1 Introduction

John von Neumann is a prominent character in the history of quantum mechanics. Most notably, we owe him the standard axiomatization of quantum mechanics in Hilbert space. Along with this contribution, he introduced a theorem that, according to many historians and philosophers, intended to be a proof of the impossibility of hidden variables [1, pp. 205–211]. According to a popular narrative, this proof contributed to establish a hostile environment towards the hidden variables program. However, three decades after its formulation, Bell [2] leveled a criticism against the theorem that turned the tide. After Bell’s critique, the proof came to be regarded as a rather unimportant result. Hermann [3] had prefigured Bell’s attack decades earlier, but her argument did not make a significant impact on the community.

This widely accepted account has been recently challenged by Bub [4] and Dieks [5]. They claim that there has been a huge misunderstanding, even from the early stages of our story. Although they of course agree with the standard narrative in that it is not an impossibility proof, they claim that von Neumann’s result is nevertheless important. The core idea in Bub’s reassessment, which is shared by Dieks, is that what von Neumann really showed is that hidden variables theories cannot represent their physical quantities (beables) by means of Hermitian operators, so such theories cannot be theories in Hilbert space. Both authors also agree in that von Neumann correctly understood the meaning and restricted scope of his theorem, and Dieks adds that although Hermann’s criticism relies on a misunderstanding, she (unlike Bell) did grasp the right lesson to be learned from it.

In a recent paper, Mermin and Schack [6] presented a critical reply to Bub and Dieks. As we will see in detail, Bell’s criticism consists in that a premise in von Neumann’s proof constitutes a physically unjustified and unnatural assumption for hypothetical hidden variables theories. Bell argued that this premise amounts to the silly imposition of additivity of expectation values for experimentally incompatible quantities and hypothetical deterministic quantum states (determined by hidden variables). However, Bub and Dieks affirm that Bell missed the real foundation and goal of von Neumann’s assumption: to allow the mathematical representation of quantities like \(f\left( {{\mathcal{R}},{\mathcal{S}}} \right)\) when the quantities \({\mathcal{R}}\) and \({\mathcal{S}}\) are experimentally incompatible. Correctly understood, Bub and Dieks claim, there is nothing wrong with it. Mermin and Schack have now replied that the premise in question is superfluous, for its goal can be allegedly attained with the rest of von Neumann’s (uncontroversial) assumptions. Furthermore, they claim that their rebuttal of Bub’s and Dieks’ stance was already contained in Hermann’s critical assessment of the proof.

In this paper I critically examine the points of controversy leveled by Mermin and Schack. Their main argument brings a new perspective on the evaluation of von Neumann’s debated premise, so a careful consideration of their reply can lead us to further clarification and to a deeper understanding of the relevance and scope of the proof. I show that the assumption is not superfluous, and that its precise meaning and goal are better understood if we keep in mind that the original formulation of the theorem did not involve issues about hidden variables—this is a crucial fact that has not been sufficiently taken into account in the discussion. Then, making use of my reading of the relevance and content of the debated premise, I propose a novel characterization of von Neumann’s understanding of his theorem, which, I think, fits better with the textual evidence. I also defend Dieks’ reading of Hermann’s stance, and bring to attention that Jammer [7] prefigured an interpretation of the theorem along the same line of Bub and Dieks.

A second goal in this paper is to correctly characterize the connection between von Neumann’s proof and Gleason’s theorem. The link between them has been spotted along the discussion, but so far it has not been carefully analyzed. I show that if, once again, we consider von Neumann’s theorem in its initial formulation, we see that although it is true that Gleason’s proof is more powerful in the sense that it establishes the same result on the basis of weaker premises, von Neumann’s is, after all, an equally valuable achievement. First, the same result for hidden variables that we obtain from von Neumann’s theorem, we can obtain from Gleason’s. Secondly, Gleason’s theorem yields a vindication of von Neumann’s debated premise. That is, after Gleason’s theorem and a right reading of the original goal of von Neumann’s theorem, we see that in the end there is nothing questionable or misleading in the debated assumption.

2 The Story So Far

In order to have all the cards on the table, I will first review the essentials and the successive stages in the debate about the interpretation of von Neumann’s theorem. I present an outline of the 1932 proof, Hermann’s and Bell’s critical reactions, Bub’s and Dieks’ reassessment, and Mermin and Schack’s subsequent reply.

2.1 von Neumann, 1932

In his seminal book Mathematical Foundations of Quantum Mechanics, published in 1932, von Neumann [1] presented the theorem at issue. As I mentioned above, it has been usually understood as an attempted proof of the impossibility of hidden variables in quantum mechanics.Footnote 1 Its premises are the following [1, pp. 201–205]:

  • A′. If the quantity \(\mathcal{R}\) is by nature non-negative (if, for example, it is the square of another quantity \(\mathcal{S}\)) then also \(Exp(\mathcal{R})\ge 0\).

  • B′. If \(\mathcal{R}\), \(\mathcal{S}\),… are arbitrary quantities, and \(a\), \(b\),… are real numbers, then \(Exp\left(a\mathcal{R}+ b\mathcal{S}+ \cdot \cdot \cdot \right)= aExp(\mathcal{R})+bExp(\mathcal{S}) + \cdot \cdot \cdot\).

  • I. If the quantity \(\mathcal{R}\) has the operator \(R\), then the quantity \(f(\mathcal{R})\) has the operator \(f(R)\).

  • II. If the quantities \(\mathcal{R}\), \(\mathcal{S}\),… have the operators \(R\), \(S\),…, then the quantity \(\mathcal{R}+\mathcal{S}+\cdot \cdot \cdot\) has the operator \(R+S+\cdot \cdot \cdot\).

Assuming these premises, von Neumann derives the trace rule \(Exp\left(\mathcal{R}\right)\) = Tr \((UR)\) [1, pp. 205–207, see Appendix 1], and from this result he proves two corollaries. First, that the trace rule does not admit dispersion-free (deterministic) states, that is, states such that, for any quantity \({\mathcal{R}}\), \(Exp\left({\mathcal{R}}^{2}\right)={\left[Exp(\mathcal{R})\right]}^{2}\) [1, pp. 208–209, see Appendix 2]. Now, if a state is dispersive, von Neumann remarks that it can always be decomposed in two sub-ensembles such that \(Exp\left(\mathcal{R}\right)=\alpha Exp^{\prime}\left(\mathcal{R}\right)+\beta Exp^{\prime\prime}(\mathcal{R})\), where \(\alpha >0\), \(\beta >0\), and \(\alpha +\beta =1\). Then he defines a homogeneous (pure) state as a state for which it holds that \(Exp\left(\mathcal{R}\right)=Exp^{\prime}\left(\mathcal{R}\right)=Exp^{\prime\prime}(\mathcal{R})\) for any sub-ensembles. As a second corollary, he proves that a state is homogeneous iff the density operator \(U\) that represents it is a projector onto a unit vector [1, pp. 209–210, see Appendix 3].

The introduction of hidden variables should yield dispersion-free states, and with suitable values for the hidden parameters a dispersive state could be decomposed in deterministic sub-ensembles, so with hidden variables a dispersive state cannot be homogeneous. However, the first corollary shows that dispersion-free states are impossible, and the second corollary shows that there are dispersive homogeneous states. Von Neumann then concludes that:

In fact, we have even established that it is impossible for the same physical quantities to exist with the same functional connections (i.e., for I and II to hold) if other variables (i.e., “hidden parameters”) exist in addition to the wave functions. Nor would it help if there existed other, as yet undiscovered, physical quantities in addition to those represented by the operators in quantum mechanics because the relations assumed by quantum mechanics (i.e., I and II) would have to fail already for the by-now-known quantities […]. It is therefore not, as is often assumed, a question of the reinterpretation of quantum mechanics—the present system of quantum mechanics would have to be objectively false for a description other than the statistical description of elementary processes to be possible. [1, pp. 211–212]

2.2 Hermann, 1935

In Section 7 of an essay published in 1935, entitled Natural-Philosophical Foundations of Quantum Mechanics, Grete Hermann criticized von Neumann’s proof by challenging premise B′. Hermann starts out by underscoring that the theorem crucially relies on it: “Neumann assumes that \(Exp\left(\mathcal{R}+\mathcal{S}\right)=Exp\left(\mathcal{R}\right)+Exp(\mathcal{S})\). In words: the expectation value of a sum of physical quantities is equal to the sum of the expectation values of the two quantities. Neumann’s proof stands or falls with this assumption” [3, p. 251–252]. She comments that B′ is a natural assumption in classical mechanics, and that in quantum mechanics it is also unproblematic for compatible quantities for which uncertainty relations do not hold—i.e., for quantities represented by commuting Hermitian operators. In the case of two such quantities \(\mathcal{A}\) and \(\mathcal{B}\), the value of the quantity \(\mathcal{A}+\mathcal{B}\) is simply the sum of the values of \(\mathcal{A}\) and \(\mathcal{B}\), and from this it follows that the expectation values are also additive.

However, for experimentally incompatible quantities represented by non-commuting operators, the assumption, she claims, does become problematic: “for the so-defined concept of the sum of two quantities that are not simultaneously measurable, the formula given above [B′] requires a proof” [3, p. 252]. Hermann remarks that B′ is indeed the case for states represented by a wavefunction (i.e., by a density operator), and she claims that this is the proof that von Neumann is providing for B′:

Neumann relies on the fact that, in the context of the formalism, the rule \(((R + S)\phi , \phi ) = (R\phi , \phi ) + (S\phi , \phi )\) holds for the symbol \(\left(R\phi , \phi \right)\),Footnote 2 which represents a number and is interpreted as the expectation value of the quantity \(\mathcal{R}\) in the state \(\phi\). (Here \(R\) and \(S\) are the mathematical operators assigned to the quantities \(\mathcal{R}\) and \(\mathcal{S}\); \(\phi\) specifies the wave function of the systems under consideration.) From this rule Neumann concludes that for ensembles of systems with equal wave functions, and therefore for all ensembles generally, the addition theorem for expectation values holds also for quantities that are not simultaneously measurable. [3, p. 252]

But this maneuver is circular, Hermann states, because we are excluding from the outset the possibility of dispersion-free sub-ensembles, characterized by “new features” beyond wavefunctions. For such sub-ensembles, claims Hermann, the linear additivity of expectation values in premise B′ should not be expected for incompatible quantities represented by non-commuting Hermitian operators. She then concludes that the proof is question begging, for a justification of B′ relying on the additivity that holds for quantum mechanical states given by wavefunctions—i.e., dispersive states represented by density operators \(U\) in the trace rule—amounts to exclude in advance the possibility of dispersion-free quantum sub-ensembles specified by hidden variables. In her own words,

one has implicitly absorbed into the interpretation the unproven assumption that there can be no distinguishing features, of the elements of an ensemble of physical systems characterized by \(\phi\), on which the result of the \(\mathcal{R}\)-measurement depends. However, the impossibility of such features is precisely the claim to be proven. Thus the proof runs in a circle. [3, p. 252]

2.3 Bell, 1966

In an article entitled On the Problem of Hidden Variables in Quantum Mechanics, published three decades after Hermann’s critique, Bell [2] introduced a somewhat similar rebuttal of von Neumann’s impossibility proof. Bell identifies what he considers the crucial assumption in the proof, namely that “any real linear combination of any two Hermitian operators represents an observable, and the same linear combination of expectation values is the expectation value of the combination” [2, pp. 448–449]—i.e., the conjunction of II and B′. Then he states that for dispersion-free states, the expectation value of a quantity is given by one of the eigenvalues of the operator that represents the quantity. However, in the case of non-commuting operators that represent experimentally incompatible quantities, their eigenvalues are, in general, not linear-additive. Thus, Bell states, B′ cannot hold for dispersion-free states, so if we assume this premise, such states are not possible.

But Bell then argues that the imposition of B′ for hypothetical dispersion-free states is unjustified. In the same line as Hermann, he complains that if we consider experimentally incompatible quantities, the description of dispersion-free states by means of hidden variables would make it natural that for such states expectation values are not additive, while additivity does hold when the states are averaged over the values of hidden variables (resulting in dispersive states described by wavefunctions/density operators):

The essential assumption can be criticized as follows. At first sight the required additivity of expectation values seems very reasonable, and it is rather the nonadditivity of allowed values (eigenvalues) which requires explanation. Of course the explanation is well known: A measurement of a sum of noncommuting observables cannot be made by combining trivially the results of separate observations on the two terms—it requires a quite distinct experiment […]. But this explanation of the nonadditivity of allowed values also establishes the nontriviality of the additivity of expectation values. The latter is a quite peculiar property of quantum mechanical states, not to be expected a priori. There is no reason to demand it individually of the hypothetical dispersion free states, whose function it is to reproduce the measurable peculiarities of quantum mechanics when averaged over. [2, p. 449]

Bell then concludes that von Neumann’s proof misses the mark. Since B′ is an unjustified assumption for hypothetical dispersion-free states specified by hidden variables, a proof of the impossibility of hidden variables that relies on it is a non-starter: “It was not the objective measurable predictions of quantum mechanics which ruled out hidden variables. It was the arbitrary assumption of a particular (and impossible) relation between the results of incompatible measurements” [2, p. 449]. In other words, to Bell’s eyes von Neumann’s theorem is only a proof of the impossibility of an uninteresting and irrelevant class of hidden variables theories. Furthermore, in a 1988 interview, he went as far as describing the whole proof in the following terms:

Yet the Von Neumann proof, if you actually come to grips with it, falls apart in your hands! There is nothing to it. It’s not just flawed, it’s silly. If you look at the assumptions made, it does not hold up for a moment. It’s the work of a mathematician, and he makes assumptions that have a mathematical symmetry to them. When you translate them into terms of physical disposition, they’re nonsense. You may quote me on that: The proof of Von Neumann is not merely false but foolish! [11, p. 88]

2.4 Bub, 2010

In Von Neumann’s ‘No Hidden Variables’ Proof: a re-appraisal, Bub [4] argues that Bell [2] misconstrues von Neumann’s proof, and that the theorem does show something significant. Bub’s reassessment is based on the fact that the conjunction of A′, B′, I and II leads to the trace rule, which tells us that quantum mechanical states are always given by density operators (see Appendix 1). From this, it follows that there cannot be dispersion-free states (see Appendix 2), and that there are dispersive homogeneous states (see Appendix 3). Now, Bub states that the significance of premises I and II for hidden variables theories is that they amount to the principle that the beables in such theories are represented by Hermitian operators: “I, II, relate physical quantities to Hilbert space operators. It is assumed that each physical quantity of a quantum mechanical system is represented by a […] Hermitian operator in a Hilbert space” [4 p. 1336]. Then, since—as von Neumann claims—the introduction of hidden variables amounts to the rejection of I and II, Bub concludes that although the theorem is not an absolute proof of the impossibility of hidden variables, it is still important:

what von Neumann’s proof excludes, then, is the class of hidden variable theories in which (i) dispersion free (deterministic) states are the extremal states, and (ii) the beables of the hidden variable theory correspond to the physical quantities represented by the Hermitian operators of quantum mechanics. [4, p. 1340]

Bub also defends von Neumann’s rationale for B′. In a nutshell (more about this below), von Neumann noted that if we consider experimentally incompatible quantities \(\mathcal{R}\) and \(\mathcal{S}\), the incompatibility forbids that the value for the quantity \(\mathcal{R}+\mathcal{S}\) can be given by simultaneous or successive measurements of \(\mathcal{R}\) and \(\mathcal{S}\). Thus, he stated, we must assume B′, so that \(Exp\left(\mathcal{R}+S\right)=Exp\left(\mathcal{R}\right)+Exp(\mathcal{S})\) implicitly defines the quantity \(\mathcal{R}+\mathcal{S}\).

Besides, Bub states that Bell’s understanding of the theorem was rather shortsighted. He describes Bell’s stance as a quick argument that follows from the core of the proof. That is, the quick argument tells us that for hypothetical dispersion-free states expectation values are eigenvalues of the corresponding operators, but given B′, such states are impossible because the eigenvalues of non-commuting operators are not linear-additive. However, Bub claims, this does not capture the deeper meaning of von Neumann’s theorem. First, the quick argument fails to realize that the proof determines what are the states allowed by the quantum formalism in Hilbert space:

What the quick argument shows is that we cannot identify the physical quantities of a hidden variable theory with Hermitian operators, according to I, II, if we require the existence of dispersion free states. What the argument does not show is what sorts of states are allowed for ‘physical quantities’ in a generalized sense characterized by the conditions A′, B′, I, II, and this is clearly the more interesting question. Von Neumann’s proof is designed to answer this question, i.e., to derive the full convex set of quantum probability distributions, and once this question is answered, the quick argument is redundant. [4, p. 1337]

Second, Bub argues that Bell’s claim that von Neumann presented his theorem as an absolute proof of the impossibility of hidden variables is wrong. According to Bub, von Neumann was clear and correct in that the meaning of his theorem is that it rules out hidden variables theories in which I and II hold, that is, hidden variables theories in which the beables are represented by Hermitian operators. To defend this view, he states that in the passage where von Neumann asserts that “the present system of quantum mechanics would have to be objectively false for a description other than the statistical description of elementary processes to be possible” [1, p. 212], by “objectively false” he refers to the invalidity of I and II, not to the empirical inadequacy of quantum mechanics:

So the sense in which ‘the present system of quantum mechanics would have to be objectively false’ if the quantum statistics could be derived from a distribution of dispersion free or deterministic states is that, in a hidden variable theory, the association of known physical quantities—like energy, position, momentum—with Hermitian operators in Hilbert space would have to fail. [4, p. 1338]

2.5 Dieks, 2017

In a recent article, Von Neumann’s Impossibility Proof: mathematics in the service of rhetorics, Dieks [5] endorses Bub’s reading of the relevance of the theorem: “the proof tells us that the hidden physical properties added to the quantum description in a hidden variables completion of the theory cannot correspond to Hermitian operators in Hilbert space in the way the standard quantum quantities do” [5, p. 140]. He also defends the view, in the same line as Bub, that von Neumann understood the significance of his theorem quite correctly. Dieks argues that Bell failed to understand the motivation for B′ that von Neumann offered—that it allows the definition of quantities like \(\mathcal{R}+\mathcal{S}\) when \(\mathcal{R}\) and \(\mathcal{S}\) are experimentally incompatible—and this failure misled him to think that von Neumann intended to impose B′ for hypothetical dispersion-free states on any hidden variables theory:

Bell thus misconstrues the premises of von Neumann’s proof. He incorrectly interprets them as a requirement imposed on the expectation values of physical quantities that are defined via their representation by operators in standard quantum mechanics. His triviality objection boils down to the observation that it is easy to see that this requirement cannot be imposed anyway. By contrast, there is nothing obviously wrong or impossible when we follow von Neumann’s own reasoning. If von Neumann’s premises are formulated as von Neumann himself stated them, there is no triviality in his proof. [5, p. 142]

Dieks also argues that the core of Hermann’s criticism is unfounded. Recall that she claims that the theorem is circular insofar as the proof of B′ that von Neumann allegedly offers is that additivity of expectation values holds for all quantum states allowed by the trace rule, but this amounts to assume from the outset that dispersion free states specified by hidden variables are not possible. Dieks states that von Neumann never intended to offer a proof of B′, and that he never needed one. Again, he introduced this assumption as a principle that allows to define quantities which are the sum of incompatible quantities, and once this principle is assumed, it is obvious that additivity of expectation values will hold in the result obtained:

There seems to be a serious misunderstanding here on Hermann’s part: indeed, there is no need at all for von Neumann to worry about a proof for the additivity of the expectation values of physical quantities, since in his argumentative set-up this is not an assumption at all but a conclusion that follows analytically from the definition of \(\mathcal{R}+\mathcal{S}\). That von Neumann felt that he needed a proof for the “additivity assumption” and subsequently attempted to provide such a proof in several steps is therefore—strangely enough—a fabrication. [5, p. 143]

Anyhow, Dieks also affirms that despite this unfounded criticism, and despite the allegedly wrong accusation that von Neumann introduced his theorem as an absolute impossibility proof, Hermann correctly understood the restricted scope of the no-go result. There are passages in which she was clear in that what von Neumann proved is that the quantum formalism in Hilbert space cannot be completed with the introduction of hidden variables, and that this leaves open the possibility of viable hidden variables formulated beyond the boundaries of that formalism:

Remarkably, in their final assessments of the situation Hermann and von Neumann substantially agree—even though Hermann apparently is unaware of this. Both think that hidden variables have only been excluded to the extent that they could fit in the Hilbert space formalism, and that it is an empirical question whether this means that they will never be needed. [5, p. 145]

2.6 Mermin and Schack, 2018

In the most recent work in this discussion, Homer Nodded: von Neumann’s surprising oversight, Mermin and Schack [6] challenge the analysis of the theorem proposed by Bub and Dieks, and they also defend Hermann and Bell from the accusation of having misunderstood von Neumann’s proof.

Mermin and Schack claim that Bub and Dieks defend the view, also endorsed by von Neumann, that the theorem forces hidden variables theories to drop I and II. They also assert that Bub and Dieks argue that since B′ amounts to an implicit definition of quantities that are the sum of experimentally incompatible quantities, its analytic-definitional character makes the assumption untouchable. After considering the explanation that von Neumann offers for B′, Mermin and Schack state that:

Bub and Dieks both take this to mean that von Neumann uses assumption B′ to define linear combinations of physical quantities that are not simultaneously measurable. This is the entire basis for their criticisms of Bell and Hermann. If B′ is just a definition, it cannot also be an invalid assumption, as Hermann and Bell maintain. [6, p. 1009]

Mermin and Schack further claim that I and II are enough to fulfill the purpose of mathematically defining physical quantities, so that B′ is not necessary after all. Assumption I guarantees, they claim, that the correspondence between physical quantities and Hermitian operators is one-to-one, so I secures that operators can always define physical quantities. Accordingly, assumption II is then simply an instance of this definitional procedure in the case of quantities that are the linear sum of experimentally incompatible quantities. On the basis of this analysis, Mermin and Schack conclude that the allegedly superfluous character of assumption B′ undermines the reassessment of the theorem proposed by Bub and Dieks:

This observation invalidates what Bub and Dieks have to say about Hermann’s and Bell’s alleged misunderstanding of von Neumann. Whether von Neumann intended to define such sums through Assumption II is beside the point, though we believe he did […]. To invalidate Bub’s and Dieks’ criticism of Hermann and Bell it is enough that an alternative definition exists in addition to the definition Bub and Dieks attribute to von Neumann. [6, p. 1010]

Mermin and Schack also claim that if it were true that the introduction of hidden variables to describe dispersion-free states forces us to drop I and II, then von Neumann’s conclusion that the present system of quantum mechanics would have to be objectively false would be justified: “so strong a conclusion might indeed be appropriate if Assumption I and II were the only suspects” [6, p. 1012]. However, they assert that it is B′ that must be jettisoned, and that its rejection does not lead to the objective falsity of the system of quantum mechanics: “it is not only meaningful to reject B′ for the hypothetical dispersion-free subensembles, but quite compatible with the general structure of ordinary quantum mechanics” [6, p. 1013].

Furthermore, they argue that Hermann’s critical account of the theorem was correct, for she allegedly noticed that B′ is unnecessary to fulfill the task of defining physical quantities like \(\mathcal{R}+\mathcal{S}\). Hermann indeed notices that the incompatibility between \(\mathcal{R}\) and \(\mathcal{S}\) involves a difficulty in defining the quantity that is their sum, and then she makes the following comment: “only by the detour over certain mathematical operators assigned to these quantities does the formalism introduce the concept of a sum also for such quantities” [3, p. 252]. Mermin and Schack quote this passage as evidence that Hermann realized that B′ is superfluous, for they claim that by a “detour over certainly mathematical operators” she refers to I and II:

Hermann is saying here that because it is not clear how to define the sum in [\(Exp\left(\mathcal{R}+\mathcal{S}\right)=Exp\left(\mathcal{R}\right)+Exp(\mathcal{S})\)] or in Assumption B′ of two quantities that are not jointly measurable, “to introduce the concept of a sum… for such quantities” requires a detour involving mathematical operators assigned to them—i.e. von Neumann’s Assumptions I and II. By emphasizing the need for a detour into I and II she underlines that it is not necessary to take B′ to define the sum of quantities that are not simultaneously measurable. Hermann is reading von Neumann just as we do. [6, p. 1015]

3 Von Neumann Revisited

Now that we have reviewed the different stances and arguments in the discussion, I will critically consider the points of controversy introduced by Mermin and Schack [6], with the goal of getting a deeper and clearer understanding of the meaning and relevance of von Neumann’s theorem. I will first call attention to a crucial fact that has not been considered in the discussion, namely, that von Neumann’s theorem was not originally presented in connection with issues about hidden variables. In its first formulation, von Neumann [12] aimed at a derivation of the predictive formula of quantum theory in Hilbert space from basic principles. Considering this fact, I provide a careful analysis of his justification of the controversial premise B′. On the basis of a right comprehension of the goal of the proof, and a correct understanding of the meaning of assumption B′ in the context of that goal, I will show that this premise is not superfluous at all. Furthermore, I will reconsider the discussion about what was von Neumann’s own understanding of his theorem, proposing a middle way alternative between the popular narrative and Bub’s and Dieks’ interpretation. Finally, I will reconsider the discussion about Grete Hermann’s understanding of the theorem, complementing Dieks’ stance, and I will briefly comment on Max Jammer’s illuminating comments on von Neumann’s proof.

3.1 von Neumann, 1927

Von Neumann’s theorem is usually analyzed as presented in the Mathematical Foundations of Quantum Mechanics, published in 1932. However, he had already introduced the conceptual basis for his seminal book in a series of papers that appeared in 1927.Footnote 3 In the first article, von Neumann [13] presented the mathematical formalism of Hilbert space, and showed that it allows a novel formulation of quantum theory. One of von Neumann’s main achievements in that paper was that in Hilbert space the Born rule takes the form of the trace rule \({\langle R\rangle }_{U}=\mathrm{Tr}(UR)\).

In his subsequent Wahrscheinlichkeitstheoretischer Aufbau der Quantenmechanik, von Neumann [12] complained that in the previous paper he had simply assumed the Born rule (in the form of the trace rule) on the basis of its empirical adequacy as a central part of the new quantum formalism:

The method hitherto used in statistical quantum mechanics was essentially deductive: the square of the norm of certain expansion coefficients of the wave function or of the wave function itself was fairly dogmatically set equal to a probability, and agreement with experience was verified afterwards. A systematic derivation of quantum mechanics from empirical facts or fundamental probability-theoretic assumptions, i.e., an inductive justification, was not given. [12, p. 246, quoted in 14, p. 246]

Considering this shortcoming, now he wanted to proceed in such a way that the trace rule could be derived from basic probabilistic-theoretical assumptions, with the goal of building the Hilbert space formalism of quantum mechanics on stronger foundations. The theorem that we have been discussing is von Neumann’s fulfillment of this task. A′, B′, I and II (which in von Neumann’s second 1927 paper are labeled B, A, D and C, respectively) are the probabilistic-theoretical assumptions from which he derived the trace rule (see Appendix 1).

As a corollary to this theorem, von Neumann [12, p. 255] proved that a quantum state represented by the density operator in the trace rule is homogeneous (i.e., for that state and any quantity \(\mathcal{R}\), it holds that \(Exp\left(\mathcal{R}\right)=Ex{p}^{^{\prime}}\left(\mathcal{R}\right)=Exp^{\prime\prime} (\mathcal{R})\) for any sub-ensembles) iff the density operator \(U\) that represents it is a projector onto a normalized vector. A notable feature in this corollary (that will be important below) is that in its derivation the assumptions A′, B′, I and II are not invoked, only the trace rule and some basic properties of projectors are needed (see Appendix 3).

A crucial point for our discussion is that neither the theorem nor the described corollary were presented or discussed by von Neumann in connection with the (im)possibility of hidden variables. As we mentioned, von Neumann’s goal was to provide stronger mathematical and theoretical foundations for his formalism of quantum mechanics in Hilbert space, particularly in the case of the trace rule. Hidden variables were not considered at all in the original formulation of the theorem. Von Neumann’s own words to characterize the goal and importance of the Aufbau paper are the following:

the goal of the present paper was to show that quantum mechanics is not only compatible with the usual probability calculus, but that, if it [probability calculus]—along with a few plausible factual assumptions—is taken as given, it [quantum mechanics] is actually the only possible solution. [12, p. 271, quoted in 14, p. 247]

3.2 The rationale for B′

In his 1932 book, von Neumann is careful to justify his premises. A′ is practically analytic, and required to prove that the density operators in the trace rule are positive semi-definite (see Appendix 1). I and II are justified in terms of the representational correspondence between Hermitian operators and quantities. In von Neumann’s words, “in quantum mechanics […], the quantities \(\mathcal{R}\) correspond one-to-one to the hypermaximal Hermitian operators \(R\)” [1, p. 159]. Commenting on II he adds that “this operation depends upon the fact that for two Hermitian operators, \(R\), \(S\), the sum \(R+S\) is also a Hermitian even if \(R\) and \(S\) do not commute” [1, pp. 201–202]. In short, I tells us that if a quantity \(\mathcal{R}\) is represented by operator \(R\), the quantity \(f(\mathcal{R})\) is represented by \(f(R)\); whereas II tells us that if \(\mathcal{R}\) and \(\mathcal{S}\) are represented by \(R\) and \(S\), the quantity that is the sum of \(\mathcal{R}\) and \(\mathcal{S}\) is represented by the sum of \(R\) and \(S\). That is, I and II establish that the functional relations between the represented quantities are mirrored by the representing operators.

Let us now focus on the rationale for premise B′ that von Neumann offers. In the Aufbau paper, he briefly refers to its definitional role: if two quantities are experimentally incompatible, to define the quantity that is their sum we need to assume the linear additivity of expectation values, which accords with the basic principles of probability calculus [12, p. 249]. As we mentioned above, in the Aufbau paper hidden variables are not discussed, so the assumption of the linear additivity of expectation values is defended in this way in the context of the task of obtaining a derivation of the trace rule from basic theoretical-probabilistic principles.

Von Neumann offers a much more detailed discussion of assumption B′ in the 1932 book, but again, not in connection to hidden variables. The basic idea is of course the same: B′ is invoked to define quantities like \(\mathcal{R}+\mathcal{S}\) in the face of experimental incompatibility, but now he adds important background considerations. He first asks us to “forget the whole of quantum mechanics” [1, p. 194], which means that the line of reasoning to be introduced is about physical reasoning in general, not only about quantum theory. Then he tells us to retain the following principle. Assume we have a physical system, and a set of well-defined quantities, together with their experimental methods of measurement. Given one of those quantities, say, \(\mathcal{R}\), we can always take a function \(f(x)\) and define the quantity \(f(\mathcal{R})\), which will also have an experimental method of measurement: measure \(\mathcal{R}\), and then apply the function \(f\) on the outcome obtained to get the value of \(f(\mathcal{R})\). This means that all the quantities \(\mathbf{f}(\mathcal{R})\) (\(\mathcal{R}\) fixed, \(\mathbf{f}\) arbitrary) are simultaneously measureable.

By the same token, if we have two quantities \(\mathcal{R}\) and \(\mathcal{S}\), we assume—according to standard physical reasoning—that we can legitimately define the quantity \(f(\mathcal{R},\mathcal{S})\). We could, again by the same token, say that the experimental method of measurement is simply to perform measurements of \(\mathcal{R}\) and \(\mathcal{S}\) and apply the function \(f\) on the outcomes to get the value of \(f(\mathcal{R},\mathcal{S})\). However, von Neumann states, it could be the case that our quantities \(\mathcal{R}\) and \(\mathcal{S}\) are experimentally incompatible, in the sense that they cannot be simultaneously measured on a system, nor successively measured on a system in the same state. In this case, we can construct an ensemble \({\varvec{S}}\), which we can divide in sub-systems \({{\varvec{S}}}_{1},{{\varvec{S}}}_{2},\dots , {{\varvec{S}}}_{n}\), with \(n\) large. For such an ensemble we do not measure the value of a quantity, but we get the distribution of values for the sub-systems, and then we can define the corresponding expectation values—\(Exp(\mathcal{R})\) and \(Exp(\mathcal{S})\), for example. Consider then two sub-ensembles in \({\varvec{S}}\), \({{\varvec{S}}}_{1},\dots ,{{\varvec{S}}}_{m}\) and \({{\varvec{S}}}_{m+1},\dots ,{{\varvec{S}}}_{2m}\), with \(m\) large, but with \(2m\ll n\). We can determine the distribution and expectation value \(Exp(\mathcal{R})\) for \(\mathcal{R}\) in the sub-ensemble \({{\varvec{S}}}_{1},\dots ,{{\varvec{S}}}_{m}\), and the distribution and expectation value \(Exp(\mathcal{S})\) for \(\mathcal{S}\) in the sub-ensemble \({{\varvec{S}}}_{m+1},\dots ,{{\varvec{S}}}_{2m}\), apply the function \(f\) on the results obtained, and get \(Exp\left(f\left(\mathcal{R},\mathcal{S}\right)\right)\) for \({\varvec{S}}\). This line of reasoning allows us to implicitly define the quantity \(f(\mathcal{R},\mathcal{S})\) and to determine an experimental method to measure it, despite the experimental incompatibility between \(\mathcal{R}\) and \(\mathcal{S}\). In the case of the quantity \(\mathcal{R}+\mathcal{S}\), what we have said amounts to assume that \(Exp\left(\mathcal{R}+\mathcal{S}\right)=Exp\left(\mathcal{R}\right)+Exp(\mathcal{S})\), which is of course B′.

In sum, if we have a theory in which \(\mathcal{R}\) and \(\mathcal{S}\) are experimentally incompatible, in order to be able to define a quantity \(\mathcal{Q}=\mathcal{R}+\mathcal{S}\) in such a way that its mathematical representative reflects \(\mathcal{Q}\)’s functional dependence on \(\mathcal{R}\) and \(\mathcal{S}\), we need to invoke B′. In other words, if we have incompatible quantities \(\mathcal{R}\) and \(\mathcal{S}\) represented by the mathematical objects \(R\) and \(S\), we can only represent \(\mathcal{Q}=\mathcal{R}+\mathcal{S}\) with the operator \(Q=R+S\) by assuming B′. Thus, von Neumann’s justification for this assumption is that it is required to functionally define quantities that are a linear combination of experimentally incompatible observables. Therefore—and now returning to quantum theory—in order to extract the full representational power of Hilbert space according to I and II, i.e., respecting the principle that the functional relations between quantities are mirrored by the representing operators, B′ is required. Although B′ and II are logically independent, in the context of experimental incompatibility they work in tandem.

Notice that the preceding discussion that von Neumann offers to defend B′ does not involve hidden variables at all. The motivation for B′ only responds to the incompatibility between physical quantities, a scenario that occurs in quantum physics. Furthermore, although von Neumann is not explicit about it, the defense of B′ does not assume that the expectation values involved are probabilistically spread. It is of course true that B′ accords with the rules of probability calculus [12, p. 249], but the point is that the argument holds also if we assume that either \({\varvec{S}}\) or the subsystems \({{\varvec{S}}}_{i}\) are dispersion-free—once again, the relevant issue is experimental incompatibility, not probabilities.

3.3 B′ is not superfluous

Considering what we have said in Sects. 3.1 and 3.2, let us now take a look at the main points of controversy. As we saw, Mermin and Schack [6] claim that the reassessment of von Neumann’s theorem that Bub and Dieks propose crucially relies on their alleged defense of B′ as a definitional principle. This definitional-analytic character, Mermin and Schack affirm, is what Bub and Dieks invoke to shield B′ from revision, turning thus to I and II as the crucial premises for the no-go result. Given this reading of Bub’s and Dieks’ stance, Mermin and Schack argue that B′ is superfluous, for I and II are allegedly enough to define quantities like \(\mathcal{R}+\mathcal{S}\) despite experimental incompatibility—we just need to invoke II, they claim, to define the quantity \(\mathcal{R}+\mathcal{S}\) in terms of the operator \(R+S\).

I think there is a misunderstanding here, for Bub and Dieks do not present B′ as an untouchable assumption. They are both clear and explicit in that in a viable hidden variables theory, for dispersion-free states and incompatible quantities expectation values cannot be linear-additive. Their main point, however, is that the viability of the theory previously requires that its beables are not represented by Hermitian operators, thus violating I and II. Dieks comments the relevance of von Neumann’s theorem for hidden variables theories in the following terms. After stating that the theorem establishes that hidden variables theories cannot be Hilbert space theories he adds that “hidden values of at least some physical quantities will not obey the same relations as the corresponding quantum observables: as already pointed out above, if two such quantum observables add up to a third one, their hidden values will generally not add up in the same way” [5, p. 141].

On the other hand, Bub [4, pp. 1338–1339] comments the example of Bohmian mechanics. The beables in this theory are represented by functions of position and momentum, not by Hermitian operators. This is of course an illustration of Bub’s reading of the relevance of von Neumann’s theorem for hidden variables theories: in Bohmiam mechanics I and II do not hold (for a treatment of the role that Hermitian operators play in the Bomh theory, see [15], and [16]). Besides, Bub also states, quite correctly, that in Bohm’s theory expectation values for dispersion-free states are not additive, and the fact that they are additive for dispersive states is explained by the dynamics of measurements.

In other words, Bub and Dieks do not claim that the hidden variables theories that von Neumann’s theorem rules out violate I and II, but respect B′. Rather, they are clear in that B′ does not hold for expectation values of hypothetical dispersion-free states in viable hidden variables theories, but the very viability of such theories also requires that I and II are violated anyway. In short, they read von Neumann’s theorem as establishing that viable hidden variables theories must indeed break B′ for dispersion-free states, but it is the concomitant violation of I and II that determines the interesting result that in such theories beables cannot be represented by Hermitian operators. Despite this misreading, though, I think that Mermin and Schack’s charge must be carefully scrutinized. First, Bub and Dieks do not discuss this point. Second, and more importantly, the argument that the alleged superfluousness of B′ as a definitional principle ruins the whole proof is a novel standpoint, and deserves to be critically assessed.

To do so, let us consider a putative theory in which B′ does not hold for hypothetical dispersion-free states, and in which Hermitian operators are the formal tools to mathematically represent the theory’s beables.Footnote 4\(\mathcal{R}\) and \(\mathcal{S}\) are two experimentally incompatible quantities, represented, respectively, by the non-commuting operators \(R\) and \(S\). Let us now consider the quantity \(\mathcal{Q}\), represented by the operator \(Q=R+S\). For dispersion-free states, the expectation values \(Exp\left(\mathcal{R}\right)\), \(Exp\left(\mathcal{S}\right)\), and \(Exp\left(\mathcal{Q}\right)\) must be eigenvalues of \(R\), \(S\), and \(Q\), respectively. But since the eigenvalues of \(Q\) are not sums of the eigenvalues of \(R\) and \(S\), then \(Exp\left( {\mathcal{Q}} \right) \ne Exp\left( {\mathcal{R}} \right) + Exp\left( {\mathcal{S}} \right)\). But then, how could we affirm that the operator \(Q\) represents the quantity that is the sum of quantities \({\mathcal{R}}\) and \({\mathcal{S}}\)? Given experimental incompatibility, we cannot measure \({\mathcal{R}}\) and \({\mathcal{S}}\) simultaneously nor successively and then add the results obtained, and B′ cannot be invoked, of course. Furthermore, the experimental methods to measure the quantity represented by \(Q\) are entirely different from the methods to measure the quantities \({\mathcal{R}}\) and \({\mathcal{S}}\).Footnote 5 In what sense could we state that \({\mathcal{Q}} = {\mathcal{R}} + {\mathcal{S}}\)?

We could declare by fiat that the operator \(Q\) represents the quantity that we label\({\mathcal{R}} + {\mathcal{S}}\)”—this is actually what Mermin and Schack’s proposal amounts to. However, it is clear that this maneuver is not enough to establish that the quantity \({\mathcal{Q}}\) denoted by \(Q\) is the sum of the quantities \({\mathcal{R}}\) and \({\mathcal{S}}\). That is, we have no grounds to assert that \({\mathcal{Q}}\) functionally depends on \({\mathcal{R}}\) and \({\mathcal{S}}\) as their linear sum. It could be replied that in our putative quantum theory we can stipulate that expectation values for dispersive states (that we can obtain by averaging over the hypothetical hidden variables) are indeed linear-additive. However, it is clear that this stipulation cannot be invoked to state that \(Q\) is the linear sum of \({\mathcal{R}}\) and \({\mathcal{S}}\), for the additivity of expectation values at the dispersive level would be a fact crying for an explanation precisely because it does not hold at the dispersion-free level—we would be putting the cart before the horse.

In sum, in our scenario of dispersion-free states and experimental incompatibility it holds that \({\mathcal{Q}} \ne {\mathcal{R}} + {\mathcal{S}}\) despite that \(Q = R + S\), which means that the rejection of B′ leads to the violation of II\({\mathcal{R}}\) and \({\mathcal{S}}\) are represented by \(R\) and \(S\), but \({\mathcal{R}} + {\mathcal{S}}\) is not represented by \(R + S\). Actually, without B′, the possibility of dispersion-free states would lead us to the fact that the legitimate quantity \(f\left( {{\mathcal{R}},{\mathcal{S}}} \right) = {\mathcal{R}} + {\mathcal{S}}\) cannot be captured by the Hilbert space formalism. We can conclude then, contra Mermin and Schack, that B′ is not superfluous at all.

These remarks are reinforced when we recall that, as we saw in Sect. 3.1, von Neumann’s theorem is primarily a derivation of the formula for predictions in a quantum theory in Hilbert space from basic theoretical-probabilistic principles. The rationale for B′ as a premise in the theorem has nothing to do with hidden variables, and it is introduced on the grounds of basic principles in physical reasoning in general, not only in quantum mechanics. Von Neumann’s point is that since (i) given two physical quantities, we can always define the quantity that is their linear sum, and (ii) there are experimentally incompatible quantities; we need B′ to guarantee that the functional relations between the mathematical objects that represent physical quantities mirror the functional connections between the represented quantities—so that in the case of quantum mechanics in Hilbert space I and II hold. As we saw, without B′ (or restricting its validity to dispersive states), we cannot guarantee that the mathematical object \(Q = R + S\) represents the quantity \({\mathcal{R}} + {\mathcal{S}}\), nor that this quantity can be represented at all.

Furthermore, since when von Neumann provides a rationale for B′ he is setting the theoretical-probabilistic principles from which the formula for experimental predictions using Hermitian operators as representatives of physical quantities is to be derived, the possibility of dispersion-free states in the Hilbert space formalism should not be excluded from the outset. Now, it is exactly this preliminary possibility of dispersion-free states in a Hilbert space quantum theory that strongly motivates B′ as a premise in the derivation that von Neumann is attempting. As we just saw, it is precisely the possibility of deterministic states that can lead us to a scenario in which \({\mathcal{Q}} \ne {\mathcal{R}} + {\mathcal{S}}\) despite that \(Q = R + S\). Without B′, we cannot guarantee with the rest of our basic theoretical-probabilistic principles that the Hilbert space formalism will be able to grasp all the quantities we can define according to general principles in physical reasoning. We conclude, then, that B′ is not superfluous, let alone silly. In the context of a derivation of the predictive formula in a quantum theory in Hilbert space, it is strongly justified.

Summarizing, von Neumann sets himself to the task of deriving the formal recipe for experimental predictions in a quantum theory formulated in Hilbert space, in which physical quantities are represented by Hermitian operators—he considered this representation as a lesson to be learned from wave and matrix mechanics (cfr. [1], Section III.1). Now, considering that the quantum realm features experimental incompatibility between quantities (reflected in the non-commutativity of their representing operators), in order to guarantee that the Hilbert space formalism is capable of representing all the physical quantities we can define, mirroring the functional relations among those quantities, we need to assume B′ as a premise in the derivation of the recipe. That is, without B′, hypothetical dispersion-free states would lead us to a scenario in which legitimate physical quantities cannot be represented in the formalism. Anyhow, when the formula is finally derived in the form of the trace rule (see Appendix 1), it turns out that the no dispersion-free states corollary (see Appendix 2) and the pure states-projectors corollary (see Appendix 3) show that the quantum formalism in Hilbert space does not admit a completion in terms of hidden variables. These remarks show that B′ is neither silly nor superfluous, and that the reevaluation of the theorem proposed by Bub and Dieks is correct: the proof shows that beables in viable hidden variables theories cannot be represented by Hermitian operators, such theories cannot be Hilbert space theories.

3.4 What did Von Neumann Really Believe?

A second point of controversy consists in what was von Neumann’s own understanding of the relevance and scope of his theorem. Hermann [3] and Bell [2] claim that he read it as an absolute proof of the impossibility of hidden variables. Bub [4] and Dieks [5] claim that he clearly understood that he proved that only Hilbert space hidden variables theories are not possible. I think that there is not enough textual evidence to settle this controversy for good, but I will propose a third “middle-way” alternative that at least allows us to make sense of all of von Neumann’s statements concerning the significance of his proof.

The main support for the first reading comes from the passage in which von Neumann states that “the present system of quantum mechanics would have to be objectively false for a description other than the statistical description of elementary processes to be possible” [1, p. 212].

The textual evidence for the second reading is given mainly by two passages. The first one states that:

In the analysis of fundamental questions, it will be shown how the statistical formulas of quantum mechanics can be derived from a few qualitative, basic assumptions. Furthermore, there will be a detailed discussion of the problem as to whether it is possible to trace the statistical character of quantum mechanics to an ambiguity (i.e., incompleteness) in our description of nature […]. This explanation “by hidden parameters” […] has been proposed more than once. However, it will appear that this can scarcely succeed in a satisfactory way, or more precisely, such an explanation is incompatible with certain qualitative fundamental postulates of quantum mechanics. [1, pp. 2–3, my emphasis]

This passage is very telling in that in the 1932 book von Neumann still understood his theorem primarily as a derivation of the trace rule from basic principles in Hilbert space, including the representation of quantities by Hermitian operators (I and II)—just like in the 1927 Aufbau paper. As we have seen, in 1932 von Neumann further showed by means of the two corollaries that the derived statistical formula does not admit a completion in terms of hidden variables. The “qualitative fundamental postulates of quantum mechanics” which are incompatible with the introduction of hidden variables are of course postulates that concern a Hilbert space formulation of quantum mechanics. Thus, I think it is more than reasonable to assert that von Neumann was quite clear in that his theorem was a proof of the impossibility of introducing hidden variables in a Hilbert space quantum theory.

A second passage that suggests this reading is the following:

Whether or not an explanation of this type, by means of hidden parameters, is possible for quantum mechanics is a much discussed question. The view that it will sometime be answered in the affirmative has at present some prominent representatives. If it were correct, it would brand the present form of the theory provisional, since then the description of states would be essentially incomplete. We shall show later (IV.2) that an introduction of hidden parameters is certainly not possible without a basic change in the present theory. [1, p. 136, my emphasis]

Here von Neumann is explicit in that the introduction of hidden variables would force us to a significant change in the present theory, i.e., a significant change in quantum theory as formulated in Hilbert space—he certainly does not claim that their impossibility is absolute. Again, it is clear that von Neumann never lost sight of the fact that the theorem is first and foremost a derivation of the statistical formula of quantum mechanics in Hilbert space. Thus, it is highly reasonable to believe that he was clear in that the scope of the no-go result for hidden variables that follows in the form of the two corollaries is restricted to the Hilbert space formalism of quantum theory—both quoted passages strongly support this view.

This interpretation of von Neumann’s stance also allows us to makes sense of the passage in section IV.2 of the book in which he comments on the relevance of the theorem there presented. As we saw in Sect. 2 above, there he says that “we have even established that it is impossible for the same physical quantities to exist with the same functional connections (i.e., for I and II to hold) if other variables (i.e., “hidden parameters”) exist in addition to the wave functions” [1, p. 211]. Again, in this passage the impossibility of hidden variables is characterized as conditional, not as absolute. That is, von Neumann seems to be quite clear in that hidden variables are possible, but that their introduction would lead to a significant alteration in the formal structure of the theory—it could not be a theory in Hilbert space.

But how do we make sense of the statement that “the present system of quantum mechanics would have to be objectively false for a description other than the statistical description of elementary processes to be possible” [1, p. 212]? The reading proposed by Bub [4] and Dieks [5] is that by “objectively false” von Neumann refers only to the validity of I and II, not to empirical adequacy. I think this interpretation is too generous. The text is clear in that von Neumann is referring to quantum mechanics as to that which should be objectively false if hidden variables are introduced.Footnote 6

But we can make sense of this statement in the light of what we said in Sect. 3.1. A derivation of the formula for the predictions of quantum mechanics from basic probabilistic-theoretical principles in the formalism of Hilbert space leads us to the trace rule, which does not admit dispersion-free states but admits homogeneous ones. Based on this result, von Neumann may have concluded that if hidden variables are taken on board in a quantum theory, the corresponding formal recipe for empirical predictions could not be the trace rule, because it does not admit hidden variables. Thus, the predictions we obtain with the introduction of hidden parameters would diverge from the predictions of quantum theory in Hilbert space. If we add that von Neumann was explicit in that a strongly grounded lesson that we should learn from matrix and wave mechanics is that physical quantities should be represented by Hermitian operators, he may have concluded that the empirical confirmation of the predictions of Hilbert space quantum mechanics strongly suggests that hidden variables theories, despite being possible, can be discarded from the outset. Von Neumann might have concluded that theories with hidden variables are condemned to predictive failure—this reading naturally explains his “objectively false” claim.

This reading, I think, allows us to make sense also of von Neumann’s statement that “nor would it help if there existed other, as yet undiscovered, physical quantities in addition to those represented by the operators in quantum mechanics because the relations assumed by quantum mechanics (i.e., I and II) would have to fail already for the by-now-known quantities” [1, p. 211]. That is, even if the introduction of hidden variables would lead to the introduction of further quantities not considered in the Hilbert space formalism, in the hidden variables theory I and II would not hold, so the trace rule would not be the recipe for predictions, and this would result in predictive divergence between the theories. But since we already know that the predictions we obtain from the trace rule for physical quantities represented by Hermitian operators are correct, we could conclude in advance that a hidden variables theory would be an empirical failure.

In other words, I agree with Bub [4] and Dieks [5] in that von Neumann was clear in that the scope of his no-go result was restricted to hidden variables in Hilbert space, so that he did not understand his theorem as an absolute impossibility proof—he was aware that consistent hidden variables theories are possible. However, I think that the way in which he understood the importance of his result went (wrongly) much further than the reading that Bub and Dieks propose. Since he was committed to the view that in quantum theory physical quantities should be represented by Hermitian operators, he concluded that a theory in which this principle is not respected would lead to empirical divergences with respect to Hilbert space quantum theory, so that the predictive success of the latter discards hidden variables theories from the outset—not for being impossible, but for being objectively false.

This conclusion is of course wrong, but it is not silly. Von Neumann did not foresee the possibility of a hidden variables theory in which Hermitian operators do not represent physical quantities (beables), but in which these operators still play an operational role of representing experimental outcomes that is consistent with the trace rule. Bohm’s theory is of course such a theory, and Bohm himself was quite clear about the role that Hermitian operators play in it:

the measurement of an “observable” is not really a measurement of any physical property belonging to the observed system alone. Instead, the value of an “observable” measures only an incompletely predictable and controllable potentiality belonging just as much to the measuring apparatus as to the observed system itself. (17, p. 183)Footnote 7

A hidden variables theory like this does not fall under the scope of von Neumann’s theorem, precisely because its beables are not represented by Hermitian operators. However, since in this theory Hermitian operators play the operational role of mapping the interaction between system and apparatus to numerical (eigen)values according to the trace rule (see [15] and [16]), there is no predictive divergence with respect to Hilbert space quantum mechanics. Von Neumann was thus wrong in that his theorem yields the lesson that hidden variables theories are doomed to empirical failure. But I would not call such a conclusion silly—that he was unable to envision a theory in which Hermitian operators do not represent physical quantities whereas they still yield predictions according to the trace rule is not a mortal sin.

3.5 Hermann, 1933 (and Jammer, 1974)

A third point of controversy consists in Hermann’s understanding of the theorem. As we saw, Dieks [5] states that the circularity claim that she leveled relies on a confusion. She claimed that von Neumann offered as a proof of B′ that expectation values given by the trace rule are always additive. However, since the trace rule does not admit dispersion-free states, Hermann claims, that proof begs the question. After our examination of the rationale that von Neumann provided for B′, it is quite clear that he never intended (or needed) to offer a proof of B′. Dieks [5] is certainly right in that this charge of circularity rests on a misunderstanding.

Mermin and Schack [6] make the further point that Hermann allegedly stated that B′ is superfluous as a principle that allows to define quantities which are the linear sum of experimentally incompatible quantities. As we saw, they quote a passage where she writes that “only by the detour over certain mathematical operators assigned to these quantities does the formalism introduce the concept of a sum also for such quantities” [10, 3, p. 252] as the sole evidence supporting that Hermann allegedly shared their view that II is enough to define the mentioned type of quantities.

Although it is likely that in this passage Hermann is referring to I and II—for she talks about operators—I find Mermin and Schack’s interpretation unconvincing. First, this passage appears in the context of an explanation of B′, and it is quite clear that Hermann understands that the justification for this assumption that von Neumann provides has to do with the representation of quantities like \({\mathcal{R}} + {\mathcal{S}}\) when \({\mathcal{R}}\) and \({\mathcal{S}}\) are experimentally incompatible. Thus, we can understand the passage as a statement that the detour over mathematical operators that allows to represent such quantities relies on B′. In other words, taken in isolation, the sentence that Mermin and Schack quote could be read in the way they propose. However, when considered in context, I think it is clear that Hermann is simply explaining why B′ is required as a definitional principle, and not (yet) criticizing it:

For classical physics this assumption [B′] is trivial. So, too, it is for those quantum mechanical features that do not mutually limit each other’s measurability, thus between which there are no uncertainty relations. Because for two such quantities, the value of their sum is nothing other than the sum of the values that each of them separately takes, from which follows immediately the same relation for the mean values of these magnitudes. The relation is, however, not self-evident for quantum mechanical quantities between which uncertainty relations hold, and in fact for the reason that the sum of two such quantities is not immediately defined at all: since a sharp measurement of one of them excludes that of the other, so that the two quantities cannot simultaneously assume sharp values, the usual definition of the sum of two quantities is not applicable. Only by the detour over certain mathematical operators assigned to these quantities does the formalism introduce the concept of a sum also for such quantities. [3, p. 252]

Second, and more importantly, the sentence Mermin and Schack quote is the only passage where Hermann refers to the challenge of representing the type of quantities under discussion, and she says nothing explicitly about a criticism in terms of the superfluousness of B′. If that was the point on which she based her alleged rebuttal of von Neumann’s theorem, one would expect that she would have been rather clear and categorical about it. However, what we find after the paragraph just quoted is her criticism on the basis of the circularity claim, which is clearly the criticism of the theorem she presents.

Thus, in this controversy I think we must take sides with Dieks. To complement and strengthen Dieks’ reading of Hermann’s understanding of von Neumann’s theorem, I can add that in a recently discovered essay written in 1933, entitled Determinism and Quantum Mechanics [10],Footnote 8 she presented an assessment of the proof in which it is also clear that despite the circularity confusion, her understanding of the restricted scope of the no-go result for hidden variables is correct. The structure of the earlier essay is pretty much the same as in her 1935 work. Hermann first mentions the issue of the challenge for representing a quantity like \({\mathcal{R}} + {\mathcal{S}}\) when \({\mathcal{R}}\) and \({\mathcal{S}}\) are experimentally incompatible: “the sum \(R + S\) in this case can be defined only indirectly as the quantity corresponding to the sum \(r + s\) of the operators belonging to \(R\) and \(S\)” [10, p. 233].Footnote 9 The indirect mode of definition that Hermann refers to is of course B′, and now the passage about the detour over mathematical operators is absent, although the criticism of the theorem she presents in the 1933 is the same as in the 1935 paper: the circularity claim. This reinforces the view that she was not reading von Neumann in the same way as Mermin and Schack do.

After these preliminary remarks, Hermann introduces the begging the question complaint. Since a direct definition of \({\mathcal{R}} + {\mathcal{S}}\) is not possible given the incompatibility between \({\mathcal{R}}\) and \({\mathcal{S}}\), so that we must refer to assumption B′ to define it indirectly, she states that with respect to this assumption “Neumann thus needs another proof for quantum mechanics” [10, p. 233], and she claims that the proof he allegedly offers is that expectation values calculated in terms of the ensembles allowed by the trace rule are always additive. Despite this confused criticism, she then writes that:

for these ensembles—but only for thesehas Neumann proved the inevitability of dispersion. The question, however, was whether at some time in the progress of research a new hitherto unknown physical trait might not be found through which the dispersion (for at least some physical quantities) could be reduced beneath the scale fixed by the uncertainty relations. Such a discovery, which would provide a cue for predicting the result of eigenvalue measurements of these quantities, is not excluded by Neumann’s proof. [10, p. 234, my emphasis]

It is clear in this passage that Hermann is correct in that von Neumann’s theorem is a proof that the states allowed by the Hilbert space formulation of quantum mechanics cannot be dispersion-free, so Hilbert space hidden variables theories are not possible. She is also explicit and correct in that hidden variables theories with deterministic states are nevertheless possible, and, most notably, she does not fall into von Neumann’s mistake that this type of theories are empirically doomed. That is, despite her confusion in the circularity critique, her understanding of the scope and relevance of the theorem is right on spot.

Interestingly, Jammer [7] offers an evaluation of von Neumann’s theorem that is quite coherent with Hermann’s, Bub’s, and Dieks’ understanding. After rebutting Hermann’s accusation of begging the question, he states that:

we agree with Grete Hermann’s criticism that the proof did not achieve its declared objective of demonstrating that quantum mechanical ensembles cannot be decomposed into any kind of dispersion-free sub-ensembles […]. But we do not dismiss the proof as nugatory. True, in view of von Neumann’s excessively restricted assumptions it is not an impossibility proof of any conceivable class of hidden variables, but it is a completeness proof, in this respect, of von Neumann’s axiomatics (with the inclusion of postulate [B′]), since it shows that this formalism does not admit non-quantum mechanical [dispersion-free] ensembles. [7, p. 274, fn. 45]

Jammer presents this reading of the theorem as if it had escaped Hermann’s eye (he also wrongly believes, just like Hermann, that von Neumann regarded his result as an absolute impossibility proof). However, he did not have access to Hermann’s 1933 essay, which as we saw contains a passage in which she presents an assessment of von Neumann’s proof that fully coincides with Jammer’s.Footnote 10

It is a pity that von Neumann’s theorem has been understood mostly as a proof of the impossibility of hidden variables in quantum theory, rather than as a derivation of the trace rule from probabilistic-theoretical principles expressed in the formalism of Hilbert space—which is of course the way in which von Neumann himself regarded it. Had this been clear in the subsequent discussion in the community, the restricted-to-Hilbert-space scope of the no-go result for hidden variables (the two corollaries) would have been, perhaps, conspicuous to everybody. For the same reason, it is also a shame that Hermann’s and Jammer’s correct assessments of the scope and relevance of the theorem did not have a big impact on the philosophy and foundations of quantum mechanics community.

4 Dropping B′: Gleason’s Theorem

In this final section, I will offer an analysis of the connection between von Neumann’s and Gleason’ theorems. It has been spotted in the discussion that Gleason derived the trace rule on the basis of uncontroversial premises weaker than von Neumann’s. However, there remain very important points to be made about this connection. First, I will precisely characterize what is the worry about assumption B′ that is left when we properly consider Bell’s criticism and the true goal and relevance of von Neumann’s theorem. Then I will show that the logical structure of von Neumann’s proof is such that the very same restricted-to-Hilbert-space no-go result for hidden variables can be derived from Gleason’s celebrated proof without using B′. Furthermore, I will argue that Gleason’s result vindicates assumption B′, in the sense that the remaining worry I mentioned does not materialize. That is, from Gleason’s proof we see that von Neumann’s stronger assumption was, after all, safe and sound—so the same holds for the no-go result for hidden variables in Hilbert space.

4.1 With or Without B′: a Trade-Off?

As we saw, Bell’s main point against von Neumann’s theorem was that we should naturally expect that in hidden variables theories expectation values for hypothetical dispersion-free states and experimentally incompatible quantities are not linear-additive. Thus, he argued, it would be silly to impose assumption B′ on each and every hypothetical hidden variables theory. We also saw, however, that this is not what von Neumann did. Rather, he introduced this assumption in order to guarantee that all physical quantities can be represented by Hermitian operators in Hilbert space, mirroring the functional connections between such quantities. Since von Neumann was attempting a derivation of the formula that yields the predictions of quantum theory in Hilbert space from basic principles, he was certainly justified in assuming B′. After he succeeded in the derivation, it turned out that the formula obtained does not admit hidden variables.

In short, Bell was certainly right in his remark about hidden variables theories and additivity of expectation values in general, but von Neumann’s proof teaches us that viable hidden variables theories in which B′, most naturally, does not hold for dispersion-free states, cannot be Hilbert space theories. But let us forget this lesson for a second. That is, let us pretend that we do not have a derivation of the predictive formula of quantum mechanics in Hilbert space, but retaining Bell’s advice about hypothetical hidden variables theories. If we would set ourselves to the task of deriving that formula listening to Bell’s advice, the possibility of hidden variables and dispersion-free states—a possibility that we should not deny in advance, of course—would lead us to avoid B′ as a premise in our derivation.

However, after what we said in Sect. 3.3, we know that if we leave B′ aside in our attempt to derive the predictive formula of quantum theory in Hilbert space, we run the risk that hypothetical hidden variables and dispersion-free states might leave us in a scenario in which legitimate quantities such as \(f\left( {{\mathcal{R}},{\mathcal{S}}} \right)\) cannot be mathematically represented in the formalism if \({\mathcal{R}}\) and \({\mathcal{S}}\) are experimentally incompatible. Thus, there seems to be a trade-off between assuming and not assuming B′ in our task. If we take it on board, we avoid the mentioned risk, but we would exclude the most interesting class of hypothetical hidden variables theories: those that deny B′ for incompatible quantities. But if we reject B′ for this reason, we would be running the ‘hidden-quantities’ risk.

With this in mind, we can formulate now two interesting and important questions. Can we derive the predictive formula of quantum mechanics in Hilbert space without B′? If the answer is yes, we can then ask: what comes of the possibility of hidden variables and dispersion-free states in Hilbert space in the light of that derivation?

4.2 Gleason, 1957

The celebrated theorem introduced by Andrew Gleason [20] provides an affirmative answer to the first question. He proved that in an \(n\)-dimensional Hilbert space \({\mathcal{H}}\), with \(n \ge 3\), every probabilistic measure of a subspace \(A\) of \({\mathcal{H}}\) is given by \(\mu_{A} = {\text{Tr}}\left( {UP_{A} } \right)\), where \(P_{A}\) is a projector onto \(A\), and \(U\) is a density operator. The crucial premises in Gleason’s proof are that \(I = 1\), where \(I\) is the identity operator, and that for mutually orthogonal projectors \(P_{i}\) such that \(\sum P_{i} = I\), it holds that \(\sum P_{i} = \sum P_{i}\). As it is clear, the second premise amounts to the linear-additivity of expectation values for quantities represented by the mutually commuting Hermitian operators \(P_{i}\), which is weaker than von Neumann’s B′. Notice that Gleason’s second premise is therefore not enough to guarantee that quantities like \({\mathcal{R}} + {\mathcal{S}}\), when \({\mathcal{R}}\) and \({\mathcal{S}}\) are experimentally incompatible, can be mathematically represented in the formalism. Thus, Gleason’s premises seem to run the risk of ‘hidden-quantities’ for hypothetical hidden variables and dispersion-free states we described in the previous section.

Bub [4, p. 1336] and Mermin and Schack [6, p. 1011] notice that Gleason’s theorem, just like von Neumann’s, is a derivation of the trace rule in Hilbert space, but this time on the basis of weaker premises that do not include B′. However, they just mention this fact as a side remark, and they do not discuss its relevance for the possibility of hidden variables. But there are very important things to say about this. Let us recall the logical structure of von Neumann’s proof. First he derived the trace rule from assumptions A′, B′, I and II (see Appendix 1). Then he obtained the corollary that the trace rule does not admit dispersion-free states (see Appendix 2), and a second corollary that a state is homogeneous iff its density operator is a projector onto a unit vector (see Appendix 3). Now, the proofs of the corollaries do not make use of the premises from which the trace rule is derived, only basic properties of (projectors in) Hilbert space and the continuity of the trace rule are invoked. This means that both these results can be obtained as corollaries of Gleason’s theorem too, but this time conditional on weaker premises (and Hilbert space dimensionality \(n \ge 3\)), not conditional on B′.

Furthermore, if we inspect Gleason’s weaker premises, it is clear that, just like von Neumann’s I and II, they amount to the assumption that physical quantities are represented by Hermitian operators. Thus, we can also conclude from Gleason’s proof that hidden variables theories with dispersion-free states in which Hermitian operators are the representatives of the theory’s beables are not possible. In other words, the lesson for hidden variables theories that we learn from Gleason’s theorem is the same lesson we learn from von Neumann’s: such theories cannot be Hilbert space theories. If we put aside his daring view that hidden variables theories are condemned to empirical failure, it turns out that von Neumann was right after all (the Hermann-Bell objection cannot be leveled against Gleason’s proof).Footnote 11

On the other hand, the fact that we can prove this no-go result from Gleason’s theorem implies that the ‘hidden-quantities’ risk of not assuming B′ does not materialize. The trace rule does not admit dispersion-free states, so its linearity entails that expectation values for all allowed states and quantities in Hilbert space are additive. That is, B′ holds as a result in Gleason’s theorem, and it guarantees that Hilbert space is able to grasp and mathematically represent all the quantities of the quantum theory formulated within its boundaries. More concretely, the no-go result for Hilbert space hidden variables theories in Gleason’s proof means that we can always define a quantity of the type \({\mathcal{R}} + {\mathcal{S}}\) when \({\mathcal{R}}\) and \({\mathcal{S}}\) are incompatible, and represent it with the Hermitian operator \(R + S\), where \(R\) and \(S\) are non-commuting. In short, from Gleason’s result we learn that B′ (and also II) holds in quantum theory in Hilbert space precisely because hidden variables are not possible in this formalism. Thus, despite the seeming trade-off and risk we explained in the previous section, after Gleason’s theorem we see that von Neumann gets fully vindicated in assuming B′ as a premise in his proof. Thus, although it is true that Gleason’s theorem is a more powerful theorem in the sense that its premises are weaker, the vindication of B′ just explained conveys von Neumann’s proof a significant appeal. The mathematical complexity of Gleason’s celebrated theorem is well-known, so von Neumann’s elegant and much simpler proof of the same remarkable result (see Appendix 1) has a pragmatic and pedagogical value that has been, most unfortunately, wasted.