Keywords

1 Introduction

The debate regarding the interpretation of probability theory remains still open in the literature [3]. And these interpretational problems become particularly important when the probabilities arising in quantum phenomena are considered [37]. The statistical nature of Quantum Theory posed intriguing questions since its beginnings. This was expressed, for example, by R. P. Feynman [51], who stressed the radical changes needed in the methods for computing probabilities:

I should say, that in spite of the implication of the title of this talk the concept of probability is not altered in quantum mechanics. When I say the probability of a certain outcome of an experiment is p, I mean the conventional thing, that is, if the experiment is repeated many times one expects that the fraction of those which give the outcome in question is roughly p. I will not be at all concerned with analyzing or defining this concept in more detail, for no departure of the concept used in classical statistics is required. What is changed, and changed radically, is the method of calculating probabilities.

The sum rule of probability amplitudes giving rise to interference terms was rapidly recognized as a non-classical feature [5]. Later, it was discovered that this was strongly related to the nonexistence of joint distributions for noncommuting observables. These peculiarities and formal aspects of the probabilities involved in quantum theory have been vastly studied in the literature [6,7,8,9,10,11,12].

One of the most important axiomatizations in probability theory is due to Kolmogorov [13]. In his approach, probabilities are considered as measures defined over a Boolean sigma algebra of a sample space, i.e., as positive maps defined on certain subsets of a given set. Interestingly enough, states of classical statistical theories can be described using Kolmogorov’s axioms, because they define measures over the sigma algebra of measurable subsets of phase space. An interesting approach to the statistical character of quantum systems consists in considering quantum states as measures over the non-Boolean structure of projection operators in a Hilbert space [4, 5, 8]. As is well known, projection operators can be used to describe elementary experiments (the analogue of this notion in the classical setting is represented by subsets of phase space). In this way, a comparison between quantum states and classical probabilistic states can be traced in formal and conceptual grounds. The equivalence between this approach and the usual one, based on the Born’s rule [26], is provided by Gleason’s theorem [35, 36]. This is the reason why quantum states are termed “non-Boolean or non-Kolmogorovian” probability measures [8].

It is important to remark that a generalization of Kolmogorov’s axioms can be given in terms of measures over arbitrary orthomodular lattices (instead of Boolean algebras) [6, 23, 24]. This approach contains quantum and classical statistical models as particular instances [5, 8]. Another way to put this in a more general setting, is to consider a set of states of a particular probabilistic model as a convex set [5]. While classical systems can be described as simplexes, non-classical theories can display a more involved geometrical structure. These models can go far beyond classical and quantum mechanics, and can be used to described different theories (see for example [28, 29] and references therein). We will discuss these notions in Sect. 2 of this work.

The fact that states can be considered as measures over different sets of possible experimental results, reveals an essential structural feature of a vast family of physical statistical theories. A statistical model must specify the probabilities of actualization of all possible measurable quantities of the system involved: this is a feature which is common to all models, no matter how different they are. In this paper, we want to study which are the possible ontologies compatible with the general features arising in generalized probabilistic models. The fact that generalized models of physical theories can be characterized using very precise mathematical structures, should allow us to draw conclusions about possible interpretations. A study of the ontological constrains imposed by this general structure was not addressed previously in the literature. As we shall see, the algebraic and geometric features of the event structures defined by these measurable properties imposes severe restrictions on the interpretation of the probabilities defined by generalized states. In Sect. 3 we will show that a novel approach, based on putting constrains on degrees of belief functions defined over arbitrary orthomodular lattices [37, 38], is particularly suitable for an extension of the Bayesian interpretation to arbitrary contextual probabilistic models. We will also see that, an ontology based on bundles of actual properties poses serious difficulties in most models of interest (specially, in all those which are not classical). An approach based on bundles of possible properties is discussed as an alternative [52]. Finally, we draw our conclusions in Sect. 4.

2 Non-Kolmogorovian Probabilistic Models

Suppose that we have a physical system whose states are given by measures which yield definite probabilities for the different outcomes of all possible experiments. From an operational perspective, these probabilities can be understood first in the sense used by Feynman in the quotation of Sect. 1. Then, for an experiment E with discrete outcomes \(\{E_{i}\}_{i=1,..,n}\), a state \(\nu \) gives us a probability \(p(E_{i},\nu ):=\nu (E_{i})\in [0,1]\) for each possible value of i. The real numbers \(p(E_{i},\nu )\) must satisfy \(\sum ^{n}_{i=1}p(E_{i},\nu )=1\); otherwise, the probabilities would not be normalized. In this way, each state \(\nu \) defines a concrete probability for each possible experiment. A crucial assumption here is that the set of all possible states \(\mathcal {C}\) is convex: this assumption allows to form new states by mixing old ones and [29, 34]. In formulae, if \(\nu _{1}\) and \(\nu _{2}\) are states in \(\mathcal {C}\), then

$$\begin{aligned} \nu =\alpha \nu _{1}+(1-\alpha )\nu _{2} \end{aligned}$$
(1)

belongs to \(\mathcal {C}\) for all \(\alpha \in [0,1]\). We will also assume that it is a compact set. This mixing property can be extended trivially to finite mixtures with more than two elements. Notice that each possible outcome \(E_{i}\) of each possible experiment E, induces a linear functional \(E_{i}(...):\mathcal {C}\longrightarrow [0,1]\), with \(E_{i}(\nu ):=\nu (E_{i})\). Functionals of this form are usually called effects. Thus, an experiment will then be a collection of effects (functionals) satisfying \(\sum ^{n}_{i=1}E_{i}(\nu )=1\) for all states \(\nu \in \mathcal {C}\). In other words, the functional \(\sum ^{n}_{i=1}E_{i}(...)\) equals the identity functional \(\mathbf 1 \) (which satisfies \(\mathbf 1 (\nu )=1\) for all \(\nu \in \mathcal {C}\)). Any compact convex set \(\mathcal {C}\) can be canonically embedded as a base for the positive cone \(V_{+}(\mathcal {C})\) of a regularly ordered linear \(V(\mathcal {C})\) (see [30, 31] for details). This means that for every element z in \(V_{+}(\mathcal {C})\), we can write \(z=t\nu \) in a unique way, with \(t\ge 0\) and \(\nu \in \mathcal {C}\).

In this way, any possible experiment that we can perform on the system, is described as a collection of effects represented mathematically by affine functionals in an affine space \(V^{*}(\mathcal {C})\). A model represented by a convex set \(\mathcal {C}\) will be said to be finite dimensional if and only if \(V(\mathcal {C})\) is finite dimensional. As in the quantum and classical cases, extreme points of the convex set of states will represent pure states.

It is important to remark the generality of the framework described above: all possible probabilistic models with finite outcomes can be described in such a way. Furthermore, if suitable definitions are made, it is possible to include continuous outcomes in this setting.

A face F of a convex set \(\mathcal {C}\) is a convex subset of it satisfying that for all \(\mu \), if \(\mu =\alpha \mu _{1}+(1-\alpha )\mu _{2}\) with \(\alpha \in [0,1]\), then \(\mu \in F\) if and only if \(\mu _{1}\in F\) and \(\mu _{2}\in F\). Faces can be interpreted geometrically as subsets that are stable under mixing and purification. Faces are very important for our discussion, because it can be proved that the set of all possible faces of any convex set forms a lattice. For very important models this lattice is an orthomodular one, and can be put in connection with the approach described below [1, 24]. In particular, the lattice of faces of the convex set of states of a quantum system is isomorphic to the lattice of projection operators in the associated Hilbert space [1, 24]. A similar result holds for classical statistical models: the lattice of faces is a Boolean one. This means that, at least for very important models, there exists a strong connection between the geometry of a convex set of states and the propositional algebra associated to the system in question [37, 47]. This connection can be exploited in order to draw conclusions about how to interpret the states of the given model.

Birkhoff and von Neumann showed [4] that the empirical propositions associated to a classical system can be naturally organized as a Boolean algebra (which is an orthocomplemented distributive lattice [6, 16]). While classical observables are defined as functions over phase space and form a commutative algebra, quantum observables are represented by self adjoint operators, which fail to be commutative. Due to this fact, empirical propositions associated to quantum systems are represented by projection operators, which are in one to one correspondence to closed subspaces related to the projective geometry of a Hilbert space [27, 32]. Thus, empirical propositions associated to quantum systems form a non-distributive—and thus non-Boolean—lattice.

An important example of a classical probabilistic model is provided by a point particle moving in space time whose states are described by probability functions over . Suppose that A represents an observable quantity (i.e., it is a function defined over the phase space). Then, the proposition “the value of A lies in the interval \(\varDelta \)", defines a testable proposition, which we denote by \(A_{\varDelta }\). The proposal of [4] is to associate \(A_{\varDelta }\) to the measurable set \(f^{-1}(\varDelta )\), which is the set of all states that make the proposition true. If the probabilistic state of the system is given by \(\mu \), the corresponding probability of occurrence of \(f_{\varDelta }\) will be given by \(\mu (f^{-1}(\varDelta ))\). The situation is analogous for more general classical probabilistic systems. There is a strict correspondence between a classical probabilistic state and the axioms of classical probability theory. Indeed, the axioms of Kolmogorov [13] define a probability function as a measure \(\mu \) on a sigma-algebra \(\varSigma \) such that

$$\begin{aligned} \mu :\varSigma \rightarrow [0,1] \end{aligned}$$
(2)

which satisfies

$$\begin{aligned} \mu (\emptyset )=0 \end{aligned}$$
(3)
$$\begin{aligned} \mu (A^{c})=1-\mu (A), \end{aligned}$$
(4)

where \((\ldots )^{c}\) means set-theoretical-complement. For any pairwise disjoint denumerable family \(\{A_{i}\}_{i\in I}\),

$$\begin{aligned} \mu (\bigcup _{i\in I}A_{i})=\sum _{i}\mu (A_{i}). \end{aligned}$$
(5)

A state of a classical probabilistic theory will be defined as a Kolmogorovian measure with \(\varSigma =\mathcal {P}(\varGamma )\) (where \(\varGamma \) and \(\mathcal {P}(\varGamma )\) denote the phase space of the system and its measurable subsets, respectively). It is straightforward to show that the set of all possible measures of this form is convex.

Quantum models can be described in an analogous way, but using operators acting on Hilbert spaces instead of functions over a phase space. If \(\mathbf {A}\) represents the self adjoint operator associated to an observable of a quantum system, the proposition “the value of \(\mathbf {A}\) lies in the interval \(\varDelta \)” will define a testable experiment represented by the projection operator \(\mathbf {P}_{\mathbf {A}}(\varDelta )\in \mathcal {P}(\mathcal {H})\), i.e., the projection that the spectral measure of \(\mathbf {A}\) assigns to the Borel set \(\varDelta \) [25]. The probability assigned to the event \(\mathbf {P}_{\mathbf {A}}(\varDelta )\), given that the system is prepared in the state \(\rho \), is computed using Born’s rule [1, 26]:

$$\begin{aligned} p(\mathbf {P}_{\mathbf {A}}(\varDelta ))=\text{ tr }(\rho \mathbf {P}_{\mathbf {A}}(\varDelta )). \end{aligned}$$
(6)

Born’s rule defines a measure on \(\mathcal {P}(\mathcal {H})\) with which it is possible to compute all probabilities and mean values for all physical observables of interest [1, 26]. It is well known that, due to Gleason’s theorem [35, 36], a quantum state can be defined by a measure s over the orthomodular latticeFootnote 1 of projection operators \(\mathcal {P}(\mathcal {H})\) as follows [8, 26]:

$$\begin{aligned} s:\mathcal {P}(\mathcal {H})\rightarrow [0;1] \end{aligned}$$
(7)

such that:

$$\begin{aligned} s(\mathbf 0 )=0 \,\, (\mathbf 0 \,\, \text {is the null subspace}). \end{aligned}$$
(8)
$$\begin{aligned} s(P^{\bot })=1-s(P), \end{aligned}$$
(9)

and, for a denumerable and pairwise orthogonal family of projections \({P_{j}}\)

$$\begin{aligned} s(\sum _{j}P_{j})=\sum _{j}s(P_{j}). \end{aligned}$$
(10)

As in the classical case, the set of states defined by the above equations is also convex. Despite their mathematical resemblance, there is a big difference between classical and quantum measures. In the latter case, the Boolean algebra \(\varSigma \) is replaced by \(\mathcal {P}(\mathcal {H})\), and the other conditions are the natural generalizations of the classical event structure to the non-Boolean setting. The fact that \(\mathcal {P}(\mathcal {H})\) is not Boolean lies behind the peculiarities of probabilities arising in quantum phenomena. In articular, their geometrical features as convex sets are very different; while classical models are simplexes, models arising in quantum systems have a more involved geometry [28]. As an example, the set of probabilistic states of a classical bit (a classical system with only two possible outcomes) forms a line segment, while the set of states of its quantum version, the qubit (a quantum system represented by a two dimensional Hilbert space) has the form of a sphere. As the dimensionality grows, the geometrical features of sets of quantum states becomes more and more involved [1]. The fact that classical states are always simplexes, implies that each point has a unique decomposition in terms of pure states; but this property does no longer holds in the quantum case, posing a serious problem for the attempts to give an ignorance interpretation of mixed states.

In a series of papers Murray and von Neumann searched for algebras more general than \(\mathcal {B}(\mathcal {H})\) [43,44,45,46]. The new algebras are known today as von Neumann algebras, and their elementary components can be classified as Type I, Type II and Type III factors. It can be shown that, the projective elements of a factor form an orthomodular lattice. Classical models can be described as commutative von Neumann algebras. The models of standard quantum mechanics can be described by using Type I factors (Type \(I_{n}\) for finite dimensional Hilbert spaces and Type \(I_{\infty }\) for infinite dimensional models). These are algebras isomorphic to the set of bounded operators of a Hilbert space. Further work revealed that a rigorous approach to the study of quantum systems with infinite degrees of freedom needed the use of Type III factors, as is the case in the axiomatic formulation of relativistic quantum mechanics [8, 48, 49]. A similar situation holds in algebraic quantum statistical mechanics [8, 50]. In these models, states are described as complex functionals satisfying certain normalization conditions, and when restricted to the projective elements of the algebras, obey laws similar to those given by Eq. 7. In other words, they define measures over lattices which are not the same to those of standard quantum mechanics. This opens the door to a meaningful generalization of Kolmogorov’s axioms to a wide variety of orthomodular lattices. Thus, a general probabilistic framework can be described by the following equations. Let \(\mathcal {L}\) be an orthomodular lattice. Then, define

$$\begin{aligned} s:\mathcal {L}\rightarrow [0;1], \end{aligned}$$
(11)

(\(\mathcal {L}\) standing for the lattice of all events) such that:

$$\begin{aligned} s(\mathbf 0 )=0. \end{aligned}$$
(12)
$$\begin{aligned} s(E^{\bot })=1-s(E), \end{aligned}$$
(13)

and, for a denumerable and pairwise orthogonal family of events \(E_{j}\)

$$\begin{aligned} s(\sum _{j}E_{j})=\sum _{j}s(E_{j}). \end{aligned}$$
(14)

where \(\mathcal {L}\) is a general orthomodular lattice (with \(\mathcal {L}=\varSigma \) and \(\mathcal {L}=\mathcal {P}(\mathcal {H})\) for the Kolmogorovian and quantum cases respectively).

Equations 1114 define what is known as a non-commutative probability theory [8]. It is very important to remark that the above measures do not exist for certain orthomodular lattices; for a detailed or a detailed discussion on the conditions under which these measures are meaningful, see [24], Chap. 11. It suffices for us that the most important physical examples fall into this scheme, and this is indeed the case.

3 Different Ontologies

Now a question arises. Can we say something about the nature of probabilities by simply looking at the structural properties of the above described general framework? Is it possible to find a theoretical framework which allows us to give a general interpretation of probabilistic statistical models? The generalized setting is a mathematical framework capable of accommodating models of a very different nature. Given that it contains classical statistical mechanics and quantum mechanics as particular instances, we know that the models involved can be very different. But there are some specific features which allow us to extract some structural conclusions by studying the relationship between the lattice of properties and the geometry of the set of states.

In order to find an answer to the above questions, let us start first by considering an approach based on the restrictions imposed by the algebraic features of the event structure on the probability measures which can be defined in a compatible way. When the property lattice is Boolean, R. T. Cox [14, 15] showed that the only measures compatible with the algebraic symmetries of the lattice are those which obey the usual Kolmogorov’s axioms. Furthermore, in this approach, Shannon’s entropic measure appears as the most natural information measure for these models [20]. Cox approach works as follows. Start with a Boolean algebra \(\mathcal {B}\), representing the domain of possible events available to a rational agent (which could be an automata). The algebra is assumed to be distributive, because this fact reflects that the logic used by the agent is classical. The agent has to put a numerical valuation to each possible proposition with a degree of belief function \(\varphi \). This is done in a conditional way. Thus, for example, the real number \(\varphi (a|b)\) represents the degree of belief of the agent that a is true given that b is true. This works as a sort of inference calculus: when we have complete certainty, the agent uses classical logic in order to make deductions; when he has no certainty, he must use a degree of belief function. But it turns out that, if the function \(\varphi \) has to be compatible with the algebraic properties of \(\mathcal {B}\), there are not too many options at hand: \(\varphi \) has to satisfy, up to rescaling, a set of laws which are equivalent to those given by Kolmogorov’s axioms. This approach has been modified and used to derive Feynman laws of probability in the quantum setting [17,18,19,20].

But as we have seen in Sect. 2, if the lattice of properties involved is not Boolean, non-Kolmogorovian measures appear. This is the case for standard non-relativistic quantum mechanics and many other statistical models of interest, such as those provided by algebraic quantum field theory and quantum statistical mechanics. What happens if a rational agent has to define belief functions under the condition that the empirical event structures depart from the Boolean realm of classical physics?

Indeed, if the lattice of properties is represented by the orthomodular lattice of projection operators in a Hilbert space, it is possible to show that the only consistent possibility is given by the Born’s rule [37]. Moreover, if the lattices are more general and non-Boolean, it can be shown that the probability measures will not be Kolmogorovian either [37]. In this approach, the Shannon’s information measure must be replaced by the von Neumann’s entropy in the quantum case and the measurement entropy in the general case [38]. Let us briefly describe how this method works in the general case. One first starts by identifying the algebraic structure of the event structure of a given theory \(\mathcal {T}\). In many cases of interest, this will be specified as a particular orthomodular lattice \(\mathcal {L}\). Notice that, in principle, this could be considered as empirical information available to the agent. Once the algebraic properties of the event structure are determined, a variant of Cox method can be applied by studying the constrains imposed on the degree of belief functions. The crucial point here is that event structures are not always organized as Boolean lattices. Thus, in order to determine the general properties of the probabilities of a given theory, Cox’s method has to be applied to lattices more general than Boolean ones.

The existence of this approach opens an interesting perspective for the Bayesian interpretation of probabilities. The interpretation would be as follows. There is an empirical scenario in which a rational agentFootnote 2 (which could be an automata) must take a decision, and with that aim, he must define a degree of belief function. Different possible experiments and results are available and they are organized in an event structure, assumed to be an orthomodular lattice. If the lattice of events that he is facing is Boolean (as in Cox’s approach), then the measures of degree of belief of the rational agent will obey laws equivalent to those of Kolmogorov. On the contrary, if the state of affairs that the agent must face presents contextuality (as in standard quantum mechanics), the measures involved must be non-Kolmogorovian. The natural information measures will be Shannon’s or more general ones, according to the algebraic structure of the context involved [38]. This kind of approach would allow for a natural justification for the peculiarities of probabilities arising in quantum phenomena from the standpoint of the Bayesian approach.

But one of the problems of the Bayesian interpretation of probabilities is that it says nothing about ontology in a deliberate way. What can we do if we want to go beyond the subjective approach and say something about the nature of the models involved? As we have seen, a generalized probabilistic model establishes a relationship between the state of the system and the results of possible experiments to be performed on the system. Is it possible to assign concrete properties of the system to these experiments? In the Kolmogorovian setting, it is possible to find global valuations of the Boolean lattice to the set \(\{0,1\}\). In other words, for each possible property, we can consistently affirm that the system either possesses that property or does not possess it. Thus, at least in principle, we can consider classical models as objects with definite properties. The probabilities would simply reflect our subjective ignorance about this objective situation.

The situation changes radically in the quantum case. The Kochen-Specker (KS) theorem poses a serious threat for the interpretation of a quantum system as an object with definite properties: it is not possible to establish a global Boolean valuation to the elements of the lattice of projection operators. Of course, this problem can be solved if hidden variables are assumed, as in the Bohmian formulation of quantum theory (where the interpretation displays highly non-local features). But one thing we know for sure: we cannot interpret the elements of the event structure as representing definite properties of the system. The trick of considering the state of the system as the quantum logical conjunction of all actual properties (i.e., those properties for which the probability of occurrence is equal to one in a given state), becomes untenable when entangled states and improper mixtures are considered [53].

What about an interpretation in terms of bundles of properties for quantum systems? One may think that the problem of the interpretation of quantum systems is related to the assumption of an ontology of substances and properties, being the system a sort of ‘carrier’ of its actual properties. In order to avoid this, one may try to think quantum systems not as individual objects, but as bundles of properties. According to this interpretation, properties have an ontological priority, and there is no individual substratum acting as a carrier. In other words, an object is no longer considered as an individual substratum possessing properties, but simply a convergent bundle of properties without any substratum. But again, the KS theorem threatens the interpretation of a quantum system as a bundle of actual properties. For this reason, some authors have attempted an interpretation of standard quantum mechanics based on bundles of possible properties (see for example, [52] and references therein). The fact that possible properties do not pertain to the realm of actuality, would avoid the problems imposed by the KS theorem.

But the above considerations with regard to the impossibility of considering the elements of the event structure as a set of definite (or actual) properties, imposes a severe restriction on the interpretation of probabilities arising in quantum phenomena: an ignorance interpretation will be problematic, due to the fact that, in these interpretations, there will always exist sets of properties for which definite values cannot be consistently assigned previously to the measurement process. For these reasons, it seems natural to take quantum probabilities as ontological (provided that we want to avoid hidden variable models).

One of the key features that allows for the existence of the KS theorem is the fact that the orthomodular lattice of projections is not Boolean. Indeed, in [40], a detailed study of the orthomodular structures underlying the Kochen-Speker construction is presented. As we have seen, the event structures associated to more generalized probabilistic models can be non-Boolean in the general case. This means that, for a vast family of non-Kolmogorovian models, we will not be able to think about the elements of the event structure as representing actual properties of an individual system. And with regard to the algebraic formulation of physical probabilistic theories, a generalized version of the KS theorem exists for von Neumann algebras [41] (see also [42]). Due to the existence of these results, an interpretation based on bundles of actual properties for generalized probabilistic models seems to be problematic. As in the quantum case, the approach based on bundles of possible properties could be used instead [52]. But again, as in the standard quantum mechanics example, the probabilities involved will no longer admit an ignorance interpretation for generalized probabilistic models showing contextuality.

4 Conclusions

In this work we have discussed the connection between the event structures associated to general non-Kolmogorovian models and measures representing states. We reviewed an approach in which states are regarded as functions measuring the degree of belief of a rational agent, and find that a Bayesian interpretation seems to be suitable for the most important probabilistic models, provided that contextual phenomena is accepted as a starting point. Of course, this program should be worked out with more detail, specially with regard to the study of conditional probabilities and the Bayes’ rule in the generalized setting.

In order to go beyond the subjective interpretation, we also discussed the conditions under which the event structures can be related to properties of a system, and inquired on the ontological aspects of such an association. Due to the existence of generalized versions of the KS theorem, we find that for the majority of models the description of systems as bundles of actual properties will be problematic. This opens the door to a generalization of previous approaches in which bundles of possible properties are the elementary bricks out of which reality is constructed. But in all these interpretations, an ignorance interpretation of probabilities will no longer be possible.