The Weirdness Theorem and the Origin of Quantum Paradoxes

Benavoli, Alessio; Facchini, Alessandro; Zaffalon, Marco

doi:10.1007/s10701-021-00499-w

The Weirdness Theorem and the Origin of Quantum Paradoxes

Open access
Published: 28 September 2021

Volume 51, article number 95, (2021)
Cite this article

Download PDF

You have full access to this open access article

Foundations of Physics Aims and scope Submit manuscript

The Weirdness Theorem and the Origin of Quantum Paradoxes

Download PDF

4322 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

We argue that there is a simple, unique, reason for all quantum paradoxes, and that such a reason is not uniquely related to quantum theory. It is rather a mathematical question that arises at the intersection of logic, probability, and computation. We give our ‘weirdness theorem’ that characterises the conditions under which the weirdness will show up. It shows that whenever logic has bounds due to the algorithmic nature of its tasks, then weirdness arises in the special form of negative probabilities or non-classical evaluation functionals. Weirdness is not logical inconsistency, however. It is only the expression of the clash between an unbounded and a bounded view of computation in logic. We discuss the implication of these results for quantum mechanics, arguing in particular that its interpretation should ultimately be computational rather than exclusively physical. We develop in addition a probabilistic theory in the real numbers that exhibits the phenomenon of entanglement, thus concretely showing that the latter is not specific to quantum mechanics.

Classical and Quantum Probability: The Two Logics of Science

Quantum Theory is an Information Theory

Article 28 July 2015

Quantum cognition and bounded rationality

Article Open access 20 October 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

We are interested in defining bounds on the algorithmic capabilities of a mathematical theory and in analysing their implications. We articulate our views from a logical standpoint, by postulating the following principle of rationality:

(Coherence)
The theory should be logically consistent.

This is what we essentially require to each well-founded mathematical theory: it has to be based on a few axioms and rules from which we can unambiguously derive its mathematical truths. The next postulate defines the computational limitations we want our theory to be subject to:

(Computation)
Inferences in the theory should be computable in polynomial time.

The second postulate will turn out to be central. It requires that there should be an efficient way to execute the theory, and in fact we are going to adopt the metaphor of a computer that executes the theory, i.e., that yields inferences out of it.

In what follows, we shall develop our considerations with regard to the special case of a theory of uncertainty. It will essentially coincide with the Bayesian theory once it is freed of the constraint of completeness (or precision); loosely speaking, such a theory is equivalent to modelling uncertainty with sets of probabilities. This choice will define a perimeter for the mathematical technicalities, while focusing on a case of wide interest and impact.

The postulates of coherence and computation are apparently in conflict with each other: intuitively, if the computer can only execute polynomial tasks, the theory will be consistent only up to what polynomial calculus allows. This is a view from outside the computer, however; it is the view of a hypothetical ‘classical’ observer with no computational limitations and thus external to the theory. An observer that is instead internal to the theory and behaves according to it is still subject to the coherence of the theory; it will therefore be impossible to prove any inconsistency from the inside. This is an instance of what we call an external-internal clash.

1.
We formalise such a clash by what we refer to as ‘the weirdness theorem’. It shows that any theory of ‘algorithmic rationality’, that is, one that obeys the two postulates of coherence and computation, necessarily departs in a very peculiar way from the probabilistic point of view. In particular, the theorem proves that all models compatible with the theory will present some negative ‘probabilities’ (these models are sometimes referred to as ‘quasi-probabilities’ in the literature). Negative probabilities are however incompatible with classical rationality, and for this reason a hypothetical classical observer may regard the internal world as incoherent. Equivalently, to a classical observer the behaviour of the internal world may appear to be incompatible with so-called classical evaluation functionals (a concept used in particular in quantum logic).
2.
We then define a theory of probability on a continuous space of complex vectors that complies with the two postulates of coherence and computation, and we show that its deductive closure (internal view) is tantamount to quantum theory (QT). The complex vectors represent the possible states of the computer while it runs the theory, and any such state bears within it the properties of the particles in QT, such as their directions or angles of polarisation.

By framing it as a theory of rationality, we therefore view QT as a normative theory guiding an agent to assess her subjective beliefs on the results of a quantum experiment. As we are going to stress all along the paper, we ground the normativity of QT on three aspects: firstly, its deductive structure is tantamount to a logical theory, and therefore it is based on a requirement of consistency (coherence)—to follow the rule of QT is to be assured to be consistent. Secondly, the model is based on a possibility (phase) space whose elements are interpreted as states of the world. Finally, we advance that the specific features of our world that ground the use of QT is its being a computation.

The external-internal clash, when transposed to QT, is thus a clash between a computational view and a view stemming from classical physics, the weirdness theorem providing its formal, mathematical formulation. When we try to give a classical physical interpretation to QT we fail, because classical physics, in our common understanding, needs classical probability, and the latter grounds its normativity essentially only on its internal consistency, given the fact that it does not require any limitations on the computational resources available for executing its inferences. As such, to an external observer, QT presents a number of weird phenomena, such as entanglement, and is made up of negative probabilities or is characterised by a non-Boolean structure of events. We will show that this weirdness follows by the computational postulate.

There is also more to it. In our framework QT is naturally based on sets of (or imprecise) probabilities: in fact, requiring the computation postulate is similar to defining a probabilistic model using only a finite number of moments;^{Footnote 1} and therefore, implicitly, to defining the model as the set of all the (quasi-)probabilities compatible with the given moments’ constraints.
3.
As mentioned above, quantum paradoxes appear to be entirely a consequence of the weirdness theorem; in particular, the weirdness does not follow from having to deal with complex number or quantised states.

We enforce this view by working out another example theory, which is unrelated to QT. Such a theory uses real numbers to model the experiment of tossing a classical coin under algorithmic rationality. Eventually the theory turns out to be based on Bernstein polynomials^{Footnote 2} and to admit entanglement. This shows in addition that the quantum-logic and the quasi-probability foundations of QT are two faces of the same coin, being natural consequences of the computation principle, as formalised by the weirdness theorem.

In order to develop the results mentioned above, we rely on a dual^{Footnote 3} characterisation of probability in terms of lotteries (or gambles). In doing so, we thus provide a subjective foundation, à la de Finetti, of so-called generalised probability theories. Compared to algebraic or information-based extensions of probability theories (e.g., [3, 4]), a gambling foundation, which emphasises the notion of logical consistency, ensures soundness, and naturally provides a ground for comparing different theories—it boils down to assess the compatibility of their different notions of consistency.

1.1 Related Work

The perspective given in this paper may be related to the agent-centered interpretation of QT advanced by QBism [5, 6]. However, by grounding the use of QT particularly in the world as being a computation, we depart from QBism, which puts at the center the Born rule but, for now, is unable to ground its use on something else than a coherence constraint. In this, our view looks more similar to the one advanced by Pitowsky [7], whose empirical premise in the derivation of the Born rule is that the structure corresponding to the outcomes of incompatible measurements is a non-Boolean algebra.^{Footnote 4}

There is a long tradition of denying to quantum states any reference to the outside world that can be traced back at least to Bohr, and more broadly to the Copenhagen interpretation of QT. Similar contemporary views are labelled as $\psi $-epistemic. In addition to QBism, they include for instance Healey’s quantum pragmatism [8], Rovelli’s relational quantum mechanics [9], and the empiricist interpretation of de Muynck [10].^{Footnote 5} All these interpretations “do not view the quantum state as an intrinsic property of an individual system and they do not believe that a deeper reality is required to make sense of quantum theory” [12, p. 72]. Opposite to this tradition stand $\psi $-ontic views such as the many world interpretation [13, 14], hidden variable theories like Bohmian mechanics [15, 16], collapse theories [17,18,19,20], or the transactional interpretations [21,22,23,24], the common trait being that quantum states are regared as descriptions of physical systems.

About the quantum-classical probability clash, since QT foundation, there have been two main ways to explain it. The first one, which goes back to Birkhoff and von Neumann [25], explains this differences with the premise that, in QT, the Boolean algebra of events is taken over by the ‘quantum logic’ of projection operators on a Hilbert space. The second one is based on the view that the quantum-classical clash is due to the appearance of negative probabilities [26]. More recently, this research programme has been explored following different avenues: extending Boolean logic [25, 27, 28], using operational primitives [3, 29,30,31], using information-theoretic postulates [3, 32,33,34,35,36,37,38,39], building upon the subjective foundation of probability [5,6,7, 40,41,42,43,44,45,46,47], and starting from the phenomenon of quantum nonlocality [3, 33, 34, 48, 49].

Without aiming at reconstructing QT, the present manuscript provides an alternative and original explaination of the differences between quantum and classical probability: the algorithmic intractability of classical probability theory contrasted to the polynomial-time complexity of QT.

1.2 Outline of the Paper

Section 2 is concerned with the coherence postulate. We recall the relatively little known fact that (imprecise) probability is the mathematical dual^{Footnote 6} of a coherent logical theory. Addressing consistency (coherence or rationality) in such a setting is a standard task in logic; in practice, it reduces to prove that a certain real-valued function is nonnegative.

Section 3 details the computation postulate and its role in developing a model of algorithmic rationality. We consider the problem of verifying the nonnegativity of a function as above. This problem is generally undecidable or, when decidable, NP-hard. We make the problem polynomially solvable by redefining the meaning of (non)negativity. We give our fundamental weirdness theorem (Theorem 1) showing that such a redefinition is at the heart of the clash between classical probability and algorithmic rationality.

We show in Sect. 4 that QT is a special instance of algorithmic rationality and hence that Theorem 1 is the distinctive reason for all quantum paradoxes: the case of entanglement is detailed in Sect. 4.3; in Sect. 4.4 we show that the witness function, in the fundamental ‘entanglement witness theorem’, is nothing else than a negative function whose negativity cannot be assessed in polynomial time—whence it is not ‘negative’ in QT.

Section 5 devises a further theory related to the experiment of tossing a classical coin under algorithmic rationality, which is hence unrelated to QT. We show that the theory admits entangled states, as prescribed by the weirdness theorem.

We give our concluding views in Sect. 6.

Appendix A discussed more in detail the relation of our view of QT highlighting in particular some aspects of our theory as well as some connection with other research fields. Appendix B contains the proofs of formal statements.

2 Classical Rationality

De Finetti’s subjective foundation of probability [51] is based on a notion of rationality (consistency or coherence). The idea is that of introducing a betting scheme and defining bettors as rational if their stakes are placed so as to avoid a sure loss (this is traditionally called a Dutch book; Economics refers to it as ‘arbitrage’). De Finetti shows that avoiding sure loss is equivalent to representing a bettor’s beliefs through classical (subjective) probability, thus providing a solid foundation for the latter.

2.1 Desirability

What is less known, however, is that de Finetti’s bright intuition has greatly been extended in [52, 53], giving rise to the so-called theory of desirable gambles (TDG). This can equivalently be regarded as a reformulation of the well-known Bayesian decision theory (à la Anscombe-Aumann [54]) once it is freed of the constraint to deal with complete preferences [55, 56]. TDG is a dual theory of probability in the sense that probability is recovered from TGD through standard mathematical duality. In such a dual form, TDG appears just as a set of logical axioms.

These axioms have a natural interpretation as rationality requirements in the way a ‘classical’ subject (we call him Isaac), accepts gambles on the results of an uncertain experiment. For instance, Isaac might claim ‘I find the gamble that returns 1 utiles^{Footnote 7} if the coin lands heads (H) and $-2$ utiles if it lands tails (T) to be desirable’. This means that he is willing to accept the gamble $g=(1,-2)$, that is, $g(H)=1$ and $g(T)=-2$: that is, to commit to both win 1 utile if the coin lands heads and lose 2 utiles if it lands tails.

Gambles are thus rewards about the uncertain outcome of an ‘experiment’, such as tossing a coin in the example above. We denote with $\varOmega $ its possibility space (e.g., $\{heads, tails\}$, ${\mathbb {R}}^n$, ${\mathbb {C}}^n$). For many experiments, there may be more than one possibility space of interest to the ‘experimenter’, $\varOmega _1,\varOmega _2,\ldots ,\varOmega _k$. A possibility space describing the joint outcome of this k-valued experiment can be constructed as the Cartesian product $\varOmega = \varOmega _1 \times \varOmega _2 \times \cdots \times \varOmega _k$.^{Footnote 8} Formally, a gamble g on $\varOmega $ is a bounded real-valued function of $\varOmega $.

In an experiment, not all the quantities are observable and, therefore, bettable; we denote by ${\mathscr {L}}_R$ the restricted set of all ‘permitted gambles’ on $\varOmega $. We assume that ${\mathscr {L}}_R$ is a linear space (a vector space) including the constant functions. The subset of all nonnegative gambles in ${\mathscr {L}}_R$, that is, of gambles for which Isaac is never expected to lose utiles, is denoted as ${\mathscr {L}}^{\ge } _R:=\{g \in {\mathscr {L}}_R: \inf g\ge 0 \}$ (analogously negative gambles are denoted as ${\mathscr {L}}^{<} _R:=\{g \in {\mathscr {L}}_R: \sup g < 0 \}$). In the following, with ${\mathscr {G}}:=\{g_1,g_2,\ldots ,g_{|{\mathscr {G}}|}\} \subset {\mathscr {L}}_R$ we denote a finite set of gambles that Isaac finds desirable:^{Footnote 9} these are the gambles that he is willing to accept and thus commits himself to the corresponding transactions.

The crucial question is how to provide a criterion for a set ${\mathscr {G}}$ of gambles, representing assessments of desirability, to be regarded rational. As we said, rationality is traditionally imposed by avoiding sure losses: that is, by requiring that Isaac should not be forced to find a negative gamble desirable as a logical consequence of his initial assessments of desirability. An elegant way to formalise this intuition is to regard ${\mathscr {L}}_R$ as an algebra of formulas on top of which to define a logic. This leads us directly to formulate rationality as logical consistency.

To proceed on this route, we first need to define an appropriate logical calculus (characterising the set of gambles that Isaac must find desirable as a consequence of having desired ${\mathscr {G}}$ in the first place) and based on it to characterise the family of consistent sets of assessments.

First of all, since nonnegative gambles may increase Isaac’s utility without ever decreasing it, we first have that:

A0.
${\mathscr {L}}^{\ge } _R$ should always be desirable.

This defines the tautologies of the calculus.

Moreover, whenever f, g are desirable for Isaac, then any positive linear combination of them should also be desirable (this amounts to assuming that Isaac has a linear utility scale, which is a standard assumption in probability). Hence the corresponding deductive closure of a set ${\mathscr {G}}$ is given by:

A1.
${\mathcal {K}}:={{\,\mathrm{posi}\,}}({\mathscr {L}}^{\ge } _R\cup {\mathscr {G}})$.

Here ‘${{\,\mathrm{posi}\,}}$’ denotes the conic hull operator.^{Footnote 10}

In the betting interpretation given above, a sure loss for an agent is represented by a negative gamble: by accepting a negative gamble an agent will lose utiles no matter the output of the experiment. We are led therefore to the following:

Definition 1

(Coherence postulate) A set ${\mathcal {K}}$ of desirable gambles is coherent if and only if

A2.
$ {\mathscr {L}}^{<} _R \cap {\mathcal {K}}=\emptyset $.

Note that ${\mathcal {K}}$ is incoherent if and only if $-1 \in {\mathcal {K}}$; therefore $-1$ can be regarded as playing the role of the Falsum in logic and hence A2 can be reformulated as

A2′.
$-1 \notin {\mathcal {K}}$.

An example that gives an intuition of the postulates is given in Fig. 1.

Postulate A2 (resp. A2$'$), which presupposes postulates A0 and A1, provides the normative definition of TDG, referred to by ${\mathcal {T}}$. Moreover, as simple as it looks, alone it is the pillar of the foundation of classical subjective probability.

2.2 Probability (The Desirability Dual)

Let us show that probability is dual to desirability as described in Sect. 2.1. First of all, however, let us make some terminology precise: when we write probability, as a function, we mean probability charge, i.e., a finitely additive probability.^{Footnote 11} In fact the Analysis literature calls ‘charge’ a finitely additive set function [58, Chap. 11]. It coincides then with what we have called a quasi-probability; if we instead want to refer to an actual probability, we have to use the qualified expression probability charge.

Assume ${\mathcal {K}}$ is coherent. We give ${\mathcal {K}}$ a probabilistic interpretation by first observing that, since ${\mathscr {L}}_R$ is a topological vector space,^{Footnote 12} we can consider its dual space ${\mathscr {L}}_R^*$ of all bounded linear functionals $L: {\mathscr {L}}_R \rightarrow {\mathbb {R}}$. Then the dual of ${\mathcal {K}}$ is defined as:

$$\begin{aligned} {\mathcal {K}}^\circ =\left\{ L \in {\mathsf {S}} \mid L(g)\ge 0, ~\forall g \in {\mathscr {G}}\right\} , \end{aligned}$$

(1)

where ${\mathsf {S}}=\{L \in {\mathscr {L}}_R^* \mid L(1)=1,~~L(h)\ge 0 ~~\forall h \in {\mathscr {L}}_R^{\ge }\}$ is the set of (belief) states; $L(1)=1$ means that linear functionals preserve the unitary gamble (normalisation). $L(g)\ge 0$ means that L(g) must be a nonnegative real number for all gambles $g \in {\mathscr {G}}$ that Isaac finds desirable.^{Footnote 13} To ${\mathcal {K}}^\circ $ we can then associate its extension $ {\mathcal {K}}^\bullet $ in ${\mathscr {M}}$, that is, the set of all probability charges on $\varOmega $ extending an element in ${\mathcal {K}}^\circ $.

In other words, we can write L(g) as an expectation with respect to a probability: $L(g)=\int _{\varOmega } g(\omega )d\mu (\omega )$. One can then show that the extension $ {\mathcal {K}}^\bullet $ is equal to:

$$\begin{aligned} \begin{aligned} {\mathscr {P}}:={\mathcal {K}}^\bullet =\left\{ \mu \in {\mathsf {S}} \Big | \int _{\varOmega } g(\omega ) d\mu (\omega )\ge 0, ~\forall g\in {\mathscr {G}}\right\} ,\\ \end{aligned} \end{aligned}$$

(2)

where ${\mathsf {S}}=\{ \mu \in {\mathscr {M}} \mid \inf \mu \ge 0,~\int _{ \omega } d\mu (\omega )=1\}$ is the set of all probability charges in $\varOmega $, and $ {\mathscr {M}}$ the set of all charges on $\varOmega $.

Equation (3) states that, whenever an agent is coherent, desirability of g corresponds to nonnegative expectation, that is $\int _{\varOmega } g(\omega ) d\mu (\omega )\ge 0$ for all probabilities in ${\mathscr {P}}$. When ${\mathcal {K}}$ is incoherent, ${\mathscr {P}}$ turns out to be empty—there is no probability compatible with the assessments in ${\mathcal {K}}$. Stated otherwise, satisfying the axioms of classical probability—that is being a nonnegative function that integrates to one—is tantamount of being in the dual of a set ${\mathcal {K}}$ satisfying the coherence postulate (‘no-Dutch book’).

3 Algorithmic Rationality

Let us reconsider the classical theory introduced in the previous section. Assume that Isaac makes an initial (finite) set of assessments ${\mathscr {G}}$, which represent his initial beliefs about an experiment. In order to evaluate Isaac’s desirability of a further gamble $f \in {\mathscr {L}}_R$, we need to solve the membership problem $f \overset{?}{\in } {\mathcal {K}}$. This can equivalently be expressed as the following nonnegativity decision problem:

$$\begin{aligned} \begin{aligned} \exists \lambda _i\ge 0:f-\sum \limits _{i=1}^{|{\mathscr {G}}|} \lambda _i g_i \in {\mathscr {L}}^{\ge }_R. \end{aligned} \end{aligned}$$

(4)

If the answer is ‘yes’, then the gamble f belongs to ${\mathcal {K}}$, which is the conic closure of ${\mathscr {G}}\cup {\mathscr {L}}^{\ge }_R$, and this proves its desirability. Note that checking whether ${\mathcal {K}}$ is coherent or not is tantamount to solving (4) for $f=-1$.

3.1 Algorithmic Desirability

However, computing such an inference may be ‘costly’, if not virtually unfeasible. Indeed, when $\varOmega $ is infinite (later on we shall consider the case $\varOmega \subset {\mathbb {C}}^n$) and for generic functions $f,g_i$, the nonnegativity decision problem is undecidable. In this paper, we consider the case where gambles are (complex) multivariate polynomials of degree at most d. In this case, by Tarski-Seidenberg’s quantifier elimination theory [59, 60], the problem (4) becomes decidable but still intractable, being in general NP-hard. From this perspective, the classical theory is therefore not suitable for constituting a realistic model of rationality.

The idea of modifying the standard theory by considering computational issues traces back to the work of Good [61] and Simon [62]. Since then there have been two main approaches to the problem, either by charging an agent for doing costly computation (as initiated in [63]), or by limiting the computation that agents can do (as initiated in [64], and first used in the context of decision theory in [65]^{Footnote 14}). In what follows, we take inspiration from the second approach, and, employing a terminology stemming from [66], develop a model of algorithmic rationality for the framework under consideration. Our subject in such an algorithmic world is now called Alice, to distinguish her from Isaac, who lives in the classical world.

The intuition behind our approach is the following. Assume that, due to computational, or other types of, limits, Alice can only work out the decision problem (4) for a closed subcone $\varSigma ^{\ge }$ of the nonnegative gambles ${\mathscr {L}}^{\ge }_R$:

$$\begin{aligned} \begin{aligned} \exists \lambda _i\ge 0:f-\sum \limits _{i=1}^{|{\mathscr {G}}|} \lambda _i g_i \in \varSigma ^{\ge }. \end{aligned} \end{aligned}$$

(5)

This means in particular that there will be a nonnegative gamble $f \in {\mathscr {L}}^{\ge }_R$ that Alice cannot actually assess as nonnegative; thus she may well decide not to accept it. Similarly, Alice’s initial set of assessments ${\mathscr {G}}$ may contain a negative gamble but this notwithstanding the answer to the corresponding coherence decision problem may be positive (solving (5) for $f=-1$ may lead to a negative answer).

Should these behaviours be counted as rational? Logic claims that they should: in fact, from the perspective of an agent whose rationality is constrained by $\varSigma ^{\ge }$, a collection of assessments is logically consistent whenever its deductive closure contains all tautologies as given by $\varSigma ^{\ge }$ but does not contain $-1$, the Falsum.

In other words, an algorithmic TDG, which we denote by ${\mathcal {T}}^\star $, should be based on a logical redefinition of the tautologies, i.e., by stating that

B0.
$\varSigma ^{\ge }$ should always be desirable,

in the place of A0, where $\varSigma ^{\ge }$ is a closed subcone of ${\mathscr {L}}^{\ge } _R$ whose corresponding membership problem (5) delimits the type of computation that an agent can actually do.

The rest of the theory follows exactly the footprints of ${\mathcal {T}}$. In particular, the deductive closure for ${\mathcal {T}}^\star $ is defined by:

B1.
${\mathcal {C}}:={{\,\mathrm{posi}\,}}(\varSigma ^{\ge }\cup {\mathscr {G}})$.

And finally the coherence postulate is simply reformulated by stating that a set ${\mathcal {C}}$ of desirable gambles is said to be A-coherent if and only if

B2.
$-1 \notin {\mathcal {C}}$,

where ‘A’ stands for the the fact that in ${\mathcal {T}}^\star $ the algorithmic bounds of the coherence problem for a finite set of assessments are established according to the particular choice of $\varSigma ^{\ge }$.

Hence, ${\mathcal {T}}^\star $ and ${\mathcal {T}}$ have the same deductive apparatus; they just differ in the considered set of tautologies, and thus in their (in)consistencies. An example that gives an intuition of the postulates is given in Fig. 2.

3.2 Quasi-Probability (The Algorithmic Desirability Dual)

Interestingly, as we did previously, we can associate a ‘probabilistic’ interpretation to the desirability calculus, now defined by B0–B2, through the dual of an A-coherent set.

So let us consider again the dual space ${\mathscr {L}}_R^*$ of all bounded linear functionals $L: {\mathscr {L}}_R \rightarrow {\mathbb {R}}$. With the additional condition that linear functionals preserve the unitary gamble, the dual cone of an A-coherent ${\mathcal {C}}\subset {\mathscr {L}}_R$ is given by

$$\begin{aligned} {\mathcal {C}}^\circ =\left\{ L \in {\mathsf {S}} \mid L(g)\ge 0, ~\forall g \in {\mathscr {G}}\right\} , \end{aligned}$$

(6)

where ${\mathsf {S}}=\{L \in {\mathscr {L}}_R^* \mid L(1)=1,~~L(h)\ge 0 ~~\forall h \in \varSigma ^{\ge }\}$ is the set of states. To ${\mathcal {C}}^\circ $ we can associate its extension $ {\mathcal {C}}^\bullet $ in ${\mathscr {M}}$, that is, the set of all charges on $\varOmega $ extending an element in ${\mathcal {C}}^\circ $. In other words, we can attempt to write L(g) as an ‘expectation’, that is, an integral with respect to a charge: $L(g)=\int _{\varOmega } g(\omega )d\mu (\omega )$. In general however this set does not yield a classical probabilistic interpretation to ${\mathcal {T}}^\star $: in fact, whenever $\varSigma ^{\ge }\subsetneq {\mathscr {L}}^{\ge } _R$, there are negative gambles that Alice, given her rationality constrains, does not recognise as such and therefore, from her perspective, do not lead to a sure loss. This is stated more precisely by the following:

Theorem 1

(The weirdness theorem) Assume that $\varSigma ^{\ge }$ includes all positive constant gambles and that it is closed (in ${\mathscr {L}}_R$). Denote by $\varSigma ^{<} $ the interior of $-\varSigma ^{\ge }$. Let ${\mathcal {C}}\subseteq {\mathscr {L}}_R$ be an A-coherent set of desirable gambles. The following statements are equivalent:

1.
${\mathcal {C}}$ includes a negative gamble that is not in $\varSigma ^{<} $.
2.
${{\,\mathrm{posi}\,}}({\mathscr {L}}^{\ge } _R\cup {\mathscr {G}})$ is incoherent, and thus ${\mathscr {P}}$ is empty.
3.
${\mathcal {C}}^{\circ }$ is not (the restriction to ${\mathscr {L}}_R$ of) a closed convex set of mixtures of classical evaluation functionals.^{Footnote 15}
4.
The extension $ {\mathcal {C}}^\bullet $ of ${\mathcal {C}}^{\circ }$ in the space ${\mathscr {M}}$ of all charges in $\varOmega $ includes only non-probabilistic charges (those with some negative value).

Theorem 1 is the central result of this paper (its proof is in Appendix B).

It states that whenever ${\mathcal {C}}$ includes a negative gamble (item 1), there is no classical probabilistic interpretation for it (item 2). The other points suggest alternative solutions to overcome this deadlock: either to change the notion of evaluation functional (item 3) or to use quasi-probabilities as a means for interpreting ${\mathcal {T}}^\star $ (item 4). The latter case means that, when we write $L(g)=\int _{\varOmega } g(\omega )d\mu (\omega )$, then $\mu (\omega )$ satisfies $1=L(1)=\int _{\varOmega } d\mu (\omega )=1$ but it is not a probability charge.

Observe that requiring polynomial time complexity is just one way to create the conditions for Theorem 1 to hold. But there are others, in that it is enough that one single negative gamble belongs to ${\mathcal {C}}$ to make the theorem hold. In other words, even if we allowed for exponential time complexity, there would still be gambles whose negativity we would not be able to evaluate (those that lead to undecidability). This is the reason why we use the terminology ‘algorithmic’ rationality, which appears to faithfully capture the idea that our capabilities are limited by the very fact of reasoning algorithmically.

However, and since we are particularly concerned with physics in this paper, we also embrace Aaronson’s point of view in [67]:

$\ldots $ while experiment will always be the last appeal, the presumed intractability of NP-complete problems might be taken as a useful constraint in the search for new physical theories

as a reason to focus on a polynomial-time complexity definition of algorithmic rationality.

4 QT as a Theory of Algorithmic Rationality

We are going to show that QT can be deduced from a particular instance of the theory ${\mathcal {T}}^\star $. As a consequence, we get that the computation postulate, and in particular B0, is the unique reason for all its paradoxes, which all boil down to a rephrasing of the various statements of Theorem 1 in the considered quantum context.

4.1 Setting

Let us initially focus on the possibility space we shall use. Consider first a single particle n-level system and let

$$\begin{aligned} {\overline{{\mathbb {C}}}}^{n}:=\{ x\in {\mathbb {C}}^{n}: ~x^{\dagger }x=1\}. \end{aligned}$$

In some cases we can interpret an element $x\in {\overline{{\mathbb {C}}}}^{n}$ as ‘input data’ for some classical preparation procedure. For instance, in the case of the spin-1/2 particle ($n = 2$), if $\theta = [\theta _1 , \theta _2 , \theta _3 ]$ is the direction of a filter in the Stern-Gerlach experiment, then x is its one-to-one mapping into ${\overline{{\mathbb {C}}}}^{2}$ (apart from a phase term). For spin greater than 1/2, however, the element $x\in {\overline{{\mathbb {C}}}}^{n}$ cannot directly be interpreted only in terms of a ‘filter direction’. In our framework element x is thus better interpreted as the state of the ontological world, which we have sketched in the Introduction. It is a world that is not directly accessible to an observer inside the theory (Alice), albeit it has implications for observables within such a theory.

For a composite systems of m particles (each one is an $n_j$-level system), the joint possibility space is the Cartesian product

$$\begin{aligned} \varOmega =\times _{j=1}^m {\overline{{\mathbb {C}}}}^{n_j}. \end{aligned}$$

Having defined the possibility space, the next step is the definition of the observables, which define the gambles in our setting. Let us recall that in QT any real-valued observable is described by a Hermitian operator. This naturally imposes restrictions on the type of ‘permitted gambles’ g on a quantum experiment. For a single particle, given a Hermitian operator $G\in {\mathscr {H}}^{n\times n}$ (with ${\mathscr {H}}^{n\times n}$ being the set of Hermitian matrices of dimension $n \times n$), a gamble on $x \in {\overline{{\mathbb {C}}}}^{n}$ can be defined as:

$$\begin{aligned} g(x)=x^\dagger G x. \end{aligned}$$

Since G is Hermitian and x is bounded ($x^{\dagger }x=1$), g is a real-valued bounded function ($g(x)=\langle x|G|x \rangle $ in ‘bra-ket’ notation). For a composite systems of m particles, the gambles are m-quadratic forms:

$$\begin{aligned} g(x_1,\ldots ,x_m)=(\otimes _{j=1}^m x_j)^\dagger G (\otimes _{j=1}^m x_j), \end{aligned}$$

(7)

with $G \in {\mathscr {H}}^{n \times n}$, $n=\prod _{j=1}^m n_j$, and where $\otimes $ denotes the tensor product between vectors regarded as column matrices. Therefore, we have that

$$\begin{aligned} {\mathscr {L}}_R =\{g(x_1,\ldots ,x_m)=(\otimes _{j=1}^m x_j)^\dagger G (\otimes _{j=1}^m x_j)\mid ~ G \in {\mathscr {H}}^{n \times n}, x=[x_1,\ldots ,x_m]\in \varOmega \} \end{aligned}$$

(8)

is the restricted set of ‘permitted gambles’ in a quantum experiment. We can also define the subset of nonnegative gambles ${\mathscr {L}}^{\ge } _R:=\{g \in {\mathscr {L}}_R: \min g\ge 0 \}$ and the subset of negative gambles ${\mathscr {L}}^{<} _R:=\{g \in {\mathscr {L}}_R: \max g < 0 \}$.^{Footnote 16}

Remark 1

(Hidden-variable theories) The model we have just presented has originally been discussed by Holevo in [68, Sect. 1.7], who treats it as a hidden-variable model. For a single particle ($m=1$), Holevo shows that this model does not contradict the existing ‘no-go’ theorems for hidden-variables. For $m\ge 2$ entangled particles, ‘no-go’ theorems apply to this model; in [68, Supplementary 3.2] Holevo discusses a way this model could still be considered a hidden variable model. We will detail these points in Appendices A.2 and A.3.

Remark 2

(The tensor product) In our setting the tensor product is ultimately a derived notion, not a primitive one, as it follows by the properties of m-quadratic forms (see Appendix A.2).

4.2 Polynomial Inference and Agreement with Born’s Rule

For $m=1$ (a single particle), evaluating the nonnegativity of the quadratic form $x^\dagger G x$ boils down to checking whether the matrix G is positive semi-definite (PSD) and therefore the membership problem

$$\begin{aligned} g(x)=x^\dagger G x \;\overset{?}{\in }\; {\mathscr {L}}^{\ge } _R \end{aligned}$$

(9)

can be solved in polynomial time and so can be problem (4). This is no longer true for $m\ge 2$: indeed, in this case there exist polynomials of type (7) that are nonnegative, but whose matrix G is indefinite (it has at least one negative eigenvalue). Moreover, it turns out that problem (4) is not tractable:

Proposition 1

([69]) The problem of checking the nonnegativity of functions of type (7) is NP-hard for $m\ge 2$.

What to do? As discussed previously, we could change the meaning of ‘being nonnegative’ by considering a subset $\varSigma ^{\ge }\subsetneq {\mathscr {L}}^{\ge } $ for which the membership problem, and thus (4), is in P. For functions of type (7), we can extend the notion of nonnegativity that holds for a single particle to $m>1$ particles:

$$\begin{aligned} \varSigma ^{\ge }:=\{g(x_1,\ldots ,x_m)=(\otimes _{j=1}^m x_j)^\dagger G (\otimes _{j=1}^m x_j): G\ge 0\}. \end{aligned}$$

(10)

That is, the function is ‘nonnegative’ whenever G is PSD. Note that $\varSigma ^{\ge }$ is the so-called cone of Hermitian sum-of-squares polynomials (see Sect. A.4), and that in $\varSigma ^{\ge }$ the nonnegative constant functions take the form $g(x_1,\ldots ,x_m)=c (\otimes _{j=1}^m x_j)^\dagger I (\otimes _{j=1}^m x_j)$ with $c\ge 0$.

Now, consider any set of desirable gambles ${\mathcal {C}}$ satisfying B0–B2 with the given definition of (10); this results in an algorithmic rationality theory that is precisely QT. In other words, from the algorithmic rationality axioms and the given definition of (10), we can derive the first postulate of QT (see for instance Postulate 1 in [70, p. 110]):

Associated to any isolated physical system is a complex vector space with inner product (that is, a Hilbert space) known as the state space of the system. The system is completely described by its density operator, which is a positive operator $\rho $ with trace one, acting on the state space of the system.

Indeed, although the possibility space $\varOmega $ is infinite (e.g., the ‘directions’ of the particle’s spins), the vector space of gambles $ {\mathscr {L}}_R$ is finite dimensional: any polynomial $(\otimes _{j=1}^m x_j)^{\dagger }G(\otimes _{j=1}^m x_j) \in {\mathscr {L}}_R$ can then be written as the inner product of a vector of complex coefficients, coming from the matrix G, and a vector of complex monomials: the elements of the matrix $(\otimes _{j=1}^m x_j) (\otimes _{j=1}^m x_j)^{\dagger }$ that constitute the basis of the vector space $ {\mathscr {L}}_R$. Therefore the dual space $ {\mathscr {L}}^*_R$ is finite dimensional too and corresponds to the space of linear operators ${\tilde{L}}: {\mathbb {C}}\rightarrow {\mathbb {C}}$, whose basis is given by the elements of the matrix ${\tilde{L}}((\otimes _{j=1}^m x_j) (\otimes _{j=1}^m x_j)^{\dagger })$ (where ${\tilde{L}}$ is applied component-wise to $(\otimes _{j=1}^m x_j) (\otimes _{j=1}^m x_j)^{\dagger }$).

Said that, let ${\mathscr {G}}$ be a finite set of assessments, and ${\mathcal {K}}$ the deductive closure as defined by B1; it is not difficult to prove that the dual of ${\mathcal {K}}$ is

$$\begin{aligned} {\mathcal {Q}}&=\{ \rho \in {\mathcal {S}} \mid Tr(G \rho ) \ge 0,~ ~\forall g \in {\mathscr {G}}\}, \end{aligned}$$

(11)

where ${\mathcal {S}}=\{ \rho \in {\mathscr {H}}^{n\times n} \mid \rho \ge 0,~~Tr(\rho )=1\}$ is the set of all density matrices. As before, whenever the set ${\mathcal {C}}$ representing Alice’s beliefs about the experiment is coherent, Eq. (11) means that desirability implies nonnegative ‘expected value’ for all models in ${\mathcal {Q}}$. Note that in QT the expectation of g is $Tr(G \rho )$. This follows by Born’s rule, a law giving the probability that a measurement on a quantum system will yield a given result.

The agreement with Born’s rule is an important constraint in any alternative axiomatisation of QT. This is also the case of our theory, but in the sense that Born’s rule can be derived from it. In fact, in the view of a density matrix as a dual operator, $\rho $ is formally equal to

$$\begin{aligned} \rho ={\tilde{L}}\left( (\otimes _{j=1}^m x_j)(\otimes _{j=1}^m x_j)^\dagger \right) . \end{aligned}$$

(12)

Example 1

Consider the case $n=m=2$, then

$$\begin{aligned} {\tilde{L}}\left( (x_1 \otimes x_2)^{\dagger } G (x_1 \otimes x_2)\right) =Tr\left( G {\tilde{L}}\left( (\otimes _{j=1}^2 x_j)(\otimes _{j=1}^2 x_j)^\dagger \right) \right) ; \end{aligned}$$

this follows by the linearity of the trace operator. The expression ${\tilde{L}}\left( (\otimes _{j=1}^2 x_j)(\otimes _{j=1}^2 x_j)^\dagger \right) $ means that the operator ${\tilde{L}}$ is applied component-wise to the elements of the matrix $(\otimes _{j=1}^2 x_j)(\otimes _{j=1}^2 x_j)^\dagger $:

$$\begin{aligned} {\tilde{L}}\left( (\otimes _{j=1}^2 x_j)(\otimes _{j=1}^2 x_j)^\dagger \right) = {\tilde{L}}\left( \left[ {\begin{matrix}x_{11} x_{11}^{\dagger } x_{21} x_{21}^{\dagger } &{} x_{11}^{\dagger } x_{12} x_{21} x_{21}^{\dagger } &{} x_{11} x_{11}^{\dagger } x_{21}^{\dagger } x_{22} &{} x_{11}^{\dagger } x_{12} x_{21}^{\dagger } x_{22}\\ x_{11} x_{12}^{\dagger } x_{21} x_{21}^{\dagger } &{} x_{12} x_{12}^{\dagger } x_{21} x_{21}^{\dagger } &{} x_{11} x_{12}^{\dagger } x_{21}^{\dagger } x_{22} &{} x_{12} x_{12}^{\dagger } x_{21}^{\dagger } x_{22}\\ x_{11} x_{11}^{\dagger } x_{21} x_{22}^{\dagger } &{} x_{11}^{\dagger } x_{12} x_{21} x_{22}^{\dagger } &{} x_{11} x_{11}^{\dagger } x_{22} x_{22}^{\dagger } &{} x_{11}^{\dagger } x_{12} x_{22} x_{22}^{\dagger }\\ x_{11} x_{12}^{\dagger } x_{21} x_{22}^{\dagger } &{} x_{12} x_{12}^{\dagger } x_{21} x_{22}^{\dagger } &{} x_{11} x_{12}^{\dagger } x_{22} x_{22}^{\dagger } &{} x_{12} x_{12}^{\dagger } x_{22} x_{22}^{\dagger }\end{matrix}}\right] \right) , \end{aligned}$$

(13)

where the monomials inside the above matrix constitute the basis of ${\mathscr {L}}_R$ and ${\tilde{L}}:{\mathbb {C}}\rightarrow {\mathbb {C}}$, so:

$$\begin{aligned} \rho := {\tilde{L}}\left( (\otimes _{j=1}^2 x_j)(\otimes _{j=1}^2 x_j)^\dagger \right) =\left[ \begin{matrix} \rho _{11} &{} \rho _{12} &{} \rho _{13} &{} \rho _{14}\\ \rho _{12}^{\dagger } &{} \rho _{22} &{} \rho _{23} &{} \rho _{24}\\ \rho _{13}^{\dagger } &{} \rho _{23}^{\dagger } &{} \rho _{33} &{} \rho _{34}\\ \rho _{14}^{\dagger } &{} \rho _{24}^{\dagger } &{} \rho _{34}^{\dagger } &{} \rho _{44} \end{matrix}\right] , \end{aligned}$$

(14)

$\rho _{11}={\tilde{L}}(x_{11} x_{11}^{\dagger } x_{21} x_{21}^{\dagger })\in {\mathbb {C}}$, $\rho _{12}={\tilde{L}}(x_{11}^{\dagger } x_{12} x_{21} x_{21}^{\dagger } )\in {\mathbb {C}}$, etcetera.

Hence, when a projection-valued measurement characterised by the projectors $\varPi _1,\ldots ,\varPi _n$ is considered, it holds that

$$\begin{aligned} {\tilde{L}}( (\otimes _{j=1}^m x_j)^\dagger \varPi _i (\otimes _{j=1}^m x_j))=Tr(\varPi _i {\tilde{L}}((\otimes _{j=1}^m x_j)(\otimes _{j=1}^m x_j)^\dagger ))=Tr(\varPi _i \rho ). \end{aligned}$$

Since $\varPi _i\ge 0$ and the polynomials $(\otimes _{j=1}^m x_j)^\dagger \varPi _i (\otimes _{j=1}^m x_j)$ for $i=1,\ldots ,n$ form a partition of unity, i.e.:

$$\begin{aligned} \sum _{i=1}^n (\otimes _{j=1}^m x_j)^\dagger \varPi _i (\otimes _{j=1}^m x_j)= (\otimes _{j=1}^m x_j)^\dagger I (\otimes _{j=1}^m x_j)=1, \end{aligned}$$

we have that

$$\begin{aligned} Tr(\varPi _i \rho )\in [0,1] \text { and } \sum _{i=1}^n Tr(\varPi _i \rho )=1, \end{aligned}$$

which is Born’s rule.

Remark 3

(Discrete vs. continuous space probability) Quantum measurements are discrete: when we perform a measurement, we observe a detection along one of the directions $ \varPi _i$. This phenomenon of quantisation is one of the major differences between quantum and classical physics. We took it into account in the choice of the framework, the possibility space being (only) the ‘directions’ of the particle’s spins and the measurement apparatus sensing only certain fixed ‘directions’ ($x^\dagger \varPi _i x=x^\dagger v_iv_i^{\dagger } x$ is a function of two ‘directions’ x and $v_i$). Despite its centrality, we want however to point out that quantisation is not the source of Bell-like inequalities and entanglement. As said before, this is because ‘quantum weirdness’ is intrinsic to any theory of algorithmic rationality as above, and is hence not confined to QT only.

It is often claimed that QT includes classical probability theory (CPT) as a special case, or better that QT includes discrete-space CPT.^{Footnote 17} However, as the possibility space $\varOmega $ is infinite (e.g., the ‘directions’ of the particle’s spins), in this paper when we speak about CPT (and compare it with QT), we mean continuous-space classical probability theory (in the complex numbers). Hence again, since both B1,B2 and A1,A2 are the same logical postulates parametrised by the appropriate meaning of ‘being negative/nonnegative’, the only axiom truly separating (continuous-space) classical probability theory from the quantum one is B0 (with the specific form of (10)), thus implementing the requirement of computational efficiency.

In other words, we claim that QT is ‘easier’ than CPT because, once the appropriate possibility space, observables and queries are specified, evaluating the consistency of the theory is NP-hard for CPT. In QT, we realise this clearly when we try to address the question of whether or not an experimentally generated state is entangled. We will discuss in Sect. 4.3 that determining entanglement of a general state is equivalent to proving the nonnegativity of a polynomial that, as we discussed in Proposition 1, is NP-hard. In fact, we can reformulate the entanglement witness theorem as the clash between the classical notion of coherence and A-coherence (see Theorem 2).

Remark 4

(Truncated moment matrices vs. density matrices) In a single particle system of dimension n, $\rho ={\tilde{L}}(x x^{\dagger })$. In such case, $ \rho $ can be interpreted as a truncated moment matrix, i.e., there exists a probability distribution on the complex vector $x\in \varOmega $ such that

$$\begin{aligned} \rho =\int _{x\in \varOmega } xx^{\dagger } d\mu (x). \end{aligned}$$

(15)

In fact, consider the eigenvalue-eigenvector decomposition of the density matrix:

$$\begin{aligned} \rho =\sum \limits _{i=1}^n \lambda _i v_iv_i^{\dagger }, \end{aligned}$$

with $\lambda _i\ge 0$ and $v_i \in {\mathbb {C}}^{n}$ being orthonormal. We can define the probability distribution

$$\begin{aligned} \mu (x)= \sum \limits _{i=1}^n \lambda _i \delta _{v_i}(x), \end{aligned}$$

where $\delta _{v_i}$ is an atomic charge (Dirac’s delta) on $v_i$. Then it is immediate to verify that

$$\begin{aligned} \int _{x\in \varOmega } x x^{\dagger }d\mu (x)=\sum \limits _{i=1}^n \lambda _i v_iv_i^{\dagger }=\rho . \end{aligned}$$

In Sect. 4.4, we will extend this result to separable states. Note also that a truncated moment matrix does not uniquely define a probability distribution, i.e., for a given $\rho $ there may exist two probability distributions $\mu _1(x)\ne \mu _2(x)$ such that

$$\begin{aligned} \rho =\int _{x\in \varOmega } xx^{\dagger } d\mu _1(x)=\int _{x\in \varOmega } xx^{\dagger } d\mu _2(x). \end{aligned}$$

This means that, if we interpret $\rho $ as a truncated moment matrix and thus defining via (15) a closed convex set of probabilities (more precisely charges), QT is a theory of imprecise probability [53]. We will discuss more on this topic in Sect. A.3. In fact, Karr [71] has proved that the set of probabilities, which are feasible for the truncated moment constraint, e.g., $\rho ={\tilde{L}}(x x^{\dagger })$, is convex and compact with respect to the weak$^*$-topology. Moreover, the extreme points of this set are probabilities that have at finite number of distinct points of support (e.g., they are finite mixtures of Dirac’s deltas). A similar characterisation for POVM measurements is discussed in the QT context in [72].

The case of a many-particle system is discussed in the next sections.

4.3 Entanglement

Entanglement is usually presented as a characteristic of QT. In this section we are going to show that it is actually an immediate consequence of algorithmic rationality.

To illustrate the emergence of entanglement from A-coherence, we verify that the set of desirable gambles whose dual is an entangled density matrix $\rho _{e}$ includes a negative gamble that is not in $\varSigma ^{<} $, and thus, although being logically coherent, it cannot be given a classical probabilistic interpretation.

In what follows we focus only on bipartite systems $\varOmega _A \times \varOmega _B$, with $n=m=2$. The results are nevertheless general.

Let $(x,y) \in \varOmega _A \times \varOmega _B$, where $x=[x_1,x_2]^T$ and $y=[y_1,y_2]^T$. We aim at showing that there exists a gamble $h(x,y)=(x \otimes y)^{\dagger } H (x \otimes y) $ satisfying:

$$\begin{aligned} \begin{aligned} Tr(H \rho _{e})&\ge 0 \text { and }\\ h(x,y)=(x \otimes y)^{\dagger } H (x \otimes y)&< 0 \text { for all } (x,y)\in \varOmega _A\times \varOmega _B.\\ \end{aligned} \end{aligned}$$

(16)

The first inequality says that h is desirable in ${\mathcal {T}}^\star $. That is, h is a gamble desirable to Alice whose beliefs are represented by $\rho _{e}$. The second inequality says that h is negative and, therefore, leads to a sure loss in ${\mathcal {T}}$. By B0–B2, the inequalities in (16) imply that H must be an indefinite Hermitian matrix.

Assume that $n=m=2$ and consider the entangled density matrix:

$$\begin{aligned} \rho _{e}=\frac{1}{2}\begin{bmatrix} 1 &{} 0 &{} 0 &{}1\\ 0 &{} 0 &{} 0 &{}0\\ 0 &{} 0 &{} 0 &{}0\\ 1 &{} 0 &{} 0 &{}1\\ \end{bmatrix}, \end{aligned}$$

and the Hermitian matrix:

$$\begin{aligned} H=\left[ \begin{matrix}0.0 &{} 0.0 &{} 0.0 &{} 1.0\\ 0.0 &{} -2.0 &{} 1.0 &{} 0.0\\ 0.0 &{} 1.0 &{} -2.0 &{} 0.0\\ 1.0 &{} 0.0 &{} 0.0 &{} 0.0\end{matrix}\right] . \end{aligned}$$

This matrix is indefinite (its eigenvalues are $\{1, -1, -1, -3\}$) and is such that $Tr(H\rho _{e})=1$. Since $Tr(H\rho _{e})\ge 0$, the gamble

$$\begin{aligned} \begin{aligned} (x \otimes y)^{\dagger } H (x \otimes y)&= - 2 x_{1} x_1^{\dagger } y_{2} y_2^{\dagger } + x_{1} x_2^{\dagger } y_{1} y_2^{\dagger } + x_{1} x_2^{\dagger } y_1^{\dagger } y_{2} + x_1^{\dagger } x_{2} y_{1} y_2^{\dagger } \\&\qquad + x_1^{\dagger } x_{2} y_1^{\dagger } y_{2} - 2 x_{2} x_2^{\dagger } y_{1} y_1^{\dagger }, \end{aligned} \end{aligned}$$

(17)

is desirable for Alice in ${\mathcal {T}}^\star $.

Let $x_i=x_{ia}+\iota x_{ib}$ and $y_i=y_{ia}+\iota y_{ib}$ with $x_{ia},x_{ib},y_{ia},y_{ib}\in {\mathbb {R}}$, for $i=1,2$, denote the real and imaginary components of x, y. Then

$$\begin{aligned} \begin{aligned} (x \otimes y)^{\dagger } H (x \otimes y)&=- 2 x_{1a}^{2} y_{2a}^{2} - 2 x_{1a}^{2} y_{2b}^{2} + 4 x_{1a} x_{2a} y_{1a} y_{2a} + 4 x_{1a} x_{2a} y_{1b} y_{2b} \\&\quad - 2 x_{1b}^{2} y_{2a}^{2} - 2 x_{1b}^{2} y_{2b}^{2} + 4 x_{1b} x_{2b} y_{1a} y_{2a} + 4 x_{1b} x_{2b} y_{1b} y_{2b}\\&\quad - 2 x_{2a}^{2} y_{1a}^{2} - 2 x_{2a}^{2} y_{1b}^{2} - 2 x_{2b}^{2} y_{1a}^{2} - 2 x_{2b}^{2} y_{1b}^{2}\\&=-(\sqrt{2}x_{1a}y_{2a}-\sqrt{2}x_{2a}y_{1a})^2-(\sqrt{2}x_{1a}y_{2b}-\sqrt{2}x_{2a}y_{1b})^2\\&\quad -(\sqrt{2}x_{1b}y_{2b}-\sqrt{2}x_{2b}y_{1b})^2 -(\sqrt{2}x_{2b}y_{1a}-\sqrt{2}x_{2a}y_{1b})^2<0. \end{aligned} \end{aligned}$$

(18)

This is the essence of the quantum puzzle: ${\mathcal {C}}$ is A-coherent but (Theorem 1) there is no ${\mathscr {P}}$ associated to it and therefore, from the point of view of Isaac, who holds a classical probabilistic interpretation, it is not coherent: in any classical description of the composite quantum system, x and y appear to be entangled in a way unusual for classical subsystems.

As previously mentioned, there are two possible ways out from this impasse: to claim the existence of either non-classical evaluation functionals or of negative probabilities. Let us examine them in turn.

(1)
Existence of non-classical evaluation functionals: From an informal betting perspective, the effect of a quantum experiment on h(x, y) is to evaluate this polynomial to return the payoff for Alice. By Theorem 1, there is no compatible classical evaluation functional, and thus in particular no values $x,y\in \varOmega _A \times \varOmega _B$ such that $ h(x,y)=1$. Hence, if we adopt this point of view, we have to find another, non-classical, explanation for $ h(x,y)=1$. The following evaluation functional, denoted as $ev(\cdot )$, may do the job:
$$\begin{aligned} \text {ev}\left( \begin{bmatrix} x_1y_1\\ x_2y_1\\ x_1y_2\\ x_2y_2\\ \end{bmatrix}\right) =\begin{bmatrix} \tfrac{\sqrt{2}}{2}\\ 0\\ 0\\ \tfrac{\sqrt{2}}{2}\\ \end{bmatrix},~\text {which implies}~ \text {ev}\left( (x \otimes y)^{\dagger } H (x \otimes y)\right) =1. \end{aligned}$$
Note that $x_1y_1=\tfrac{\sqrt{2}}{2}$ and $x_2y_1=0$ together imply that $x_2=0$, which contradicts $x_2y_2=\tfrac{\sqrt{2}}{2}$. Similarly, $x_2y_2=\tfrac{\sqrt{2}}{2}$ and $x_1y_2=0$ together imply that $x_1=0$, which contradicts $x_1y_1=\tfrac{\sqrt{2}}{2}$. Hence, as expected, the above evaluation functional is non-classical. It amounts to assigning a value to the products $x_iy_j$ but not to the single components of x and y separately. Quoting Holevo in [68, Supplement 3.4]:

entangled states are holistic entities in which the single components only exist virtually.
(2)
Existence of negative probabilities: Negative probabilities are not an intrinsic characteristic of QT. They appear whenever one attempts to explain QT ‘classically’ by looking at the space of charges on $\varOmega $. To see this, consider $\rho _e$, and assume that, based on (12), one calculates:
$$\begin{aligned} \int \begin{bmatrix}x_{1} x_1^{\dagger } y_{1} y_1^{\dagger } &{} x_1^{\dagger } x_{2} y_{1} y_1^{\dagger } &{} x_{1} x_1^{\dagger } y_1^{\dagger } y_{2} &{} x_1^{\dagger } x_{2} y_1^{\dagger } y_{2}\\ x_{1} x_2^{\dagger } y_{1} y_1^{\dagger } &{} x_{2} x_2^{\dagger } y_{1} y_1^{\dagger } &{} x_{1} x_2^{\dagger } y_1^{\dagger } y_{2} &{} x_{2} x_2^{\dagger } y_1^{\dagger } y_{2}\\ x_{1} x_1^{\dagger } y_{1} y_2^{\dagger } &{} x_1^{\dagger } x_{2} y_{1} y_2^{\dagger } &{} x_{1} x_1^{\dagger } y_{2} y_2^{\dagger } &{} x_1^{\dagger } x_{2} y_{2} y_2^{\dagger }\\ x_{1} x_2^{\dagger } y_{1} y_2^{\dagger } &{} x_{2} x_2^{\dagger } y_{1} y_2^{\dagger } &{} x_{1} x_2^{\dagger } y_{2} y_2^{\dagger } &{} x_{2} x_2^{\dagger } y_{2} y_2^{\dagger }\end{bmatrix} d \mu (x,y)=\frac{1}{2}\begin{bmatrix} 1 &{} 0 &{} 0 &{}1\\ 0 &{} 0 &{} 0 &{}0\\ 0 &{} 0 &{} 0 &{}0\\ 1 &{} 0 &{} 0 &{}1\\ \end{bmatrix}. \end{aligned}$$
(19)
Because of Theorem 1, there is no probability charge $\mu $ satisfying these moment constraints, the only compatible being quasi-probabilities. Table 1 reports the nine components and corresponding weights of one of them:
$$\begin{aligned} \mu (x,y)=\sum \limits _{i=1}^{9}w_i\delta _{\{(x^{(i)},y^{(i)})\}}(x,y) ~~\text { with }~~ (w_i,x^{(i)},y^{(i)}) ~~\text { as in Table}~1. \end{aligned}$$
(20)
Note that some of the weights are negative but $\sum _{i=1}^{9}w_i=1$, meaning that we have an affine combination of atomic charges (Dirac’s deltas). Consider for instance the first monomial $x_{1} x_1^{\dagger } y_{1} y_1^{\dagger }$ in (12), its expectation w.r.t. the above charge is
$$\begin{aligned} \begin{aligned}&\int x_{1} x_1^{\dagger } y_{1} y_1^{\dagger } \left( \sum _{i=1}^{9}w_i\delta _{\{(x^{(i)},y^{(i)})\}}(x,y)\right) dxdy=\sum _{i=1}^{9}w_i x^{(i)}_{1} {x^{(i)}_1}^{\dagger } y^{(i)}_{1} {y^{(i)}_1}^{\dagger }\\&\quad = 0.4805 (-0.0963 - 0.6352\iota )(-0.0963 + 0.6352\iota )(-0.3727 \\&\qquad - 0.3899\iota )(-0.3727 + 0.3899\iota )\\&\qquad + 0.7459 (0.251 - 0.9665\iota )(0.251 + 0.9665\iota )(-0.1628 \\&\qquad + 0.561\iota )(-0.1628 - 0.561\iota )\\&\qquad +\dots \\&\qquad + 0.1755(-0.1255 - 0.3078\iota )(-0.1255 + 0.3078\iota )(0.0933 \\&\qquad - 0.4588\iota )(0.0933 + 0.4588\iota )=\frac{1}{2}. \end{aligned} \end{aligned}$$

Table 1 Weights for the charge in (12)

Full size table

The charge described in Table 1 is one among the many that satisfy (12) and has been derived numerically. Explicit procedure for constructing such negative-probability representations have been developed in [73,74,75,76].

Again, we want to stress that the two above paradoxical interpretations are a consequence of Theorem 1, and therefore can emerge when considering any instance of a theory of A-coherence in which the hypotheses of this result hold.

4.4 Entanglement Witness

Do quantum and classical probability sometimes agree? Yes they do, but when at play there are density matrices $\rho $ such that Eq. (16) does not hold, and thus in particular for separable density matrices. We make this claim precise by providing a link between Eq. (16) and the entanglement witness theorem [77, 78].

We first report the definition of entanglement witness [79, Sect. 6.3.1]:

Definition 2

(Entanglement witness) A Hermitian operator $W \in {\mathscr {H}}^{n_1 \times n_2} $ is an entanglement witness if and only if W is not a positive operator but $(x_1 \otimes x_2)^{\dagger } W (x_1 \otimes x_2) \ge 0$ for all vectors $(x_1,x_2)\in \varOmega _1\times \varOmega _2$.^{Footnote 18}

The next well-known result (see, e.g., [79, Theorem 6.39, Corollary 6.40]) provides a characterisation of entanglement and separable states in terms of entanglement witness.

Proposition 2

A state $\rho _e$ is entangled if and only if there exists an entanglement witness W such that $Tr(\rho _e W ) < 0$. A state is separable if and only if $Tr(\rho _e W ) \ge 0$ for all entanglement witnesses W.

Assume that W is an entanglement witness for the entangled density matrix $\rho _e$ and consider $W'=-W$. By Definition 2 and Proposition 2, it follows that

$$\begin{aligned} Tr(\rho _e W' ) > 0 \text { and } (x_1 \otimes x_2)^{\dagger } W' (x_1 \otimes x_2) \le 0. \end{aligned}$$

(21)

The first inequality states that the gamble $(x_1 \otimes x_2)^{\dagger } W' (x_1 \otimes x_2)$ is strictly desirable for Alice (in theory ${\mathcal {T}}^\star $) given her belief $\rho _e$. Since the set of desirable gambles (B1) associated to $\rho _e$ is closed, there exists $\epsilon >0$ such that $W'=W'-\epsilon I$ is still desirable, i.e, $Tr(\rho _e W' )\ge 0$ and

$$\begin{aligned} (x_1 \otimes x_2)^{\dagger } W' (x_1 \otimes x_2) = (x_1 \otimes x_2)^{\dagger } W' (x_1 \otimes x_2) - \epsilon <0, \end{aligned}$$

where we have exploited that $(x_1 \otimes x_2)^{\dagger } \epsilon I (x_1 \otimes x_2)= \epsilon $. Therefore, (21) is equivalent to

$$\begin{aligned} Tr(\rho _e W' ) \ge 0 \text { and } (x_1 \otimes x_2)^{\dagger } W' (x_1 \otimes x_2) <0, \end{aligned}$$

(22)

which is the same as (16).

Hence, by Theorem 1, we can equivalently formulate the entanglement witness theorem as an arbitrage/Dutch book:

Theorem 2

Let ${\mathcal {C}}=\{g(x_1,\ldots ,x_m)=(\otimes _{j=1}^m x_j)^\dagger G (\otimes _{j=1}^m x_j)\mid Tr(G{\tilde{\rho }})\ge 0\}$ be the set of desirable gambles corresponding to some density matrix ${\tilde{\rho }}$. The following claims are equivalent:

1.
${\tilde{\rho }}$ is entangled;
2.
${{\,\mathrm{posi}\,}}({\mathcal {C}}\cup {\mathscr {L}}_R^{\ge })$ is not coherent in ${\mathcal {T}}$.

This result provides another view of the entanglement witness theorem in light of A-coherence. In particular, it tells us that the existence of a witness satisfying Eq. (21) boils down to the disagreement on rationality (coherence) between Isaac’s classical probabilistic interpretation and Alice’s theory ${\mathcal {T}}^\star $, and therefore that whenever they agree it means that $\rho _e$ is separable. This connection explains why the problem of characterising entanglement is hard in QT: it amounts to proving the negativity of a function, which is NP-hard. We can also prove the following

Corollary 1

Let ${\tilde{\rho }}$ be separable, then ${\tilde{\rho }}$ is a truncated moment matrix.

In other words, when ${\tilde{\rho }}$ are separable, we have an agreement between the Isaac’s classical view and Alice’s theory ${\mathcal {T}}^\star $ of rationality, and therefore we can give ${\tilde{\rho }}$ a fully classical probabilistic interpretation by regarding it as a truncated moment matrix.

5 A Theory of Algorithmic Rationality and Entanglement in the Reals

In this section we are going to present an example of entanglement in an A-coherent theory of probability that is different from QT. For this purpose, we consider two classical coins, which we denote as l (left) and, respectively, r (right), and define

$$\begin{aligned} \begin{bmatrix} \theta _1\\ \theta _2\\ \theta _3\\ 1-\theta _1-\theta _2-\theta _3 \end{bmatrix} =\text {Prob}\begin{bmatrix} H_lH_r\\ T_lH_r\\ H_lT_r\\ T_l,T_r\end{bmatrix}, \end{aligned}$$

where $H_i,T_j$ denote the outcome heads and, respectively, tails for the left or right coin. We consider the possibility space

$$\begin{aligned} \varOmega =\left\{ \theta \in {\mathbb {R}}^3: ~ \theta _1,\theta _2,\theta _3\ge 0, ~~1-\theta _1-\theta _2-\theta _3\ge 0 \right\} . \end{aligned}$$

(23)

Note that the following marginal relationships hold:

$$\begin{aligned} \begin{aligned} \theta _{H_l}=\text {Prob}(H_l)=\theta _1+\theta _3,~~\theta _{H_r}=\text {Prob}(H_r)=\theta _1+\theta _2. \end{aligned} \end{aligned}$$

As the space of gambles ${\mathscr {L}}_R$, we consider the set of all polynomials of the unknowns $\theta =[\theta _1,\theta _2,\theta _3]$ of degree 2:^{Footnote 19}

$$\begin{aligned} {\mathscr {L}}_R=\{g(\theta ): g(\theta ) \text { is a degree 2 polynomial}\}. \end{aligned}$$

(24)

For instance, these are two elements of ${\mathscr {L}}_R$:

$$\begin{aligned} g_1(\theta )&=\theta _1^2-\theta _2^2+2\theta _1\theta _3+2\theta _2\theta _3 -\theta _1 -\theta _3, \end{aligned}$$

(25)

$$\begin{aligned} g_2(\theta )&=\theta _1 +\theta _2^2 +3 \theta _3. \end{aligned}$$

(26)

Evaluating the nonnegativity of polynomials in ${\mathscr {L}}_R$ is in general NP-hard. Therefore, Alice may not have the computational resources to enforce full rationality, A0–A2, or, equivalently, to solve (4).

However, she can use a quick algorithm to prove a sufficient condition for a polynomial in ${\mathscr {L}}_R$ to be nonnegative: a polynomial of $\theta $ is nonnegative in $ \varOmega $ if its coefficients are nonnegative. For instance, under this criterion, Alice can easily verify that $g_2(\theta )$ is nonnegative.

Proposition 3

Let ${\mathcal {C}}\in {\mathscr {L}}_R$ be a set of desirable gambles satisfying B0, B1, with $ {\mathscr {L}}_R$ defined in (24) and $\varSigma ^{\ge }$ defined as follows:^{Footnote 20}

$$\begin{aligned} \begin{aligned} \varSigma ^{\ge }=\Bigg \{ \sum \limits _{\alpha _i\ge 0, \alpha _1+\alpha _2+\alpha _3+\alpha _4\le 2} u_{{\alpha _1\alpha _2\alpha _3\alpha _4}} \theta _{1}^{\alpha _1}\theta _{2}^{\alpha _2}\theta _{3}^{\alpha _3}(1-\theta _1-\theta _2-\theta _3)^{\alpha _{4}} : u_{{\alpha _1\alpha _2\alpha _3\alpha _4}}\in {\mathbb {R}}^{\ge } \Bigg \}, \end{aligned} \end{aligned}$$

(27)

which is the cone of (multivariate) Bernstein’s polynomials of degree less than, equal to 2. A-coherence of ${\mathcal {C}}$ (or equivalently B2) can be proven in polynomial time by solving a linear programming problem.

Therefore, the definition of nonnegativity (27) gives an algorithmic efficient way to assess rationality: linear programming.

Also in this case, we can define the dual operator ${\tilde{L}}$. First of all, observe that the vector of monomials $b(\theta )=[1, \theta _1, \theta _2, \theta _3, \theta _1\theta _2 , \theta _1\theta _3 , \theta _2\theta _3, \theta _1^2, \theta _2^2, \theta _3^2]$ constitues a basis for $ {\mathscr {L}}_R$ in (24). Therefore, the dual space $ {\mathscr {L}}^*_R$ corresponds to the space of linear operators ${\tilde{L}}: {\mathbb {R}}\rightarrow {\mathbb {R}}$, whose basis is given by the elements of the matrix ${\tilde{L}}(b)$, where ${\tilde{L}}$ is applied component-wise to the elements of $b(\theta )$. The dual of an A-coherent set of desirable gambles ${\mathcal {C}}$ is

$$\begin{aligned} {\mathcal {C}}^\circ =\left\{ L \in {\mathsf {S}} \mid L(g)\ge 0, ~\forall g \in {\mathscr {G}}\right\} , \end{aligned}$$

(28)

where ${\mathsf {S}}=\{ {\tilde{L}} \in {\mathscr {L}}_R^* \mid {\tilde{L}}(1)=1,~{\tilde{L}}(g)\ge 0~\forall g \in \varSigma ^{\ge }\}$ is the set of states.

Consider for instance the state:

$$\begin{aligned} \begin{aligned} {\tilde{L}}(\theta _1)=1/3&{\tilde{L}}(\theta _1^2)&=1/3\\ {\tilde{L}}(\theta _2)=1/6&{\tilde{L}}(\theta _2^2)&=0\\ {\tilde{L}}(\theta _3)=1/6&{\tilde{L}}(\theta _3^2)&=0\\ {\tilde{L}}(\theta _1\theta _2)=0&{\tilde{L}}(\theta _1\theta _3)&=0\\ {\tilde{L}}(\theta _2\theta _3)=1/6&{\tilde{L}}(1)&=1,\\ \end{aligned} \end{aligned}$$

(29)

which, as it can be verified, belongs to ${\mathsf {S}}$. We aim at showing that there exists a gamble $h \in {\mathscr {L}}_R$ such that:

$$\begin{aligned} \begin{aligned} {\tilde{L}}(h)&\ge 0 \text { and }\\ h(\theta )&< 0 \text { for all } \theta \in \varOmega .\\ \end{aligned} \end{aligned}$$

(30)

Consider the polynomial gamble:

$$\begin{aligned} h(\theta )=g_1(\theta ) -\epsilon , \end{aligned}$$

with $\epsilon >0$ and $g_1$ defined in (26). It can be shown that $h(\theta )\le -\epsilon $ and so the polynomial is negative. However, its ‘expectation’ w.r.t. the state (29) is equal to

$$\begin{aligned} {\tilde{L}}(h)={\tilde{L}}(\theta _1^2)-{\tilde{L}}(\theta _2^2)+{\tilde{L}}(2\theta _1\theta _3)+{\tilde{L}}(2\theta _2\theta _3) -{\tilde{L}}(\theta _1) -{\tilde{L}}(\theta _3)-{\tilde{L}}(\epsilon ) =\frac{1}{6}-\epsilon \ge 0. \end{aligned}$$

Therefore, we have violated an inequality that holds in classical probability ($E(h)\le -\epsilon $ in ${\mathcal {T}}$), although the set of desirable gambles

$$\begin{aligned} {\mathcal {C}}=\{g \in {\mathscr {L}}_R \mid {\tilde{L}}(g)\ge 0\}, \end{aligned}$$

with ${\tilde{L}}$ defined in (29), is logically consistent in ${\mathcal {T}}^\star $ (A-coherent). This is the essence of Bell’s type inequalities: the quantum weirdness that is also present in this example.

It is then possible [80] to set up a thought experiment where two coins are drawn from a bag in the state (29). If we give the left coin to Alice and the right coin to her friend Bob as depicted in Fig. 3, then we can show that after the coins move apart, there are ‘matching’ correlations between the output of their toss. That is, if Alice measures (through a toss) the bias of one coin, then she can predict with certainty the outcome of the measurement (toss) on the other coin. This correlation cannot be explained classically, because there does not exist any classical correlation model that can violate the Bell’s type inequality (30). We have entanglement!

6 Discussions

This paper grew out of our desire to understand QT, in the sense of giving it a meaning clear to us. We have been favoured in this by the fact that we have quite a strong background on the foundations of probability, and QT, mathematically, can be regarded as a generalised theory of probability. But, given this, why is probability generalised in such a way, and what does it mean?

We believe that the present paper, without aiming at reconstructing QT, provides a new way to explain the differences between classical and quantum probability: the algorithmic intractability of classical probability theory contrasted to the polynomial-time complexity of QT.

We have obtained this result in a setting that is more general than QT itself. Our ‘weirdness theorem’ establishes that the weirdness of QT is not exclusive to QT: it appears in any probabilistic theory that is (i) logically consistent and (ii) computationally bounded.^{Footnote 21} QT is just a special case, in the same way as our theory of Bernstein polynomials is another special case.

Yet, our result does talk in particular of QT. And hence it is interesting to know, for one thing, that QT is logically consistent, in the sense that it is a mathematical theory that cannot be proven inconsistent from the inside, by Alice. But it is actually inconsistent from the ‘outside’, i.e., from the point of view of our external observer Isaac that has unbounded computational capabilities, who, in other words, identifies rationality with the logical consistency of classical probability theory. This is the essence of the clash between classical and quantum physics. It also explains why QT is so peculiarly hard for us to grasp: because to classic eyes there is a degree of incoherence in it; and we tend be able to actually understand only logically consistent theories or ideas.

We believe that such a degree of incoherence is also the reason why we should abandon our attempt to reconcile traditional physics with quantum theory. In our narration, such an abandonment is embodied by the metaphor of a computer that ‘runs the universe’. This is not a new idea at all [86]. However, it is new in the sense that the computer has limits due to the algorithmic nature of its tasks; and this is the reason for the weirdness of QT. Stated differently, what follows from this work, in our view, is that there is room for the idea of a more fundamental reality than classical physics, a reality that is just computational. It is by detaching computation from classical physics, in such a way, that we can finally have a solid grip on the meaning of QT and eventually being able to identify the specific features of our world that ground its use.

In order to hold onto some purely physical intuition, instead, one might want to consider for instance the many-worlds interpretation of QT [13], as many physicists do nowadays. It is certainly a fascinating view of QT, of which we feel the appeal. However, we perceive also the discomfort of having to embrace an interpretation that appears to require an incommensurably huge, and possibly infinite, amount of resources in order to have a universe that branches continuously in multiple copies of itself. Our own algorithmically bounded theory is much more parsimonious. It tells us that we can implement a quantum world in polynomial time, by definition, and such a world would obey the usual axioms of QT: Bob might then as well believe that he is living in one of many worlds, but he would just be wrong. So should we, as entities of our universe, really go as far as postulating the existence of many worlds in presence of such a more parsimonious alternative? Is it not there any Occam’s razor issue at stake here?

Of course one could still criticise our appeal to a more fundamental algorithmic reality on the basis of our postulating the existence of a computer that executes the universe. We have been careful in referring to this as a metaphor, however: in that it need not be a computer that someone has built and in particular there is no need of a programmer. It can simply be another level of reality, which can be interpreted as a computer;^{Footnote 22} in a sense, our picture only suggests that there can be more levels of reality, one nested into the other.

One might also wonder why we humans perceive the quantum-physical clash given that we, as Alice, are subjects within the quantum theory—in our narrative the inconsistencies of QT are observed from Isaac’s point of view, externally to the theory. The explanation that we give to ourselves about this point is that we are used to the illusion of living in a classical universe. This is just in our minds, however, as we cannot make any physical experiment that reveals an actual inconsistency in our wonderland. And yet, we believe that this illusion can be explained within our framework: classical rationality emerges from algorithmic rationality when we consider the joint state of a system of many identical particles. We plan to address this issue in future work.

Finally, we think that the foundation of generalised probability theory via algorithmic rationality provided in this paper could possibly be useful outside the context of QT, for instance in decision theory. We also plan to address this research direction in future work.

Notes

In classical probability, given a (real) variable x and an expectation operator E, the n-th (non-central) moment of x is defined as $m_n:=E[x^n]$ (we can also define multivariate moments, e.g., $E[x_1^nx_2^m]$). Given a sequence of moments $m_0,m_1,m_2,\ldots ,m_n$, there exist infinitely many probability distributions corresponding to the same moments and they form a convex set. A sequence of scalars $m_0,m_1,m_2,\ldots ,m_n$ is a valid sequence of moments provided that they satisfy certain consistency constraints. For instance, the moment matrix, obtained by organizing that sequence into a matrix (in a certain way), must be positive semi-definite, see for instance [1]. As we will show in this paper, this gives reason for the constraint $\rho \ge 0$ for density matrices in QT.
Given $\theta \in [0,1]$, univariate Bernstein polynomials of degree n are defined as $b_{\nu ,n}(\theta )\propto \theta ^{\nu }(1-\theta )^{n-\nu }$ for $\nu =0,1,\ldots ,n$. This definition can be extended to multivariate polynomials.
A duality, generally speaking, translates concepts or mathematical structures into other concepts or structures. Two dual concepts or mathematical structures can be regarded equivalent, as essentially the same. “Fundamentally, duality gives two different points of view of looking at the same object” [2].
Notice that, in our perspective, the non-boolean structure is a consequence of algorithmic rationality as it follows from ‘the weirdness theorem’: the non-boolean structure arising from the non-standard evaluation functionals.
See for instance [11] for an overview of these views.
More precisely, here we refer to conic duality [50]. A coherent set of desirable gambles is a pointed closed convex cone and the 1-norm cross-section of its dual cone is a closed convex set of probabilities. We review this result in Sect. 2.
Abstract units of utility, we can approximately identify it with money provided we deal with small amounts of it [57, Sect. 3.2.5]
In other words, the outcomes of different experiments are assumed to be logically independent.
The case when ${\mathscr {G}}$ is infinite is analogous, but see Footnote 10. However, we will only consider the finite case in this paper because it suffices to the end of deriving finite-dimensional QT.
The conic hull of a set of gambles ${\mathscr {A}}$ is defined as ${{\,\mathrm{posi}\,}}({\mathscr {A}})=\{\sum _i \lambda _ig_i: \lambda _i\in {\mathbb {R}}^{\ge }, g_i\in {\mathscr {A}}\}$. A technicality is that when ${\mathscr {G}}$ is not finite, A1 should require in addition that ${\mathcal {K}}$ is topologically closed.
Historically, de Finetti works with finitely additive probabilities, while Kolmogorov stays within the special case of sigma additivity.
Equipped with the supremum norm, ${\mathscr {L}}_R$ constitutes a Banach space, and its topological dual ${\mathscr {L}}_R^*$ is the space of all bounded linear functionals on it. We assume the weak${}^*$ topology on ${\mathscr {L}}_R^*$.
Technically, one can show that ${\mathcal {K}}$ is a closed convex cone and ${\mathcal {K}}^\circ $ is a section of the dual cone of ${\mathcal {K}}$. ${\mathcal {K}}^\circ $ is a closed convex set. Note also that, given a closed convex set ${\mathscr {R}}\subseteq {\mathsf {S}}$ of states L, we can define its dual cone as
$$\begin{aligned} {\mathscr {R}}^{\circ } = \{g \in {\mathscr {L}}_R \mid L(g) \ge 0, ~~~\forall L \in {\mathscr {R}}\}, \end{aligned}$$
(3)
which is a coherent set of desirable gambles (it satisfies A0–A2$'$). Therefore there is a bijection between coherent sets of desirable gambles and closed convex sets of states. Since this relation preserves all relevant operations, such as conditioning and marginalisation, the two views (in terms of sets of desirable gambles and of convex sets of states) are, mathematically, the same.
This work was originally presented on the 2003 Review of Economic Studies Tour.
Here ‘closed’ is with respect to the weak$^*$-topology, which is the coarsest topology on the dual space making the evaluation functions continuous. Note also that evaluation functionals or, more in general the elements of the dual space, can informally be interpreted simply as states.
Notice that, since g is a polynomial and $\varOmega $ is bounded, $\min g = \inf g$ and $\max g = \sup g$.
In the present framework, such a view is due to properties of quadratic forms. Indeed $x^\dagger \varPi _1 x,\ldots ,x^\dagger \varPi _n x$ form a partition of unity, and therefore $E[x^\dagger \varPi _i x]=Tr(\varPi _i\rho )=p_i$, whenever $\rho =E[xx^\dagger ]=\sum _{i=1}^{n}p_i \varPi _i$ (with $p_i\ge 0$ and $\sum _{i=1}^{n} p_i=1$).
In [79, Sect. 6.3.1], the last part of this definition says ‘for all factorised vectors $x_1 \otimes x_2$’. This is equivalent to considering the pair $(x_1,x_2)$.
Degree 2 polynomials allow Alice to express desirability judgments about the probability that the outcome is $H_lH_r$, e.g., is the gamble $\theta _1-0.5$ desirable?, and also about the probability of the outcome $H_lH_r,H_lT_r$, e.g., is the gamble $\theta _1\theta _2-0.25$ desirable? Therefore, this choice of ${\mathscr {L}}_R$ is expressive; we have fixed the maximum degree to 2 just to keep small the dimension of ${\mathscr {L}}_R$.
Note that $\varSigma ^{\ge }$ is the set of all degree 2 polynomials of $\theta $ that have nonnegative coefficients; the subscript in $u_{{\alpha _1\alpha _2\alpha _3\alpha _4}}$ is just an index that allows us to define the coefficients for the elements in the sum (e.g., for $\alpha _1=1,\alpha _2=0,\alpha _3=1,\alpha _4=0$, the term is $u_{1010}\theta _1 \theta _3$).
Note that there have been previous investigations into the computational nature of QT but they have mostly focused on topics of undecidability and of potential computational advantages of non-standard theories involving modifications of quantum theory [81,82,83,84]. In particular, the undecidability results in QT are usually obtained via a limiting argument, as the number of particles goes to infinity (see, e.g., [85]). These results do not apply to our setting as we rather take the stance that the Universe is a finite physical system.
Reality is for instance interpreted as a computer in a recent conjecture that the holographic universe could just act as a quantum-correcting code [87]; in a sense, our view is similar in spirit.
There are other definitions of dependence/independence, such as epistemic irrelevance [53], that are not compatible with a notion of generative model.
We do not aim to review all literature on hidden variable models. Clearly, it would be interesting to place our work within the ‘$\psi $-ontic/$\psi $-epistemic’ interpretation of QT [12, 92, 93] agreeing with the Born’s rule but not preserving the structure of functional dependencies in QT.
See Sect. 4 but also [68, Sect. 1.7, and Supplem. 3.2]. Notice that, when it corresponds to a pure quantum state, that is to a truncated moment matrix of rank one, the model is compatible with exactly one probability.
Such nonuniqueness argument is also developed in [94] and [68, Supplementary 3.2].

References

Lasserre, J.B.: Moments, Positive Polynomials and Their Applications, vol. 1. World Scientific, Singapore (2009)
Google Scholar
Atiyah, M.: Duality in mathematics and physics. Conferències FME 5, 2007–2008 (2007)
Google Scholar
Barrett, J.: Information processing in generalized probabilistic theories. Phys. Rev. A 75(3), 032304 (2007)
Article ADS Google Scholar
Chiribella, G., Cabello, A., Kleinmann, M., Müller, M.P.:General bayesian theories and the emergence of the exclusivity principle, arXiv preprint arXiv:1901.11412 (2019)
Caves, C.M., Fuchs, C.A., Schack, R.: Unknown quantum states: the quantum de finetti representation. J. Math. Phys. 43(9), 4537–4559 (2002)
Article ADS MathSciNet MATH Google Scholar
Fuchs, C.A., Schack, R.: A quantum-bayesian route to quantum-state space. Found. Phys. 41(3), 345–356 (2011)
Article ADS MathSciNet MATH Google Scholar
Pitowsky, I.: Betting on the outcomes of measurements: a bayesian theory of quantum probability. Stud. Hist. Philos. Sci. Part B 34(3), 395–414 (2003)
MathSciNet MATH Google Scholar
Healey, R.: Quantum theory: a pragmatist approach. Br. J. Philos. Sci. 63(4), 729–771 (2012)
Article MathSciNet MATH Google Scholar
Rovelli, C.: Relational quantum mechanics. Int. J. Theoret. Phys. 35(8), 1637–1678 (1996)
Article MathSciNet MATH Google Scholar
De Muynck, W.M.: Foundations of Quantum Mechanics, an Empiricist Approach, vol. 127. Springer, New York (2006)
MATH Google Scholar
Healey, R.: Quantum-bayesian and pragmatist views of quantum theory. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy, Metaphysics Research Lab. Stanford University (2017)
Leifer, M.S.: Is the quantum state real? an extended review of psi-ontology theorems. Quanta 3(1), 67–155 (2014)
Article Google Scholar
Everett, H.: On the Foundations of Quantum Mechanics. PhD thesis, Princeton University (1957). Reprinted in [14], pp. 3–140
DeWitt, B.S., Graham, N.: The Many Worlds Interpretation of Quantum Mechanics. Princeton University Press, Princeton (1973)
Google Scholar
Bohm, D.: A suggested interpretation of the quantum theory in terms of “hidden" variables. I. Phys. Rev. 85(2), 166–179 (1952)
Article ADS MathSciNet MATH Google Scholar
Bohm, D.: A suggested interpretation of the quantum theory in terms of “hidden" variables. II. Phys. Rev. 85(2), 180–193 (1952)
Article ADS MathSciNet MATH Google Scholar
Ghirardi, G.C., Rimini, A., Weber, T.: Unified dynamics for microscopic and macroscopic systems. Phys. Rev. D 34(2), 470 (1986)
Article ADS MathSciNet MATH Google Scholar
Ghirardi, G.C., Pearle, P., Rimini, A.: Markov processes in hilbert space and continuous spontaneous localization of systems of identical particles. Phys. Rev. A 42(1), 78 (1990)
Article ADS MathSciNet Google Scholar
Diósi, L.: Models for universal reduction of macroscopic quantum fluctuations. Phys. Rev. A 40(3), 1165 (1989)
Article ADS MathSciNet Google Scholar
Penrose, R.: On gravity’s role in quantum state reduction. Gen. Relativ. Gravit. 28(5), 581–600 (1996)
Article ADS MathSciNet MATH Google Scholar
Cramer, J.G.: The transactional interpretation of quantum mechanics. Rev. Mod. Phys. 58(3), 647 (1986)
Article ADS MathSciNet Google Scholar
Cramer, J.G.: An overview of the transactional interpretation of quantum mechanics. Int. J. Theoret. Phys. 27(2), 227–236 (1988)
Article MathSciNet Google Scholar
Kastner, R.E.: On the status of the measurement problem: recalling the relativistic transactional interpretation. Int. J. Quantum Found. 4(1), 128–141 (2017)
ADS Google Scholar
Kastner, R.E.: The Transactional Interpretation of Quantum Mechanics: The Reality of Possibility. Cambridge University Press, Cambridge (2013)
MATH Google Scholar
Birkhoff, G., Von Neumann, J.: The logic of quantum mechanic. Ann. Math. 823–843 (1936)
Feynman, R.P.: Negative Probability. Quantum Implications: essays in honour of David Bohm, pp. 235–248 (1987)
Mackey, G.W.: Mathematical Foundations of Quantum Mechanics. Courier Corporation, North Chelmsford (2013)
Google Scholar
Jauch, J.M., Piron, C.: Can hidden variables be excluded in quantum mechanics. Helv. Phys. Acta 36, 827–837 (1963)
MathSciNet MATH Google Scholar
Hardy, L.: Foliable operational structures for general probabilistic theories. In: Halvorson, H. (ed.) Deep Beauty: Understanding the Quantum World Through Mathematical Innovation, p. 409 (2011)
Hardy, L.: Quantum theory from five reasonable axioms, arXiv preprint arXiv:quant-ph/0101012 (2001)
Chiribella, G., D’Ariano, G.M., Perinotti, P.: Probabilistic theories with purification. Phys. Rev. A 81(6), 062348 (2010)
Article ADS Google Scholar
Barnum, H., Wilce, A.: Information processing in convex operational theories. Electron. Notes Theoret. Comput. Sci. 270(1), 3–15 (2011)
Article MATH Google Scholar
Van Dam, W.: Implausible consequences of superstrong nonlocality, arXiv preprint arXiv:quant-ph/0501159 (2005)
Pawłowski, M., Paterek, T., Kaszlikowski, D., Scarani, V., Winter, A., Żukowski, M.: Information causality as a physical principle. Nature 461(7267), 1101 (2009)
Article ADS Google Scholar
Dakic, B., Brukner, C.: Quantum theory and beyond: Is entanglement special?, arXiv preprint arXiv:0911.0695 (2009)
Fuchs, C.A.: Quantum mechanics as quantum information (and only a little more), arXiv preprint arXiv:quant-ph/0205039, abridged version in Quantum Theory: Reconsideration of Foundations, edited by A. Khrennikov (2002)
Brassard, G.: Is information the key? Nat. Phys. 1(1), 2 (2005)
Article Google Scholar
Mueller, M.P., Masanes, L.: Information-theoretic postulates for quantum theory. In: Quantum Theory: Informational Foundations and Foils, pp. 139–170. Springer (2016)
Coecke, B., Spekkens, R.W.: Picturing classical and quantum bayesian inference. Synthese 186(3), 651–696 (2012)
Article MathSciNet MATH Google Scholar
Appleby, D.: Facts, values and quanta. Found. Phys. 35(4), 627–668 (2005)
Article ADS MathSciNet MATH Google Scholar
Appleby, D.: Probabilities are single-case or nothing. Opt. Spectrosc. 99(3), 447–456 (2005)
Article ADS Google Scholar
Timpson, C.G.: Quantum Bayesianism: a study. Stud. Hist. Philos. Sci. Part B 39(3), 579–609 (2008)
MathSciNet MATH Google Scholar
Fuchs, C.A., Schack, R.: Quantum-Bayesian coherence. Rev. Mod. Phys. 85(4), 1693 (2013)
Article ADS Google Scholar
Mermin, N.D.: Physics: Qbism puts the scientist back into science. Nature 507(7493), 421–423 (2014)
Article ADS Google Scholar
Pitowsky, I.: Physical Theory and its Interpretation: Essays in Honor of Jeffrey Bub, ch. Quantum Mechanics as a Theory of Probability, pp. 213–240. Springer, Dordrecht (2006)
Benavoli, A., Facchini, A., Zaffalon, M.: Quantum mechanics: the Bayesian theory generalized to the space of Hermitian matrices. Phys. Rev. A 94(4), 042106 (2016)
Article ADS Google Scholar
Benavoli, A., Facchini, A., Zaffalon, M.: A Gleason-type theorem for any dimension based on a gambling formulation of quantum mechanics. Found. Phys. 47(7), 991–1002 (2017)
Article ADS MathSciNet MATH Google Scholar
Popescu, S., Rohrlich, D.: Causality and nonlocality as axioms for quantum mechanics. In: Causality and Locality in Modern Physics, pp. 383–389. Springer (1998)
Navascués, M., Wunderlich, H.: A glance beyond the quantum model. In: Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 466, pp. 881–890, The Royal Society (2010)
Aliprantis, C.D., Tourky, R.: Cones and Duality, vol. 84. American Mathematical Soc., Providence (2007)
MATH Google Scholar
de Finetti, B.: La prévision: ses lois logiques, ses sources subjectives. Annales de l’Institut Henri Poincaré 7, 1–68 (1937)
MathSciNet MATH Google Scholar
Williams, P.M.: Notes on conditional previsions, tech. rep., School of Mathematical and Physical Science, University of Sussex, UK (1975)
Walley, P.: Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, New York (1991)
Book MATH Google Scholar
Anscombe, F.J., Aumann, R.J.: A definition of subjective probability. Ann. Math. Stat. 34, 199–2005 (1963)
Article MathSciNet MATH Google Scholar
Zaffalon, M., Miranda, E.: Axiomatising incomplete preferences through sets of desirable gambles. J. Artif. Intell. Res. 60, 1057–1126 (2017)
Article MathSciNet MATH Google Scholar
Zaffalon, M., Miranda, E.: Desirability foundations of robust rational decision making. Synthese
de Finetti, B.: Theory of Probability: A Critical Introductory Treatment, vol. 1. Wiley, Chichester (1974)
MATH Google Scholar
Aliprantis, C., Border, K.: Infinite Dimensional Analysis: A Hitchhiker’s Guide. Springer, New York (2007)
MATH Google Scholar
Tarski, A.: A Decision Method for Elementary Algebra and Geometry (1951)
Seidenberg, A.: A new decision method for elementary algebra. Ann. Math. 365–374 (1954)
Good, I.J.: Rational decisions. J. R. Stat. Soc. Ser. B 14(1), 107–114 (1952)
MathSciNet Google Scholar
Simon, H.A.: A behavioral model of rational choice. Q. J. Econ. 69(1), 99–118 (1955)
Article Google Scholar
Rubinstein, A.: Finite automata play the repeated prisoner’s dilemma. J. Econ. Theory 39(1), 83–96 (1986)
Article MathSciNet MATH Google Scholar
Neyman, A.: Bounded complexity justifies cooperation in the finitely repeated prisoners’ dilemma. Econ. Lett. 19(3), 227–229 (1985)
Article MathSciNet MATH Google Scholar
Wilson, A.: Bounded memory and biases in information processing. Econometrica 82(6), 2257–2294 (2014)
Article MathSciNet MATH Google Scholar
Halpern, J.Y., Pass, R.: Algorithmic rationality: game theory with costly computation. J. Econ. Theory 156, 246–268 (2015)
Article MathSciNet MATH Google Scholar
Aaronson, S.: NP-complete problems and physical reality. ACM Sigact News 36(1), 30–52 (2005)
Article Google Scholar
Holevo, A.S.: Probabilistic and Statistical Aspects of Quantum Theory, vol. 1. Springer, New York (2011)
Book MATH Google Scholar
Gurvits, L.:Classical deterministic complexity of edmonds’ problem and quantum entanglement. In: Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, pp. 10–19. ACM (2003)
Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information. Cambridge University Press, Cambridge (2010)
MATH Google Scholar
Karr, A.F.: Extreme points of certain sets of probability measures, with applications. Math. Oper. Res. 8(1), 74–85 (1983)
Article MathSciNet MATH Google Scholar
Chiribella, G., D’Ariano, G.M., Schlingemann, D.: How continuous quantum measurements in finite dimensions are actually discrete. Phys. Rev. Lett. 98(19), 190403 (2007)
Article ADS MathSciNet MATH Google Scholar
Schack, R., Caves, C.M.: Explicit product ensembles for separable quantum states. J. Mod. Opt. 47(2–3), 387–399 (2000)
Article ADS MathSciNet Google Scholar
Sperling, J., Vogel, W.: Necessary and sufficient conditions for bipartite entanglement. Phys. Rev. A 79(2), 022318 (2009)
Article ADS MathSciNet Google Scholar
Gerke, S., Sperling, J., Vogel, W., Cai, Y., Roslund, J., Treps, N., Fabre, C.: Multipartite entanglement of a two-separable state. Phys. Rev. Lett. 117(11), 110502 (2016)
Article ADS MathSciNet Google Scholar
Gerke, S., Vogel, W., Sperling, J.: Numerical construction of multipartite entanglement witnesses. Phys. Rev. X 8(3), 031047 (2018)
Google Scholar
Horodecki, M., Horodecki, P.: Reduction criterion of separability and limits for a class of distillation protocols. Phys. Rev. A 59(6), 4206 (1999)
Article ADS MathSciNet Google Scholar
Horodecki, R., Horodecki, P., Horodecki, M., Horodecki, K.: Quantum entanglement. Rev. Mod. Phys. 81(2), 865 (2009)
Article ADS MathSciNet MATH Google Scholar
Heinosaari, T., Ziman, M.: The Mathematical Language of Quantum Theory: From Uncertainty to Entanglement. Cambridge University Press, Cambridge (2011)
Book MATH Google Scholar
Benavoli, A., Facchini, A., Zaffalon, M.: Bernstein’s socks, polynomial-time provable coherence and entanglement. In: Bock, J.D., de Campos, C., de Cooman, G., Quaeghebeur, E., Wheeler, G. (eds.) ISIPTA ;’19: Proceedings of the Eleventh International Symposium on Imprecise Probability: Theories and Applications, PJMLR, JMLR (2019)
Bacon, D.: Quantum computational complexity in the presence of closed timelike curves. Phys. Rev. A 70(3), 032309 (2004)
Article ADS MathSciNet Google Scholar
Aaronson, S.: Quantum computing and hidden variables II: the complexity of sampling histories, arXiv preprint arXiv:quant-ph/0408119 (2004)
Aaronson, S.: Quantum computing, postselection, and probabilistic polynomial-time. Proc. R. Soc. Lond. A 461, 3473–3482 (2005)
ADS MathSciNet MATH Google Scholar
Chiribella, G., D’Ariano, G.M., Perinotti, P., Valiron, B.: Quantum computations without definite causal structure. Phys. Rev. A 88(2), 022318 (2013)
Article ADS Google Scholar
Cubitt, T.S., Perez-Garcia, D., Wolf, M.M.: Undecidability of the spectral gap. Nature 528(7581), 207 (2015)
Article ADS Google Scholar
Zuse, K.: Rechnender raum. Elektronische Datenverarbeitung 8, 336–344 (1967)
MATH Google Scholar
Almheiri, A., Dong, X., Harlow, D.: Bulk locality and quantum error correction in AdS/CFT. J. High Energy Phys. 2015(4), 163 (2015)
Article MathSciNet MATH Google Scholar
D’Angelo, J.P., Putinar, M.: Polynomial optimization on odd-dimensional spheres. In: Emerging applications of algebraic geometry, pp. 1–15. Springer, New York (2009)
Josz, C., Molzahn, D.K.: Lasserre hierarchy for large scale polynomial optimization in real and complex variables. SIAM J. Optim. 28(2), 1017–1048 (2018)
Article MathSciNet MATH Google Scholar
Benavoli, A., Facchini, A., Zaffalon, M.:Bernstein’s socks, polynomial-time provable coherence and entanglement. In: Bock, J.D., de Campos, C., de Cooman, G., Quaeghebeur, E., Wheeler, G. (eds.) ISIPTA ;’19: Proceedings of the Eleventh International Symposium on Imprecise Probability: Theories and Applications, PJMLR (2019)
Kochen, S., Specker, E.: The problem of hidden variables in quantum mechanics. J. Math. Mech. 17, 59–87 (1968)
MathSciNet MATH Google Scholar
Spekkens, R.W.: Evidence for the epistemic view of quantum states: a toy theory. Phys. Rev. A 75(3), 032110 (2007)
Article ADS Google Scholar
Pusey, M.F., Barrett, J., Rudolph, T.: On the reality of the quantum state. Nat. Phys. 8(6), 475 (2012)
Article Google Scholar
Srinivas, M.: When is a hidden variable theory compatible with quantum mechanics? Pramana 19(2), 159–173 (1982)
Article ADS Google Scholar
Parrilo, P.A.: Semidefinite programming relaxations for semialgebraic problems. Math. Programm. 96(2), 293–320 (2003)
Article MathSciNet MATH Google Scholar
Landau, L.J.: Empirical two-point correlation functions. Found. Phys. 18(4), 449–460 (1988)
Article ADS MathSciNet Google Scholar
Doherty, A.C., Parrilo, P.A., Spedalieri, F.M.: Distinguishing separable and entangled states. Phys. Rev. Lett. 88(18), 187904 (2002)
Article ADS Google Scholar
Doherty, A.C., Parrilo, P.A., Spedalieri, F.M.: Complete family of separability criteria. Phys. Rev. A 69(2), 022308 (2004)
Article ADS Google Scholar
Wehner, S.: Tsirelson bounds for generalized clauser-horne-shimony-holt inequalities. Phys. Rev. A 73(2), 022110 (2006)
Article ADS Google Scholar
Doherty, A.C., Liang, Y.-C., Toner, B., Wehner, S.: The quantum moment problem and bounds on entangled multi-prover games. In: 23rd Annual IEEE Conference on Computational Complexity, 2008. CCC’08, pp. 199–210. IEEE (2008)
Navascués, M., Pironio, S., Acín, A.: A convergent hierarchy of semidefinite programs characterizing the set of quantum correlations. New J. Phys. 10(7), 073013 (2008)
Article ADS Google Scholar
Pironio, S., Navascués, M., Acin, A.: Convergent relaxations of polynomial optimization problems with noncommuting variables. SIAM J. Optim. 20(5), 2157–2180 (2010)
Article MathSciNet MATH Google Scholar
Bamps, C., Pironio, S.: Sum-of-squares decompositions for a family of clauser-horne-shimony-holt-like inequalities and their application to self-testing. Phys. Rev. A 91(5), 052111 (2015)
Article ADS Google Scholar
Barak, B., Kothari, P.K., Steurer, D.: Quantum entanglement, sum of squares, and the log rank conjecture. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp. 975–988. ACM (2017)
Benavoli, A., Facchini, A., Piga, D., Zaffalon, M.: Sum-of-squares for bounded rationality. Int. J. Approx. Reason. 105, 130–152 (2019)
Article MathSciNet MATH Google Scholar
Benavoli, A., Facchini, A., Zaffalon, M., Vicente-Pérez, J.: A polarity theory for sets of desirable gambles. In: Antonucci, A., Corani, G., Couso, I., Destercke, S. (eds.) Proceedings of the Tenth International Symposium on Imprecise Probability: Theories and Applications, vol. 62 of Proceedings of Machine Learning Research, pp. 37–48, PMLR, 10–14 (2017)

Download references

Funding

Open Access funding provided by the IReL Consortium.

Author information

Authors and Affiliations

School of Computer Science and Statistics, Trinity College, Dublin, Ireland
Alessio Benavoli
Dalle Molle Institute for Artificial Intelligence (IDSIA), Lugano, Switzerland
Alessandro Facchini & Marco Zaffalon

Authors

Alessio Benavoli
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Facchini
View author publications
You can also search for this author in PubMed Google Scholar
Marco Zaffalon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alessio Benavoli.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix: Additional Discussion on QT in Relation to Other Notions

In the present section we shall discuss a few main questions that our view of QT appears to raise.

1.1 The Class of Hermitian Sum-of-Squares

The class $\varSigma ^{\ge }$ of nonnegative gambles, defined in Sect. 4, is the closed convex cone of all Hermitian sum-of-squares in ${\mathscr {L}}_R=\{(\otimes _{j=1}^m x_j)^\dagger G (\otimes _{j=1}^m x_j) \mid G\in {\mathscr {H}}^{r \times r}\}$, that is, of all gambles $ g(x_1,\ldots ,x_m) \in {\mathscr {L}}_R$ for which G is PSD. In particular this means that Alice can efficiently determine whether a gamble belongs to $\varSigma ^{\ge }$ or not. But is this class the only closed convex cone of nonnegative polynomials in ${\mathscr {L}}_R$ for which the membership problem can be solved efficiently (in polynomial-time)? It turns out that the answer is negative (see for instance [88, 89]): in addition of Hermitian sum-of-squares (the one that Nature has chosen for QT) one could also consider real sum-of-squares in ${\mathscr {L}}_R$, that is, polynomials of the form $(\otimes _{j=1}^m x_j)^\dagger G (\otimes _{j=1}^m x_j)$ that are sum-of-squares of polynomials of the real and imaginary part of $ x_j$.

A separating example is the polynomial in (17), which is not a Hermitian sum-of-squares but it is a real sum-of-square, as it can be seen from (18). This polynomial was used in our example because it can be constructed by inspection and its nonnegativity follows immediately by (18). Clearly, there exist nonnegative polynomials in ${\mathscr {L}}_R$ that are neither Hermitian sum-of-squares nor real sum-of-squares.

Why has Nature chosen Hermitian sum-of-squares? This is an open question that we will investigate in future work. A possible explanation may reside in the different size of the corresponding optimisation problems [89]. Another possible explanation is that the class of Hermitian sum-of-squares is always strictly included in the class of real sum-of-squares polynomials. Therefore the former may be the smallest class of gambles that allows one to efficiently determine whether a gamble is nonnegative according to it, but that is still expressive enough [90, Proposition 6].

1.2 On the Use of Tensor Product

In Sect. 4 we saw that the possibility space of composite systems of m particles, each one with $n_j$ degrees of freedom, is given by $\varOmega =\prod _{j=1}^m {\overline{{\mathbb {C}}}}^{n_j}$. We saw that gambles on such a space are actually bounded real functions $g(x_1,\ldots ,x_m)=(\otimes _{j=1}^m x_j)^\dagger G (\otimes _{j=1}^m x_j)$, where $\otimes $ denotes the tensor product between vectors, understood as column matrices.

In what follows, we justify the use of the tensor product, and more specifically the type of gambles on the possibility space of composite systems, as a consequence of the way a multivariate theory of probability is usually formulated.

As a start, let us consider the case of classical probability. In CPT, under the reasonable assumption that since agents are expressing beliefs about physical systems, the underlying notion of dependence/independence should be compatible with that of a generative model,^{Footnote 23} structural judgements of independence/dependence are expressed via products: given factorised gambles $g(x_1,\ldots ,x_m)=\prod _{j=1}^m g_j(x_j)$, $x_1,\ldots ,x_m$ are said to be independent if $E[\prod _{j=1}^m g_j(x_j)]=\prod _{j=1}^m E[g_j(x_j)]$ for all $g_j$, where $E[\cdot ]$ denotes the expectation operator. With this in mind, let us go back to our setting. Marginal gambles are of type $g_j(x_j)=x_j^{\dagger }G_j x_j$. This means that structural judgements are performed by considering factorised gambles of the form $\prod _{j=1}^m x_j^{\dagger }G_j x_j$. It is then not difficult to verify that

$$\begin{aligned} \prod _{j=1}^m x_j^{\dagger }G_j x_j=(\otimes _{j=1}^m x_j)^\dagger (\otimes _{j=1}^m G_j) (\otimes _{j=1}^m x_j). \end{aligned}$$

By closing the set of factorised gambles under the operations of addition and scalar multiplication, one finally gets a vector space whose domain domain coincides with the collection of all gambles of the form $(\otimes _{j=1}^m x_j)^\dagger G (\otimes _{j=1}^m x_j)$. Hence, structural judgements of independence/dependence are stated considering the desirability of gambles belonging to ${\mathscr {L}}_R$.

1.3 Hidden Variable Models

In [91], Kochen and Specker gave a hidden variable model for QT.^{Footnote 24} Their idea amounts to introducing a ‘hidden variable’ for each observable H producing stochasticity in outcomes of measurement of H. The totality of all such hidden variables is then the phase space variable $\omega $ of the model.

In case of a single n-dimensional quantum system, our model based on the phase space

$$\begin{aligned} \varOmega =\{x\in {\mathbb {C}}^n: x^{\dagger }x=1\}, \end{aligned}$$

can also be understood as a hidden variable model, and essentially coincides with the one introduced by Holevo in [68, Sect. 1.7]. The point is that for a single quantum system, both $\varSigma ^{\ge }={\mathscr {L}}_R^{\ge }$ and $\varSigma ^{<}={\mathscr {L}}_R^{<}$, meaning that Alice will never accept negative gambles. Hence, in such a case, the density matrix $\rho =L(xx^{\dagger })$ can be interpreted as a truncated moment matrix and it is therefore compatible with (can be extended to) a probability distribution over $\varOmega $. Now, as there may be more the one probability compatible with it,^{Footnote 25} such model does not fulfil one of the key requirement imposed in many existing ‘no-go’ theorems, namely the uniqueness of the associated classical description.

However, despite the fact that a hidden variable theory would necessarily treat as distinct two probabilities that define the same density matrix, they are underdetermined by the observations and therefore they can be regarded as corresponding to two unidentifiable, or undistinguishable, classical models. In fact, since any real-valued observable is described by a Hermitian operator and the expectation of a Hermitian operator w.r.t. a given density matrix (truncated moment matrix) $\phi $ is unique ($Tr(G\phi )$), $\phi $ is sufficient to provide an adequate characterisation of these two probabilities. To sum up, if we accept the view that, because of the underdeterminisation of classical models by observations, the requirement of a one-to-one correspondence between classical and quantum states is not grounded and hence can be relaxed, a hidden-variable model may be simply defined as the equivalence class of all probabilities associated to a given truncated moment matrix.^{Footnote 26}

What about the case when there are $m>1$ particles? In this case, Theorem 1 applies, and it therefore can be read as a no-go theorem pointing to two ways to extend the classical model either by allowing negative probabilities or by redefining the notion of evaluation functionals. Moreover, the result elucidates the role of the tensor product. In order to see this, let us consider two quantum systems A and B, with corresponding Hilbert spaces ${\mathscr {H}}_A$ and ${\mathscr {H}}_B$. By duality, the density matrix (state) of the joint system lives in the tensor product space ${\mathscr {H}}_A\otimes {\mathscr {H}}_B$. Indeed, we have that

$$\begin{aligned} \begin{aligned} L((\otimes _{j=1}^2 x_j)^\dagger G (\otimes _{j=1}^2 x_j))&=L(Tr(G(\otimes _{j=1}^2 x_j)(\otimes _{j=1}^2 x_j)^\dagger ))\\&=Tr(G\; L((\otimes _{j=1}^2 x_j)(\otimes _{j=1}^2 x_j)^\dagger )), \end{aligned} \end{aligned}$$

and $L((\otimes _{j=1}^2 x_j)(\otimes _{j=1}^2 x_j)^\dagger )$ belongs to ${\mathscr {H}}_A\otimes {\mathscr {H}}_B$. However, as mentioned before, when (16) holds, we may justify entanglement hypothesising the existence of non-classical evaluation functions or, equivalently, a larger possibility space (Theorem 1). This is clearly discussed in [68, Supplement 3.4]:

“Since the set of pure states of the composite system $\varOmega $ is larger than Cartesian product $\varOmega _A\times \varOmega _B$, the phase space of the classical description of the composite system will be larger than the product of phase spaces for the components: $\varOmega _A\times \varOmega _B \varsubsetneq \varOmega $. Therefore this classical description is not a correspondence between the categories of classical and quantum system preserving the operation of forming the composite systems. Moreover, it appears that there is no way to establish such a correspondence. In any classical description of a composite quantum system the variables corresponding to observables of the components are necessarily entangled in the way unusual for classical subsystems.”

To sum up, $\varOmega _A\times \varOmega _B \varsubsetneq \varOmega $ may be understood as another manifestation of algorithmic rationality.

1.4 Sum-of-Squares Optimisation

The theory of moments (and its dual theory of positive polynomials) are used to develop efficient numerical schemes for polynomial optimisation, i.e., global optimisation problems with polynomial functions. Such problems arise in the analysis and control of nonlinear dynamical systems, and also in other areas such as combinatorial optimisation. This scheme consists of a hierarchy of semidefinite programs (SDP) of increasing size which define tighter and tighter relaxations of the original problem. Under some assumptions, it can be showed that the associated sequence of optimal values converges to the global minimum, see for instance [1, 95]. Note that every polynomial in

$$\begin{aligned} \varSigma ^{\ge }:=\{g(x_1,\ldots ,x_m)=(\otimes _{j=1}^m x_j)^\dagger G (\otimes _{j=1}^m x_j): G\ge 0\}, \end{aligned}$$

is a (Hermitian) sum-of-squares because it can be rewritten as:

$$\begin{aligned} (\otimes _{j=1}^m x_j)^\dagger H H^{\dagger } (\otimes _{j=1}^m x_j)=\sum _{i=1}^k |(H^{\dagger } (\otimes _{j=1}^m x_j))_i|^2, \end{aligned}$$

with $G= HH^{\dagger }$.

In QT, SDP has been used to numerically prove that a certain state is entangled [96,97,98,99,100,101,102,103,104]. The work [97, 98] realised that the set of separable quantum states can be approximated by sum-of-squares hierarchies. This leads to the SDP hierarchy of Doherty-Parrilo-Spedalieri, which is extensively employed in quantum information.

The present, purely foundational, work differs from these approaches by stating that the (microscopic) world is actually running on a ‘computer’ that solves SOS optimisation problems.

Technicalities

1.1 Sections 2 and 4

The proofs of the results in Sect. 2.2 were derived in [105]. Hereafter, we extend those results to prove Theorem 1.

We define the dual of a subset ${\mathcal {K}}$ of ${\mathscr {L}}$ as:

$$\begin{aligned} {\mathcal {K}}^\bullet =\left\{ \mu \in {\mathscr {M}}: \int gd\mu \ge 0, ~\forall g \in {\mathcal {K}}\right\} . \end{aligned}$$

(31)

By an argument analogous to that by [106, Th. 4], it is easy to check that:

Proposition 4

The map

$$\begin{aligned} {\mathcal {K}}\mapsto {\mathscr {P}}:={\mathcal {K}}^\bullet \cap {\mathcal {S}} \end{aligned}$$

establishes a bijection between coherent sets of desirable gambles and non-empty closed convex sets of states.

It is also easy to verify the following characterisation of the dual of a closed convex cone which is not coherent.

Proposition 5

Let ${\mathcal {K}}$ be a non empty closed convex cone. Then the following are equivalent:

1.
${\mathcal {K}}\ne {\mathscr {L}}$ and ${\mathcal {K}}$ is not coherent;
2.
${\mathcal {K}}^\bullet \not \subseteq {\mathscr {M}}^{\ge }$;
3.
${\mathcal {K}}^\bullet \cap \{ \mu \in {\mathscr {M}} \mid \langle \mathbbm {1}, \mu \rangle =1 \} \not \subseteq {\mathcal {S}}$;
4.
$\{0\}\subsetneq {\mathcal {K}}^\bullet $ and ${\mathcal {K}}^\bullet \cap {\mathcal {S}}= \emptyset $.

Essentially, Proposition 5 is telling us that, from the dual point of view, non-degenerated closed convex cones of gambles that are not coherent are characterised by quasi-probabilities (charges).

Proof of Theorem 1

Assume that ${\mathscr {L}}_R$ includes all positive constant gambles and that $\varSigma ^{\ge }$ is closed (in ${\mathscr {L}}_R$). Let ${\mathcal {C}}\subseteq {\mathscr {L}}_R$ be an A-coherent set of desirable gambles. We have to verify that the following statements are equivalent:

1.
${\mathcal {C}}$ includes a negative gamble that is not in $\varSigma ^{<} $;
2.
${{\,\mathrm{posi}\,}}({\mathscr {L}}^{\ge } \cup {\mathscr {G}})$ is incoherent, and thus ${\mathscr {P}}$ is empty;
3.
${\mathcal {C}}^{\circ }$ is not (the restriction to ${\mathscr {L}}_R$ of) a closed convex set of mixtures of classical evaluation functionals;
4.
The extension $ {\mathcal {C}}^\bullet $ of ${\mathcal {C}}^{\circ }$ in the space ${\mathscr {M}}$ of all charges in $\varOmega $ includes only quasi-probabilities.

First of all, notice that the restriction to ${\mathscr {L}}_R$ of the set of all normalised charges that correspond to a bounded linear functionals coincides with ${\mathcal {C}}^{\bullet }$. Given this, the equivalence between (3) and (4) is immediate, whereas the equivalence between (2) and (4) is given by Proposition 5. We finally verify the equivalence between (1) and (3). In this case, the direction from left to right being obvious, the other direction is due to the fact that $g \le f$, for every $g \in {\mathcal {C}}$ and $f \in {{\,\mathrm{posi}\,}}({\mathscr {L}}^{\ge } \cup {\mathcal {C}})\setminus {\mathcal {C}}$. $\square $

1.2 Section 4: Duality in QT

Recall from Sect. 3 that the set $\left\{ L \in {\mathscr {L}}_R^* \mid L(g)\ge 0, ~~ L(\mathbbm {1}_R)=1,~\forall g \in {\mathcal {C}}\right\} $ is the dual of ${\mathcal {C}}\subset {\mathscr {L}}_{R}$.

The monomials $\otimes _{j=1}^m x_j$ form a basis of the space ${\mathscr {L}}_{R}$. Define the Hermitian matrix of scalars

$$\begin{aligned} Z:= L\left( (\otimes _{j=1}^m x_j)(\otimes _{j=1}^m x_j)^\dagger \right) , \end{aligned}$$

(32)

and let $\{z_{ij}\}\in {\mathbb {C}}^{d}$, with $d=\frac{n(n+1)}{2}$ and $n=\prod _{j=1}^m n_j$, be the vector of variables obtained by taking the elements of the upper triangular part of Z. Given any gamble g, we can therefore rewrite L(g) as a function of the vector $\{z_{ij}\}\in {\mathbb {C}}^{d}$. This means that the dual space ${\mathscr {L}}_{R}^*$ is isomorphic to ${\mathbb {C}}^{d}$, and we can then define the dual maps $(\cdot )^\circ $ between ${\mathscr {L}}_{R}$ and ${\mathbb {C}}^{d}$ as follows.

Definition 3

Let ${\mathcal {C}}$ be a closed convex cone in ${\mathscr {L}}_{R}$. Its dual cone is defined as

$$\begin{aligned} {\mathcal {C}}^\circ =\left\{ {z} \in {\mathbb {C}}^{{d}}: L(g)\ge 0, ~\forall g \in {\mathcal {C}}\right\} , \end{aligned}$$

(33)

where L(g) is completely determined by $\{z_{ij}\}$ via the definition (32).

In discussing properties of the dual space, we need the following well-known result from linear algebra:

Lemma 1

For any $M \in H^{d\times d}$ and $v \in {\mathbb {C}}^{d}$, it holds that

$$\begin{aligned} Tr(M (v v^\dagger )) = Tr((v v^\dagger )M) = v^\dagger M v. \end{aligned}$$

(34)

By Lemma 1 and the definitions of g and Z, we obtain the following result.

Proposition 6

Let $g(x_1,\ldots ,x_m) = (\otimes _{j=1}^m x_j)^\dagger G (\otimes _{j=1}^m x_j)$ and G Hermitian. Then for every $z \in {\mathbb {C}}^{{d}}$, it holds that $ L(g) = Tr({G} Z)$, where Z is defined in (32).

It is then possible to verify that:

Proposition 7

Let ${\mathcal {C}}$ be an A-coherent set of desirable gambles. The following holds:

$$\begin{aligned} {\mathcal {C}}^\circ =\left\{ {z} \in {\mathbb {C}}^{{d}}: L(g)=Tr({G} Z)\ge 0, ~Z\ge 0,~\forall g \in {\mathcal {C}}\right\} . \end{aligned}$$

(35)

Proof

By A-coherence, ${\mathcal {C}}$ includes $\varSigma ^{\ge }$, which is isomorphic to the closed convex cone of PSD matrices. We have that

$$\begin{aligned} L(g)=Tr({G} Z)\ge 0 ~~\forall g\in \varSigma ^{\ge } \subseteq {\mathcal {C}}. \end{aligned}$$

From a standard result in linear algebra, see for instance [68, Lemma 1.6.3], this implies that $Z \ge 0$, i.e., it must be a PSD matrix. $\square $

In what follows, we verify that the dual ${\mathcal {C}}^\circ $ is completely characterised by a closed convex set of states. But before doing that, we have to clarify what is a state in this context.

In an algorithmic TDG, postulate A0 is replaced with postulate B0. Hence, to define what a state is, one cannot anymore refer to nonnegative gambles but to gambles that are A-nonnegative. This means that states are linear operators that: (1) assign nonnegative real numbers to A-nonnegative, and (2) preserve the unit gamble. In the context of Hermitian gambles, the unitary gamble is

$$\begin{aligned} \mathbbm {1}_R(x_1,\ldots ,x_m)=(\otimes _{j=1}^m x_j)^\dagger I (\otimes _{j=1}^m x_j)= \prod \limits _{i=1}^m {x_j}^{\dagger }x_j=1, \end{aligned}$$

(36)

where I is the identity matrix. Therefore, we want that

$$\begin{aligned} \begin{aligned} L\Big ((\otimes _{j=1}^m x_j)^\dagger I (\otimes _{j=1}^m x_j)\Big )&= L\Big (Tr\Big ( I ~(\otimes _{j=1}^m x_j)(\otimes _{j=1}^m x_j)^\dagger \Big )\Big )\\&=Tr\Big (I~ L\Big ( (\otimes _{j=1}^m x_j)(\otimes _{j=1}^m x_j)^\dagger )\Big )\Big )\\&=Tr(I ~Z)=Tr(Z)=1. \end{aligned} \end{aligned}$$

(37)

Hence, the set of states is

$$\begin{aligned} {\mathcal {S}}_B=\{{z} \in {\mathbb {C}}^{{d}}: \mid Z \ge 0, ~~Tr(Z)=1\}. \end{aligned}$$

(38)

By reasoning exactly as for Theorem 4, we then have the following result.

Theorem 3

The map

$$\begin{aligned} {\mathcal {C}}\mapsto {\mathcal {Q}}:={\mathcal {C}}^\circ \cap {\mathcal {S}}_B \end{aligned}$$

is a bijection between A-coherent set of desirable gambles in ${\mathscr {L}}_R$ and closed convex subsets of ${\mathcal {S}}_B$.

We can therefore identify the dual of an A-coherent set of desirable gambles ${\mathcal {C}}$, with the closed convex set of states

$$\begin{aligned} {\mathcal {Q}} =\left\{ z \in {\mathcal {S}}_B: L(g)=Tr({G} Z)\ge 0, ~~\forall g \in {\mathcal {C}}\right\} . \end{aligned}$$

(39)

Notice that since matrices corresponding to states are density matrices, (39) is in fact equivalent to

$$\begin{aligned} \left\{ \rho \in {\mathscr {H}}^{n\times n}: \rho \ge 0, Tr(\rho )=1, Tr({G} \rho )\ge 0, ~~\forall G \in {\mathcal {C}}\right\} , \end{aligned}$$

meaning that we can identify the set ${\mathcal {S}}_B$ with the set of density matrices and denote its elements as usual with the symbol $\rho $.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Benavoli, A., Facchini, A. & Zaffalon, M. The Weirdness Theorem and the Origin of Quantum Paradoxes. Found Phys 51, 95 (2021). https://doi.org/10.1007/s10701-021-00499-w

Download citation

Received: 05 January 2021
Accepted: 08 September 2021
Published: 28 September 2021
DOI: https://doi.org/10.1007/s10701-021-00499-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The Weirdness Theorem and the Origin of Quantum Paradoxes

Abstract

Similar content being viewed by others

Classical and Quantum Probability: The Two Logics of Science

Quantum Theory is an Information Theory

Quantum cognition and bounded rationality

1 Introduction

1.1 Related Work

1.2 Outline of the Paper

2 Classical Rationality

2.1 Desirability

Definition 1

2.2 Probability (The Desirability Dual)

3 Algorithmic Rationality

3.1 Algorithmic Desirability

3.2 Quasi-Probability (The Algorithmic Desirability Dual)

Theorem 1

4 QT as a Theory of Algorithmic Rationality

4.1 Setting

Remark 1

Remark 2

4.2 Polynomial Inference and Agreement with Born’s Rule

Proposition 1

Example 1

Remark 3

Remark 4

4.3 Entanglement

4.4 Entanglement Witness

Definition 2

Proposition 2

Theorem 2

Corollary 1

5 A Theory of Algorithmic Rationality and Entanglement in the Reals

Proposition 3

6 Discussions

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix: Additional Discussion on QT in Relation to Other Notions

1.1 The Class of Hermitian Sum-of-Squares

1.2 On the Use of Tensor Product

1.3 Hidden Variable Models

1.4 Sum-of-Squares Optimisation

Technicalities

1.1 Sections 2 and 4

Proposition 4

Proposition 5

Proof of Theorem 1

1.2 Section 4: Duality in QT

Definition 3

Lemma 1

Proposition 6

Proposition 7

Proof

Theorem 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation