Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Contextuality is a key feature of quantum systems, as no noncontextual hidden-variable theory exists that is consistent with quantum theory. This feature has been at the core of recent research in quantum information, such as attempts to identify the underlying principles for the quantum boundary. Despite its importance, there seem to be no universally accepted measure of contextuality (see, e.g., the different approaches in Refs. [6, 10, 1217, 20, 22, 26, 27, 29, 37]). Here, we consider and compare two measures inspired by the idea that contextuality means the impossibility of finding a joint probability distribution (jpd) for different sets of random variables with some elements in common. One measure (denoted below by \(\varDelta _{\min }\)) is based on extended sets of context-indexed random variables; its precursors can be found in Refs. [10, 11, 14, 29, 33, 37, 38] and in its current form it is presented in Refs. [12, 1518, 27, 28]. The other measure (denoted below by \(\varGamma _{\min }\)) is based on negative (quasi-)probabilities dating back to Dirac, and recently explored in connection to contextuality in Refs. [1, 79, 31, 35].

As an example of contextuality, let there be three properties of a system, P, Q, and R, whose measurement outcomes are represented by the random \(\mathbf {P}\), \(\mathbf {Q}\), and \(\mathbf {R}\) Footnote 1. Assume we can never observe P, Q, and R simultaneously, but only in pairwise combinations, \(\left( \mathbf {P},\mathbf {Q}\right) \), \(\left( \mathbf {P},\mathbf {R}\right) \), or \(\left( \mathbf {Q},\mathbf {R}\right) \). We may think of each pair as recorded under a different experimental condition providing a context. The system exhibits contextuality if one cannot find a jpd of \(\left( \mathbf {P},\mathbf {Q},\mathbf {R}\right) \) that agrees with the observed distributions of \(\left( \mathbf {P},\mathbf {Q}\right) \), \(\left( \mathbf {P},\mathbf {R}\right) \), and \(\left( \mathbf {Q},\mathbf {R}\right) \) as its marginals. The two approaches to be considered in this paper deal with this situation differently. The negative probabilities (NP) approach relaxes the notion of a jpd by allowing some (unobservable) joint probabilities for \(\left( \mathbf {P},\mathbf {Q},\mathbf {R}\right) \) to be negative. The “contextuality-by-default” (CbD) approach treats random variables recorded under different conditions as different “by default”, so that, e.g., property P in the context of experiment \(\left( \mathbf {P},\mathbf {Q}\right) \) is represented by some random variable \(\mathbf {P}_{A}\), and in the context \(\left( \mathbf {P},\mathbf {R}\right) \) by another random variable, \(\mathbf {P}_{B}\). Denoting the three contexts by ABC, this yields three pairs of contextually labeled random variables, \(\left( \mathbf {P}_{A},\mathbf {Q}_{A}\right) \), \(\left( \mathbf {P}_{B},\mathbf {R}_{B}\right) \), and \(\left( \mathbf {Q}_{C},\mathbf {R}_{C}\right) \), and in the CbD approach the joint distribution imposed on them allows, say, \(\mathbf {P}_{A}\) and \(\mathbf {P}_{B}\) to be unequal with some probability.

Here we compare the NP and CdB approaches applied to the simplest contextual case possible, with three pairwise correlated random variables, and to the standard EPR-Bell experiment. We show that for such examples the two measures of contextuality coincide (although they differ for more complex systems). The details of proofs and computations used in the main text are presented in Appendix.

2 Negative Probabilities (NP)

Using our above example, with \(\mathbf {P},\mathbf {Q},\mathbf {R}\) observed in pairs, in the NP approach one ascribes to the vector \(\left( \mathbf {P},\mathbf {Q},\mathbf {R}\right) \) a joint quasi-distribution by means of assigning to each possible combination \(w=\left( p,q,r\right) \) a real number \(\mu \left( w\right) \) (possibly negative), such that

$$\begin{aligned} \begin{array}{c} \sum \nolimits _{r}\mu \left( w\right) =\Pr \left[ \mathbf {P}=p,\mathbf {Q}=q\right] ,\\ \sum \nolimits _{q}\mu \left( w\right) =\Pr \left[ \mathbf {P}=p,\mathbf {R}=r\right] ,\\ \sum \nolimits _{p}\mu \left( w\right) =\Pr \left[ \mathbf {Q}=q,\mathbf {R}=r\right] . \end{array} \end{aligned}$$
(1)

Such \(\mu \) exists if and only if the no-signaling condition (built into EPR paradigms with spacelike separation) is satisfied [2, 3, 31], i.e., the distribution of, say, \(\mathbf {P}\) is the same in \(\left( \mathbf {P},\mathbf {Q}\right) \) and in \(\left( \mathbf {P},\mathbf {R}\right) \).Footnote 2 The numbers \(\mu \left( w\right) \) can then be interpreted as quasi-probabilities of events \(\left\{ w\right\} \), with the quasi-probability of any other event (subset of w values) being computed by additivity, inducing thereby a signed measure [21] on the set of all events. The quasi-probability of the entire set of w will then be necessarily equal to unity, because, e.g.,

$$\begin{aligned} 1=\sum _{p,q}\Pr \left[ \mathbf {P}=p,\mathbf {Q}=q\right] =\sum _{w}\mu \left( w\right) . \end{aligned}$$
(2)

The function \(\mu \) is generally not unique. In our approach [7, 8] we restrict the class of possible \(\mu \) to those as close as possible to a proper jpd by requiring that the L1 norm of the probability distribution, defined by \(M=\sum _{w}\left| \mu \left( w\right) \right| ,\) be minimized. This ensures that if the class of all possible \(\mu \) satisfying (1) contains proper probability distributions, the chosen \(\mu \) will have to be one of them. Since in this case \(\left| \mu \left( w\right) \right| =\mu \left( w\right) \) for all w, the minimum of M is 1. If (and only if) no proper probability distribution exists, then the minimum of M exceeds 1. As a result, the smallest possible value \(\varGamma _{\min }\) of \(M-1\) can be taken as a measure of contextuality.

3 Contextuality-by-Default (CbD)

A more direct approach to contextuality [1018, 27, 28] is to posit that the identity of a random variable is determined by all systematically recorded conditions under which it is observed. Thus, in \(\left( \mathbf {P}_{A},\mathbf {Q}_{A}\right) \), \(\left( \mathbf {P}_{B},\mathbf {R}_{B}\right) \), and \(\left( \mathbf {Q}_{C},\mathbf {R}_{C}\right) \) of our example, any random variable in any of the pairs is a priori different from and stochastically unrelated to any random variable in any other pair [12, 14], but a jpd can always be imposed on the six random variables. In other words, one can always assign probability masses \(\lambda \) to \(v=\left( p_{A},p_{B},q_{A},q_{C},r_{B},r_{C}\right) \) in such a way that

$$\begin{aligned} \begin{array}{c} \sum \nolimits _{p_{B},q_{C},r_{B},r_{C}}\lambda \left( v\right) =\Pr \left[ \mathbf {P}_{A}=p_{A},\mathbf {Q}_{A}=q_{A}\right] ,\\ \sum \nolimits _{p_{A},q_{A},q_{C},r_{C}}\lambda \left( v\right) =\Pr \left[ \mathbf {P}_{B}=p_{B},\mathbf {R}_{B}=r_{B}\right] ,\\ \sum \nolimits _{p_{A},p_{B},q_{A},r_{B}}\lambda \left( v\right) =\Pr \left[ \mathbf {Q}_{C}=q_{C},\mathbf {R}_{C}=r_{C}\right] . \end{array} \end{aligned}$$
(3)

The noncontextuality hypothesis for \(\mathbf {P}_{A},\mathbf {Q}_{A},\mathbf {R}_{B}\) and \(\mathbf {P}_{B},\mathbf {Q}_{C},\mathbf {R}_{C}\) is that among these jpds \(\lambda \) we can find at least one for which \(\Pr \left[ \mathbf {P}_{A}\not =\mathbf {P}_{B}\right] =\Pr \left[ \mathbf {Q}_{A}\not =\mathbf {Q}_{C}\right] =\Pr \left[ \mathbf {R}_{B}\not =\mathbf {R}_{C}\right] =0,\) which is equivalent to \(\varDelta =\Pr \left[ \mathbf {P}_{A}\not =\mathbf {P}_{B}\right] +\Pr \left[ \mathbf {Q}_{A}\not =\mathbf {Q}_{C}\right] +\Pr \left[ \mathbf {R}_{B}\not =\mathbf {R}_{C}\right] =0.\) Such a jpd need not exist, and then the smallest possible value \(\varDelta _{\min }\) of \(\varDelta \) for which a jpd of \(\left( \mathbf {P}_{A},\mathbf {P}_{B},\mathbf {Q}_{A},\mathbf {Q}_{C},\mathbf {R}_{B},\mathbf {R}_{C}\right) \) exists can be taken as a measure of contextuality.Footnote 3

The CdB approach has its precursors in the literature: various aspects of the contextual indexation of random variables and probabilities of the kind shown are considered in Refs. [10, 14, 2325, 29, 33, 37, 38]. The principal difference, however, is in the use of minimization of \(\varDelta \) under the assumption that a jpd exists. This is a well-defined mathematical problem, solvable in principle for any set of distributions observed empirically. We will now compare and interrelate the two approaches, NP and CbD, by applying them to the Leggett-Garg and the EPR-Bell setups.

4 Leggett-Garg

Let us consider Leggett and Garg’s \(\pm 1\)-valued random variables, \(\mathbf {Q}_{1}\), \(\mathbf {Q}_{2}\), and \(\mathbf {Q}_{3}\) [30]. Applying the NP approach, we seek signed probabilities \(\mu \) for \(\left( \mathbf {Q}_{1},\mathbf {Q}_{2},\mathbf {Q}_{3}\right) \) that are consistent with the observed correlations \(\left\langle \mathbf {Q}_{i}\mathbf {Q}_{j}\right\rangle \) and individual expectations \(\left\langle \mathbf {Q}_{i}\right\rangle \), with the smallest possible value of the L1 norm \(M\equiv \sum _{w}\left| \mu \left( w\right) \right| \), where w denotes all possible combinations of values \(\left( q_{1},q_{2},q_{3}\right) \) for \(\left( \mathbf {Q}_{1},\mathbf {Q}_{2},\mathbf {Q}_{3}\right) \). Here, we use the standard notation \(\left\langle \cdot \right\rangle \) for the expectation operator. This problem can be easily solved, as we only have \(2^{3}\) atomic elements w: \(\left( 1,1,1\right) \), \(\left( 1,1,-1\right) \), ... , \(\left( -1,-1,-1\right) \). Thus, for \(\mathbf {Q}_{1}\), \(\mathbf {Q}_{2}\), and \(\mathbf {Q}_{3}\), the minimal L1 norm \(1+\varGamma _{\min }\) satisfies

$$\begin{aligned} \varGamma _{\min }&=\max&\left\{ 0,-\frac{1}{2}+\frac{1}{2}S_{LG}\right\} , \end{aligned}$$
(4)

where \(S_{LG}\) is defined as

$$\begin{aligned} S_{LG}\equiv \max _{\#^{-}=1,3}\{\pm \left\langle \mathbf {Q}_{1}\mathbf {Q}_{2}\right\rangle \pm \left\langle \mathbf {Q}_{1}\mathbf {Q}_{3}\right\rangle \pm \left\langle \mathbf {Q}_{2}\mathbf {Q}_{3}\right\rangle \}, \end{aligned}$$
(5)

where each \(\pm \) in the expression should be replaced with \(+\) or \(-\), and \(\#^{-}\) indicates the possible numbers of minuses. Notice that \(S_{LG}\le 1\), which is equivalent to \(\varGamma _{\min }=0\), is a necessary and sufficient condition for the existence of a proper jpd.

Turning now to the CbD approach, we create a set of six random variables

$$\begin{aligned} \mathbf {Q}_{1,2},\mathbf {Q}_{1,3},\mathbf {Q}_{2,1},\mathbf {Q}_{2,3},\mathbf {Q}_{3,1},\mathbf {Q}_{3,2}, \end{aligned}$$
(6)

each indexed by the measurement conditions under which it is recorded: for any two random variables recorded at moments \(t_{i}\) and \(t_{j}\), with \(i<j\), the \(\mathbf {Q}_{i,j}\) designates the earlier variable and \(\mathbf {Q}_{j,i}\) the later one. We have thus three pairs of variables with known jpds:

$$\begin{aligned} \left( \mathbf {Q}_{1,2},\mathbf {Q}_{2,1}\right) ,\left( \mathbf {Q}_{1,3},\mathbf {Q}_{3,1}\right) ,\left( \mathbf {Q}_{2,3},\mathbf {Q}_{3,2}\right) . \end{aligned}$$
(7)

A jpd can always be constructed for these pairs (e.g., they can always be connected as stochastically independent pairs), but we seek a jpd with the smallest value \(\varDelta _{\min }\) of

$$\begin{aligned} \begin{array}{r} \varDelta =\Pr \left[ \mathbf {Q}_{1,2}\ne \mathbf {Q}_{1,3}\right] +\Pr \left[ \mathbf {Q}_{2,1}\ne \mathbf {Q}_{2,3}\right] +\Pr \left[ \mathbf {Q}_{3,1}\ne \mathbf {Q}_{3,2}\right] .\end{array} \end{aligned}$$
(8)

A classical joint exists for \(\mathbf {Q}_{1}\), \(\mathbf {Q}_{2}\), and \(\mathbf {Q}_{3}\) (no contextuality) if and only if a joint exists for (7) with \(\varDelta =0\). The more we depart from the classical joint, the larger the minimum value \(\varDelta _{\min }\). Thus, \(\varDelta _{\min }\) can serve as a measure of contextuality.

Requiring a jpd consistent with (7) means to assign a probability to each of the \(2^{6}\) possible values of these random variables,

$$\begin{aligned} \mathbf {Q}_{1,2}=\pm 1,\mathbf {Q}_{1,3}=\pm 1,\ldots ,\mathbf {Q}_{3,2}=\pm 1, \end{aligned}$$
(9)

constrained by being nonnegative and summing to the observed probabilities. For instance, the probabilities assigned to all combinations with \(\mathbf {Q}_{1,2}=1\) and \(\mathbf {Q}_{2,1}{= --1}\) should sum to the observed \(\Pr \left[ \mathbf {Q}_{1,2}=1,\mathbf {Q}_{2,1}{= --1}\right] \). A computer-assisted Fourier-Motzkin elimination algorithm gives the following analytic expression for the minimum value of \(\varDelta \) consistent with the observable pairs (7):

$$\begin{aligned} \varDelta _{\min }=\max \left\{ 0,-\frac{1}{2}+\frac{1}{2}S_{LG}\right\} . \end{aligned}$$
(10)

This is a special case of the result in Ref. [16, 17, 27].

Comparing the general expressions (4) for \(\varGamma _{\min }\) and (10) for \(\varDelta _{\min }\) we see that the two simply coincide:

$$\begin{aligned} \varDelta _{\min }=\varGamma _{\min }. \end{aligned}$$
(11)

5 EPR-Bell

We now turn to the EPR-Bell case where Alice and Bob have each two distinct settings, 1 and 2, corresponding to four observable random variables \(\mathbf {A}_{1}\), \(\mathbf {A}_{2}\), \(\mathbf {B}_{1}\), and \(\mathbf {B}_{2}\). This notation implicitly contains the assumption that the identity of Alice’s measurements as random variables does not depend on Bob’s settings, and vice versa. It is well known [19] that under the no-signaling conditions the existence of the jpd is equivalent to the CHSH inequalities being satisfied. Applying the NP approach, the minimal L1 norm of the probability distribution is given by [31]

$$\begin{aligned} \varGamma _{\min }=\max \left\{ 0,\frac{1}{2}S_{CHSH}-1\right\} , \end{aligned}$$
(12)

where

$$\begin{aligned} \begin{array}{r} S_{CHSH}={\displaystyle \max _{\#^{-}=1,3}}\{\pm \left\langle \mathbf {A}_{1,1}\mathbf {B}_{1,1}\right\rangle \pm \left\langle \mathbf {A}_{1,2}\mathbf {B}_{1,2}\right\rangle \pm \left\langle \mathbf {A}_{2,1}\mathbf {B}_{2,1}\right\rangle \pm \left\langle \mathbf {A}_{2,2}\mathbf {B}_{2,2}\right\rangle \}.\end{array} \end{aligned}$$
(13)

Here \(\varGamma _{\min }=0\) corresponds to the CHSH inequalities, and \(\varGamma _{\min }>0\) to contextuality.

Turning now to the CbD approach, we have four pairs of random variables,

$$\begin{aligned} \left( \mathbf {A}_{1,1},\mathbf {B}_{1,1}\right) ,\left( \mathbf {A}_{1,2},\mathbf {B}_{1,2}\right) ,\left( \mathbf {A}_{2,1},\mathbf {B}_{2,1}\right) ,\left( \mathbf {A}_{2,2},\mathbf {B}_{2,2}\right) . \end{aligned}$$
(14)

Here, \(\mathbf {A}_{i,j}\) denotes Alice’s measurement under her setting \(i=1,2\) when Bob’s setting is \(j=1,2\), and analogously for \(\mathbf {B}_{i,j}\). We seek a jpd with the smallest value \(\varDelta _{\min }\) of

$$\begin{aligned} \begin{array}{r} \Pr \left[ \mathbf {A}_{1,1}\ne \mathbf {A}_{1,2}\right] +\Pr \left[ \mathbf {A}_{2,1}\ne \mathbf {A}_{2,2}\right] +\Pr \left[ \mathbf {B}_{1,1}\ne \mathbf {B}_{2,1}\right] +\Pr \left[ \mathbf {B}_{1,2}\ne \mathbf {B}_{2,2}\right] .\end{array} \end{aligned}$$
(15)

No contextuality means \(\varDelta _{\min }=0\). A computer assisted Fourier-Motzkin elimination algorithm yields (this is a special case of the result in Ref. [16, 17, 27])

$$\begin{aligned} \varDelta _{\min }=\max \left\{ 0,\frac{1}{2}S_{CHSH}-1\right\} . \end{aligned}$$
(16)

We have the same simple coincidence the two measures as in the case of the Leggett-Garg systems,

$$\begin{aligned} \varDelta _{\min }=\varGamma _{\min }. \end{aligned}$$
(17)

6 Final Remarks

We have discussed two ways to measure contextuality. The direct approach, named Contextuality-by-Default (CbD), assigns to each random variable an index related to their context. If a system is noncontextual, a jpd can be imposed on the random variables so that any two of them representing the same property in different contexts always have the same values. If the system is contextual, the minimum value of \(\varDelta \) in (10)–(16) across all possible jpds has the interpretation of how close a variable can be in two different contexts: the larger the value the greater contextuality, zero representing a necessary and sufficient condition for no contextuality.

The other approach maintains the original set of random variables, but requires negative (quasi-)probabilities. This leads to nonmonotonicity (i.e., a set of outcomes can have a smaller probability than some of its proper subsets), which is a characteristic of quantum interference. The departure from a proper probability distribution is measured by \(\varGamma _{\min }\) in the minimum L1 norm \(1+\varGamma _{\min }\). Similar to the CbD approach, we use here a minimization principle that gives the closest probability distribution to an ideal (but impossible) jpd. The value of \(\varGamma _{\min }\) has the interpretation of how contextual the system is: a necessary and sufficient condition for no contextuality is \(\varGamma _{\min }=0\), and the larger the value of \(\varGamma _{\min }\), the more contextual the system is.

As we have seen, in the case of EPR-Bell and Leggett-Garg systems the two approaches lead to simple coincidence, \(\varDelta _{\min }=\varGamma _{\min }\). The two measures, \(\varGamma _{\min }\) and \(\varDelta _{\min }\), can be computed for any given system, and they do not coincide for more complex systems. Thus, for a bi-partite system with three settings for Alice and for Bob our computations show that \(\varGamma _{\min }\) has a value of 1 for all Popescu-Rohrlich (PR) boxes [5, 32], whereas for some PR boxes \(\varDelta _{\min }=2\) and for others \(\varDelta _{\min }=1\). Still more complex systems exhibit still richer patterns of values for \(\varGamma _{\min }\) and \(\varDelta _{\min }\).

Of the two measures of contextuality, \(\varGamma _{\min }\) is computationally much simpler, as it involves fewer random variables and a simpler set of conditions (no nonnegativity constraints). However, CbD has the advantage of being more general than NP, as it can include cases where no NP distributions exist due to violations of the no-signaling condition [15, 17, 27].