Keywords

1 Introduction

This invitation aims mainly at calling your attention to an emerging effort to possibly improve the way we do econometrics. In fact, it has started at the dawn of econometrics by the man who created it (as a synthesis of mathematics, economic theory and statistics), Jan Tinbergen (obtaining the first Nobel Memorial Prize in Economics in 1969). He was a physicist turned economist. He proposed the gravity model of international trade by a formula similar to Newton’s law of gravity in which mass is replaced by GDP. This connection with physics, or more precisely with mechanics, seems natural as both mechanics and econometrics, especially finance, are concerned about models and predictions of (uncertain) dynamical systems. Earlier, to “capture” (explain) the observed fluctuations of stock returns, Louis Bachelier in his Ph.D. thesis (1900) proposed a continuous time model based on the Brownian motion which later forms the foundations for financial mathematics (through works of Black, Scholes and Merton, 1973, where diffusion models are based on Brownian motion). But Brownian motion, as explained by Albert Einstein, in 1905, is a motion of minuscule pollen particles suspended in water (which can be seen to wiggle and wander when examined under a strong microscope), i.e., in the realm of quantum mechanics! (laws of motion of extremely “small” objects). Thus, a shift from Newtonian mechanics to quantum mechanics seems obvious in this context? More specifically, a shift from Kolmogorov probability to quantum probability seems desirable?

Remark. In fact, Brownian motion is modeled probabilistically as “limits” of random walks within Kolmogorov’s probability theory. The same situation happened with Statistical Mechanics (see e.g., Sethna [12]. Surrounding the famous Black-Scholes option pricing formula are stuff such as PDE, Ito stochastic calculus, martingale method, and various extended models related to volatility. If quantum probability is to replace Kolmogorov probability, then we should turn to Quantum Stochastic Calculus (see, e.g., [10]).

It is well known that Kolmogorov probability theory is not appropriate to use in quantum mechanics (as exemplified by the two-slit experiment), especially the failure of the additivity property of probability measures. In fact, a radically different formalism for probability has been developed to calculating probabilities in quantum mechanics, with great successes (i.e., confirmed by experiments).

This is essentially a lesson learned from physics. It is not just importing stuff from physics to economics (in particular) but looking as physics as an evolutive science with great successes (as testified by what we got from engineering in our daily life!). When we are uncertain (epistemic or random) about some phenomenon, e.g. in “classical” mechanics (Newtonian and Einstein’s relativity theories) or quantum mechanics, we propose models, based, of course, on “evidence” from observations, measurements, and “imagination”. This is common in physics and statistics (used in, say, econometrics). Since physics has an advantage in natural science over social sciences (such as economics) as we can perform experiments to predict phenomena by our models and see if the predictions match the observations, the evolution of physics (from one model to another) proceeded peacefully, as opposed to statistical debates on modern methodologies! Let’s give a striking example:

For the purpose of “improving” statistical methods (which are used in various applied fields), at least three things surfaced recently:

  1. (i)

    The questionable use of P-values in hypothesis testing,

  2. (ii)

    The seemingly realistic prediction methodology based on calibration vs estimation (especially when big data are available),

  3. (iii)

    The possible used of quantum probability calculus in applied statistics.

Let’s “compare” reactions of statisticians (to the above 3 proposed “innovative” things) with three models in quantum physics: It was discovered that a hydrogen atom consists of a single proton at the center, and a single electron orbiting around the proton. The problem is: How the electron moves around the proton? Since we cannot “see” the electron movement, we must propose models (then verifying if such models reflect “reality”, or compatible with “observations”/some possible measurements).

(1) First model (Ernest Rutherford): Just like the earth rotating around the sun, the electron could just follow the “solar system”. The “reality” is this. The solar system is stable (that’s why we are still alive today! the earth does not collapse by falling to the sun, despite the existence of gravitational force between the earth and the sun), and so is the electron-proton system. But, unlike the solar system, subatomic particles have electric charges (of opposite signs), and as such, the Rutherford model is unstable: the electron spiraling into the proton in the center, hence this model does not correspond to reality.

(2) Bohr’s model: Thus the first model has to be replaced. To explain the stability of hydrogen atoms, Bohr proposed the following model. The electron rotates around the proton, not in a continuous fashion, but in “discrete” levels, i.e., there are countable numbers of orbits that the electron can travel, between which it can jump, so that the atom does not collapse. However, this model is only good for the hydrogen atom, and not for other particles.

(3) Schrodinger’s model: Not only to modeling dynamics of all particles, but also to explain Bohr’s model for hydrogen atom, Schrodinger proposed that the electron is in many places at once, in an “electron cloud” whose shape is given by a wave function (in Schrodinger’s fundamental equation).

This evolving understanding of hydrogen atom dynamics is a “peaceful” and productive phenomenon! New proposed models were received with open mind. As Box has said, “all models are wrong, but some are useful”, an open-minded attitude is helpful in sciences. Tradition should not be an obstacle to scientific progress.

Now, with respect to the main theme of this paper, namely, the proposal to see if quantum probability (a generalization of Kolmogorov probability calculus), viewing as a “new model for probability calculus” (not the meaning of probability per se) could be used in social sciences (e.g. economics), of course, when appropriate, the situation is this. Again, by “tradition” (like the issues (i), (ii) listed above), it’s a slow motion, as usual! Perhaps, only a handful of statisticians is aware of the proposal, let alone taking a closer look at it.

Let’s quote a recent opinion of some prominent statisticians on this proposal, namely Andrew Gelman and Michael Betancourt [5]:

“Does quantum uncertainty have a place in everyday applied statistics?”

(a) Open mind: “We are sympathetic to the proposal of modeling joint probabilities using a framework more general than standard model by relaxing the law of conditional probability”.

“The generalized probability theory suggested by quantum physics might very well be relevant in the social sciences”.

Remark. An obvious research issue arises right here: Beyond copulas? Interference vs correlation.

(b) A closer look at a new proposal: “Some of our own applied work involves political science and policy, often with analysis of data from opinion polls, where there are clear issues of the measurement affecting the outcome”.

Remark. “Measurement affecting the outcome” is the main real phenomenon in quantum physics, as expressed by Heisenberg’s uncertainty principle (responsible to the lack of a phase space in quantum mechanics). The point is this. It’s all about data (observations): the data dictate the methods to use for analyzing them, and not the other way around.

(c) Some possible gains: “Just as psychologists have found subadditivity and superadditivity of probability in many contexts, we see the potential gain of thinking about violating of the conditional probability law”.

Remark. To apply quantum probability calculus to social science problems, one needs to have clear evidence of the failure of classical probability theory. Remember “The ultimate challenge in statistics is to solve applied problems”.

Finally, and this is important (!), to say loud and clear: If quantum probability calculus seems to be useful for applied statistics, it does not mean that we have to “ignore” standard probability theory, i.e., replace the latter by the former. This is important for two reasons:

(i) “Traditional” researchers should not be worry about abandoning what they used to work with until now! since Kolmogorov probability theory could remain appropriate for many situations,

(ii) Quantum probability calculus may be only suitable for some situations, but not all.

This is completely similar to the situation in mechanics: The discovery of quantum mechanics did not ignore Newton’s mechanics: Newtonian mechanics is still valid in macrophysics.

So, assuming that we have an open mind, so that we love to understand the new proposal before making our own judgement of whether it could be used in, say, financial econometrics. Thus, tradition aside, let’s find out why in quantum physics the calculus of probabilities is different than classical Kolmogorov’s one.

Again, mathematical finance was founded on the Black-Scholes option pricing PDE which was based upon the modeling of financial returns as diffusion processes in the context of probability theory. In this modeling approach, the return distributions are classical probability distributions. The basic question of “econophysicists” is this:

“Should we model return distributions with distributions which reflect the data in a much closer way?”

Clearly, predictions would be improved if the models are better! In fact, research reported in the literature showed that this quantum approach can be of potential benefit.

By the very nature of Brownian motion, should we study finance in the context of quantum mechanics, instead? with the hope that “quantum probability distributions” will supply a reasonable answer to the above question. The attempt to put the Black-Scholes pricing formula in the quantum context was discussed by [11] who rationalized the use of quantum principles in option pricing context:

“A natural explanation of extreme irregularities in the evolution of prices in financial markets is provided by quantum effects”.

See also [7].

At a technical level, the difference between classical modeling approach (i.e., based on Kolmororov’s probability theory) and the “quantum approach” can be explained as follows.

(i) Kolmogorov probability formalism includes both objective and subjective probabilities which are man-made uncertainty (i.e. imposed uncertainty by men), e.g., in Von Neumann’s game theory for economics/mixed strategies; and Savage expected utility theory, whereas uncertainty in quantum physics is due to the nature itself,

(ii) While we still have the same interpretation of the concept of probability (“chance”), the calculus of these types of probabilities is different. For example, quantum probability measures are not additive (due to quantum interference of waves of particles). This is clearly affecting our attempt on making financial predictions!

In any case, the quantum approach to finance in particular, and to econometrics in general, is an ongoing research direction. For an introduction to Quantum Finance, see [1].

In this introductory lecture to quantum econometrics, we will only focus on the main ingredient, namely the concept of quantum probability (and the context giving rise to it) which plays a crucial role in uncertainty analysis of quantum mechanics and possibly in social sciences. While Feynman path integral is useful for solving the initial value problem for the Schrodinger equation, it will not be discussed in this introductory lecture. Curious readers could read Keller and McLaughlin [6].

As such, in Sect. 2, a bit of quantum mechanics is given. Section 3 presents the uncertainty analysis in quantum context. Section 4 presents a mathematical formulation for quantum probability together with a comparison with Kolmogorov probability theory. Section 5 concludes the paper by discussing econometrics issues. Along the way, research issues will be mentioned.

2 A Bit of Quantum Mechanics

Unlike statistical mechanics, quantum mechanics reveals the randomness believed to be caused by nature itself. As we are going to examine whether economic fluctuations can be modeled by quantum uncertainty, we need to take a quick look at quantum mechanics. For a good and enjoyable reading on quantum mechanics, consult Feynman [3, 4].

The big picture of quantum mechanics is this. A particle with mass m, and potential energy \(V(x_{o})\) at a position \(x_{o}\in \mathbb {R}^{3}\), at time \(t=0\), will move to a position x at a later time \(t>0\). But unlike Newtonian mechanics (where moving objects obey a law of motion and their time evolutions are deterministic trajectories, with a state being a point in \(\mathbb {R}^{6}\)/position and velocity), the motion of a particle is not deterministic, so that at most we can only look for the probability that it could be in a small neighborhood of x, at time t. Thus, the problem is: How to obtain such a probability? According to quantum mechanics, the relevant probability density \(f_{t}(x)\) is of the form \(|\psi (x,t)|^{2}\) where the (complex) “probability amplitude” \(\psi (x,t)\) satisfies the Schrodinger equation (playing the role of Newton’s law of motion in macrophysics)

$$\begin{aligned} ih\frac{\partial \psi (x,t)}{\partial t}=-\frac{h^{2}}{2m}\varDelta _{x}\psi (x,t)+V(x)\psi (x,t) \end{aligned}$$

where h is the Planck’s constant, \(i=\sqrt{-1}\), and \(\varDelta _{x}\) is the Laplacian \(\varDelta _{x}\psi =\frac{\partial ^{2}\psi }{\partial x_{1}^{2}}+ \frac{\partial ^{2}\psi }{\partial x_{2}^{2}}+\frac{\partial ^{2}\psi }{ \partial x_{3}^{2}}\), \(x=(x_{1},x_{2},x_{3})\in \mathbb {R}^{3}\).

Solutions of the Schrodinger equation are “wave-like”, and hence are called wave functions of the particle (the equation itself is called the wave equation). Of course, solving this PDE equation, in each specific situation, is crucial. Richard Feynman [2] introduced the concept of path integral to solve it.

For a solution of the form \(\psi (x,t)=\varphi (x)e^{it\theta }\), \(|\psi (x,t)|^{2}=|\varphi (x)|^{2}\) with \(\varphi \in L^{2}(\mathbb {R}^{3}, \mathscr {B}(\mathbb {R}^{3}),dx)\), in fact \(||\varphi ||=1\). Now, since the particle can take any path from \((x_{o},0)\) to (xt), its “state” has to be described probabilistically. Roughly speaking, each \(\varphi \) (viewed as a “vector” in the complex, infinitely dimensional Hilbert space \(L^{2}( \mathbb {R}^{3},\mathscr {B}(\mathbb {R}^{3}),dx)\)) represents a state of the moving particle. Now \(L^{2}(\mathbb {R}^{3},\mathscr {B}(\mathbb {R}^{3}),dx)\) is separable so that it has a countable orthonormal basis, \(\varphi _{n}\), say, and hence

$$\begin{aligned} \varphi =\sum _{n}<\varphi ,\varphi _{n}>\varphi _{n}=\sum _{n}c_{n}\varphi _{n}=\,<\varphi _{n}|\varphi |\varphi _{n}> \end{aligned}$$

where \(<.,.>\) denotes the inner product in \(L^{2}(\mathbb {R}^{3},\mathscr {B}( \mathbb {R}^{3}),dx)\), and the last notation on the right is written in popular Dirac’s notation, noting that \(||\varphi ||^{2}=1=\sum _{n}|c_{n}|^{2} \), and

$$\sum _{n}|\varphi _{n}> <\varphi _{n}|=I~ (\mathrm{identity~operator~on}~L^{2}(\mathbb {R}^{3},\mathscr {B}(\mathbb {R}^{3}),dx))$$

where \(|\varphi > <\psi |\) is the operator: \(f\in L^{2}(\mathbb {R}^{3}, \mathscr {B}(\mathbb {R}^{3}),dx)\rightarrow<\varphi ,f> <\psi |\) \(\in L^{2}( \mathbb {R}^{3},\mathscr {B}(\mathbb {R}^{3}),dx)\).

From the solution \(\varphi (x)\) of Schrodinger equation, the operator

$$\begin{aligned} \rho =\sum _{n}c_{n}|\varphi _{n}> <\varphi _{n}| \end{aligned}$$

is positive definite with unit trace (\(tr(\rho )=\sum _{n}<\varphi _{n}|\rho |\varphi _{n}>\,=1\)).

Thus it plays the role of the classical probability density function. By separability of \(L^{2}(\mathbb {R}^{3},\mathscr {B}(\mathbb {R}^{3}),dx)\), we are simply in a natural extension of finitely dimensional euclidean space setting, and as such, the operator \(\rho \) is called a density matrix which represents the “state” of a quantum system.

This “concrete setting” brings out a general setting (which generalizes Kolmogorow probability theory), namely, a complex, infinitely dimensional, separable, Hilbert space \(H=L^{2}(\mathbb {R}^{3},\mathscr {B}(\mathbb {R} ^{3}),dx)\), and a density matrix \(\rho \) which is a (linear) positive definite operator on H (i.e., \(<f,\rho f>\) \(\ge 0\) for any \(f\in H\), implying that it is self adjoint), and of unit trace. A quantum probability space is simply a pair \((H,\rho )\).

Remark. At a given time t, it is the entire function \( x\rightarrow \psi (x,t)\) which describes the state of the quantum system, and not just one point! The wave function \(\psi (x,t)\) has a probabilistic interpretation: its amplitude gives the probability distribution for the position, a physical quantity of the system, namely, \(|\psi (x,t)|^{2}\).

Now, observe that for \(\varphi (p)\) arbitrary, where \(p=mv\) is the particle momentum, a solution of Schrodinger’s equation is

$$\begin{aligned} \psi (x,t)=\int _{\mathbb {R}^{3}}\varphi (p)e^{-\frac{i}{h}(Et-<p,x>}dp/(2\pi h)^{\frac{3}{2}} \end{aligned}$$

(where \(E=\frac{||p||^{2}}{2m}\)), i.e., \(\psi \) is the Fourier transform of the function \(\varphi (p)e^{-\frac{i}{h}(Et)}\), and hence, by Parseval-Plancherel,

$$\begin{aligned} \int _{\mathbb {R}^{3}}|\psi (x,t)|^{2}dx=\int _{\mathbb {R}^{3}}|\varphi (p)|^{2}dp \end{aligned}$$

Thus, it suffices to choose \(\varphi (\text {.})\) such that \(\int _{\mathbb {R} ^{3}}|\varphi (p)|^{2}dp=1\) (to have all wave functions in \(L^{2}(\mathbb {R} ^{3},\mathscr {B}(\mathbb {R}^{3}),dx)\), as well as \(\int _{\mathbb {R} ^{3}}|\psi (x,t)|^{2}dx=1\). In particular, for stationary solutions of Schrodinger’ equation \(\psi (x)e^{-iEt/h},\) describing the same stationary state. Here, note that \(||\psi ||=1.\)

Three things come up:

  1. (i)

    With addition of waves and square integrability, the state space in quantum mechanics is a complex, infinitely dimensional Hilbert space,

  2. (ii)

    Unlike Newtonian mechanics, the dynamics of particles are random in nature (in the sense that, under the same “state” (initial conditions), results are different), thus we cannot talk about “the trajectory” of a moving particle,

  3. (iii)

    We need to be able to find the probability distribution of possible “trajectories”. A plausible suggestion is \(|\psi (x,t)|^{2}\) for probability density of the position. But then, while the meaning of probability remains the usual one (e.g. as a frequency interpretation), its calculus based on this formalism is different than Kolmogorov’s probability calculus, e.g., additivity property breaks down (in, say, interference of waves).

Remark. Note that when dealing with uncertainty (ordinary or quantum), it is necessary to evoke its underlying logic, for purpose of “reasoning” (inference which is based on logic, and not on mathematical theorems, like the way to carry out statistical hypothesis testing problems using p-values!). It turns out that quantum logic is non Boolean, but seems to have a pleasant connection with the so-called Conditional Event Algebra. See the recent paper by Nguyen [9].

In summary, quantum mechanics concerns motions of particles. Particles moves like waves with a random behavior. The law of quantum mechanics is given by the Schrodinger’s equation whose solution is the wave function describing the motion of a particle States of quantum systems are determined by quantum probabilities. Quantum mechanics does not predict a single definite outcome (observed), it predicts a number of different possible outcomes and tells us how likely each of these is (somewhat similar to coarse data in classical statistics). Interference occurs with particles by duality wave/particle.

3 Measuring Physical Quantities

Physical quantities are numerical values associated to a quantum system, such as position, momentum, velocity, and functions of these, such as energy.

In classical mechanics, the result on the measurement of a physical quantity is just a number at each instant of time. In quantum mechanics, at a given time, repeated measurements under the same state of the system give different values of a physical quantity A: There should exist a probability distribution on its possible values, and we could use its expected (mean) value.

For some simple quantities, it is not hard to figure out their probability distributions, such as position x and momentum p (use Fourier transform to find the probability distribution of p) from which we can carry out computations for expected values of functions of then, such as potential energy V(x), kinetic energy (function of p alone). But how about, say, the mechanical energy \(V(x)+\frac{p^{2}}{2m}\), which is a function of both position x and momentum p? Well, its expected value is not a problem, as you can take \(E(V(x)+\frac{p^{2}}{2m})=EV(x)+E(\frac{p^{2} }{2m})\), but how to get its distribution when we need it? Also, if the quantity of interest is not of the form of a sum where the knowledge of E(x), E(p) is not sufficient to compute its expectation?

If you think about classical probability, then you would say this. We know the marginal distributions of the random variables xp. To find the distribution of \(V(x)+\frac{p^{2}}{2m}\), we need the joint distribution of (xp). How? Copulas could help? But are we in the context of classical probability!?

We need a general way to come up with necessary probability distributions for all physical quantities, from the knowledge of the wave function \( \psi (x,t)\) in the Schrodinger’s equation. It is right here that we need mathematics for physics!

For a spacial quantity like position X (of the particle),  or V(X) (potential energy), we know its probability distribution \(x\rightarrow |\psi (x,t)|^{2}\), so that its expected valued is given by

$$\begin{aligned} EV(X)=\int _{\mathbb {R}^{3}}V(x)|\psi (x,t)|^{2}dx=\int _{\mathbb {R}^{3}}\psi ^{*}(x,t)V(x)\psi (x,t)dx \end{aligned}$$

If we group the term \(V(x)\psi (x,t)\), it looks like we apply the “operator” V to the function \(\psi (.,t)\in L^{2}(\mathbb {R}^{3})\), to produce another function of \(L^{2}(\mathbb {R}^{3})\). That operator is precisely the multiplication \(A_{V}(\text {.}):L^{2}(\mathbb {R}^{3})\rightarrow L^{2}(\mathbb {R} ^{3}):\psi \rightarrow V\psi \). It is a bounded, linear map from a (complex) Hilbert space H to itself, which we call, for simplicity, an operator on H.

We observe also that EV(X) is a real value (!) since

$$\begin{aligned} EV(X)=\int _{\mathbb {R}^{3}}V(x)|\psi (x,t)|^{2}dx \end{aligned}$$

with \(V(\text {.})\) being real-valued. Now,

$$\begin{aligned} \int _{\mathbb {R}^{3}}\psi ^{*}(x,t)V(x)\psi (x,t)dx=\,<\psi ,A_{V}\psi > \end{aligned}$$

is the inner product on \(H=L^{2}(\mathbb {R}^{3})\). We see that, for any \(\psi ,\varphi \in H\), \(<\psi ,A_{V}\varphi >\,=\,<A_{V}\psi ,\varphi>\), since V is real-valued, meaning that the operator \(A_{V}(\text {.}):\psi \rightarrow V\psi \) is self adjoint.

For the position \(X=(X_{1},X_{2},X_{3})\), we compute the vector mean \( EX=(EX_{1},EX_{2},EX_{3})\), where we can derive, for example, \(EX_{1}\) directly by the observables of \(Q=X_{1}\) as \(A_{X_{1}}:\psi \rightarrow x_{1}\psi \) (multiplication by \(x_{1}),\) \(A_{x_{1}}(\psi )(x,t)=x_{1}\psi (x,t)\).

Remark. The inner product in the (complex) Hilbert space \(H=L^{2}( \mathbb {R}^{3})\) (complex-valued functions on \(\mathbb {R}^{3}\), squared integrable wrt to Lebesgue measure dx on \(\mathscr {B}(\mathbb {R}^{3})\)) is defined as

$$\begin{aligned} <\psi ,\varphi >\,=\int _{\mathbb {R}^{3}}\psi ^{*}(x,t)\varphi (x,t)dx \end{aligned}$$

where \(\psi ^{*}(x,t)\) is the complex conjugate of \(\psi (x,t)\). The adjoint operator of the (bounded) operator \(A_{V}\) is the unique operator, denoted as \(A_{V}^{*}\), such that \(<A_{V}^{*}(f),g>\,=\,<f,A_{V}(g)>\), for all \(f,g\in H\) (its existence is guaranteed by Riesz theorem in functional analysis). It can be check that \(A_{V}^{*}=A_{V^{*}}\), so that if \(V=V^{*}\) (i.e., V is real-valued), then \( A_{V}^{*}=A_{V}\), meaning that \(A_{V}\) is self adjoint. Self adjoint operators are also called Hermitian (complex symmetry) operators, just like for complex matrices. The property of self adjoint for operators is important since eigenvalues of such operators are real values, and as we will see later, which correspond to possible values of the physical quantities under investigation, which are real valued.

As another example, let’s proceed directly to find the probability distribution of the momentum \(p=mv\) of a particle, at time t, in the state \(\psi (x,t)\), \(x\in \mathbb {R}^{3}\), and from it,.compute, for example, expected values of functions of momentum, such as \(Q=\frac{||p||^{2}}{2m}\).

The Fourier transform of \(\psi (x,t)\) is

$$\begin{aligned} \varphi (p,t)=(2\pi h)^{-\frac{3}{2}}\int _{\mathbb {R}^{3}}\psi (x,t)e^{- \frac{i}{h}<p,x>}dx \end{aligned}$$

so that, by Parseval-Plancherel, \(|\varphi (p,t)|^{2}\) is the probability density for p, so that

$$\begin{aligned} E(\frac{||p||^{2}}{2m})=\int _{\mathbb {R}^{3}}\frac{||p||^{2}}{2m}|\varphi (p,t)|^{2}dp \end{aligned}$$

But we can obtain this expectation via an appropriate operator \(A_{p}\) as follows. Since

$$\begin{aligned} \psi (x,t)=(2\pi h)^{-\frac{3}{2}}\int _{\mathbb {R}^{3}}\varphi (p,t)e^{\frac{ i}{h}<p,x>}dp \end{aligned}$$

with \(x=(x_{1},x_{2},x_{3})\), we have

$$\begin{aligned} \frac{h}{i}\frac{\partial }{\partial x_{1}}\psi (x,t)=(2\pi h)^{-\frac{3}{2} }\int _{\mathbb {R}^{3}}p_{1}\varphi (p,t)e^{\frac{i}{h}<p,x>}dp \end{aligned}$$

i.e., \(\frac{h}{i}\frac{\partial }{\partial x_{1}}\psi (x,t)\) is the Fourier transform of \(p_{1}\varphi (p,t)\), and since \(\psi \) is the Fourier transform of \(\varphi \), Parveval-Plancherel implies

$$\begin{aligned} E(p_{1})=\int _{\mathbb {R}^{3}}\varphi ^{*}(p,t)p_{1}\varphi (p,t)dp=\int _{\mathbb {R}^{3}}\psi ^{*}(p,t)[\frac{h}{i}\frac{\partial }{ \partial x_{1}}](\psi (x,t)dx \end{aligned}$$

we see that the operator \(A_{p}=\frac{h}{i}\frac{\partial }{\partial x_{1}} (\text {.})\) on H extracts information from the wave function \(\psi \) to provide a direct way to compute the expected value of the component \(p_{1}\) of the momentum vector \(p=(p_{1},p_{2},p_{3})\) (note \(\ p=mv\), with \( v=(v_{1},v_{2},v_{3})\)) on one axis of \(\mathbb {R}^{3}\). For the vector p (three components), the operator \(A_{p}=\frac{h}{i}\nabla \), where \(\nabla =( \frac{\partial }{\partial p_{1}},\frac{\partial }{\partial p_{2}},\frac{ \partial }{\partial p_{3}}).\)

As for \(Q=\frac{||p||^{2}}{2m}\), we have

$$\begin{aligned} EQ=\int _{\mathbb {R}^{3}}\psi ^{*}(x,t)[(\frac{-h^{2}}{2m})\varDelta ](\psi (x,t)dx \end{aligned}$$

where \(\varDelta \) is the Laplacian. The corresponding operator is \(A_{Q}=( \frac{-h^{2}}{2m})\varDelta \).

Examples, as the above, suggest that, for each physical quantity of interest Q (associated to the state \(\psi \) of a particle) we could look for a self adjoint operator \(A_{Q}\) on H so that

$$\begin{aligned} EQ=\,<\psi ,A_{Q}\psi > \end{aligned}$$

A such operator extracts information from the state (wave function) \(\psi \) for computations on Q. This operator \(A_{Q}\) is referred to as the observable for Q.

Remark. If we just want to compute the expectation of the random variable Q, without knowledge of its probability distribution, we look for the operator \(A_{Q}\). On the surface, it looks like we only need a weaker information than the complete information provided by the probability distribution of Q. This is somewhat similar to a situation in statistics, where getting the probability distribution of a random set S, say on \(\mathbb {R}^{3}\) is difficult, but a weaker and easier information about S can be obtained, namely it coverage function \(\pi _{S}(x)=P(S\ni x)\), \(x\in \mathbb {R}^{3}\), from which the expected value of the measure \(\mu (S)\) can be computed, as \(E\mu (S)=\int _{\mathbb {R}^{3}}\pi _{S}(x)d\mu (x)\), where \(\mu \) is the Lebesgue measure on \(\mathscr {B}( \mathbb {R}^{3})\). See e.g., Nguyen [8].

But how to find \(A_{Q}\) for Q in general? Well, a “principle” used in quantum measurement is this. Just like in classical mechanics, all physical quantities associated to a dynamical systems are functions of the system state, i.e., position and momentum (xp),  i.e., Q(xp), such as \(Q(x,p)= \frac{||p||^{2}}{2m}+V(x)\). Thus, the observable corresponding to Q(xp) should be \(Q(A_{x},A_{p}),\) where \(A_{x},A_{p}\) are observables corresponding to x and p which we already know in the above analysis. For example, if the observable of Q is \(A_{Q}\), then the observable of \( Q^{2}\) is \(A_{Q}^{2}\).

An interesting example. What is the observable \(A_{E}\) corresponding to the energy \(E=\frac{||p||^{2}}{2m}+V\)?

We have \(A_{V}=V\) (\(Q_{V}(f)=Vf\), i.e., multiplication by the function V: \( (A_{V}(f)(x)=V(x)f(x)\)).

$$\begin{aligned} A_{p}=\frac{h}{i}\nabla =\frac{h}{i}\left( \begin{array}{c} \frac{\partial }{\partial x_{1}} \\ \frac{\partial }{\partial x_{2}} \\ \frac{\partial }{\partial x_{3}} \end{array} \right) \end{aligned}$$

so that

$$\begin{aligned} A_{p}^{2}(f)= & {} (A_{p}\circ A_{p})(f)=A_{p}(A_{p}(f))=A_{p}\left[ \frac{h}{i} \begin{array}{c} \frac{\partial f}{\partial x_{1}} \\ \frac{\partial f}{\partial x_{2}} \\ \frac{\partial f}{\partial x_{3}} \end{array} \right] \\= & {} (\frac{h}{i})^{2}\left[ \begin{array}{c} \frac{\partial f^{2}}{\partial x_{1}^{2}} \\ \frac{\partial f^{2}}{\partial x_{2}^{2}} \\ \frac{\partial f^{2}}{\partial x_{3}^{2}} \end{array} \right] =-h^{2}\left[ \begin{array}{c} \frac{\partial f^{2}}{\partial x_{1}^{2}} \\ \frac{\partial f^{2}}{\partial x_{2}^{2}} \\ \frac{\partial f^{2}}{\partial x_{3}^{2}} \end{array} \right] \end{aligned}$$

Thus, the observable of \(\frac{||p||^{2}}{2m}\) is \(\frac{-h^{2}}{2m}\varDelta ,\) and that of \(E=\frac{||p||^{2}}{2m}+V\) is \(A_{E}=\frac{-h^{2}}{2m}\varDelta +V \), which is an operator on \(H=L^{2}(\mathbb {R}^{3})\).

By historic reason, this observable of the energy (of the quantum system) is called the Hamiltonian of the system (in honor of Hamilton, 1805–1865) and denoted as

$$\begin{aligned} \mathscr {H}=\frac{-h^{2}}{2m}\varDelta +V \end{aligned}$$

Remark. Since

$$\begin{aligned} E(V)=\int _{\mathbb {R}^{3}}\psi ^{*}(x,t)V(x)\psi (x,t)dx=\int _{\mathbb {R} ^{3}}\psi ^{*}(x,t)(M_{V}\psi )(x,t)dx \end{aligned}$$

it follows that \(A_{V}=V\).

The Laplacian operator is

$$\begin{aligned} \varDelta f(x)=\frac{\partial f^{2}}{\partial x_{1}^{2}}+\frac{\partial f^{2}}{ \partial x_{2}^{2}}+\frac{\partial f^{2}}{\partial x_{3}^{2}} \end{aligned}$$

where \(x=(x_{1},x_{2},x_{3})\in \mathbb {R}^{3}\).

Now, if we look back at Schrodinger’s equation

$$\begin{aligned} ih\frac{\partial }{\partial t}\psi (x,t)=-\frac{h^{2}}{2m}\varDelta \psi (x,t)+V(x)\psi (x,t) \end{aligned}$$

with (stationary) solutions of the form \(\psi (x,t)=\varphi (x)e^{-i\omega t} \), then it becomes

$$\begin{aligned} (-i^{2})h\omega \varphi (x)e^{-i\omega t}=-\frac{h^{2}}{2m}\varDelta \varphi (x)e^{-i\omega t}+V(x)\varphi (x)e^{-i\omega t} \end{aligned}$$

or

$$\begin{aligned} h\omega \varphi (x)=-\frac{h^{2}}{2m}\varDelta \varphi (x)+V(x)\varphi (x) \end{aligned}$$

With \(E=h\omega \), this is

$$\begin{aligned} -\frac{h^{2}}{2m}\varDelta \varphi (x)+V(x)\psi (x)=E\varphi (x) \end{aligned}$$

or simple, in terms of the Hamiltonian,

$$\begin{aligned} \mathscr {H\varphi =}\, E\varphi \end{aligned}$$

Putting back the term \(e^{-i\omega t}\), the Schrodinger’s equation is written as

$$\begin{aligned} \mathscr {H\psi =}\, E\psi \end{aligned}$$

i.e., the state \(\psi \) (solution of Schrodinger’s equation) is precisely the eigenfunction of the Hamiltonian \(\mathscr {H}\) of the system, with corresponding eigenvalue E. In other words, the wave function of a quantum system (as described by Schrodinger’s equation) is an eigenfunction of the observable of the system energy.

In fact, the Schrodinger equation is

$$\begin{aligned} ih\frac{\partial }{\partial t}\psi (x,t)=\mathscr {H}\psi (x,t) \end{aligned}$$

with \(\mathscr {H}\) as an operator on a complex Hilbert space H in a general formalism, where the wave function is an element of H: The Schrodinger’s equation is an “equation” in this “Operators on Complex Hilbert spaces” formalism. This equation tells us clearly: It is precisely the observable of the energy that determines the time evolution of states of a quantum system. On the other hand, being an element in a separable Hilbert space, a wave function \(\psi \) can be decomposed as a linear superposition of stationary states, corresponding to the fact that energy is quantified (i.e., having discrete levels of energy, corresponding to stationary states). Specifically, the states (wave functions in the Schrodinger’s equation) of the form \(\psi (x,t)=\varphi (x)e^{-i\omega t}\) are stationary states since \(|\psi (x,t)|=|\varphi (x)|\), independent of t, so that the probability density \(|\varphi (x)|^{2}\) (of finding the particle in a neighborhood of x) does not depend on time, resulting in letting anything in the system unchanged (not evoluting in time). That is the meaning of stationarity of a dynamical system (the system does not move). To have motion, the wave function has to be a linear superposition of stationary states in interference (as waves). And this can be formulated “nicely” in Hilbert space theory! Indeed, let \(\varphi _{n}\) be eigenfunctions of the Hamiltonian, then (elements of a separable Hilbert space have representations with respect to some orthonormal basis) \(\psi (x,t)=\sum _{n}c_{n}\varphi _{n}(x)e^{-iE_{n}t/h}\), where \(E_{n}=h\omega _{n}\) (energy level). Note that, as seen above, for stationary states \(\varphi _{n}(x)e^{-iE_{n}t/h}\), we have \(\mathscr {H}\varphi _{n}=E_{n}\varphi \), i.e., \(\varphi _{n}\) is an eigenfunction of \(\mathscr {H}\). Finally, note that, from the knowledge of quantum physics where energy is quantified, the search for (discrete) energy levels \(E_{n}=h\omega _{n}\) corresponds well to this formalism.

We can say that Hilbert spaces and linear operators on them form the language of quantum mechanics.

Thus, before continuing, let’s put down an abstract definition: An observable is a bounded, linear, and self adjoint operator on a Hilbert space.

We have seen that multiplication operator \(M_{f}:g\in H=L^{2}(\mathbb {R} ^{3})\rightarrow M_{f}(g)=fg\) is self adjoint when f is real-valued. In particular, for \(f=1_{B}\), \(B\in \mathscr {B}(\mathbb {R}^{3})\), \(M_{1_{B}}\) is a (orthogonal) projection on H, i.e., satisfying \( M_{1_{B}}=(M_{1_{B}})^{2}\) (idempotent) \(=(M_{1_{B}})^{*}\), which is a special self adjoint operator. This will motive the space \(\mathscr {P}(H)\) of all projections on H as the set of “events”.

Each observable A is supposed to represent an underlying physical quantity. So, given a self adjoint operator A on H, what is the value that we are interested in, in a given state \(\psi \)? Well, it is \(<\psi ,A\psi>\) (e.g., \(\int _{\mathbb {R}^{3}}\psi ^{*}(x,t)A(\psi )(x,t)dx\)), with, by abuse of language, is denoted as \(<A>_{\psi }\). Note that \(<\psi , A\psi >\,\in \mathbb {R}\), for any \(\psi \in H\), since A is self adjoint, which is “consistent” with the fact that physical quantities are real-valued.

Remark. If we view the observable A as a random variable, and the state \(\psi \) as a probability measure on its “sampling space” H, in the classical setting of probability theory, then \(<A>_{\psi }\) plays the role of expectation of A wrt the probability measure \(\psi \). But here is the fundamental difference with classical probability theory: as operators, the “quantum random variables” do not necessarily commute, so that we are facing a noncommutative probability theory. This is compatible with the “matrix” viewpoint of quantum mechanics, suggested by Heisenberg, namely that numerical measurements in quantum mechanics should be matrices which form a noncommutative algebra.

4 Distributions of Observables

Let’s look back at the finitely dimensional case. This is in fact the origin of the so-called spectral theory (of operators).

For simplicity, and for concreteness, consider the euclidean space \(\mathbb {R }^{n}\). This is a vector space over the scalar field \(\mathbb {R}\). Moreover, it has an binary form which is called an inner product: \(<x,y>\,= \sum _{j=1}^{n}x_{j}y_{j}\), where \(x_{j}^{\prime }s\) are the coordinates of \( x\in \mathbb {R}^{n}\) with respect to an orthonormal (canonical) basis of \( \mathbb {R}^{2}\). When we consider infinitely dimensional spaces with similar properties, we will call them Hilbert spaces. Thus euclidean spaces \(\mathbb { R}^{n}\) are finitely dimensional Hilbert spaces.

A (real) \(n\times n\) matrix \(A=[a_{jk}]\) is a linear transformation (we will call it an operator) on \(\mathbb {R}^{n}\) (i.e., \(A:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\)). If the matrix A is symmetric, i.e., \(A=A^{t}\) (the transpose of A, i.e., \(a_{jk}=a_{kj}\)), then by changing coordinate systems (principal axes theorem in analytical geometry), we represent A in a “nice” form, namely diagonal, where the nonzero diagonal entries are roots of the characteristic polynomial \(\det (A-\lambda I)=0\), called the eigenvalues of A. If we let \(\sigma (A)\) be the set of all eigenvalues of A, called the spectrum of A, then A is written as \( A=\sum _{\lambda \in \sigma (A)}\lambda P_{\lambda }\), where \(P_{\lambda }\) is the (orthogonal) projections on \(\mathbb {R}^{n}\) onto the eigensubspace \( S(\lambda )=\{x\in \mathbb {R}^{n}:Ax=\lambda x\}\), i.e., the set of eigenvectors associated with the eigenvalue \(\lambda \). This is referred to as the spectral decomposition of the matrix (operator) A.

Remark. The term “spectrum” (or spectral) is used possibly in relation of spectra of atoms in physics. Spectral theory was named after D. Hilbert (1910). But of course, “Hilbert space” was not named by Hilbert!

When we need to consider matrices with complex entries, e.g., linear operators on \(\mathbb {C}^{n}\), symmetry is extended to Hermitian (or self adjoint) property, i.e., \(A=A^{*}\) (transpose of complex conjugate matrix). Even in this case, the remarkable fact is that eigenvalues of self adjoint operators are real: \(\sigma (A)\subseteq \mathbb {R}\). In particular, when \(\sigma (A)\subseteq \mathbb {R}^{+}\), A is said to be a positive operator, which is equivalent to \(<x,Ax>\) is positive for all \(x\in \mathbb {C}^{n}\).

Look at the spectral decomposition \(A=\sum _{\lambda \in \sigma (A)}\lambda P_{\lambda }\) of the symmetric matrix A. Since \(\sigma (A)\subseteq \mathbb {R}\), we can define a a map \(\xi _{A}:\mathscr {B}(\mathbb {R})\in \mathscr {P}(\mathbb {R}^{n})\) (space of projections on \(\mathbb {R}^{n}\)) by \(\xi _{A}(B)=\sum _{\lambda \in B}P_{\lambda }\). Then, \(\xi _{A}(\mathbb {R} )=\sum _{\lambda \in \mathbb {R}}P_{\lambda }=I\) (when \(tr(A)=1\)), and for any pairwise disjoint \(B_{j}\in \mathscr {B}(\mathbb {R})\), \(\xi _{A}(\cup _{j}B_{j})=\sum _{j}\xi _{A}(B_{j})\). The set function \(\xi _{A}(\text {.})\) looks like a probability measure, but with \(\mathscr {P}(\mathbb {R}^{n}){-}\) valued, instead of [0, 1]. Such a set function is called a spectral measure, and \(\xi _{A}(\text {.})\) is the (discrete) spectral measure of the matrix A. In fancy notation (but useful when considering infinitely dimensional setting), we write \(A=\int _{\sigma (A)}\lambda d\xi _{A}(\lambda )\).

We see that the study of (random) physical quantities Q on a quantum system, in a state \(\psi \), is via its observable \(A_{Q}.\) Observables (in quantum context) play the role of random variables in Kolmogorov’s probability theory.

For simplicity, let’s elaborate on this in the finitely dimension case where, for “concreteness”, observables are taken as \(n\times n\) matrices with complex entries. These are linear, Hermitian operators on \(\mathbb {C} ^{n}\).

Recall that a matrix \(A=[a_{jk}]\), as an operator on \(\mathbb {R}^{n}\), gives rise to a quadratic form \(<Ax,x>\,=\sum a_{jk}x_{j}x_{k}\). If A is symmetric, i.e., \(a_{jk}=a_{kj}\), then, using an orthogonal transformation (leaving invariant Euclidean metric on \(\mathbb {R}^{n}\)), in analytic geometry, it can be rewritten in a normal form \(<Ax,x>\,=\sum \lambda _{j}x_{j}^{2}\). Sylvester, in 1852, showed that the \(\lambda _{j}^{\prime }s\) are roots of the characteristic polynomial \(\det (\lambda I-A)\), i.e., eigenvalues of the matrix (operator) A. This form reduction corresponds to a diagonalization process on the matrix A: for some orthogonal matrix B, the matrix \(D=B^{-1}AB\) is in diagonal form. The diagonal entries of D are eigenvalues of A. The set of eigenvalues of A is called the spectrum of A, and is denoted as \(\sigma (A)\). Thus, there exists an orthonormal basis of \(\mathbb {R}^{n}\), \(\{e_{1},e_{2},...,e_{n}\}\), with respect to it, A is diagonal with diagonal entries being eigenvalues of A. In other words, \(A=\sum _{\lambda \in \sigma (A)}\lambda P_{\lambda }\), where \(P_{\lambda }\) is the (orthogonal) projection onto the eigensubspaces \( S_{\lambda }=\{x:Ax=\lambda x\}\). This is referred to as the spectral decomposition of the symmetric matrix A.

Remark. In quantum mechanics, certain physical quantities cannot be measured simultaneously. This fact is interpreted as their observables (e.g. Hermitian matrices) do not commute (since the algebra of matrices is noncommutative). The set of possible values of a quantity Q is the spectrum of \(A_{Q}\). Thus, the spectrum of the Hamiltonian of energy (energy levels, recalling that energy is quantified) of an atom is precisely the spectrum of the atom.

With this spectral decomposition of an observable (i.e., a self adjoint operator) in the finite case, let’s point out right away that observables play the role of random variables in Kolmogorov’s setting. First, observe that a projection operator is an “event” in quantum setting: for example \( P_{\lambda }\) is the event that the underlying physical quantity, represented by A, takes the value \(\lambda .\)

In Kolmogorov’s setting, given a measurable space \((\varOmega ,\mathscr {A})\), an event is an element of the \(\sigma {-}\) algebra \(\mathscr {A}\) of subsets of \(\varOmega \). We identify \(B\in \mathscr {A}\) with its indicator function \( 1_{B}:\varOmega \rightarrow [0,1]\) which, in turn, is identified with the multiplication operator on the Hilbert space \(L^{2}(\varOmega ,\mathscr {A} ,P);\) \(f\rightarrow 1_{B}f:(1_{B}f)(\omega )=1_{B}(\omega )f(\omega )\) (so that if B happens, i.e., \(\omega \in B\), then \((1_{B}f)(\omega )=f(\omega ) \), otherwise, it’s 0). This operator on \(L^{2}(\varOmega ,\mathscr {A},P)\) is an orthogonal projection, and hence self adjoint. In other words, in quantum setting, projections correspond to events. Note also that, two quantum events (projections) pq are compatible when pq is also an event (a projection): in this case, \(pq=(pq)^{*}=q^{*}p^{*}=qp\), i.e., p and q commute. The counterpart of \(\mathscr {A}\) is the set \(\mathscr {P} (\mathbb {R}^{n})\) of all projections on \(\mathbb {R}^{n}\).

Now the spectral decomposition \(A=\sum _{\lambda \in \sigma (A)}\lambda P_{\lambda }\) is similar to a “simple random variable” in classical probability. A simple random variable X is of the form \(X(\omega )=\sum x_{j}1_{B_{j}}(\omega )\), where \(B_{j}=\{\omega :X(\omega )=x_{j}\}\), so that when the event \(B_{j}\) occurs, \(X=x_{j}\). The probability density of X is \(P(X=x_{j})=P(B_{j})=P[X^{-1}(\{x_{j}\})]\).

What is the probability density of the observable A? i.e., \(P(P_{\lambda }) \) in quantum formalism? The counterpart of the probability measure P on \((\varOmega ,\mathscr {A})\), is the state \(\psi \) of the Schrodinger equation. Let \(\rho \) be a positive operator on \(\mathbb {R}^{n}\) with unit trace (i.e., \(tr(\rho )=\sum _{j=1}^{n}<e_{j},\rho e_{j}>\,=1\)). Note that a positive operator is necessarily self adjoint. The triple \((\mathbb {R}^{n},\) \(\mathscr {P}(\mathbb {R}^{n}),\rho )\) is called a (finite dimentional) quantum probability space, the “state” \(\rho \) is called a “density matrix”.

For \(B\in \mathscr {P}(\mathbb {R}^{n})\), we have \(tr(\rho B)=\sum _{j=1}^{n}<u_{j},\rho u_{j}>\), where the \(u_{j}^{\prime }s\) is an orthogonal basis for the range of the projection B, so that \(tr(\rho B)\in [0,1]\), and for \(B_{1},...,B_{k}\), pairwise orthogonal (for \(j\ne m\), \(B_{j}B_{m}=0\)), so that \(B_{1}+...+B_{k}\) is the event that at least one of the \(B_{j}^{\prime }s\) occurs, and \(tr(B_{1}+...+B_{k})=\sum _{j=1}^{k}tr( \rho B_{j})\). Thus the map \(tr(\rho \cdot (\text {.})):\mathscr {P}(\mathbb {R} ^{n})\rightarrow [0,1]\) acts like a probability distribution, with \( tr(\rho B)\) being the probability of the event B under the state \(\rho \).

For \(A=\sum _{\lambda \in \sigma (A)}\lambda P_{\lambda }\), \(\Pr (A\) takes the values \(\lambda )=tr(\rho P_{\lambda })\). Thus, the observable A on \( \mathbb {R}^{n}\) is a discrete (finite) random variable with a probability mass function.

In summary, let H be a (finite dimensional) complex Hilbert space. representing states of a quantum system. Let \(\mathscr {P}(H)\) denote the set of all projections on H (playing the role of events), and \(\rho \) a positive operator of H with unit trace. The Triple \((H,\mathscr {P}(H),\rho )\) is a quantum probability space.

In such a quantum probability space, under the state \(\rho \), an observable A, with spectral decomposition \(A=\sum _{\lambda \in \sigma (A)}\lambda P_{\lambda }\), has a probability distribution given by \(\Pr (P_{\lambda })=tr(\rho P_{\lambda })\). The converse to this construction from a given \( \rho \) is Gleason’s theorem, which says that any probability distribution \( \mu :\mathscr {P}(H)\rightarrow [0,1]\) is of this form, i.e., has a density \(\rho \).

Remark. There are several different definitions of quantum probability space in the literature, depending of levels of generality, e.g., in terms of \(C^{*}{-}\)algebra. Here we consider a low level in terms of Hilbert spaces.

Let \(A_{Q}\) the observable of the quantity Q. What are the possible values of Q? In fact, what are the values of Q that we can actually measure? And what is the probability distribution of \(A_{Q}\)?

To answer this, observe that if the state \(\psi \in H\) is an eigenfunction of \(A_{Q}\), i.e., there is some scalar a (corresponding eigenvalue), here real since \(A_{Q}\) is self adjoint, such that \(A_{Q}(\psi )=a\psi \), then

$$\begin{aligned} EQ= & {} \int _{\mathbb {R}^{3}}\psi ^{*}(x,t)(A_{Q}\psi )(x,t)dx=\int _{ \mathbb {R}^{3}}\psi ^{*}(x,t)a\psi (x,t)dx \\= & {} a\int _{\mathbb {R}^{3}}|\psi (x,t)|^{2}dx=a \end{aligned}$$
$$\begin{aligned} E(Q^{2})= & {} \int _{\mathbb {R}^{3}}\psi ^{*}(x,t)(A_{Q})^{2}\psi )(x,t)dx=\int _{\mathbb {R}^{3}}\psi ^{*}(x,t)(A_{Q})(a\psi (x,t))dx \\= & {} a^{2}\int _{\mathbb {R}^{3}}|\psi (x,t)|^{2}dx=a^{2} \end{aligned}$$

so that \(Var(Q)=EQ^{2}-(EQ)^{2}=0\). i.e., for sure, Q will take the value a (no uncertainty involved). Thus, measurements of a quantity Q are precisely the eigenvalues of its observables, i.e., the spectrum of the observable representing it, denoting as \(\sigma (A_{Q})\). Note that, since every \(A_{Q}\) is self adjoint, \(\sigma (A_{Q})\subseteq \mathbb {R}\), consistent with the fact that measured values of physical quantities have to be real (not complex numbers!).

Here, again, is a theory extending what we known from matrix theory.

With the interests in transforming symmetric quadratic forms (\( (Ax,x)=\sum _{j,k}\alpha _{jk}x_{j}x_{k}\), the matrix A is symmetric) to normal form (\(\sum _{j}\beta _{j}x_{j}^{2}\)) via an orthogonal transformation \(T:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) (\(||Tx||=||x||\), for any \(x\in \mathbb {R}^{n}\), norm invariant), back to the times of Descartes (1637), it was known that any symmetric matrix A (\(Tx=Ax\)) is orthogonally equivalent to a diagonal matrix D. i.e., \(D=B^{-1}AB\), for some orthogonal matrix B (\(||Bx||=||x||\)). Note that B is orthogonal iff its columns form an orthonormal basis for \(\mathbb {R}^{n}.\)The diagonal entries of D are the eigenvalues of A, i.e., roots of the polynomial equation det\( (A-\lambda I)=0\). The set \(\sigma (A)\) of eigenvalues of a matrix A is called the spectrum of the operator (matrix) A. It x is a non zero vector such that \(Ax=\lambda x\), then x is called an eigenvector (with associated eigenvalue \(\lambda \)). Thus, a symmetric matrix A can be written as \(\sum _{j}\lambda _{j}u_{j}\).

As we will see, when we generalize matrix A on \(\mathbb {R}^{n}\) is to an “nice” bounded operator (e.g. compact) on a “nice” Hilbert space H (separable), we will have countable eigenvalues and vectors, the latter form an orthonormal basis for H, so that each \(h\in H\) can be written as an infinite series, and hence any operator on H can be represented as an “infinite matrix”.

Let’s start with matrices to bring out things we wish to generalize. For \( n\times n\) matrices with complex entries (i.e., operators on \(\mathbb {C} ^{n}, \) a finitely dimensional Hilbert space), it known from matrix algebra that, a self adjoint matrix A has real eigenvalues \(\lambda _{j},j=1,2,...,n\) (i.e., \(A-\lambda _{j}I\) are not invertible). The set \( \sigma (A)\) of eigenvalues is called the spectrum of A. The eigenspaces associated with eigenvalues \(\lambda _{j}\) ( i.e., \(S(\lambda _{j})=(x\in \mathbb {C}^{n}:Ax=\lambda _{j}x\}\)) are orthogonal (for \(\lambda _{j}\ne \lambda _{k},S(\lambda _{j})\perp S(\lambda _{k})\)). Moreover, \( A=\sum _{\lambda \in \sigma (A)}\lambda P_{\lambda }\), where \(P_{\lambda }\) is the projection onto \(S(\lambda )\). This is referred to as the spectral decomposition of A.

Remark. If A is an operator on an Hilbert space H, then its spectrum \(\sigma (A)\subseteq \mathbb {C}\) is, by definition, the set complements of its “resolvent” \(\{\lambda \in \mathbb {C}:(A-\lambda I)^{-1}\) exists\(\}\). In general, the spectrum could be uncountable. The spectral decomposition of a self adjoint operator will be defined as an “integral wrt a spectral measure on \(\mathscr {B}(\mathbb {C})\)”. In Quantum mechanics, quantities which are measured are matrices (more generally, operators) rather than real numbers. Also, “observables” may be functions of other observables, such as f(A) where \(f:\mathbb {R}\rightarrow \mathbb {R}\). As such, we need to make sense of f(A) as an operator: this is the problem of functional Calculus. If \(Au=\lambda u\), then we could set \(f(A)u=f(\lambda )u\), so that clearly there is a connection between spectral theory and functional calculus. Both are related to quantum mechanics. When \(A=\sum _{j}\lambda _{j}P_{j}\), we set \(f(A)=\sum _{j}f( \lambda _{j})P_{j}\). For Hilbert space H, an observable (i.e., a self adjoint operator A on H), has its spectral measure \(\xi _{A}\) on \( \mathscr {B}(\mathbb {C})\), such that \(A=\int _{\sigma (A)}\lambda d\xi _{A}(\lambda )\), we set \(f(A)=\int _{\sigma (A)}f(\lambda )d\xi _{A}(\lambda ) \).

The spectral decomposition of a self adjoint operator A in the finitely dimensional case is obtained from a “resolution of identity” \(\{P_{\lambda };\lambda \in \sigma (A)\}\). The map \(\lambda \in \sigma (A)\subseteq \mathbb {C}\rightarrow P_{\lambda }\in \mathscr {P}(H)\), space of projections on H, acts like a finite probability density where probability values are projections! Note that, like [0, 1], \(\mathscr {P}(H)\) is not a Boolean lattice. For a “random variable X” taking values in \(\sigma (A)\), formally, \(\Pr (X=\lambda )=P_{\lambda }\). When H is of infinite dimensions, this “density” should be replaced by a measure on \(\mathscr {B}( \mathbb {C})\). Thus, a spectral measure is defined as \(\xi (\text {.}):\mathscr {B}(\mathbb {C})\rightarrow \) \(\mathscr {P}(H)\) having analogous properties of a numerical measure, namely, \(\xi (\mathbb {C} )=I\), \(\xi (\cup _{n}B_{n})=\sum _{n}\xi (B_{n})\) for any sequence of pairwise disjoint \(B_{n}\in \mathscr {B}(\mathbb {C})\), where the infinite sum is taken in the sense of convergence wrt to the norm. This is a projection-valued probability measure.

The upshot is that any (bounded) self adjoint operator A on a Hilbert space H admits a unique spectral measure \(\xi _{A}\) such that it has the spectral decomposition (in the infinite case) as \(A=\int _{\sigma (A)}\lambda d\xi _{A}(\lambda )\) (von Neumann’s spectral theorem) which is the extension of \(A=\sum _{\lambda \in \sigma (A)}\lambda P_{\lambda }\), in the finite case. The spectral integral is defined as a Lebesgue-Stieltjes integral, here, of the function \(f(\lambda )=\lambda \), wrt \(\xi _{A}\).

Now, \(\mathscr {P}(H)\) is the set of events, i.e., special {0,1}− valued random variables, general random variables (observables) are represented by (bounded) self adjoint operators on H. The spectral measure of a self adjoint operator thus plays the role of the probability law of a random variable in the quantum context, its existence and uniqueness are guaranteed by von Neumann’s spectral theorem.

Remark. Why the spectral integral representing A is over its spectrum \(\sigma (A)\)? First, note that \(\sigma (A)\) needs not be discrete, as it is \(\{\lambda \in \mathbb {C};A-\lambda I\) is not invertible\(\}\).

The “support” of the spectral measure \(\xi _{A}\) is \(\varLambda (\xi _{A})= \mathbb {C}\backslash \cup _{k}B_{k}\) where the union is over all open set \( B_{k}\)in \(\mathbb {C}\) for which \(\xi _{A}(B_{k})=0.\) The measure \(\xi _{A}\) is said to be compact if, by definition, its support \(\varLambda (\xi _{A})\) is compact in \(\mathbb {C}\). It turns out that for compact spectral measures, \( \sigma (A)=\varLambda (\xi _{A})\). That answers our question.

We close this technical discussions with the concept of distribution of observables.

Let A be an observable, i.e., a self adjoint operator on a Hilbert space H . Let \(\rho \) be a density matrix, i.e., a positive operator on H with unit trace. Let \(\xi \) be the spectral measure of A . Let \(\mu :\mathscr {B}( \mathbb {R})\rightarrow [0,1]\) be \(\mu (B)=tr(\rho \xi (B))\). Then \(\mu (\text {.})\) is a probability measure, and it is called the “law” or probability distribution of the observable A , under the state \(\rho \) .

The interpretation is this. The distribution of A in the quantum framework is the same as the usual probability distribution of a random variable on \(( \mathbb {R},\mathscr {B}(\mathbb {R}),\mu )\).

Kolmogorov’s theory of probability is a special case of quantum probability: a commutative theory within an arbitrary (commutative or not) theory: Each random variable \(X:(\varOmega ,\mathscr {A},P\mathscr {)\rightarrow } \mathbb {R}\) is identified with the multiplication by it, acting on \( L^{2}(\varOmega ,\mathscr {A},P\mathscr {)}\), i.e., a special self adjoint operator, where multiplication operators commute; whereas in quantum uncertainty analysis, observables are arbitrary self adjoint operators which might not commute. Among other things, noncommutativity of observables (meaning that they cannot be observed simultaneously) is characteristic for quantum modeling in applications, such as finance.

5 Quantum Modeling and Probability Calculus for Econometrics

Like the attempt of econophysicists to use statistical physics to model and analyze financial time series, an obvious rationale for using quantum mechanic formalisim is in the force driving their fluctuations. Specifically, the Hamiltonian of a quantum dynamical system (the observable total energy) controls the time evolution of the system. The Black-Scholes’ equation in option pricing can be converted into a quantum system with a given Hamiltonian (see, e.g., [1]). It is about modeling, say, financial time series as quantum dynamical systems to gain new insights into the behavior of financial markets, for predictions among other purposes.

The heart of statistical analysis of time series data is models. Usually, having in mind just one theory of probability, namely Kolmogorov (in fact, one calculus of probabilities), all models are based on it. In particular, joint or conditional distributions, and correlations among variables are derived, including the use of copulas. It is perhaps time to ask “Are we using the right calculus of probabilities so far in financial data analysis?”. Note that, Kolmogorov probability theory has no problem at all in games of chance! “All models are wrong, but some are useful” (G. Box) has a neat interpretation in quantum mechanics! Schrodinger equation is just our best guess of how nature behaves (as verified by experiments). But how to find the “useful models”? Of course, that is the main task of statisticians using all their statistical tools, such as model fitting on data, cross validation methods, etc.

Now with the knowledge of quantum mechanics which not only provides us with a sense of dynamics (what causes the financial data to fluctuate?), but also a way to conduct uncertainty analysis based on a new calculus of probabilities (nonadditive and noncommutative), we could reexamine, where appropriate, the ways to do econometrics so far. For example, in analyzing the factors which affect the fluctuations of a financial time series, we could discover a Hamiltonian driving these fluctuations and then examine whether we are in a quantum context. When it seems to be the case, we have found a “useful” model! A quantum model for a financial data set. And the familiar follow up tasks involve the use of quantum probability calculus.