1 Description of the Problem

Need to Represent a Random Gain by a Single Number. In economics, the outcomes of a decision are usually known with uncertainty. Based on the previous experience, for each possible decision, we can estimate the probability of different gains m. Thus, each possible decision can be characterised by a probability distribution on the set of possible gains m. This probability distribution can be described by a probability density function \(\rho (m)\), or by the cumulative distribution function \(F(n){\mathop {=}\limits ^\textrm{def}}\textrm{Prob}(m\le n).\)

To select the best decision, we need to be able to compare every two possible decisions – and for this purpose, we need to represent each possible decision by a single number.

Problem: What Should this Number be? How can we select this number?

According to decision theory, decisions of a rational person are equivalent to selecting a decision that leads to the largest possible mean value of this person’s utility (see, e.g., [3, 4, 7,8,9,10,11]), and in the first approximation, utility is proportional to the gain. According to this logic, we should select a decision that leads to the largest possible value of the mean gain.

What We Do in this Paper. In this paper, we show that a more appropriate decision is to select the decision with the largest possible value of the appropriate quantile. This provides an additional explanation for the fact that in econometrics, an appropriate quantile – known as the Value at Risk (VaR) (see, e.g., [2]) – is an accepted measure of the investment’s volatility (for other explanations, see, e.g., [1]).

2 Analysis of the Problem and Its Resulting Formulation in Precise Terms

Suppose that we represent a decision by a number n. Since the outcomes are random, the actual gain m will be, in general, different from n. How will this difference affect the decision maker?

Case When We Gained More than Expected. Let us first consider the case when the actual gain m is larger than n. In this case, we can use the unexpected surplus \(m-n\). For example, a person can take a trip, a company can buy some new equipment, etc. However, the value of this additional amount to the user is somewhat decreased by the fact that this amount was unexpected. For example, if a user plans a trip way beforehand, it is much cheaper than buying it in the last minute. If the company plans to buy an equipment some time ahead, it can negotiate a better price. In all these cases, in comparison to the user’s value of each dollar of the expected amount n, each dollar from the unexpected additional amount \(m-n\) has a somewhat lower value, valued less by some coefficient \(\alpha _+>0\).

The loss of value for each dollar above the expected value is \(\alpha _+\). Thus, the overall loss corresponding to the whole unexpected amount \(m-n\) is equal to \(\alpha _+\cdot (m-n)\). So, to get the overall user’s value v(m) of the gain m, we need to subtract this loss from m:

$$\begin{aligned} v(m)=m-\alpha _+\cdot (m-n). \end{aligned}$$
(1)

Case When We Gained Less than Expected. What if the actual gain m is smaller than n? In this case, not only we lose the difference \(n-m\) in comparison to what we expected, but we lose some more. For example, since we expected the gain n, we may have already made some purchases for which we planned to pay from this amount. Since we did not get as much money as we expected, we need to borrow the missing amount of money – and since borrowing money comes with an interest, we thus lose – e.g., on this interest – some additional amount. In other words, for each dollar that we did not receive, we lose some additional amount; let us denote this additional loss by \(\alpha _->0\).

The overall additional loss to the user caused by the difference \(n-m\) can be obtained by multiplying this difference by the per-dollar loss \(\alpha _-\), so this loss is equal to \(\alpha _-\cdot (n-m)\). The overall user’s value v(m) corresponding to the gain m can be thus obtained by subtracting this loss from the monetary amount m:

$$\begin{aligned} v(m)=m-\alpha _-\cdot (n-m). \end{aligned}$$
(2)

Resulting Optimization Problem. As we have mentioned, a rational agent should maximize the expected value, i.e., a rational agent should select the value n that maximizes the expression

$$V{\mathop {=}\limits ^\textrm{def}}\int _{-\infty }^{\infty } \rho (m)\cdot v(m)\,dm$$
$$\begin{aligned} =\,\int _{-\infty }^n \rho (m)\cdot (m-\alpha _-\cdot (n-m))\,dm+\int _n^{\infty } \rho (m)\cdot (m-\alpha _+\cdot (m-n))\,dm. \end{aligned}$$
(3)

3 Solving the Resulting Optimization Problem

Solving the Problem. Differentiating the expression (3) with respect to the unknown value n and equating the resulting derivative to 0, we conclude that

$$\begin{aligned} -\alpha _-\cdot \int _{-\infty }^n \rho (m)\,dm+\alpha _+\cdot \int _n^{\infty } \rho (m)\,dm=0 \end{aligned}$$
(4)

Here, by definition of the probability density:

$$\int _{-\infty }^n \rho (m)\,dm=\textrm{Prob}(m\le n)=F(n)$$

and

$$\int _n^{\infty } \rho (m)\,dm=\textrm{Prob}(m\ge n)=1-F(n).$$

Thus, the equality (4) takes the form

$$-\alpha _-\cdot F(n)+\alpha _+\cdot (1-F(n))=0,$$

so

$$(\alpha _- +\alpha _+)\cdot F(m)=\alpha _+$$

and hence

$$F(m)=\frac{\alpha _+}{\alpha _- +\alpha _+}.$$

This is exactly the quantile corresponding to

$$\begin{aligned} \tau =\frac{\alpha _+}{\alpha _- +\alpha _+}. \end{aligned}$$
(5)

Thus we arrive at the following conclusion.

Conclusion. In the above natural optimization problem, the optimal value n representing the random variable described by a cumulative distribution function F(m) is the quantile corresponding to the value (5).

This explains why quantiles (i.e., VaR) indeed work well in econometrics.

An Interesting Observation. In econometrics, quantiles are not only used to describe the risk of different investments, they are also used to describe the dependence between different random variables – in the form of describing how the quantile of the dependent variable m depends on the quantiles of the corresponding independent variables \(x_1,\ldots ,x_n\). In this technique – known as quantile regression (see, e.g., [5, 6]), for each value \(\tau \in (0,1)\) the \(\tau \)-level quantile n of the random variable m is determined by minimizing the expression

$$\begin{aligned} I{\mathop {=}\limits ^\textrm{def}}(\tau -1)\cdot \int _{-\infty }^n \rho (m)\cdot (m-n)\,dm+\tau \cdot \int _n^{\infty } \rho (m)\cdot (m-n)\,dm. \end{aligned}$$
(6)

By comparing the formulas (3) and (7) for the value \(\tau \) determined by the formula (5), we see that

$$\begin{aligned} V=\int _{-\infty }^{\infty } \rho (m)\cdot m\,dm-(\alpha _+ +\alpha _-)\cdot I. \end{aligned}$$
(7)

The first integral in the expression (7) is just the expected value of the gain, it does not depend on n at all. Thus, maximizing V is equivalent to minimizing the expression I.

So, the formal optimized expression used in quantile regression actually has a precise meaning: it is linearly related to the expected utility of the user.