Until the beginning or even the middle of the 1900s, mathematical finance (or financial mathematics) remained confined to a support role to business accounting, especially in debt and credit reports. It was used for computing, quickly and without controversy, interests and discounts on basic operations: exchange of a loaned sum today against the return at a certain future time of the same amount plus interests (accumulation operation) or, conversely, early repayment of a debt, reduced by an appropriate rate (discounting).

Also in the past, between the sixteenth and the eighteenth centuries, experts in the insurance world, borrowing from probability calculus and demographic statistics, developed tools to compute the fair premium of annuity insurance and related values. For the evaluation of the probability of survival, a sophisticated statistical analysis was required and consequently, even in the early twentieth century, the field of actuarial science (the theoretical counterpart of the flourishing industry of insurance) was much more advanced than that of banking, in the use of sophisticated mathematical tools. Even though some “stock exchanges” were already operating according to modern canons as early as the eighteenth century [the Paris stock exchange was born in 1724 and the legendary New York Stock Exchange (NYSE), in 1798], there was still a lack of studies about the evolution of stock prices and, more generally, the functioning of stock markets.

In this setting, in the early twentieth century, two exceptionally important PhD theses were defended. In Paris, in 1900, Jean Louis Bachelier (1870–1946) defended a thesis entitled Théorie de la spéculation (The theory of speculation) [3], while the thesis by Filip Lundberg (1876–1965), entitled Approximerad framstallning af sannolikhetsfunktionen—Återförsäkring af kollektivrisker (Approximations of the probability function—reinsurance of collective risk), was defended in Uppsala in 1903 [11]. Both involved the use of advanced mathematics: stochastic processes, then a genuinely revolutionary tool. A stochastic process is a family X t of random variables indexed by the time parameter t. Thus, it is a tool useful to describe the time evolution of a random phenomenon. The simplest model of a continuous-time stochastic process is the standard Wiener process, a one-dimensional arithmetic Brownian motion, triggered by the initial condition W(0) = 0 and characterised by increments W(t + h) − W(t), independent for each pair of disjoint intervals, with a normal distribution having mean 0 and variance h (or standard deviation h 1/2). It follows that the generic variable W(t) of the process is distributed as a normal with mean 0 and variance t. The notation dW indicates the random differential of the standard Wiener process, an evolution of the notion of real differential (of functions of real variables). It is the normal random variable with mean 0 and variance dt.

In his description of the evolution model of the cumulative gain process G t that can be obtained over time after investing in an equity asset, Bachelier used the differential equation dG = sdW, in which an asset-specific constant s appears, the so-called volatility of the asset. Given the initial condition G(0) = 0, the solution of this equation is the process G(t) of normal variables with mean 0 and variance s 2 t. This means that a stock investment is a financial transaction that in a time interval of length t generates a gain normally distributed with expected value zero and variance proportional (depending on the square of the volatility) to the time. Despite the keenness of his insight, Bachelier’s contribution was not taken into account as it deserved. His thesis, forgotten for decades, was rediscovered and brought to the attention of the world of finance only in the 1950s by Paul Samuelson (1915–2009).

For his part, Lundberg applied stochastic processes to the study of the solvency of an insurance firm, identifying the level of initial capital resources required to ensure that the probability of a firm being ruined falls within acceptable limits. Lundberg assumed that the firm’s capital resources evolve over time according to a random model described by the equation X(t) = x 0  + ct − S t , where x 0 is the initial allocation, c is the constant instantaneous rate of an inflow of net premiums whose total value in the time period (0, t) is ct, and S t is the cumulative value up to time t of the disbursements derived from paying claims. Denoting by γ the constant loading coefficient of the fair premium for each insured risk, we have ct = (1 + γ) E(S t ), or c = (1 + γ) E(S 1). The key point of Lundberg’s approach is that the arrival of claims is described by a compound Poisson process \({S_t}=\sum\nolimits_{h=1}^{N\left( t \right)} {{Y_h}} ,\) where N(t) is the random number of claims in the interval (0,t) and is supposed to follow a Poisson process of parameter λ, constant over time [so that the average number of accidents in each interval (0,t) is λt], while the random variables Y h , the size of the reimbursement deriving from the generic claim, are equally distributed, with mean μ and variance ν 2, independent of each other and of N(t). Thus, the process X(t) is one whose trajectories show alternating intervals of linear growth segments, with slope c, and downwards jumps, which are random in both time and amount. The time τ of ruin of the firm is the instant in which, due to one of these jumps, for the first time, X(τ) < 0; formally, τ = inf t {X(t) < 0}. Lundberg was able to prove that in this scenario the asymptotic ruin probability, that is, the probability of the event τ < +∞, is approximately equal to exp(–(x 0 (2 m/σ 2)), a function of the parameters m = γE(S 1 ) (expected value of the gain per time unit) and σ 2 = λ (μ 2 + ν 2 ) (variance of this gain) of the firm’s risk portfolio.

Lundberg’s way of posing the problem was very successful and became known in actuarial circles as “collective risk theory”, inspiring for a long time the control strategies about the solvency of insurance companies in order to protect the community of insured people. It placed its emphasis on the role of free capital in counteracting the failure of the firm, but it did not draw consistent conclusions about the dividend policy (completely neglected in the model) and the company’s value. A reconciliation of the collective risk theory with a consistent company-oriented viewpoint had to wait until 1957 and the 15th International Congress of Actuaries in New York, where Bruno de Finetti (1906–1985) gave a talk in which he set out an optimal dividend strategy aimed at maximising the company’s value, defined as the average present value of the dividends themselves [8].

Reworking the insights of Bachelier and Lundberg and exploiting the powerful results reached in the meantime by probability theory, in particular thanks to Paul Lévy (1886–1971) and Kiyosi Itô (1915–2008), stochastic processes have played, since 1970, a dominant role in quantitative finance.

A variant of the arithmetic Brownian motion, the geometric Brownian motion described by the differential equation dA/A = mdt + sdW, was used by Samuelson in 1965 and a few years later, in 1973, both by Fischer Black (1938–1995) and Myron Scholes (b. 1941) [4], and Robert C. Merton (b. 1944) [14] to describe the behaviour over time of the instantaneous rate of return dA/A of an asset, typically a share. In addition to the random component sdW, the second member includes a sure term mdt whose coefficient m is called the drift parameter of the process. It is easy to deduce that dA/A has normal distribution with mean mdt and variance s 2 dt. Taking into account the initial condition A(0) = A 0, it can be proved that the solution of the differential equation is the process A t  = A 0 exp(μt + sWt), where μ = m − (s 2/2). Since the natural logarithm of (A t /A 0), that is μt + sWt, is a normal variable with mean μt and variance s 2 t, the process is said to be log-normal. In the simplest model, the parameters m and s are constant in each time/state combination (t, A t ), while in other cases they may be deterministic or random variables.

Brownian motions with drift were also proposed to explain the random movements of the instantaneous spot interest rate. Oldřich Vašíček (b. 1942) was the pioneer in these applications; slightly different variants were proposed a few years later, in 1985, by Cox, Ingersoll and Ross (with what is now known as the CIR model) [7] and by Hull and White in 1993 [10].

A common feature of all these variants of Brownian motion is the fact that the trajectories of the process are always continuous. An influential supporter of the opportunity to introduce mixed diffusion/jump processes when studying the evolution of random variables in financial applications was again Merton, in a celebrated 1976 article [16]. In his model we have A t  = A 0 exp(L t ), with L t a jump-diffusion process in which the parameters of the diffusion component are m and s 2 and the component of the jumps is described by a compound Poisson process with random arrivals of parameter λ and jumps independent of each other and of N t , and equally distributed with common distribution Y h (normal with mean ν and variance σ 2), or \({L_t}\,=\,mt\,+\,s{W_t}\,+\,\mathop \sum \nolimits_{h=1}^{N\left( t \right)} {Y_h}.\) The first two moments of the distribution of L 1 are E(L 1) = m + λ ν and V(L 1) = s 2 + λ (ν 2 + σ 2). It is hardly necessary to point out how brilliant (three quarters of a century in advance!) the insights of Bachelier and Lundberg proved to be.

Towards the middle of the twentieth century, another big name in quantitative finance emerged: Harry Markowitz (b. 1927) with his portfolio theory (see [12, 13]).

To introduce Markowitz’s portfolio theory, consider an individual or a representative of an institution who has to invest an amount W of wealth in certain n risky assets whose random return rates in a given time span are R i , i = 1, …, n. Denoting by W i the part of the wealth invested in the asset i and by x i  = W i /W its fraction, we call portfolio the vector x = (x 1, x 2, …, x i , …, x n ) of the fractions. The return of the portfolio over the given time span is equal to \(R\left( x \right){ }=\sum\nolimits_{i=1}^n {{x_i}{R_i}} ,\) the combination of the returns of the single assets weighted by their respective fractions. We have the constraint \(\sum\nolimits_{i=1}^n {{x_i}=1} ,\) but, where necessary, other restrictions can be imposed, and in particular x ≥ 0, the non-negativity of fractions, which reflects the prohibition (in force in some institutional settings) on taking short positions in any asset.

In his work in the 1950s, Markowitz addressed the problem of how to choose the portfolio, solving it by proposing to apply the mean–variance approach. In his opinion, any rational investor aims at achieving two objectives: maximising gain and minimising risk. The objectives are, however, conflicting: the maximisation of the expected gain would mean the concentration of investments in one or very few assets (those with the highest expected gain), which would be accompanied by a (too) high value of risk. Mediating between the two requirements, the practice of the day empirically suggested dividing the portfolio into more or less equal parts in a number (from a dozen to 20) of assets with the highest expected return. It was thought that this, besides ensuring a sufficiently high expected return, would reduce, or virtually eliminate, risk, due to the diversification. Markowitz was not content with this simplistic approach and decided to investigate further the conditions of effectiveness of diversification. In a random setting, he identified as a measure of the gain the expected return of the portfolio \(E\left( {R\left( {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} } \right)} \right)=\sum\nolimits_{i=1}^n {{x_i}E\left( {{R_i}} \right)}\) and as a measure of the risk its variance, by making explicit the components V(R(x)) = Σ i Σ j x i x j σ ij , where σ ij  = ρ ij σ i σ j denotes the covariance between R i and R j , the product of the coefficient of linear correlation between the yields of the two assets by their standard deviations (note that for j = i, the covariance reduces to the variance of the R i s).

The next step was consistent with Pareto’s concept of multicriteria decisions. It was necessary to find an algorithm that would generate the set of Pareto-efficient portfolios: formally, the admissible values of x for which no admissible values of y exist with E(y) ≥ E(x) and V(y) ≤ V(x), with at least one strict inequality. Taking advantage of the recent results of constrained optimisation theory (due to Dantzig, Kuhn and Tucker), Markowitz found that the set of efficient portfolios could be found by solving for x the equation min V = x T Cx with the constraints x T m ≥ E, x T1 = 1, where C denotes the matrix of the covariances of the returns and m is the vector of the respective expected values.

Markowitz described the geometrical properties of the solution both in the space (a simplex) of portfolios and in the variance-mean plane (V, E), where such a set turns out to be the northwest (efficient) border of admissible portfolios. This geometric representation became the starting point of reflections on the functioning of capital markets (more or less, the stock markets) developed in the following years by James Tobin (1918–2002) and William F. Sharpe (b. 1934)—who were awarded with the Nobel Prize for Economy in 1981 and 1990 respectively—which led to the so-called Capital Asset Pricing Model (CAPM).

Tobin had the idea of adding to the menu of random yield assets a further non-risky asset with rate of return r f over the time considered (a pure discount of fixed face value, payable without any risk of insolvency at the end of the time period considered by the model). In this new scenario, every efficient portfolio could be obtained as a combination of that asset and a single completely random portfolio M. Such a portfolio, called the market portfolio, was characterised by the property of maximising, in the set of all eligible, completely random portfolios A, the ratio (E A  − r f )/σ A . Ultimately, then, all efficient portfolios were combinations of the market portfolio and the risk-free asset (in proportions dependent on the propensity to risk-return trade off of the individual investor).

Reflecting on this model, in 1964 Sharpe introduced the equation that characterises in this abstract balance all portfolios P that are mean–variance efficient, \({E_P}\,=\,{r_f}\,+\,\left( {{{({E_M}\, - \,{r_f})} \mathord{\left/ {\vphantom {{({E_M}\, - \,{r_f})} {{\sigma _M}}}} \right. \kern-\nulldelimiterspace} {{\sigma _M}}}} \right){\sigma _P}\) [20]. It decomposes the expected return E P of each efficient portfolio in the sum of the certain return r f , the so-called price of time, and an additional return called overall price of risk. The latter, in turn, is the product of the unitary price of risk (E M  – r f )/σ M multiplied by the units of risk of an efficient portfolio measured by σ P , that is, by its standard deviation.

This model suggested to Sharpe another equation, valid in general for both efficient portfolios and individual assets or inefficient portfolios: E P  = r f  + (E M  – r f ) (σ PM /σ 2 M ). It suggests that the expected return of any asset or portfolio is the sum of the price of time and an additional compensation for the risk associated with the unavoidable risk factor, represented by the market portfolio return whose unitary price is (E M  − r f ). The measure of this risk is the so-called beta coefficient of the asset (or of the portfolio) β PM  = σ PM /σ 2 M . This coefficient, geometrically interpreted as the slope of the regression line of R P on R M , is a measure of the sensitivity of the return of the portfolio to changes in the market return. It can be proved that the beta coefficient of a portfolio is the combination of the beta coefficients of its assets, weighted by the respective fractions; in particular, the market portfolio has β = 1, while it is β = 0 for the risk-free title. Portfolios with β > 1 are said to be aggressive, while those with β < 1 defensive. The first ones enhance the reward corresponding to market trends better than the expectations but reduce it in case of disappointing behaviours of the market; the opposite is true for defensive portfolios.

In the decades that followed, the explanatory power of this model generated a huge literature of attempts to theoretical generalisation on the one hand and empirical verification on the other. Of particular note are the proposals by Ross with the Arbitrage Pricing Theory (APT, 1976) [17], Merton with the Intertemporal Capital Asset Pricing Model (ICAPM, 1973) [15] and Breeden with the Capital Consumption Asset Pricing Model (CCAPM, 1979) [5].

Today, to verify the efficiency of their asset management all financial intermediaries, such as open- and closed-end funds, pension funds, insurance companies, commercial banks and investment banks, rely on techniques and methods borrowed directly or indirectly from the theories of Markowitz and Sharpe. They have had and continue to have a major influence on the functioning of capitalism in its current stage of development.

Another theory, developed in the last quarter of the twentieth century, has assumed an importance that is equal to or even greater than that of the portfolio theory: the option theory. It combined in the same year, 1973, the circulation of the Black–Scholes formula, which seemed to provide a basis for unequivocally and precisely computing of the price of European call and put options of standard type, and the opening of the first modern stock exchange specialised in the field (the Chicago Board Options Exchange). Subsequently, the development of the theory has encouraged the spread of “exotic” options, often negotiated over the counter, that is, directly between the parties outside an exchange. The dissemination of this instrument through familiar banking channels, by giving friendly connotations to techniques that are very deceptive and difficult to understand even for experts, has certainly contributed to the polluting of the financial markets and has been a major cause of the severe economic crisis that, starting in 2007 in USA, has overrun the developed economies, creating an unprecedented crisis in the real economy.

In an option, there are two parties, one in long position (holder) and the other in short position (writer). The option gives the holder the right but not the obligation to exercise it at a fixed date T (European options) or at any time t before the deadline (American options). The object of the exercising is a predetermined amount of an underlying asset (usually a financial asset, such as a share or a bond, or an actual commodity, such as oil, wheat or cotton, or even an interest rate). The exercise allows the holder to buy (“call” option) or sell (“put” option) the underlying asset, not at the current price A but at a fixed price K, the exercise price. In the absence of fiscal and transaction frictions, the value of the option at maturity is given by the net gain resulting from the holder adopting the optimal strategy: max(A T K; 0) for the call and max(0; K-A T ) for the put. Note that, in the absence of friction, the option cannot have a negative value: in the worst case, it is possible to just let it expire without exercising it, which does not involve any expense. In the absence of maintenance costs of the option, non-negativity also holds for any time before maturity. More generally, we have the so-called fundamental equation of option theory: A T  + P T  − C T  − K T  = 0, where the plus and minus signs denote, respectively, the values of a long and short position and the identity means that the relationship is valid whatever the value at T of the underlying asset; of course, P T and C T are the values of options that have in common the maturity date T, the underlying asset A and the exercise price K = K T . In the identity, K can also be interpreted as the redemption value at the maturity date T of a zero-coupon bond. At time T, it is therefore equivalent (a) having a portfolio consisting of long positions in a zero-coupon bond with maturity date T and redemption value K = K T and a call option on the underlying asset A, with maturity date T and exercise price K, or (b) having a portfolio consisting of long positions in the underlying asset A and the “twin” put option (with the same specifications of underlying asset, maturity and exercise price) of the call.

The fundamental identity also remains valid at any time t < T, provided the four assets at stake generate neither costs, nor profits in the interval between t and T (in particular, the share does not generate dividends, the bond does not pay coupons and the options are European). In this case, it always holds (that is, for any value A t of the underlying asset) that A T  + P T  − C T  − K T  = 0, where K t is the value at t of the bond having maturity date T (but also the discounted value of the exercise price). This identity is the basis of so-called put-call parity relations for European options P t  = C t  + K t  − A t and C t  = A t  + P t  − K t , expressed in this precise algebraic form by Stoll in 1969 [21], but well known in the financial world since the seventeenth century (Joseph de La Vega covered it in his 1688 Confusion of Confusions).

The parity relations are in turn the basis for the decomposition into two parts, intrinsic value and time value, of the value of a European option. We have

$${P_t}=\left( {{K_T}-{A_t}} \right)+\left( {{C_t}-{K_T}iv} \right);\quad{C_t}=\left( {{A_t}-{K_T}} \right)+\left( {{P_t}+{K_T}iv} \right).$$

The decomposition helps to understand the reasons why (as highlighted for the first time in a 1969 paper by Samuelson and Merton [19]) it is never profitable to exercise early (before maturity) an American call option on a dividend-free underlying asset or with the equivalent protection clause of the dividends in favour of the writer. Indeed, prematurely exercising an American option means deciding to shorten its maturity from T to t. If it is a call option, this means to lose the time value. The reasoning is less straightforward in the case of the American put option because the time value does not have an unequivocal sign.

The arguments we have seen so far, based on relatively weak economic assumptions, have made it possible to define a put-call parity relation whose use allows us to find the value of the call option when the value of the twin put option is known (and vice versa). However, we are still not able to find one of the two values, when the other is unknown. Achieving this goal, one of the great intellectual challenges in the history of finance, was made possible only in 1973 by the work (independently) of Black and Scholes [4], and Merton [14]. It is for this that Scholes and Merton were awarded the Nobel Prize for Economics in 1997.

To achieve the goal of pricing a European call option on a dividend-free underlying stock asset, Black, Scholes and Merton wisely used instruments and results already available: geometric Brownian motion with drift (suggested, as we have seen, by Samuelson in 1965 [18]) for the the process of the underlying asset, and the idea of using the possibilities offered by a complete market to replicate the value of the option at maturity whatever the path followed by the underlying asset in the time between the creation of the option and its maturity. In this regard, the standard was set by a 1953 paper [1] by Arrow (1921–2017), in which he proved that, to replicate any desired basket of payments at the time T, it was necessary and also sufficient to have a number of elementary assets (the so-called “Arrow-Debreu assets” or “state contingent assets”) equal to the number of possible states of the world in T. Each elementary asset pays a monetary unit when the corresponding state of the world occurs; if that state does not occur, it pays nothing. In the same paper, Arrow also neutralised the objection that the existence of a number of elementary assets corresponding to the enormous number of states of the world was an absolutely unfeasible academic utopia. To do this, he proved that the completion of the market could be obtained in an alternative way (even in the presence of a limited number of assets) making the negotiating instants more frequent. Bringing this strategy to the extreme, that is, passing to a continuous situation, Black and Scholes proved that only two assets, the risk-free asset and the random underlying asset, continuously rebalanced, could replicate the maturity value of a European call option. Thus they obtained, by solving a second-order partial differential equation, the well-known formula.

$$C\left( t \right)\,=\,AN\left( {{d_1}} \right)-K\exp \left( {-r\left( {T-t} \right)} \right)N\left( {{d_2}} \right),$$

where N(x) = Prob(N(0,1) ≤ x) is the value at x of the cumulative function of a standard normal distribution,

$${d_{1}}={ }{{\left( {{ln}\left( {A/K} \right)+\left( {r\,+\,{s^{2}}/{2}} \right){ }\left( {T-t} \right)} \right)} \mathord{\left/ {\vphantom {{\left( {{ln}\left( {A/K} \right)+\left( {r\,+\,{s^{2}}/{2}} \right){ }\left( {T-t} \right)} \right)} {s{{\left( {T-t} \right)}^{{1}/{2}}}}}} \right. \kern-\nulldelimiterspace} {s{{\left( {T-t} \right)}^{{1}/{2}}}}}.$$

and

$${d_{2}}\,=\,{d_{1}}-s{\left( {T-t} \right)^{{1}/{2}}}={{\left( {{ln}\left( {A/K} \right){ }+{ }\left( {r-{s^{2}}/{2}} \right)\left( {T-t} \right)} \right)} \mathord{\left/ {\vphantom {{\left( {{ln}\left( {A/K} \right){ }+{ }\left( {r-{s^{2}}/{2}} \right)\left( {T-t} \right)} \right)} {s{{\left( {T-t} \right)}^{{1}/{2}}}}}} \right. \kern-\nulldelimiterspace} {s{{\left( {T-t} \right)}^{{1}/{2}}}}}.$$

In the formula five parameters A, r, K, (T – t), s appear, but not the drift parameter m of the differential equation of the underlying asset. A and r are observable in the market, K, and T − t are specified by contract, while the volatility s of the underlying asset is the only non-observable parameter. In the Black–Scholes model, it is assumed to be constant; its estimation has been the subject of great debate in view of practical applications. One can resort to the use of the historical volatility or the so-called implied volatility, which is obtained by solving the Black–Scholes formula for the unknown volatility, starting from a set of market prices for similar options. Of course, there are countless models with time-dependent (deterministic or random) volatility, which were proposed later. Even though none of them has even come near the prominent position of the Black–Scholes model, many operators follow the volatility arbitrage strategies inspired by these models.

Unfortunately, the use in the proof of complex tools of stochastic calculus, such as Itô’s lemma, did not help to facilitate an understanding of this approach, so the paper by Black and Scholes waited a long time before being given the green light by the referees. Cox, Ross and Rubinstein (b. 1944) remedied the situation in a masterly 1979 article [6], where they proved that a multi-period simple-alternative market with only two elementary activities that are available at each negotiating instant, when the intervals between successive instants approach zero, generates a log-normal distribution of the price of the underlying asset and a price formula of the call option, in the absence of arbitrage, that converges to the Black–Scholes formula.

In subsequent years, the impact of the Black–Scholes formula was strongly accentuated by another consequence of Arrow’s logic: the extraordinary (and in retrospect, we might say deceptive) simplicity of the technology of price calculation as the present value of expected values in a risk-neutral world. If we consider in a complete market the set of current prices p s of elementary assets, and denote by B = 1/(1 + r f ) the current price of a zero coupon bond certainly paying 1 at time T, the absence of arbitrage requires that \(B=\sum\nolimits_{s=1}^S {{p_s}},\) or \(1=B\left( {1+{r_f}} \right)=\left( {1+{r_f}} \right)\sum\nolimits_{s=1}^S {{p_s}}.\) If we now set \({\pi _s}={p_s}\left( {1+{r_f}} \right),\) we obtain \(\sum\nolimits_{s=1}^S {{\pi _s}} =\left( {1+{r_f}} \right)\sum\nolimits_{s=1}^S {{p_s}} =1.\) Thus, the numbers π s , which are non-negative and sum to 1, have the property of a probability distribution. Their role can be understood if we consider the current price p(X) of a random asset X that at T pays x s conditionally upon the occurrence of the state s. To avoid arbitrage we must have:

$$p\left( X \right) = \mathop \sum \limits_{{s = 1}}^{S} p_{s} x_{s} = \left( {1 + r_{f} } \right)^{{ - 1}} \mathop \sum \limits_{{s = 1}}^{S} \pi _{s} x_{s} = \left( {1 + r_{f} } \right)^{{ - 1}} E_{{\mathop \prod \nolimits^{} }} \left( X \right) .$$

Thus, p(X) turns out to be the present value, at the risk-free interest rate, of the expected value, according to the probability distribution Π, of the variable X. If Π were the real-world probability, this would be the price of the random asset X only in a market of risk-neutral agents; if, on the other hand, the agents are risk-averse, the π s are not real probabilities, but are instead adjusted according to risk aversion related to the various states. For this reason, the probabilities π s are called risk-neutral probabilities. It was Jacques Drèze (b. 1929) who in 1970 highlighted this aspect, which is fundamental for applications [9]. Subsequently, the probabilists formalised this approach coining the terminology of equivalent martingale measures. Within a few years, the ability to estimate the prices of elementary assets with relative (and illusory) ease made this pricing technique the new mainstream of theoretical and applied finance. Options were also extended to the sector of government and corporate bonds and a world of applications of complex products emerged, whose logic eventually got out of hand, even for their creators. The financial engineering excesses that we now deplore are a perverse consequence of Arrow’s studies on the best allocation of risk (for which he was awarded the Nobel Prize for Economics in 1972). Arrow himself remarked in a 1965 paper [2] that this is how a divorce between productive activities (the real economy) and risk undertaking/allocation activities (which he understood as the core of finance) could occur. This work should therefore be considered as the true birth certificate of financial engineering.

Translated from the Italian by Daniele A. Gewurz.