1 Introduction

The use of stock options for risk reduction and return enhancement has expanded at a considerable rate over the last several decades. In 1973, the Chicago Board Option Exchange was established, and brought about liquidity for successful option trading through public listing and option contract standardization. In the same year, the most famous option pricing model—Black–Scholes model was proposed and became the industry standard (Lee et al. 2013a).

Black and Scholes (1973) and Merton (1973) used stochastic calculus to derive option pricing models. Rendleman and Bartter (1979) and Cox et al. (1979) used binomial distribution to derive the Black–Scholes model. In the next several decades, a group of new models that relax some restrictive assumptions of Black–Scholes model have been proposed.

The first group of researches developed models that allowed important parameters such as interest rate or (and) volatility, to be stochastic. Scott (1987), Wiggins (1987), Hull and White (1987), Melino and Turnbull (1990, 1995), Stein and Stein (1991) and Heston (1993) generalized the Black–Scholes model in terms of stochastic variance. While Amin and Jarrow (1992) developed the Black–Scholes model allowing stochastic interest rate. Furthermore, there are some literatures proposed generalization method to allow interest rate and volatility to be stochastic at the same time. Examples can be found in Amin and Ng (1993), Bakshi and Chen (1997a, b) and Bailey and Stulz (1989). Similarly, Lee et al. (1991) extended the binomial option pricing model to the case where the up and down percentage changes of stock prices are stochastic. They have proved that assuming stochastic parameters in the discrete-time binomial option pricing is analogous to assuming stochastic volatility in the continuous-time option pricing.

The second group of studies introduced jump-diffusion process into the Black–Scholes model, and made extensions to the original model. Several jump-diffusion models were proposed by Bates (1991), Kou (2002), Kou and Wang (2004), respectively. Psychoyios et al. (2010) well approximated the time series behavior of VIX index by a mean reverting logarithmic diffusion process with jumps. Based on the empirical results, they derived closed-form valuation models for European options written on the spot and forward VIX, respectively. For more complicated cases, Bates (1996) introduced jump-diffusion process into the stochastic-volatility model, and Scott (1997) attempted to price options in a jump-diffusion model with stochastic volatility and interest rates.

Besides the mentioned two large categories of models, an explosion of other option pricing models have also been proposed and well validated using real world data. Examples include: (a) the constant elasticity of variance (CEV) models (Cox and Ross 1976; Beckers 1980; Davydov and Linetsky 2001; Lee et al. 2004; Chen et al. 2009); (b) the Markovian models (Rubinstein 1994; Aït-Sahalia and Lo 1996); (c) the GARCH models (Duan 1995; Heston and Nandi 2000; Wu 2006); and (4) the models based on Lévy processes (Geman et al. 2001; Carr and Wu 2004).

In more recent years, much more complex cases were considered to develop new models. And these models were proved to describe the reality more precisely. Chen and Palmon (2005) proposed an empirically based, non-parametric option pricing model and used it to evaluate S&P 500 index options. As their model was derived under the real measure, an equilibrium asset pricing model, rather than no-arbitrage model was assumed. Costabile et al. (2014) proposed a binomial approach for option pricing assuming the parameters governing the underlying asset process follow a regime-switching model. Lin et al. (2014) developed a currency option pricing model with regimes of high-variance or low-variance states as well as the jump nature of exchange rates. And they have proved that their model performed better than both traditional regime-switching model and the Black–Scholes model.

Overall, Black–Scholes option pricing model has been extensively studied by different researchers, and the models discussed above are by no means exhaustive. This information can be found in Hull (2014).

Black–Scholes model, the important development of option valuation theory, which relied on far fewer assumptions, shed new light on the valuation process. Subsequently, the growing popularity of the option concept is evidenced by its application to the valuation of other more abstract assets including lease contracts (Grenadier 1995) and real estate agreements (Williams 1991; Buetow and Albert 1998).

Besides assets valuation, option theory is also widely used in the field of risk management. The most important observation in Merton (1974) is that the firm’s equity can be regarded as a call option on the firm’s assets with exercise price equal to the liability. If the firm’s assets fall below its liabilities, then the firm is in danger of bankruptcy. Under the Black–Scholes model, the probability of bankruptcy is simply the probability that the market value of assets is less than the face value of the liabilities (Hillegeist et al. 2004). Based upon the option pricing models, several commercial vendors provide default probabilities, with KMV, LLC being the best known.

Similar to the above theory, Merton (1977, 1978) first discussed the relationship between the deposit insurance and put option. If bank’s assets cannot meet the amount of deposits, the bank is insolvent. Therefore, all remainders of the assets belong to depositors. And the insurer of deposit insurance should pay the difference of the bank assets and the deposits. In this case, the deposit insurance contracts can be viewed as a put option written on bank assets with the strike price equal to deposits. Marcus and Shaked (1984) used Merton’s model to price fair insurance premium with constant proportional dividends, and found FDIC overcharged the deposit insurance premiums in practice.

As discussed so far, it is very important to better understand the pricing theory and mechanism of option contracts, as the applications of the theory are so wide. It is well-known that binomial approach, lognormal distribution approach and Itô stochastic differential approach can be used to derive option pricing model. (a) Binomial option model assumes stock price either goes up or down at each period. With no arbitrage opportunity existing, a risk-free portfolio combined with assets is constructed to produce the same return in every state over each investment period. After then, the binomial model can be generalized into n periods. (b) As for lognormal distribution approach, the most important assumption is that the stock price return follow a lognormal distribution. Using properties of normal distribution, lognormal distribution, and their mutual relations, Black–Scholes model can be derived without using stochastic differential. (c) Black and Scholes have used two alternative methods to derive the well-known stochastic differential equation. By introducing boundary constraints and making variable substitutions, the stochastic differential equation evolves to the heat-transfer equation in physics. The Fourier transformation is then used to solve the heat-transfer equation under the boundary condition, and finally obtain the closed-form solution, which is the famous Black–Scholes formula.

In this paper, we are going to give a overall review and comparison of the alternative methods to derive option pricing model. This paper will show that the main methodologies used to derive the Black–Scholes model are: binomial distribution, lognormal distribution, and differential and integral calculus. We will show that if we assume risk neutrality, then we don’t need stochastic calculus to derive the Black–Scholes model. This paper can help statisticians and mathematicians understand how alternative methods can be used to derive the Black–Scholes option model.

The rest of this paper proceeds as follows. In Sect. 2, we briefly review three different approaches for deriving option model. In Sect. 3, we discuss the relationship between binomial OPM and Black–Scholes OPM. In Sect. 4, we compare Cox et al. method and Rendleman and Bartter method for deriving Black–Scholes OPM. In Sect. 5, we discuss lognormal distribution method to derive Black–Scholes OPM. In Sect. 6, we present the stochastic calculus for deriving the Black–Scholes model. Finally, in Sect. 7, we summarize the paper. In Appendix, we use the de Moivre–Laplace theorem to prove that the best fit between the binomial and normal distributions occurs when binomial probability is \( \frac{1}{2} \).

2 A brief review of alternative approaches for deriving option pricing model

Binomial model, lognormal distribution approach, and the Black–Scholes model can be used to price an option. Similar results can be obtained by any of them if we assume some additional assumptions.

2.1 Binomial modelFootnote 1

The binomial option pricing model derived by Rendleman and Bartter (1979) and Cox et al. (1979) is one of the most used models to price options.

In binomial model settings, stock price S either goes up with increase factor (u) to arrive uS or down with decrease factor (d) to arrive dS at each period, where u = 1 + percentage of increase, d = 1 − percentage of decrease.

Let i = interest rate; r = 1 + i; C u  = max[uS − X, 0], call option price after stock price increases; C d  = max[dS − X, 0], call option price after stock price decreases.

To intuitively grasp the underlying concept of option pricing, here we set up a risk-free portfolio—a combination of assets that produces the same return in every state of the world over the investment horizon. The investment horizon here is assumed to be one period. We buy h shares of the stock and sell the call option at its current price of C to set up the portfolio.Footnote 2 Moreover, we choose the value of h such that our portfolio after one period will yield the same payoff whether the stock goes up or down, which is shown as follows.

$$ h(uS) - C_{u} = h(dS) - C_{d} $$
(1)

By solving h, we can obtain the number of shares of stock we should buy for each call option we sell, as the following equation shows.

$$ h = \frac{{C_{u} - C_{d} }}{(u - d)S} $$
(2)

Here h is called the hedge ratio. Because our portfolio yields the same return under either of the two possible states for the stock price without risk, then it should yield the risk-free rate of return, which is equal to the risk-free borrowing and lending rate. This condition must hold; otherwise, there would be a chance to earn a risk-free profit, which is known as an arbitrage opportunity. Therefore, the ending portfolio value must be equal to r (1 + risk-free rate) times the beginning portfolio value, as defined in the following equation.

$$ r(hS - C) = h(uS) - C_{u} = h(dS) - C_{d} $$
(3)

Note that S and C represent the stock price and the option price at period 0, respectively.

Substituting h as Eq. (2) shows, we get the expression for call option value as follows.

$$ C = {{\left[ {\left( {\frac{R - d}{u - d}} \right)C_{u} + \left( {\frac{u - R}{u - d}} \right)C_{d} } \right]\,} \mathord{\left/ {\vphantom {{\left[ {\left( {\frac{R - d}{u - d}} \right)C_{u} + \left( {\frac{u - R}{u - d}} \right)C_{d} } \right]\,} r}} \right. \kern-0pt} r} $$
(4)

To simplify this equation, we set

$$ p = \frac{r - d}{u - d}\, $$
(5)

Therefore, we have

$$ 1 - p = \frac{u - r}{u - d} $$
(6)

Thus we can get the option’s value with one period to expiration as Eq. (7).

$$ C = [pC_{u} + (1 - p)C_{d} ]/r $$
(7)

This is the binomial call option valuation formula in its most basic form. It prices the call option with one period to expiration. In this formula, p can be viewed as the probability of stock price increase, while 1 − p is the probability of stock price decrease.

To derive the option’s price with two periods to go, it is helpful as an intermediate step to derive the value of C u and C d with one period to expiration when the stock price is either uS or dS, respectively.

$$ C_{u} = {{[pC_{uu} + (1 - p)C_{ud} ]}/r} $$
(8)
$$ C_{d} = {{[pC_{du} + (1 - p)C_{dd} ]}/r} $$
(9)

Equation (8) tells us that if the value of the option after one period is C u , the option will be worth either C uu (if the stock price goes up) or C ud (if stock price goes down) after one more period (at its expiration date). C uu and C ud are determined by: C uu  = max[u 2 S − X, 0], and C ud  = max[udS − X, 0].

Similarly, Eq. (9) shows that if the value of the option is C d after one period, the option will be worth either C du or C dd at the end of the second period. C du and C dd are: C ud  = max[udS − X, 0], and C dd  = max[d 2 S − X, 0].

Replacing C u and C d in Eq. (4) with their expressions in Eqs. (8) and (9), respectively, we can simplify the resulting equation to yield the two-period equivalent of the one-period binomial pricing formula, which is

$$ C = {{\left[ {p^{2} C_{uu} + 2p(1 - p)C_{ud} + (1 - p)^{2} C_{dd} } \right]} \mathord{\left/ {\vphantom {{\left[ {p^{2} C_{uu} + 2p(1 - p)C_{ud} + (1 - p)^{2} C_{dd} } \right]} {r^{2} }}} \right. \kern-0pt} {r^{2} }} $$
(10)

In Eq. (10), we used the fact that C ud  = C du because the price will be the same in either case.

If we assume that r, u, and d will remain constant over time, deriving the option’s fair value with two or more periods to maturity is a relatively simple process of working backwards from the possible maturity values. Using the same procedure, we can extend the two-period model to a three-period model as in Eq. (11).

$$ C = {{\left[ {p^{3} C_{uuu} + 3p^{2} (1 - p)C_{uud} + 3p(1 - p)^{2} C_{udd} + (1 - p)^{3} C_{ddd} } \right]} \mathord{\left/ {\vphantom {{\left[ {p^{3} C_{uuu} + 3p^{2} (1 - p)C_{uud} + 3p(1 - p)^{2} C_{udd} + (1 - p)^{3} C_{ddd} } \right]} {r^{3} }}} \right. \kern-0pt} {r^{3} }} $$
(11)

A graphical interpretation of Eq. (11) is presented in Figs. 1 and 2. Figure 1 presents the stock price binomial decision tree and Fig. 2 presents the call option decision tree.

Fig. 1
figure 1

Three-period binomial decision tree of stock price

Fig. 2
figure 2

Three-period binomial decision tree of call option

In Fig. 1, S represents stock price per share in period 0. In period 1, stock price can either go up (uS) or go down (dS).

Similarly, in period 2, stock prices can be u 2 S, udS, duS or d 2 S. Footnote 3 In period 3, stock price has eight possible cases as presented in Fig. 1 in detail.

In period 3, the highest possible value for stock price based on our assumption is u 3 S. We get this value first by multiplying the stock price S at period 0 by u to get the resulting value of uS of period 1. Then, we again multiply the stock price in period 1 by u to get the resulting value of u 2 S of period 2. Finally, we multiply the stock price in period 2 by u to get the value of u 3 S in period 3. Similarly, the lowest possible value of stock price is d 3 S.

In Fig. 2, eight nodes in the right column represent the values of call option when the stock price is fewer than eight different possible cases. Under three-period binomial tree settings, period 3 means the maturity date. At that point, the value of the call option is determined by the relationship between stock price and exercise price X. Here, we take C uud , which implies the value of the call option when the stock price is u 2 dS as an example. If the stock price, u 2 dS, exceeds exercise price X, then the call option value should be C uud  = u 2 dS − X. Otherwise, a negative value has no value to an investor, and the call option value should be 0. All we mentioned above yields the value of option C uud in period 0 as C uud  = max[u 2 dS − X, 0]. Similarly, we can determine all the option values under different stock price at expiration date.

Similarly, the binomial model can be generalized into n periods. Lee et al. (2013b) defined the pricing of a call option in a binomial OPM with n periods as Eq. (12).

$$ C = \frac{1}{{r^{n} }}\sum\limits_{j = 0}^{n} {\frac{n!}{j!(n - j)!}p^{j} (1 - p)^{n - j} } \hbox{max} [(u)^{j} (d)^{n - j} S - X,0] $$
(12)

We can rewrite Eq. (12) as:

$$ C = S\left[ {\sum\limits_{j = a}^{n} {\frac{n!}{j!(n - j)!}p^{j} (1 - p)^{n - j} \frac{{u^{j} d^{n - j} }}{{r^{n} }}} } \right] - \frac{X}{{r^{n} }}\left[ {\sum\limits_{j = a}^{n} {\frac{n!}{j!(n - j)!}p^{j} (1 - p)^{n - j} } } \right] $$
(13)

where a denotes the minimum integer value of j for which u j d nj − X will be positive.

It is easy to observe that the second term in brackets in Eq. (13) is a cumulative binomial distribution with parameters n and p. If we define \( p^{{\prime }} \equiv ({u \mathord{\left/ {\vphantom {u r}} \right. \kern-0pt} r})p \) and \( 1 - p' \equiv ({d \mathord{\left/ {\vphantom {d r}} \right. \kern-0pt} r})(1 - p) \), then the first term in the brackets can also become a cumulative binomial distribution with parameters n and \( p' \), as show in Eq. (14).

$$ p^{j} (1 - p)^{n - j} \frac{{u^{j} d^{n - j} }}{{r^{n} }} = p^{{{\prime }j}} (1 - p^{{\prime }} )^{n - j} $$
(14)

Therefore, Eq. (13) can be simplified as

$$ C = SB_{1} (a;n,p^{{\prime }} ) - \frac{X}{{r^{n} }}B_{2} (a;n,p) $$
(15)

where

$$ B_{1} (a;n,p^{{\prime }} ) = \sum\limits_{j = a}^{n} {{}_{n}C_{j} } p^{{{\prime }j}} (1 - p^{{\prime }} )^{n - j} $$
(15a)
$$ B_{2} (a;n,p) = \sum\limits_{j = a}^{n} {{}_{n}C_{j} } p^{j} (1 - p)^{n - j} $$
(15b)

2.2 Black–Scholes model

Black and Scholes (1973) and Merton (1973) have used stochastic Itô calculus to derive an option pricing model. However, if we assume risk-neutral, Lee et al. (2013b) proposed a lognormal distribution approach to derive the Black–Scholes model. In this paper, we will discuss the lognormal distribution approach in details in Sect. 5.

The most famous option pricing model is the Black–Scholes option pricing model which can be used to price European options.

The Black–Scholes model for a European call option is:

$$ C = SN(d_{1} ) - Xe^{ - rT} N(d_{2} ) $$
(16)

where \( d_{1} = \frac{{\ln ({S/X}) + \left(r + \frac{{\sigma^{2} }}{2}\right)T}}{\sigma \sqrt{T} } \); \( d_{2} = d_{1} - \sigma \sqrt{T} \); C = call price; S = stock price; X = exercise price; r = risk-free interest rate; T = time to maturity of option in years; N(·) = standard normal distribution; σ = stock volatility.

This model can be used to price call option and the put option can be derived from the following put-call parity:

$$ P = C + Xe^{ - rT} - S $$
(17)

where P = put price, other notations are identical to those defined in Eq. (16).

In the following section, we will show the relationship between binomial and Black–Scholes option pricing models.

3 Relationship between binomial OPM and Black–Scholes OPM

When comparing the parameters in both models, we will find that, the binomial model has an increase factor (u), a decrease factor (d), and n-period parameters that the Black–Scholes model does not have. While the Black–Scholes model has distinct parameters, σ and T do not appear in binomial model. The parameters between the two models have the links, and can be translated from one to another. The derivations are as follows (Hull 2014).

As we discussed in Sect. 2, in binomial OPM setting, the stock price S goes up with a probability p to arrive uS, and goes down with a probability 1 − p to arrive dS. The expected stock price is:

$$ puS+(1-p)d. $$

Assume each step is of length Δt, where \( \Delta t = \frac{T}{n} .\) As n → ∞, Δt → 0. The expected return on a stock (in the real world) is supposed to be r (continuously compounding). Therefore, within this small period of time, the expected price should be e rΔt S.

Therefore, we have the following equation holds.

$$ puS + (1 - p)dS = e^{{r\Delta t}} S $$
(18)

The volatility of a stock price is defined as σ, therefore, in the short time period Δt, the standard deviation of the stock return is \( \sigma \sqrt{\Delta t} \), i.e. the variance of the return is σ 2Δt.

The variance of the stock price return isFootnote 4:

$$ pu^{ 2} + ( 1 - p)d^{ 2} - (pu + ( 1 - p)d)^{ 2} . $$

We have the following equation holds.

$$ pu^{2} + (1 - p)d^{2} - (pu + (1 - p)d)^{2} = \sigma^{2}\Delta t $$
(19)

Combining the two equations, we get

$$ e^{{r\Delta t}} (u + d) - ud - e^{{2r\Delta t}} = \sigma^{2}\Delta t $$
(20)

When ignoring terms Δt 2 and higher power of Δt, one solution of this equation isFootnote 5:

$$ \begin{aligned} u & = e^{{\sigma \sqrt{\Delta t} }} \\ d & = e^{{ - \sigma \sqrt{\Delta t} }} \\ \end{aligned} $$
(21)

These are the values of u and d proposed by Cox et al. (1979). To summarize the main relations between the parameters of the two alternative OPMs, we have the following important equations to link the two models.

$$ \begin{aligned}\Delta t & = \frac{T}{n} \\ R & = e^{{r\Delta t}} \\ u & = e^{{\sigma \sqrt{\Delta t} }} \\ d & = e^{{ - \sigma \sqrt{\Delta t} }} \\ \end{aligned} $$
(22)

If n gets very large, the binomial OPM value will get close to the Black–Scholes OPM value. Benninga and Czaczkes (2000) demonstrated that the binomial value will be close to Black–Scholes when the parameter n exceeds 500.

There are two alternative methods to show how binomial OPM can be converted to the Black–Scholes OPM. These two methods are:

  1. 1.

    Theoretical methods proposed by Cox et al. (1979) and Rendleman and Bartter (1979). Lee and Lin (2010) have shown how these two different methods can be related. Cox et al. (1979) used the Lyapunov Condition to show how the binomial OPM can be reduced to the Black–Scholes OPM. Alternatively, Rendleman and Bartter (1979) used the limited theory of the relationship between binomial and normal distribution to show how the binomial OPM can be converted to the Black–Scholes OPM. To understand this approach, we do not need to know advanced probability theory, as we will point out in the next section.

  2. 2.

    The Excel approach proposed by Lee (2001). Lee has used the Excel program approach to show that the binomial model can be approached to the Black–Scholes model when n approaches 500. This approach is similar to the concept and theory used by Rendleman and Bartter (1979).

Here we will demonstrate how to use the binomialBS_OPM.xls Excel file proposed by Lee (2001), to create the decision trees for call option price as an illustrative example.Footnote 6 We assume stock price is 30, strike price is 32, increase factor (u) is 1.1 and decrease factor (d) is 0.9. We are constructing a four-period binomial option pricing model with risk-free rate 3 %. Decision tree of call option using binomial model is produced as shown in Fig. 3.

Fig. 3
figure 3

Call option pricing decision tree

31 calculations were required to create a decision tree that has four periods. Therefore, the Excel file did 31 × 3 = 93 calculations to create the three decision trees for stock price, call option value, and put option value.

We also use the Excel program to calculate the binomial and Black–Scholes call values, which were previously illustrated. If we determine the parameter T and σ as 1 and 0.2, respectively, the increase factor (u) and decrease factor (d) will be adjusted. And we can get: u = 1.105 and d = 0.905. Figure 4 shows the decision tree approximation of Black–Scholes call pricing model as these parameters determined.

Fig. 4
figure 4

Decision tree approximation of Black–Scholes call pricing

Notice that in Fig. 4, the binomial OPM value does not agree with the Black–Scholes OPM, but the values are close. The binomial OPM value will get very close to the Black–Scholes OPM value once the binomial parameter—number of periods n gets very large. Benninga and Czaczkes (2000) demonstrated that the binomial value will be close to Black–Scholes when the number of periods n is larger than 500. Here, we will use the Johnson and Johnson call option as a real example for a practical illustration. The parameters are as follows: S = 93.45, X = 92.5, T = 0.3589, r = 2.75 %. σ = 13.01 %, which is estimated from JNJ stock’s daily return. The observed call price from market is C = 3.65.

For Black–Scholes OPM, we can get:

$$ N(d_{1} ) = 0.6175,\quad N(d_{2} ) = 0.5875 $$

The theoretical value for the call option from Black–Scholes model should be:

$$ C = SN(d_{1} ) - e^{ - rT} XN(d_{2} ) = 3.90 $$

Table 1 shows how the binomial OPM value converges to the Black–Scholes OPM as n gets larger, with increase factor (u) and decrease factor (d) adjusted accordingly.

Table 1 Binomial OPM estimates of different numbers of periods

Previous examples show that the Excel program can be used to demonstrate the binomial option pricing model and can converge to the Black–Scholes model when the number of periods approaches infinity.

4 Compare Cox et al. and Rendleman and Bartter methods to derive OPM

Both Cox et al. (1979) and Rendleman and Bartter (1979) employ Eq. (13) to show how the binomial model can be reduced to the Black–Scholes model when the number of observation n approaches infinity. In this section, we briefly discuss the methods employed by these two papers.

4.1 Cox et al. method

Cox et al. (1979) used the following binomial option pricing model to derive the Black–Scholes model.

$$ \begin{aligned} C & = S\left[ {\sum\limits_{j = a}^{n} {\frac{n!}{j!(n - j)!}p^{j} (1 - p)^{n - j} \frac{{u^{j} d^{n - j} }}{{\hat{r}^{n} }}} } \right] - X\hat{r}^{ - n} \left[ {\sum\limits_{j = a}^{n} {\frac{n!}{j!(n - j)!}p^{j} (1 - p)^{n - j} } } \right] \\ & = SB_{1} (a;n,p^{{\prime }} ) - X\hat{r}^{ - n} B_{2} (a;n,p) \\ \end{aligned} $$
(23)

where

$$ \begin{aligned} B_{1} (a;n,p^{{\prime }} ) & = \sum\limits_{j = a}^{n} {{}_{n}C_{j} } p^{{{\prime }j}} (1 - p' )^{n - j} \\ B_{2} (a;n,p) & = \sum\limits_{j = a}^{n} {{}_{n}C_{j} } p^{j} (1 - p)^{n - j} \\ p' & \equiv ({u/{\hat{r}}})p\;\;\;{\text{and}}\;\;\;1 - p' \equiv ({d/{\hat{r}}})(1 - p) \\ \hat{r} & = 1+ {\text{interest rate over one period}} \\ \end{aligned} $$

a is the minimum number of upward stock movements necessary for the option to terminate in the money. In other words, a is the minimum value of integer j that u j d nj S − X > 0 holds.

In order to show the limiting result that the binomial option pricing formula converges to the continuous version of the Black–Scholes option pricing formula, we assume that h represents the lapsed time between successive stock price changes. Thus, if t is the fixed length of calendar time to expiration, and n is the total number of periods each with length h, then \( h = \frac{t}{n} \). As the trading frequency increases, h will get closer to zero. When h → 0, this is equivalent to n → ∞.

\( \hat{r} \) is one plus the interest rate over a trading period of length h. We not only want \( \hat{r} \) to depend on n, but want it to depend on n in a particular way—so that as n changes, the total return \( \hat{r}^{n} \) remains the same. We denote r as one plus the rate over a fixed unit of calendar time, then over time t, the total return should be r t. Then, we will have following equation:

$$ \hat{r}^{n} = r^{t} $$
(24)

for any choice of n. Therefore, \( \hat{r} = r^{\frac{t}{n}} \). Let \( S^{*} \) be the stock price at the end of the nth period with the initial price S. If there are j upwards move, then the generalized expression should be:

$$ \log ({{S^{*} }/S}) = j\log u + (n - j)\log d = j\log ({u/d}) + n\log d $$
(25)

Therefore, j is the realization of a binomial random variable with probability of a success being p. We have the expectation of \( \log ({{S^{*} }/S}) \) as

$$ E(\log ({{S^{*} }/S})) = [p\log ({u/d}) + \log d]n \equiv \tilde{\mu }n $$
(26)

and its variance

$$ var(\log {{S^{*} }/S}) = [\log ({u/d})]^{2} p(1 - p)n \equiv \tilde{\sigma }^{2} n $$
(27)

We are considering dividing up the original time period t into many shorter subperiods of length h so that t = nh. Our procedure calls for making n larger while keeping the original time period t fixed. As n → ∞, we would at least like the mean and the variance if the continuously compounded return rate of the assumed stock price movement coincided with that of actual stock price. Label the actual empirical values of \( \tilde{\mu }n \) and \( \tilde{\sigma }^{2} n \) as μt and σ 2 t, respectively. Then we want to choose u, d, and p so that \( \tilde{\mu }n \to \mu t \) and \( \tilde{\sigma }^{2} n \to \sigma^{2} t \) as n → ∞.

A little algebra shows that we can accomplish this by letting

$$ \begin{aligned} u & = e^{{\sigma \sqrt{\frac{t}{n}} }} ,\quad d = e^{{ - \sigma \sqrt{\frac{t}{n}} }} \\ p & = \frac{1}{2} + \frac{1}{2}\left( {\frac{\mu }{\sigma }} \right)\sqrt{\frac{t}{n}} \\ \end{aligned} $$
(28)

At this point, in order to proceed further, we need the Lyapunov condition of central limit theorem as following (Ash and Doleans-Dade 1999; Billingsley 2008).

Lyaponov’s Condition

Suppose X 1X 2, … are independent and uniformly bounded with E(X i ) = 0, Y n  = X 1 + ··· + X n , and s 2 = E(Y 2 n ) = Var(Y n ).

If \( \lim_{n \to \infty } \sum\nolimits_{k = 1}^{n} {\frac{1}{{s_{n}^{2 + \delta } }}E\left| {X_{k} } \right|^{2 + \delta } } = 0 \) for some δ > 0, then the distribution of \( \frac{{Y_{n} }}{{s_{n} }} \) converges to the standard normal distribution as n → ∞.

Theorem

If

$$ \frac{{p\left| {\log u - \tilde{\mu }} \right|^{3} + (1 - p)\left| {\log d - \tilde{\mu }} \right|^{3} }}{{\tilde{\sigma }^{3} \sqrt{n} }} \to 0\;\; {\text{ as }}\;\; n \to \infty $$
(29)

then

$$ \Pr \left[ {\frac{{\log \left( {\frac{{S^{*} }}{S}} \right) - \tilde{\mu }n}}{{\tilde{\sigma }\sqrt{n} }} \le z} \right] \to N(z) $$
(30)

where N(z) is the cumulative standard normal distribution function.

Proof

Since

$$ p\left| {\log u - \tilde{\mu }} \right|^{3} = p\left| {\log u - p\log \frac{u}{d} - \log d} \right|^{3} = p(1 - p)^{3} \left| {\log \frac{u}{d}} \right|^{3} $$

and

$$ (1 - p)\left| {\log d - \tilde{\mu }} \right|^{3} = (1 - p)\left| {\log d - p\log \frac{u}{d} - \log d} \right|^{3} = p^{3} (1 - p)\left| {\log \frac{u}{d}} \right|^{3},$$

we have

$$ p\left| {\log u - \tilde{\mu }} \right|^{3} + (1 - p)\left| {\log d - \tilde{\mu }} \right|^{3} = p(1 - p)[(1 - p)^{2} - p^{2} ]\left| {\log \frac{u}{d}} \right|^{3}. $$

Thus

$$ \begin{aligned} & \frac{{p\left| {\log u - \tilde{\mu }} \right|^{3} + (1 - p)\left| {\log d - \tilde{\mu }} \right|^{3} }}{{\hat{\sigma }^{3} \sqrt{n} }} \hfill \\ & \quad = \frac{{p(1 - p)[(1 - p)^{2} - p^{2} ]\left| {\log \frac{u}{d}} \right|^{3} }}{{\left( {\sqrt{p(1 - p)} \log \left( \frac{u}{d} \right)} \right)^{3} \sqrt{n} }} \hfill \\ & \quad= \frac{{(1 - p)^{2} + p^{2} }}{{\sqrt{np(1 - p)} }} \hfill \\ \end{aligned} $$

Recall that \( p = \frac{{\hat{r} - d}}{u - d} \) with \( \hat{r} = r^{\frac{t}{n}} \), \( u = e^{{\sigma \sqrt{\frac{t}{n}} }} \), \( d = e^{{ - \sigma \sqrt{\frac{t}{n}} }} \), we have:

$$ \begin{aligned} p & = \frac{{e^{{\frac{t}{n}\log r}} - e^{{ - \sigma \sqrt{\frac{t}{n}} }} }}{{e^{{\sigma \sqrt{\frac{t}{n}} }} - e^{{ - \sigma \sqrt{\frac{t}{n}} }} }} \\ & = \frac{{1 + \frac{t}{n}\log r - \left[ {1 - \sigma \sqrt{\frac{t}{n}} + \frac{1}{2}\sigma^{2} \frac{t}{n}} \right] + O\left( {n^{{ - \frac{3}{2}}} } \right)}}{{1 + \sigma \sqrt{\frac{t}{n}} - \left[ {1 - \sigma \sqrt{\frac{t}{n}} } \right] + O\left( {n^{{ - \frac{3}{2}}} } \right)}} \\ & = \frac{1}{2} + \frac{1}{2}\left[ {\frac{{\log r - \frac{1}{2}\sigma^{2} }}{\sigma }} \right]\sqrt{\frac{t}{n}} + O(n^{ - 1} ) \\ \end{aligned} $$

Therefore, \( \frac{{(1 - p)^{2} +\,p^{2} }}{{\sqrt{np(1 - p)} }} \to 0\,{\text{ as }}\,n \to \infty \).

Hence the condition for the theorem to hold as stated in Eq. (29) is satisfied. It is noted that the condition (29) is a special case of Lyapunov’s condition where δ = 1. Next, we will show that the binomial option pricing model as given in Eq. (23) will indeed coincide with the Black–Scholes option pricing formula. We can see that there are apparent similarities in Eq. (23). In order to show the limiting result, we need to show that:

$$ As\;\;n \to \infty ,\quad B_{1} (a;n,p^{{\prime }} ) \to N(x)\quad {\text{and}}\quad B_{2} (a;n,p) \to N(x - \sigma \sqrt{t} ) $$

In this section we will only show the second convergence result, as the same argument will hold true for the first convergence. From the definition of B 2(anp), it is clear that

$$ \begin{aligned} 1 - B_{2} (a;n,p) & = \Pr (j \le a - 1) \\ \, & = \Pr \left( {\frac{j - np}{{\sqrt{np(1 - p)} }} \le \frac{a - 1 - np}{{\sqrt{np(1 - p)} }}} \right) \\ \end{aligned} $$
(31)

Recall that we consider a stock to move from S to uS with probability p and dS with probability 1 − p. The mean and variance of the continuously compounded rate of return for this stock are \( \tilde{\mu }_{p} \) and \( \tilde{\sigma }_{p}^{2} \) where

$$ \tilde{\mu }_{p} = p\log \left( \frac{u}{d} \right) + \log d\quad {\text{and}}\quad \tilde{\sigma }_{p}^{2} = \left[ {\log \left( \frac{u}{d} \right)} \right]^{2} p(1 - p) $$
(32)

From Eq. (25) and the definitions for \( \tilde{\mu }_{p} \) and \( \tilde{\sigma }_{p}^{2} \), we have

$$ \frac{j - np}{{\sqrt{np(1 - p)} }} = \frac{{\log \left( {\frac{{S^{*} }}{S}} \right) - \tilde{\mu }_{p} n}}{{\tilde{\sigma }_{p} \sqrt{n} }} $$
(33)

Also, from the binomial option pricing formula we have

$$ a - 1 = \frac{{\log \left( {\frac{X}{{Sd^{n} }}} \right)}}{{\log \left( {\frac{u}{d}} \right)}} - \varepsilon = {{\left[ {\log \frac{X}{S} - n\log d} \right]} \mathord{\left/ {\vphantom {{\left[ {\log \frac{X}{S} - n\log d} \right]} {\log \left( {\frac{u}{d}} \right)}}} \right. \kern-\nulldelimiterspace} {\log \left( {\frac{u}{d}} \right)}} - \varepsilon$$
(34)

where ɛ is a real number between 0 and 1.

From the definitions of \( \tilde{\mu }_{p} \) and \( \tilde{\sigma }_{p}^{2} \), it is easy to show that

$$ \frac{a - 1 - np}{{\sqrt{np(1 - p)} }} = \frac{{\log \left( \frac{X}{S} \right) - \tilde{\mu }_{p} n - \varepsilon \log \left( \frac{u}{d} \right)}}{{\tilde{\sigma }_{p} \sqrt{n} }} $$
(35)

Thus from Eq. (31) we have

$$ 1 - B_{2} (a;n,p) = \Pr \left( {\frac{{\log \frac{{S^{*} }}{S} - \tilde{\mu }_{p} n}}{{\tilde{\sigma }_{p} \sqrt{n} }} \le \frac{{\log \frac{X}{S} - \tilde{\mu }_{p} n - \varepsilon \log \left( \frac{u}{d} \right)}}{{\tilde{\sigma }_{p} \sqrt{n} }}} \right) $$
(36)

We have checked the condition given by Eq. (29) in order to apply the central limit theorem. In addition, we have to evaluate \( \tilde{\mu }_{p} n \), \( \tilde{\sigma }_{p}^{2} n \) and \( \log \left( \frac{u}{d} \right) \) as n → ∞. \( \tilde{\mu }_{p} n \to \left( {\log r - \frac{1}{2}\sigma^{2} } \right)t \), which can be derived from the property of the lognormal distribution that \( \log E({{S^{*} }/S}) = \mu_{p} t + \frac{1}{2}\sigma^{2} t \), and \( E({{S^{*} }/S}) = [pu + (1 - p)d]^{n} = \hat{r}^{n} = r^{t} \). It is also clear that \( n\tilde{\sigma }_{p}^{2} \to \sigma^{2} t \) and \( \log \left( \frac{u}{d} \right) \to 0 \).

Hence, in order to evaluate the asymptotic probability in Eq. (30), we have

$$ \frac{{\log \left( \frac{X}{S} \right) - \tilde{\mu }_{p} n - \varepsilon \log \left( \frac{u}{d} \right)}}{{\tilde{\sigma }_{p} \sqrt{n} }} \to z = \frac{{\log \left( \frac{X}{S} \right) - \left( {\log r - \frac{1}{2}\sigma^{2} } \right)t}}{\sigma \sqrt{t} } $$
(37)

Using the fact that 1 − N(z) = N(−z), we have, as n → ∞, \( B_{2} (a;n,p) \to N( - z) = N(x - \sigma \sqrt{t} ) \), where \( x = \frac{{\log \left( {\frac{S}{{Xr^{ - t} }}} \right)}}{\sigma \sqrt{t} } + \frac{1}{2}\sigma \sqrt{t} \). A similar argument holds for \( B_{1} (a;n,p^{{\prime }} ) \), and hence we completed the proof that the binomial option pricing formula as given in Eq. (23) includes the Black–Scholes option pricing formula as a limiting case.

Lyaponov’s Condition requires that X 1X 2, … are independent and uniformly bounded with E(X i ) = 0, Y n  = X 1 + ··· + X n , and s 2 = E(Y 2 n ) = Var(Y n ). However, rates of return are generally not independent over time and not necessarily uniformly bounded by the condition required. This is the potential limitation of proof by Cox et al. (1979). We found that the derivation methods proposed by Rendleman and Bartter (1979), which will be discussed in next section, are not so restrictive as the proof discussed in this section.

4.2 Rendleman and Bartter method

In Rendleman and Bartter (1979), a stock price can either advance or decline during the next period. Let \( H_{T}^{ + } \) and \( H_{T}^{ - } \) represent the returns per dollar invested in the stock if the price rises (the + state) or falls (the − state), respectively, from time T − 1 to time T (maturity of the option). \( V_{T}^{ + } \) and \( V_{T}^{ - } \) the corresponding end-of-period values of the option.

Let R be the riskless interest rate, they showed that the price of the option can be represented as a recursive form as:

$$ W_{T - 1} = \frac{{W_{T}^{ + } \left(1 + R - H_{T}^{ - } \right) + W_{T}^{ - } \left(H_{T}^{ + } - 1 - R\right)}}{{(H_{T}^{ + } - H_{T}^{ - } )(1 + R)}} $$
(38)

Equation (37) can be applied at any time T − 1 to determine the price of the option as a function of its value at time T. Footnote 7 By using recursive substitution as discussed in Sect. 2.1, they derived the binomial option pricing model as defined in Eq. (38).Footnote 8

$$ W_{0} = S_{0} B_{1} (a;T,\varphi ) - \frac{X}{{(1 + R)^{T} }}B_{2} (a;T,\phi ) $$
(39)

where pseudo probabilities φ and ϕ are defined as:

$$ \varphi = \frac{{(1 + R - H^{ - } )H^{ + } }}{{(1 + R)(H^{ + } - H^{ - } )}} $$
(40)
$$ \phi = \frac{{(1 + R - H^{ - } )}}{{(H^{ + } - H^{ - } )}} $$
(41)

Please note that ϕ and φ are identical to p and \( p^{{\prime }} \), which are defined as \( p = \frac{r - d}{u - d}\, \) and \( p^{{\prime }} \equiv ({u/r})p \) in Sect. 2.1.

a denotes the minimum integer value of i for which \( S_{0} H^{{ +^{i} }} H^{{ -^{T - i} }} > X \) will be satisfied. This value is given byFootnote 9:

$$ a = 1 + INT\left[ {\frac{{\ln (X/S_{0}) - T\ln (H^{ - } )}}{{\ln H^{ + } - \ln H^{ - } }}} \right] $$
(42)

where INT[·] is the integer operator.

B 1(aTφ) and B 2(aTφ) are the cumulative binomial probability. The number of successes will fall between a and T after T trials, φ and ϕ represent the probability associated with a success after one trial.

In each period, the stock price rises with the probability θ. We assume the distribution of returns, which is generated after T periods will follow a log-binomial distribution. Then the mean of the stock price return is:

$$ \mu = T[h^{ + } \theta + h^{ - } (1 - \theta )] = T[(h^{ + } - h^{ - } )\theta + h^{ - } ] $$
(43)

And the variance of stock price return is:

$$ \sigma^{2} = T(h^{ + } - h^{ - } )^{2} \theta (1 - \theta ) $$
(44)

where: θ = probability that the price of the stock will rise

$$ h^{ + } = \ln (H^{ + } ) $$
(45)
$$ h^{ - } = \ln (H^{ - } ) $$
(46)

Please note that in Cox et al. (1979), they assume log-binomial distribution with mean μt, and variance σ 2 t. Apparently, Rendleman and Bartter (1979) assumed that t = 1. Therefore, the Black–Scholes model derived by them is not exactly identical to the original Black–Scholes model. The implied values of H + and H are then determined by solving Eqs. (42)–(45), shown as Eqs. (46) and (47), respectively.

$$ H^{ + } = \exp \left( {{\mu/T} + ({\sigma/{\sqrt{T} }})\sqrt{\frac{(1 - \theta )}{\theta }} } \right) $$
(47)
$$ H^{ - } = \exp \left( {{\mu/T} - ({\sigma/{\sqrt{T} }})\sqrt{\frac{\theta }{(1 - \theta )}} } \right) $$
(48)

As T becomes larger, the cumulative binomial density function can be approximated by the cumulative normal density function. When T → ∞, the approximation will be exact, and Eq. (38) evolves to Eq. (48).Footnote 10

$$ W_{0} \sim S_{0} N\left( {Z_{1} ,Z_{1}^{{\prime }} } \right) - \frac{X}{{(1 + R)^{T} }}N\left( {Z_{2} ,Z_{2}^{{\prime }} } \right) $$
(49)

In this equation, \( N(Z,Z^{{\prime }} ) \) is the probability that a random variable from a standard normal distribution will take on values between a lower limit Z and an upper limit \( Z^{{\prime }} \). According to the property of binomial probability distribution function, we have:

$$ Z_{1} = \frac{a - T\varphi }{{\sqrt{T\varphi (1 - \varphi )} }},\quad Z_{1}^{{\prime }} = \frac{T - T\varphi }{{\sqrt{T\varphi (1 - \varphi )} }} $$
$$ Z_{2} = \frac{a - T\phi }{{\sqrt{T\phi (1 - \phi )} }},\quad Z_{2}^{{\prime }} = \frac{T - T\phi }{{\sqrt{T\phi (1 - \phi )} }} $$

Thus, the price of option when the two-state process evolves continuously is presented as:

$$ W_{0} = S_{0} N\left( {\mathop {\lim }\limits_{T \to \infty } Z_{1} ,\mathop {\lim }\limits_{T \to \infty } Z_{1}^{{\prime }} } \right) - \frac{X}{{\mathop {\lim }\limits_{T \to \infty } (1 + R)^{T} }}N\left( {\mathop {\lim }\limits_{T \to \infty } Z_{2} ,\mathop {\lim }\limits_{T \to \infty } Z_{2}^{{\prime }} } \right) $$
(50)

Let \( 1 + R = e^{r/T} \) reflect the continuous compounding of interest, then lim T→∞(1 + R)T = e r. It is obvious that \( { \lim }_{T \to \infty } Z_{1}^{\prime } = { \lim }_{T \to \infty } Z_{2}^{\prime } = \infty \), therefore, all that needs to be determined is lim T→∞ Z 1 and lim T→∞ Z 2 in the derivation of the two-state model under a continuous time case. Substituting H + and H in Eqs. (46) and (47) into Eq. (41), we have: \( a = 1 + INT\left[ {\frac{{\ln (X/S_{0}) - \mu + \sigma \sqrt{T} \sqrt{\frac{\theta }{(1 - \theta )}} }}{{{\sigma \mathord{\left/ {\vphantom {\sigma {\sqrt{T\theta (1 - \theta )} }}} \right. \kern-0pt} {\sqrt{T\theta (1 - \theta )} }}}}} \right] \).

Then, we have Eq. (50) holds.

$$ Z_{1} = \frac{a - T\varphi }{{\sqrt{T\varphi (1 - \varphi )} }} = \frac{{1 + INT\left[ {\frac{{\ln (X/S_{0}) - \mu + \sigma \sqrt{T} \sqrt{\frac{\theta }{1 - \theta }} }}{{\frac{\sigma }{{\sqrt{T\theta (1 - \theta )} }}}}} \right] - T\varphi }}{{\sqrt{T\varphi (1 - \varphi )} }} $$
(51)

In the limit, the term 1 + INT[·] will be simplified to [·]. Therefore, Z 1 can be restated as:

$$ Z_{1} \sim \frac{{\ln (X/S_{0}) - \mu }}{{\sigma \sqrt{\frac{\varphi (1 - \varphi )}{\theta (1 - \theta )}} }} + \frac{\sqrt{T} (\theta - \varphi )}{{\sqrt{\varphi (1 - \varphi )} }} $$
(52)

Substituting H + and H in Eqs. (46) and (47) and \( 1 + R = e^{r/T} \) into Eq. (39), we have:

$$ \begin{aligned} \varphi & = \frac{{(1 + R - H^{ - } )H^{ + } }}{{(1 + R)(H^{ + } - H^{ - } )}} = \frac{{\left( {e^{{\tfrac{r}{T}}} - e^{{\tfrac{\mu }{T} - (\sigma/\sqrt{T})\sqrt{\frac{\theta }{1 - \theta }} }} } \right)e^{{\tfrac{\mu }{T} + (\sigma/\sqrt{T})\sqrt{\frac{1 - \theta }{\theta }} }} }}{{e^{{\tfrac{r}{T}}} \left( {e^{{\tfrac{\mu }{T} + (\sigma/\sqrt{T})\sqrt{\frac{1 - \theta }{\theta }} }} - e^{{\tfrac{\mu }{T} - (\sigma/\sqrt{T}) \sqrt{\frac{\theta }{1 - \theta }} }} } \right)}} \\ & = \frac{{e^{{(\sigma/\sqrt{T})\sqrt{\frac{1 - \theta }{\theta }} }} - e^{{\tfrac{\mu }{T} - \tfrac{r}{T} + (\sigma/\sqrt{T})\left( {\sqrt{\frac{1 - \theta }{\theta }} - \sqrt{\frac{\theta }{1 - \theta }} } \right)}} }}{{e^{{(\sigma/\sqrt{T})\sqrt{\frac{1 - \theta }{\theta }} }} - e^{{ - (\sigma/\sqrt{T})\sqrt{\frac{\theta }{1 - \theta }} }} }} \\ \end{aligned} $$
(53)

Now, we expand Taylor’s seriesFootnote 11 in \( \frac{1}{\sqrt{T} } \), and obtain:

$$ \begin{aligned} \varphi & = \frac{{(\sigma /\sqrt T )\sqrt {\frac{{1 - \theta }}{\theta }} - {{(\mu - r)} \mathord{\left/ {\vphantom {{(\mu - r)} T}} \right. \kern-\nulldelimiterspace} T} - (\sigma /\sqrt T )\left( {\sqrt {\frac{{1 - \theta }}{\theta }} - \sqrt {\frac{\theta }{{1 - \theta }}} } \right) + O\left( {\frac{1}{{\sqrt T }}} \right)}}{{(\sigma /\sqrt T )\left( {\sqrt {\frac{{1 - \theta }}{\theta }} + \sqrt {\frac{\theta }{{1 - \theta }}} } \right) + O\left( {\frac{1}{{\sqrt T }}} \right)}} \\ & = \frac{{(\sigma /\sqrt T )\sqrt {\frac{\theta }{{1 - \theta }}} + O\left( {\frac{1}{{\sqrt T }}} \right)}}{{(\sigma /\sqrt T )\left( {\sqrt {\frac{{1 - \theta }}{\theta }} + \sqrt {\frac{\theta }{{1 - \theta }}} } \right) + O\left( {\frac{1}{{\sqrt T }}} \right)}} \\ \end{aligned} $$
(54)

where \( o(\frac{1}{\sqrt{T} }) \) denotes a function tending to zero more rapidly than \( \frac{1}{\sqrt{T} } \).

It can be shown that:

$$ \mathop {\lim }\limits_{T \to \infty } \varphi = \frac{{\sqrt{\frac{\theta }{1 - \theta }} }}{{\sqrt{\frac{1 - \theta }{\theta }} + \sqrt{\frac{\theta }{1 - \theta }} }} = \frac{{\sqrt{\frac{\theta }{1 - \theta }} }}{{\frac{1 - \theta + \theta }{{\sqrt{\theta (1 - \theta )} }}}} = \theta $$
(55)

Similarly, we have:

$$ \begin{aligned} \sqrt{T} (\theta - \varphi ) &= \sqrt{T} \left( {\theta - \frac{{e^{{(\sigma/\sqrt{T})\sqrt{\frac{1 - \theta }{\theta }} }} - e^{{\tfrac{\mu }{T} - \tfrac{r}{T} + (\sigma/\sqrt{T})\left( {\sqrt{\frac{1 - \theta }{\theta }} - \sqrt{\frac{\theta }{1 - \theta }} } \right)}} }}{{e^{{(\sigma/\sqrt{T})\sqrt{\frac{1 - \theta }{\theta }} }} - e^{{ - (\sigma/\sqrt{T})\sqrt{\frac{\theta }{1 - \theta }} }} }}} \right) \hfill \\ &= \frac{{\theta \sqrt{T} \left( {e^{{(\sigma/\sqrt{T})\sqrt{\frac{1 - \theta }{\theta }} }} - e^{{ - (\sigma/\sqrt{T})\sqrt{\frac{\theta }{1 - \theta }} }} } \right) - \sqrt{T} \left( {e^{{(\sigma/\sqrt{T})\sqrt{\frac{1 - \theta }{\theta }} }} - e^{{\tfrac{\mu }{T} - \tfrac{r}{T} + (\sigma/\sqrt{T})\left( {\sqrt{\frac{1 - \theta }{\theta }} - \sqrt{\frac{\theta }{1 - \theta }} } \right)}} } \right)}}{{e^{{(\sigma/\sqrt{T})\sqrt{\frac{1 - \theta }{\theta }} }} - e^{{ - (\sigma/\sqrt{T})\sqrt{\frac{\theta }{1 - \theta }} }} }} \hfill \\ \end{aligned} $$
(56)

We also expand Taylor’s series in \( \frac{1}{\sqrt{T} } \), and we can obtain:

$$ \begin{aligned} \sqrt T (\theta - \varphi ) & = \frac{{\theta \sqrt T \left[ {(\sigma /\sqrt T )\left( {\sqrt {\frac{{1 - \theta }}{\theta }} + \sqrt {\frac{\theta }{{1 - \theta }}} } \right) + \frac{1}{2}(\sigma /\sqrt T )^{2} \left( {\frac{{1 - \theta }}{\theta } - \frac{\theta }{{1 - \theta }}} \right) + O\left( {\frac{1}{T}} \right)} \right]}}{{(\sigma /\sqrt T )\left( {\sqrt {\frac{{1 - \theta }}{\theta }} + \sqrt {\frac{\theta }{{1 - \theta }}} } \right) + O\left( {\frac{1}{{\sqrt T }}} \right)}} \\ & \quad - \frac{{\sqrt T (\sigma /\sqrt T )\left( {\sqrt {\frac{{1 - \theta }}{\theta }} } \right) + \frac{1}{2}(\sigma /\sqrt T )^{2} \left( {\frac{{1 - \theta }}{\theta }} \right) - \left( {\frac{{\mu - r}}{T} + (\sigma /\sqrt T )\left( {\sqrt {\frac{{1 - \theta }}{\theta }} - \sqrt {\frac{\theta }{{1 - \theta }}} } \right)} \right)}}{{(\sigma /\sqrt T )\left( {\sqrt {\frac{{1 - \theta }}{\theta }} + \sqrt {\frac{\theta }{{1 - \theta }}} } \right) + O\left( {\frac{1}{{\sqrt T }}} \right)}} \\ & \quad + \frac{{\frac{1}{2}(\sigma /\sqrt T )^{2} \left( {\sqrt {\frac{{1 - \theta }}{\theta }} - \sqrt {\frac{\theta }{{1 - \theta }}} } \right)^{2} + O\left( {\frac{1}{{\sqrt T }}} \right)}}{{(\sigma /\sqrt T )\left( {\sqrt {\frac{{1 - \theta }}{\theta }} + \sqrt {\frac{\theta }{{1 - \theta }}} } \right) + O\left( {\frac{1}{{\sqrt T }}} \right)}} \\ & = \frac{{\frac{1}{2}\theta \frac{{\sigma ^{2} }}{{\sqrt T }}\left( {\frac{{1 - \theta }}{\theta } - \frac{\theta }{{1 - \theta }}} \right) + \frac{{\mu - r}}{{\sqrt T }} + \frac{1}{2}\frac{{\sigma ^{2} }}{{\sqrt T }}\left( {\frac{\theta }{{1 - \theta }} - 2} \right) + O\left( {\frac{1}{{\sqrt T }}} \right)}}{{(\sigma /\sqrt T )\left( {\sqrt {\frac{{1 - \theta }}{\theta }} + \sqrt {\frac{\theta }{{1 - \theta }}} } \right) + O\left( {\frac{1}{{\sqrt T }}} \right)}} \\ \end{aligned} $$
(57)

Therefore, we have:

$$ \begin{aligned} & \mathop {\lim }\limits_{T \to \infty } \sqrt{T} (\theta - \varphi ) \\ & \quad = \mathop {\lim }\limits_{T \to \infty } \frac{{\frac{1}{2}\theta \frac{{\sigma^{2} }}{\sqrt{T} }\left( {\frac{1 - \theta }{\theta } - \frac{\theta }{1 - \theta }} \right) + \frac{\mu - r}{\sqrt{T} } + \frac{1}{2}\frac{{\sigma^{2} }}{\sqrt{T} }\left( {\frac{\theta }{1 - \theta } - 2} \right) + O\left( {\frac{1}{\sqrt{T} }} \right)}}{{{{(\sigma}/{\sqrt{T} }})\left( {\sqrt{\frac{1 - \theta }{\theta }} + \sqrt{\frac{\theta }{1 - \theta }} } \right) + O\left( {\frac{1}{\sqrt{T} }} \right)}} \\ & \quad = \frac{{\frac{1}{2}\theta \sigma^{2} \left( {\frac{1 - \theta }{\theta } - \frac{\theta }{1 - \theta }} \right) + \mu - r + \frac{1}{2}\sigma^{2} \left( {\frac{\theta }{1 - \theta } - 2} \right)}}{{\sigma \left( {\sqrt{\frac{1 - \theta }{\theta }} + \sqrt{\frac{\theta }{1 - \theta }} } \right)}} \\ & \quad = \frac{{\mu - r - \frac{1}{2}\sigma^{2} }}{{\sigma \left( {\sqrt{\frac{1 - \theta }{\theta }} + \sqrt{\frac{\theta }{1 - \theta }} } \right)}} = \frac{{\left( {\mu - r - \frac{1}{2}\sigma^{2} } \right)\sqrt{\theta (1 - \theta )} }}{\sigma } \\ \end{aligned} $$
(58)

Now substituting lim T→∞ φ for φ and \( \lim_{T \to \infty } \sqrt{T} (\theta - \varphi ) \) for \( \sqrt{T} (\theta - \varphi ) \) into Eq. (51). Then we have Eq. (58) holds.

$$ \begin{aligned} \mathop {\lim }\limits_{T \to \infty } Z_{1} & = \frac{{\ln (X/S_{0}) - \mu }}{{\sigma \sqrt{\frac{\theta (1 - \theta )}{\theta (1 - \theta )}} }} - \frac{{\sqrt{\theta (1 - \theta )} \left( {r - \mu + \frac{1}{2}\sigma^{2} } \right)}}{{\sigma \sqrt{\theta (1 - \theta )} }} \\ & = \frac{{\ln (X/S_0) - r - \frac{1}{2}\sigma^{2} }}{\sigma } \\ \end{aligned} $$
(59)

Similarly, we can also prove that

$$ \mathop {\lim }\limits_{T \to \infty } Z_{2} = \frac{{\ln (X/S_0) - r + \frac{1}{2}\sigma^{2} }}{\sigma } $$
(60)

According to the property of normal distribution, N(Z, ∞) = N(−∞, −Z). Let d 1 = −lim T→∞ Z 1, d 2 = −lim T→∞ Z 2, the continuous time version of the two-state model is obtained:

$$ \begin{aligned} w_{0} & = S_{0} N( - \infty ,d_{1} ) - Xe^{ - r} N( - \infty ,d_{2} ) \\ & = S_{0} N(d_{1} ) - Xe^{ - r} N(d_{2} ) \\ d_{1} & = \frac{{\ln \left( {\frac{{S_{0} }}{X}} \right) + r + \frac{1}{2}\sigma^{2} }}{\sigma } \\ d_{2} & = d_{1} - \sigma \\ \end{aligned} $$
(61)

Equation (60) is not exactly identical to the original Black–Scholes model because of the assumed log-binomial distribution with mean μ and variance σ 2. If they assume a log-binomial distribution with mean μt and variance σ 2 t, then d 1 and d 2 should be rewritten as:

$$ \begin{aligned} d_{1} & = \frac{{\ln \left(\frac{{S_{0} }}{X}\right) + \left( {r + \frac{1}{2}\sigma^{2} } \right)t}}{\sigma \sqrt{t} } \\ d_{2} & = d_{1} - \sigma \sqrt{t} \\ \end{aligned} $$

Lee and Lin (2010) have theoretically compared these two derivation methods. Based upon (a) mathematical and probability theory knowledge, (b) assumption and (c) advantage and disadvantage, the comparison results are listed in Table 2. The main differences of assumptions between two approaches are: Under Cox et al. (1979) method, the stock price’s increase factor and decrease factor is expressed as: \( u = e^{{\sigma \sqrt{\frac{t}{n}} }} \) and \( d = e^{{ - \sigma \sqrt{\frac{t}{n}} }} \), respectively, which implies the restraints equality ud = 1 holds. While under the Rendleman and Bartter (1979) method, the increase factor and decrease factor is: \( H^{ + } = \exp \left( {{\mu/T} + (\sigma/\sqrt{T})\sqrt{\frac{(1 - \theta )}{\theta }} } \right) \) and \( H^{ + } = \exp \left( {{\mu/T} - (\sigma/\sqrt{T})\sqrt{\frac{\theta }{(1 - \theta )}} } \right) \), respectively. In the Rendleman and Bartter (1979) method’s settings, time to maturity is settled as “1”. With the number of periods T → ∞, we can find that the expressions are similar to the Cox et al. (1979) method. They still have the “adjusted factor” \( \sqrt{\frac{(1 - \theta )}{\theta }} \) and \( \sqrt{\frac{\theta }{(1 - \theta )}} \) before \( \frac{\sigma }{\sqrt{T} } \) in the exponential expression for increase factor and decrease factor. Under the Rendleman and Bartter (1979) method, H + H  ≠ 1.

Table 2 Comparison between Rendleman and Bartter’s and Cox et al.’s approaches

Hence, like we indicate in Table 2, the Cox et al. method is easy to follow if one has the advanced level knowledge in probability theory, but the assumptions on the model parameters make its applications limited. On the other hand, the Rendleman and Bartter model is intuitive and does not require higher-level knowledge in probability theory. However, the derivation is more complicated and tedious. In “Appendix”, we show that the best fit between binomial distribution and normal distribution will occur when binomial probability is 0.5.

5 Lognormal distribution approach to derive Black–Scholes model

The presentation and derivation of this section follow Garven (1986), Lee et al. (2013a, b).

To derive the option pricing model in terms of lognormal distribution, we begin by assuming that the stock price follows a lognormal distribution (Lee et al. 2013b). Denote the current stock price by S and the stock price at the end of tth period by S t . Then \( \frac{{S_{t} }}{{S_{t - 1} }} = \exp (K_{t} ) \) is a random variable with a lognormal distribution, where K t is the rate of return in tth period and is assumed as a random variable with normal dis-tribution. Assume K t has the same expected value μ k and variance \( \sigma_{k}^{2} \) for each. Then \( K_{1} + K_{2} + \cdots + K_{T} \) is a normal random variable with expected value k and variance \( T\sigma_{k}^{2} \).

Property of lognormal distribution

If a continuous random variable y is normally distributed, then the continuous variable x defined in Eq. (61) is lognormally distributed.

$$ x = e^{y} $$
(62)

If the variable y has mean μ and variance σ 2, then the mean μ x and variance σ 2 x of variable x is defined as the following, respectively.

$$ \mu_{x} = e^{{\mu \, + \,1 /2\sigma^{2} }} $$
(63)
$$ \sigma_{x}^{2} = e^{{2\mu + \sigma^{2} }} \left( {e^{{\sigma^{2} }} - 1} \right) $$
(64)

Following the property, we then can define the expected value of \( \frac{{S_{T} }}{S} = \exp (K_{1} + K_{2} + \cdots + K_{T} ) \) as:

$$ E\left( {\frac{{S_{T} }}{S}} \right) = \exp \left( {T\mu_{k} + \frac{{T\sigma_{k}^{2} }}{2}} \right) $$
(65)

Under the assumption of a risk-neutral investor, the expected return \( E\left( {\frac{{S_{T} }}{S}} \right) \) is assumed to be e rT (where r is the riskless rate of interest). In other words, we have the following equality holds.

$$ \mu_{k} = r - {{\sigma_{k}^{2} }/2} $$
(66)

The call option price C can be determined by discounting the expected value of the terminal option price by the risk-free rate.

$$ C = e^{ - rT} E[Max(S_{T} - X,0)] $$
(67)

Note that in Eq. (66):

$$ Max(S_{T} - X,0) =\left\{\begin{array}{ll} S_{T} - X & \quad {\text{for }}\,\,S_{T} > X \\ 0 & \quad{\rm otherwise }\\ \end{array}\right. $$

where T is the time of expiration and X is the exercise price.

Let \( x = \frac{{S_{T} }}{S} \) be a lognormal distribution. Then we have:

$$ \begin{aligned} C & = e^{ - rT} \,E[Max(S_{T} - X)] \\ & = e^{ - rT} \int_{\frac{X}{S}}^{\infty } {S\left[ {x - \frac{X}{S}} \right]} \;g(x)dx \\ & = e^{ - rT} S\;\int_{\frac{X}{S}}^{\infty } {xg(x)dx} - e^{ - rT} S\frac{X}{S}\int_{\frac{X}{S}}^{\infty } {g(x)dx} \\ \end{aligned} $$
(68)

where g(x) is the probability density function \( x = \frac{{S_{T} }}{S} \).

Here, we will use properties of normal distribution, lognormal distribution, and their mutual relations to derive the Black–Scholes model. We continue with variable settings in Eq. (61), where y is normally distributed and x is lognormally distributed.

The PDF of x is:

$$ f(x) = \frac{1}{{x\sigma \sqrt{2\pi } }}\exp \left[ { - \frac{1}{{2\sigma^{2} }}(x - \mu )^{2} } \right],\quad x > 0 $$
(69)

The PDF of y can be defined as:

$$ f(y) = \frac{1}{{\sigma \sqrt{2\pi } }}\exp \left[ { - \frac{1}{{2\sigma^{2} }}(y - \mu )^{2} } \right],\quad - \infty < y < \infty $$
(70)

By comparing the PDF of normal distribution and the PDF of lognormal distribution, we know that

$$ f(x) = \frac{f(y)}{x} $$
(71)

In addition, it can be shown thatFootnote 13

$$ dx = xdy $$
(72)

The CDF of lognormal distribution can be defined as

$$ \int_{a}^{\infty } {f(x)dx} $$
(73)

If we transform variable x in Eq. (72) into variable y, then the upper and lower limits of integration for a new variable are ∞ and ln a, respectively. Then the CDF for lognormal distribution can be written in terms of the CDF for normal distribution as

$$ \int_{a}^{\infty } {f(x)dx} = \int_{\ln\,a}^{\infty } {\left( {\frac{f(y)}{x}} \right)x\;dy} = \int_{\ln (a)}^{\infty } {f(y)dy} $$
(74)

We can rewrite Eq. (73) in a standard normal distribution form, by substituting the variable.

$$ \int_{a}^{\infty } {f(x)dx} = \int_{\ln (a)}^{\infty } {f(y)dy = N(d)} $$
(75)

where \( d = \frac{\mu - \ln (a)}{\sigma } \).

Similarly, the mean of a lognormal variable can be defined as:

$$ \int_{0}^{\infty } {xf(x)dx} = e^{{\mu \, + \,1/2\sigma^{2} }} $$
(76)

If the lower bound a is >0; then the partial mean of x can be shown asFootnote 14:

$$ \int_{0}^{\infty } {xf(x)dx} = \int_{\ln (a)}^{\infty } {f(y)e^{y} dy} = e^{{\mu + \sigma^{2} /2}} N(d) $$
(77)

where \( d = \frac{\mu - \ln (a)}{\sigma } + \sigma \).

Substituting μ = r − σ 2/2 and \( a = \frac{X}{S} \) into Eq. (74), we obtain:

$$ \int_{\frac{X}{S}}^{\infty } {g(x)dx = N(d_{2} )} $$
(78)

where \( d_{2} = \frac{{r - (1 /2)\sigma^{2} - \ln \frac{X}{S}}}{\sigma } \).

Similarly, we substitute \( \mu = r - {{\sigma^{2} } \mathord{\left/ {\vphantom {{\sigma^{2} } 2}} \right. \kern-0pt} 2} \) and \( a = \frac{X}{S} \) into Eq. (76), we obtain:

$$ \int_{\frac{X}{S}}^{\infty } {xg(x)dx} = e^{r} N(d_{1} ) $$
(79)

where \( d_{1} = \frac{{r - (1 /2)\sigma^{2} - \ln \frac{X}{S}}}{\sigma } + \sigma \).

Substituting Eqs. (77) and (78) into Eq. (67), we obtain Eq. (79), which is identical to the Black–Scholes formula.

$$ \begin{aligned} C &= SN(d_{1} ) - Xe^{ - rT} N(d_{2} ) \hfill \\ d_{1} &= \frac{{\ln \left( \frac{S}{X} \right) + \left( {r + \frac{1}{2}\sigma^{2} } \right)T}}{\sigma \sqrt{T} } \hfill \\ d_{2} &= \frac{{\ln \left( \frac{S}{X} \right) + \left( {r - \frac{1}{2}\sigma^{2} } \right)T}}{\sigma \sqrt{T} } = d_{1} - \sigma \sqrt{T} \hfill \\ \end{aligned} $$
(80)

In this section, we show that the Black–Scholes model can be derived by differential and integral calculus without using stochastic calculus. However, it should be noted that we assume risk neutrality instead of risk averse in the derivation of this section.

6 Using stochastic calculus to derive Black–Scholes model

Black and Scholes (1973) have used two alternative approaches to derive the well-known stochastic differential equation defined in Eq. (80)Footnote 15:

$$ \frac{1}{2}\sigma^{2} S^{2} C_{SS} \left( {t,S} \right) + rSC_{S} \left( {t,S} \right) - rC\left( {t,S} \right) + C_{t} \left( {t,S} \right) = 0 $$
(81)

where t = passage of time; S = stock price, which is a function of time t; C(tS) = call price, which is a function of time t and stock price S; C t (tS) is the first order partial derivative of C(tS) respect to t; C S (tS) is the first order partial derivative of C(tS) respect to S; C SS (tS) is the second order partial derivative of C(tS) respective to S; r = risk-free interest rate; σ = stock volatility.

We rewrite it in a simpler way, as shown in Eq. (81).

$$ \frac{\partial C}{\partial t} + rS\frac{\partial C}{\partial S} + \frac{1}{2}\sigma^{2} S^{2} \frac{{\partial^{2} C}}{{\partial S^{2} }} = rC $$
(82)

where \( \frac{\partial C}{\partial t} = C_{t} (t,S) \); \( \frac{\partial C}{\partial S} = C_{S} (t,S) \); \( \frac{{\partial^{2} C}}{{\partial S^{2} }} = C_{SS} (t,S) \) in Eq. (80).

To derive the Black–Scholes model, we need to solve this differential equation under the boundary condition:

$$ C(S,T) = \left\{\begin{array}{ll} {S - X} \hfill &\quad {{\text{if}}\,S \ge X} \hfill \\ 0 \hfill &\quad {\text{otherwise}} \hfill \\ \end{array}\right. $$
(83)

where T is the maturity date of the option, and X is the exercise price.

By introducing boundary constraints and making variable substitutions, they obtained a differential equation, which is the heat-transfer equation in physics (Joshi 2003). They used the Fourier transformation to solve the heat-transfer equation under the boundary condition, and finally obtain the solution. Here we will demonstrate the main procedures to obtain the heat-transfer equation, and then get the closed-form solution under the boundary condition.

Let Z = ln S, using the chain rule of partial derivatives, then we have the following equations hold.

$$ \frac{\partial C}{\partial S} = \frac{\partial C}{\partial Z}\frac{\partial Z}{\partial S} = \frac{\partial C}{\partial Z}\frac{1}{S} $$
(84)
$$ \begin{aligned} \frac{{\partial^{2} C}}{{\partial S^{2} }} & = \frac{{\partial (\frac{\partial C}{\partial Z})}}{\partial S}\frac{1}{S} - \frac{\partial C}{\partial Z}\frac{1}{{S^{2} }} \\ & = \frac{{\partial^{2} C}}{{\partial Z^{2} }}\frac{1}{{S^{2} }} - \frac{\partial C}{\partial Z}\frac{1}{{S^{2} }} \\ \end{aligned} $$
(85)

Then we changed Eq. (81) into Eq. (85).

$$ \frac{\partial C}{\partial t} + \left( {r - \frac{1}{2}\sigma^{2} } \right)\frac{\partial C}{\partial Z} + \frac{1}{2}\sigma^{2} \frac{{\partial^{2} C}}{{\partial Z^{2} }} = rC $$
(86)

Let τ = T − t.

$$ \frac{\partial C}{\partial \tau } - \left( {r - \frac{1}{2}\sigma^{2} } \right)\frac{\partial C}{\partial Z} - \frac{1}{2}\sigma^{2} \frac{{\partial^{2} C}}{{\partial Z^{2} }} = - rC $$
(87)

Let D = e C, i.e. C = e D, and re-define three partial derivatives in Eq. (86). We have:

$$ \frac{\partial C}{\partial \tau } = - re^{ - r\tau } D + e^{ - r\tau } \frac{\partial D}{\partial \tau } = - rC + e^{ - r\tau } \frac{\partial D}{\partial \tau } $$
(88)
$$ \frac{\partial C}{\partial Z} = e^{ - r\tau } \frac{\partial D}{\partial Z} $$
(89)
$$ \frac{{\partial^{2} C}}{{\partial Z^{2} }} = e^{ - r\tau } \frac{{\partial^{2} D}}{{\partial Z^{2} }} $$
(90)

If we substitute Eqs. (87), (88), and (89) into Eq. (86), we obtain:

$$ \frac{\partial D}{\partial \tau } - \left( {r - \frac{1}{2}\sigma^{2} } \right)\frac{\partial D}{\partial Z} - \frac{1}{2}\sigma^{2} \frac{{\partial^{2} D}}{{\partial Z^{2} }} = 0 $$
(91)

We introduce a new variable Y to replace Z, then we have:

$$ Y = \ln ({S/X}) + \left( {r - \frac{1}{2}\sigma^{2} } \right)\tau = Z + \left( {r - \frac{1}{2}\sigma^{2} } \right)\tau - \ln X $$
(92)

Since D = e C, and it is a function of Z and τ, then we explicitly rewrite D as D Z(Zτ). Equation (91) implies that D is also a function of Y and τ. We define D Z(Zτ) and D Y(Yτ) as follows.

$$ D^{Z} (Z,\tau ) = D^{Z} \left( {Y - \left( {r - \frac{1}{2}\sigma^{2} } \right)\tau + \ln X,\tau } \right) $$
(93)
$$ D^{Y} (Y,\tau ) = D^{Y} \left( {Z + \left( {r - \frac{1}{2}\sigma^{2} } \right)\tau - \ln X,\tau } \right) $$
(94)

Taking partial derivatives of D Z(Zτ) respective to Z, we obtain:

$$ \frac{{\partial D^{Z} (Z,\tau )}}{\partial Z} = \frac{{\partial D^{Y} (Y,\tau )}}{\partial Y}\frac{\partial Y}{\partial Z} = \frac{{\partial D^{Y} (Y,\tau )}}{\partial Y} $$
(95)

Similarly, we have:

$$ \frac{{\partial^{2} D^{Z} (Z,\tau )}}{{\partial Z^{2} }} = \frac{{\partial^{2} D^{Y} (Y,\tau )}}{{\partial Y^{2} }} $$
(96)
$$ \begin{aligned} \frac{{\partial D^{Z} (Z,\tau )}}{\partial \tau } & = \frac{{\partial D^{Y} (Y,\tau )}}{\partial Y}\frac{\partial Y}{\partial \tau } + \frac{{\partial D^{Y} (Y,\tau )}}{\partial \tau } \\ & = \frac{{\partial D^{Y} (Y,\tau )}}{\partial Y}\frac{{\partial \left( {Z + \left( {r - \frac{1}{2}\sigma^{2} } \right)\tau - \ln X} \right)}}{\partial \tau } + \frac{{\partial D^{Y} (Y,\tau )}}{\partial \tau } \\ & = \frac{{\partial D^{Y} (Y,\tau )}}{\partial Y}\left( {r - \frac{1}{2}\sigma^{2} } \right) + \frac{{\partial D^{Y} (Y,\tau )}}{\partial \tau } \\ \end{aligned} $$
(97)

Substituting Eqs. (94), (95), and (96) into Eq. (90), we can get:

$$ \frac{{\partial D^{Y} }}{\partial \tau } - \frac{1}{2}\frac{{\partial^{2} D^{Y} }}{{\partial Y^{2} }}\sigma^{2} = 0 $$
(98)

Equation (97) is almost close to the heat transfer equation used by Black and Scholes.

Let \( u = \frac{2}{{\sigma^{2} }}\left( {r - \frac{1}{2}\sigma^{2} } \right)Y,\,\,v = \frac{2}{{\sigma^{2} }}\left( {r - \frac{1}{2}\sigma^{2} } \right)^{2} \tau \), and re-denote D(uv) as the function of u and v.

$$ \frac{{\partial D^{Y} (Y,\tau )}}{\partial Y} = \frac{\partial D(u,v)}{\partial u}\frac{\partial u}{\partial Y} = \frac{\partial D(u,v)}{\partial u}\frac{2}{{\sigma^{2} }} \left(r - \frac{1}{2}\sigma^{2}\right) $$
(99)
$$ \frac{{\partial^{2} D^{Y} (Y,\tau )}}{{\partial Y^{2} }} = \frac{{\partial^{2} D(u,v)}}{{\partial u^{2} }}\left( {\frac{2}{{\sigma^{2} }}\left( {r - \frac{1}{2}\sigma^{2} } \right)} \right)^{2} $$
(100)
$$ \begin{aligned} \frac{{\partial D^{Y} (Y,\tau )}}{\partial \tau } & = \frac{\partial D(u,v)}{\partial u}\frac{\partial u}{\partial \tau } + \frac{\partial D(u,v)}{\partial v}\frac{\partial v}{\partial \tau } = \frac{\partial D(u,v)}{\partial v}\frac{\partial v}{\partial \tau } \\ & = \frac{\partial D(u,v)}{\partial v}\frac{2}{{\sigma^{2} }}\left( {r - \frac{1}{2}\sigma^{2} } \right)^{2} \\ \end{aligned} $$
(101)

We finally reach the heat transfer equation derived by Black and Scholes, when we substitute Eqs. (99) and (100) into Eq. (97).

$$ \frac{\partial D(u,v)}{\partial v} = \frac{{\partial^{2} D(u,v)}}{{\partial u^{2} }} $$
(102)

Equation (101) is identical to Eq. (10) of Black–Scholes (1973). In terms of the Black–Scholes notation, Eq. (101) can be written as

$$ y_{2} = y_{11} $$
(101′)

Now, we need to find a function D(uv) that satisfies both of the boundary conditions and the partial differential equation as shown in Eq. (101).Footnote 16 The general solution given in Churchill (1963) is as follows.

If:

$$ v_{t} (x,t) = kv_{xx} (x,t)\,\,\,\,\,\,\,\,( - \infty < x < \infty ,t > 0) $$
(103)
$$ v(x,0) = f(x)\quad ( - \infty < x < \infty ) $$
(104)

Then the general solution for v t (xt) is Footnote 17:

$$ v(x,t) = {1/{\sqrt{\pi} }}\int_{ - \infty }^{\infty } {f(x + 2\eta \sqrt{kt} )e^{{ - \eta^{2} }} d\eta } $$
(105)

In our notation, D(uv) = v(xt), and k = 1, which makes Eq. (102) equivalent to the partial differential equation as shown in Eq. (101). Moreover, we have the boundary condition:

$$ C(S,T) =\left\{ \begin{array}{ll} {S - X} \hfill &\quad {{\text{if}}\;\;S \ge X} \hfill \\ 0 \hfill &\quad {\text{otherwise}} \hfill \\ \end{array}\right. $$

At maturity date, t = T, v = 0. Then we have D(u, 0) = C(ST). f(u) must be determined to make Eq. (103) satisfied. Black and Scholes choose:

$$ f(u) = \left\{\begin{array}{ll} {X\left( {e^{{u{{\left( {\frac{1}{2}\sigma^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\frac{1}{2}\sigma^{2} } \right)} {\left( {r - \frac{1}{2}\sigma^{2} } \right)}}} \right. \kern-0pt} {\left( {r - \frac{1}{2}\sigma^{2} } \right)}}}} - 1} \right)} \hfill &\quad if\,\, u \ge 0 \hfill \\ 0 &\quad if\,\,u < 0 \hfill \\ \end{array}\right. $$
(106)

Note that, when t = T, \( u = \frac{2}{{\sigma ^{2} }}\left(r - \frac{1}{2}\sigma ^{2}\right)\ln (S/X)\), then we have:

$$ D(u,0) = \left\{\begin{array}{ll} {X(e^{{\ln ({S/X})}} - 1) = S - X} \hfill &\quad {if\,u \ge 0} \hfill \\ 0 \hfill &\quad {if\,u < 0} \hfill \\ \end{array}\right. $$
(107)

which is identical to the boundary condition. Therefore, the determined f(u) makes Eq. (103) hold.

Now that Eqs. (102) and (103) are satisfied, the solution to the differential equation is given byFootnote 18:

$$ D(u,v) = {1 /{\sqrt{\pi} }}\int_{{ - {u /{2\sqrt{v} }}}}^{\infty } {X\left( {e^{{(u + 2\eta \sqrt{v} ){{\left( {\frac{1}{2}\sigma^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\frac{1}{2}\sigma^{2} } \right)} {\left( {r - \frac{1}{2}\sigma^{2} } \right)}}} \right. \kern-0pt} {\left( {r - \frac{1}{2}\sigma^{2} } \right)}}}} - 1} \right)e^{{ - \eta^{2} }} d\eta } $$
(108)

Let \( \eta = {q \mathord{\left/ {\vphantom {q {\sqrt{2} }}} \right. \kern-0pt} {\sqrt{2} }} \), and substitute it and C = e D into Eq. (107), then we have:

$$ C(S,t) = e^{ - r\tau } \frac{1}{{\sqrt{2\pi } }}\int_{{ - {u/{\sqrt{2v} }}}}^{\infty } {X\left( {e^{{(u + q\sqrt{2v} ){{\left( {\frac{1}{2}\sigma^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\frac{1}{2}\sigma^{2} } \right)} {\left( {r - \frac{1}{2}\sigma^{2} } \right)}}} \right. \kern-0pt} {\left( {r - \frac{1}{2}\sigma^{2} } \right)}}}} - 1} \right)e^{{ - {{q^{2} }/2}}} dq} $$
(109)

Note that:\( - {u/{\sqrt{2v} }} = - \frac{{\ln ({S/X}) + \left( {r - \frac{1}{2}\sigma^{2} } \right)\tau }}{\sigma \sqrt{\tau} } = - d_{2} \). Therefore, Eq. (108) can evolve into:

$$ C(S,t) = Xe^{{ - r\tau }} \frac{1}{{\sqrt {2\pi } }}\int_{{ - d_{2} }}^{\infty } {e^{{\left[ {(u + q\sqrt {2v} ){{\left( {\frac{1}{2}\sigma ^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\frac{1}{2}\sigma ^{2} } \right)} {\left( {r - \frac{1}{2}\sigma ^{2} } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {r - \frac{1}{2}\sigma ^{2} } \right)}}} \right] - q^{2} /2}} dq} - Xe^{{ - r\tau }} \frac{1}{{\sqrt {2\pi } }}\int_{{ - d_{2} }}^{\infty } {e^{{ - q^{2} /2}} dq} $$
(110)

We observe the second term in Eq. (109). Recall that the cumulative standard normal density function is defined as:

$$ N(x) = \int_{ - \infty }^{x} {\frac{1}{{\sqrt{2\pi } }}} e^{{-\tfrac{{t^{2} }}{2}}} dt $$
(111)

Therefore, the second term of Eq. (109) is:

$$ Xe^{ - r\tau } \int_{{ - d_{2} }}^{\infty } {\frac{1}{{\sqrt{2\pi } }}e^{{ - {{q^{2} }/2}}} dq} = Xe^{ - r\tau } (1 - N( - d_{2} )) = Xe^{ - r\tau } N(d_{2} ) $$
(112)

Deriving the first term in Eq. (109) is much more tedious and difficult. Recall the expressions for u and v, \( u = \frac{2}{{\sigma^{2} }}(r - \frac{1}{2}\sigma^{2} )(\ln ({S/X}) + (r - \frac{1}{2}\sigma^{2} )\tau ) \), \( v = \frac{2}{{\sigma^{2} }}(r - \frac{1}{2}\sigma^{2} )^{2} \tau \). Therefore, we have Eqs. (112) and (113) hold.

$$ u{{\left( {\frac{1}{2}\sigma ^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\frac{1}{2}\sigma ^{2} } \right)} {\left( {r - \frac{1}{2}\sigma ^{2} } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {r - \frac{1}{2}\sigma ^{2} } \right)}} = \left( {\ln (S/X) + \left( {r - \frac{1}{2}\sigma ^{2} } \right)\tau } \right) $$
(113)
$$ q\sqrt{2v} {{\left( {\frac{1}{2}\sigma^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\frac{1}{2}\sigma^{2} } \right)} {\left( {r - \frac{1}{2}\sigma^{2} } \right)}}} \right. \kern-0pt} {\left( {r - \frac{1}{2}\sigma^{2} } \right)}} = q\sigma \sqrt{\tau} $$
(114)

Therefore, the first term in Eq. (109) is:

$$ \begin{aligned} & Xe^{ - r\tau } \frac{1}{{\sqrt{2\pi } }}\int_{{ - d_{2} }}^{\infty } {e^{{\left[ {(u + q\sqrt{2v} ){{\left( {\frac{1}{2}\sigma^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\frac{1}{2}\sigma^{2} } \right)} {\left( {r - \frac{1}{2}\sigma^{2} } \right)}}} \right. \kern-0pt} {\left( {r - \frac{1}{2}\sigma^{2} } \right)}}} \right] - {{q^{2} }/2}}} dq} \\ & \quad = Xe^{ - r\tau } e^{{\ln ({S/X})}} \frac{1}{{\sqrt{2\pi } }}\int_{{ - d_{2} }}^{\infty } {e^{r\tau } e^{{ - \tfrac{1}{2}(q^{2} - 2q\sigma \sqrt{\tau} + \sigma^{2} \tau )}} } dq \\ & \quad = S\frac{1}{{\sqrt{2\pi } }}\int_{{ - d_{2} }}^{\infty } {e^{{ - \tfrac{1}{2}(q - \sigma \sqrt{\tau} )^{2} }} } dq \\ \end{aligned} $$
(115)

Here, again we apply variable substitution. Let \( q^{{\prime }} = q - \sigma \sqrt{\tau} \), then \( dq^{{\prime }} = dq \). Therefore, Eq. (114) evolves to:

$$ S\frac{1}{{\sqrt{2\pi } }}\int_{{ - d_{2} - \sigma \sqrt{\tau} }}^{\infty } {e^{{ - \tfrac{1}{2}q^{{{\prime }2}} }} } dq^{{\prime }} $$
(116)

Let \( d_{1} = d_{2} + \sigma \sqrt{\tau} \), then we obtain:

$$ S\frac{1}{{\sqrt{2\pi } }}\int_{{ - d_{1} }}^{\infty } {e^{{ - \tfrac{1}{2}q^{{{\prime }2}} }} } dq^{{\prime }} = S\int_{{ - d_{1} }}^{\infty } {\frac{1}{{\sqrt{2\pi } }}e^{{ - \tfrac{1}{2}q^{{{\prime }2}} }} } dq^{{\prime }} = S(1 - N( - d_{1} )) = SN(d_{1} ) $$
(117)

Finally, when combining the first and second terms in Eq. (109), simplified by Eqs. (116) and (111) respectively, we reach the Black–Scholes formula.

$$ C(S,t) = SN(d_{1} ) - Xe^{ - r\tau } N(d_{2} ) $$
(118)

where

$$ \begin{aligned} d_{1} & = \frac{{\ln ({S/X}) + \left( {r + \frac{1}{2}\sigma^{2} } \right)\tau }}{\sigma \sqrt{\tau} } \\ d_{2} & = \frac{{\ln ({S/X}) + \left( {r - \frac{1}{2}\sigma^{2} } \right)\tau }}{\sigma \sqrt{\tau} } = d_{1} - \sigma \sqrt{\tau} \\ \tau & = T - t \\ \end{aligned} $$
(119)

7 Summary and concluding remarks

In this paper, we have reviewed three alternative approaches to derive option pricing models. We have discussed how the binomial model can be used to derive the Black–Scholes model in detail. In addition, we also show how the Excel program in terms of decision tree can be used to empirically show how binomial model can be converted to Black–Scholes model when observations approach infinity. Under an assumption of risk neutrality, we show that the Black–Scholes formula can be derived using only differential and integral calculus and a basic knowledge of normal and lognormal distributions. In “Appendix”, we use the de Moivre–Laplace Theorem to prove that the best fit between the binomial and normal distributions occurs when binomial probability is \( \frac{1}{2} \). Overall, this paper can help statisticians and mathematicians better understand how alternative methods can be used to derive the Black–Scholes option model.