Abstract
We introduce a new system of stochastic differential equations which models dependence of market beta and unsystematic risk upon size, measured by market capitalization. We fit our model using size deciles data from Kenneth French’s data library. This model is somewhat similar to generalized volatility-stabilized models. The novelty of our work is twofold. First, we take into account the difference between price and total returns (in other words, between market size and wealth processes). Second, we work with actual market data. We study the long-term properties of this system of equations, and reproduce observed linearity of the capital distribution curve. In the “Appendix”, we analyze size-based real-world index funds.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
1.1 Size effect and the capital asset pricing model
The size of a stock is measured by its market capitalization, or market cap: current stock price multiplied by the number of shares. For a stock portfolio, its market cap is defined as weighted sum of market caps of constituent stocks, with weights equal to the portfolio weights. The size is a very important fundamental characteristic of a stock or a portfolio.
It is observed that small stocks have higher returns but higher risk than large stocks. An explanation is that small companies are in their dynamic growth phase and they have higher growth potential relative to large mature companies, but small companies are more vulnerable to failure and bankruptcy. Some researchers claim that even after adjusting for risk, small stocks have higher returns than large stocks. This adjustment can be made rigorous within the framework of the Capital Asset Pricing Model (CAPM). Take a portfolio of stocks with total returns (including dividends, not inflation-adjusted) Q(t) during time t. Here, we operate in a discrete-time setting. Compare it with risk-free returns from short-term Treasury bills R(t). An investor deserves premium reward for taking risk and investing in stocks rather than in safe Treasury bills. We calculate this equity premium P(t) by subtracting \(P(t) = Q(t) - R(t)\). Next, we compute this equity premium \(P_0(t)\) for a market portfolio, used as a benchmark. An example of such benchmark is the Standard & Poor (S&P) 500, a widely used benchmark for large U.S. stocks. Run a linear regression:
The parameter \(\beta \) shows market exposure, how much risk the portfolio is exposed to because of fluctuations in the benchmark. The parameter \(\alpha \) shows excess return, how much can one earn from this portfolio on top of this market return. They are often called by their Greek names: beta and alpha. The residual \(\varepsilon (t)\) is called unsystematic risk which can be eliminated by diversification. According to the CAPM, \(\alpha = 0\) and the only risk which deserves rewards is the systematic risk (due to market exposure) since other risk can be diversified away. The CAPM was proposed in the classic article Sharpe (1964). Subsequent research cast doubt on the consistency of CAPM with actual market data. In particular, Banz (1981) found that taking a portfolio of small stocks generates positive \(\alpha \). That is, small stocks have higher returns than large stocks even after adjusting for market exposure (which is greater than 1 for small stocks). Subsequent classic article Fama and French (1993) confirmed this. Further research on the size effect can be found in Semenov (2015) and van Dijk (2011) and references therein. See also critique of CAPM in Fama and French (2004).
1.2 Our model
We study dependence of \(\alpha , \beta , \sigma \) (the standard deviation of \(\varepsilon (t)\)) on the size, measured by the market cap S(t), or, more precisely, by relative size to \(S_0(t)\):
We would like to find functions \(\alpha , \beta , \sigma \) of C such that for standardized white noise terms Z(t), with \(\mathbb E[Z(t)] = 0\) and \({\mathbb {E}}[Z^2(t)] = 1\):
This allows us to quantify how exactly \(\alpha \) and \(\beta \) (and \(\sigma \) the standard deviation of unsystematic risk) depend on the relative size measure. We then consider a version of the Eq. (3) in which equity premia are replaced by price returns, that is, returns due to price changes (or, equivalently, market cap changes). That is, we replace P(t) and \(P_0(t)\) with \(\ln (S(t+1)/S(t))\) and \(\ln (S_0(t+1)/S_0(t))\), ignoring for now risk-free rate and dividends. This gives us:
with C(t) from (2). This Eq. (4) includes only market caps of benchmark \(S_0(t)\) and the portfolio S(t). This time series equation or its continuous-time version, a stochastic differential equation, allows us to model S(t) and \(S_0(t)\) separately from dividends and risk-free returns. On top of this, we add the Eq. (3). Surprisingly, market data gives us almost the same functions \(\alpha , \beta , \sigma \) in (3) and (4), but with differing coefficients. Moreover, the white noise terms in (3) and (4) are almost perfectly (more than 99%) correlated. This does not follow from any theoretical considerations, and seems simply a piece of good luck which simplifies analysis.
We also adapt (3) and (4) for continuous time: Equity premium P(t) becomes then \(\mathrm {d}\ln V(t)\), where V(t) is the wealth process adjusted for risk-free returns. More precisely, \(V(t) = U(t)/U^*(t)\), where U(t) is the wealth accumulated from investing \(U(0) = 1\) in stock portfolio and reinvesting dividends, while \(U_*\) is a similar wealth process from investing in Treasury bills. Then (3) takes the form
where \(V_0\) is the adjusted wealth process for the benchmark, and W is a Brownian motion: A real-valued continuous process with \(W(t) - W(s) \sim {\mathcal {N}}(0, t-s)\) independent of \(W(u),\, 0 \le u \le s\), for all \(0 \le s < t\). This Brownian motion can be viewed as a zoomed out random walk with very small but very frequent steps. For simplicity, we assume that \(\ln V_0(t)\) is also a Brownian motion with positive drift (which captures the tendency of long-term stock returns to be greater than risk-free returns). Although equity premia have heavy tails and thus are not well-described by the Gaussian distribution, the Brownian motion provides a simple first approximation. Similarly, Eq. (4) becomes
with C(t) from (2). By the above remark, we can assume that the driving Brownian motion W from (5) and (6) is the same.
1.3 Dependence upon the size measure
What is the dependence of market exposure \(\beta \), excess return \(\alpha \), and standard deviation of unsystematic risk \(\sigma \) upon relative size?
Our statistical analysis does not yield conclusive results. The white noise tests for Z unfortunately fail. Thus we cannot claim that a model (3), (4) passes goodness of fit tests. The most resonable guess for \(\alpha , \beta , \sigma \) seems to be:
Here, \(\alpha _{\pm }, \beta _{\pm }, \sigma _{\pm }, c_{\pm }\) are positive constants. Our data analysis does not allow us to suggest these functions in a neighborhood of zero. This is due to the fact that observed C(t) in our data do not come very close to zero.
1.4 Stochastic portfolio theory
The continuous-time version is useful because we can use stochastic calculus and immerse these models in Stochastic Portfolio Theory (SPT). This is a framework for stock market modeling which does not depend on particular models. It suggests overweighing small stocks and continuous rebalancing. This means investing in small stocks in proportion greater than their market cap would dictate. One example of this is taking the equal-weighted portfolio. The central result of SPT: under mild conditions, diversity (no stock dominates the entire market) and sufficient intrinsic volatility, such portfolios outperform the market portfolio, which invests in each stock in proportion to its market cap; see Banner and Fernholz (2008), Fernholz and Karatzas (2005), Fernholz et al. (2005). This theory has solid theoretical basis, and is consistent with observed data. For references, see the book Fernholz (2002) and a more recent survey Fernholz and Karatzas (2009). SPT is based on the same observation as above: small stocks have higher return and risk than large stocks.
Although we mentioned above that SPT is model-independent, there are some SPT models which attempt to capture this observation: competing Brownian particles, where logarithms of market caps evolve as Brownian motions with drift and diffusion coefficients dependent on their current ranks relative to other particles, Banner et al. (2005), Banner et al. (2011), Karatzas and Sarantsev (2016); their generalizations with jumps, or with dependence on both name and rank (so-called second-order models): see articles Banner et al. (2011), Barnes and Sarantsev (2020), Shkolnikov (2011); volatility-stabilized models, where \(\ln S(t)\) are modeled by stochastic differential equations (SDE) with volatility inversely proportional to S(t), Pal (2011), and their generalizations, Pickova (2014). As the number of stocks tends to infinity, the limiting behavior of competing Brownian particles and volatility-stabilized models is studied in Chatterjee and Pal (2010), Jourdain and Reygner (2015), Shkolnikov (2012), Shkolnikov (2013).
In this article, we recognize the difference between price and total returns, and model them separately as (3) and (4) for discrete time or (5) and (6) for continuous time. SPT is much more developed for continuous-time diffusive models based on SDE than for discrete-time, see Pal and Wong (2016). Thus it is reasonable to switch to continuous time in (5) and (6).
Stochastic Portfolio Theory deals with diversification benefits in the form of excess growth rate, and functionally generated portfolios. The standard assumption in SPT is that there are no dividends; that is, price and total returns are the same. But here this is no longer true. To model separately price returns (which drive the market capitalization processes) and total returns (which drive the wealth processes) requires an extension of the SPT, in particular the concept of functionally generated portfolios. This is left for future research.
1.5 Data analysis
We take real U.S. market data from Kenneth French’s Data Library online: market cap, price and total returns for equal-weighted portfolios made from size deciles (stocks split into top 10%, next 10%, etc. according to their size). This library contains data processed from original raw data from the Center for Research in Securities Prices (CRSP) at the University of Chicago, July 1926–June 2020 (84 years), monthly data.
We study models of n portfolios and the benchmark with market caps \(S_0, \ldots , S_n\), relative size measures \(C_0, \ldots , C_n\), and wealth processes \(V_0, \ldots , V_n\). The corresponding Brownian motions \(W_1, \ldots , W_n\) from (6) and (5) (recall that these two equations have the same Brownian motions) are assumed to be i.i.d. for simplicity (although our analysis shows this to be inconsistent with actual data). Consider market weights:
representing the proportion of the ith stock in the overall market. Small stocks have smaller market weights. The market weight vector \({\mu } = (\mu _0, \ldots , \mu _n)\) is a Markov process on the n-dimensional simplex \(\triangle _n\). If this market weight vector converges to a unique stationary distribution as \(t \rightarrow \infty \) in the total variation norm (see definitions in Sect. 4), then we call the system stable. In this article, we state and prove that our market model is stable under certain conditions on \(\alpha , \beta , \sigma \). We also investigate whether collisions: \(S_i(t) = S_j(t)\) happen (when small stocks grow and overtake large stocks). Finally, at each time rank weights from top to bottom:
The capital distribution curve is the plot of ranked weights vs their ranks on the double logarithmic scale:
For the actual market data, this plot is linear, except at the endpoint. See Fernholz (2002), Figure 5.1 for the capital distribution curve for the CRSP stock universe for 8 days, chosen to be the last days (December 31) of the eight decades – December 31, 1929; December 31, 1939; ...December 31, 1999. Strikingly, these 8 curves almost completely coincide. In other words, this curve is stable in time. Previously mentioned competing Brownian particles and volatility-stabilized models reproduce this feature, see Chatterjee and Pal (2010) and Pal (2011) respectively. In this article, we establish this property for our model using both theoretical analysis and simulations.
1.6 Our contributions
We propose a new model within SPT which is consistent with CAPM and captures the size effect. This model is consistent with long-term market data (although not fully consistent) and captures its important features. We study its properties: long-term stability, collisions of particles, market weights, and the capital distribution curve.
1.7 Organization of the article
In Sect. 2, we describe our data in detail and analyze it. This provides a motivation for continuous-time modeling. In Sect. 3, we introduce a system of SDE as in (6) and (5), and prove existence and uniqueness of the solution. We also discuss applicability of SPT, since now size and wealth processes are different. In Sect. 4, we show stability under certain conditions and see how they are compared with data analysis in Sect. 2. In Sect. 5, we replicate the linear capital distribution curve. Finally, Sect. 6 is devoted to conclusions and suggestions for future research. The “Appendix” contains some data analysis for existing exchange-traded funds based on size. The code and data can be found on GitHub: asarantsev/CAPM-SPT.
2 Data analysis
2.1 Data description
Our main data source, as mentioned in the Introduction, is the CRSP database at the University of Chicago, taken from Kenneth French’s online data library. We take from this library price and total monthly returns for equally-weighted portfolios made of stock in each decile from July 1926 to June 2020. We analyze only the top 8 deciles, corresponding to large-cap, mid-cap, and small-cap stocks. The two bottom deciles are micro-cap stocks which we exclude. Deciles are created based on the market cap at the end of each June. After a year, these deciles are reconstituted. The data library also has average market capitalizations for the end of each month for each decile. Risk-free monthly returns are computed as \(R(t) = \ln (1 + r(t)/12)\), where r(t) are monthly data for short-term Treasury bills taken from the Federal Reserve Economic Data (FRED) website: January 1934–June 2020 series TB3MS and July 1926–December 1933 (discontinued) series M1329AUSM193NNBR.
Stock and portfolio returns can be computed in two ways: arithmetic A and geometric G, which are related as follows: \(G = \ln (1 + A)\). Arithmetic returns are quoted regularly for their practicality in computing portfolio returns such as in the data library which we used as our data source. The arithmetic return of a portfolio is equal to the weighted average of arithmetic returns of constituent stocks, but here we convert arithmetic returns to their geometric versions according to the above formula.
The advantage of geometric over arithmetic returns for our research is apparent when combining compound interest rates. For example, 20% and 30% arithmetic returns combined gives 56% returns, whereas geometric returns combine for 50% as expected.
Our time unit is equal to a month, of total \(T = 12\cdot (2020-1926) = 1128\) months, \(t = 0\) corresponding to June 1926, \(t = T\) corresponding to June 2020.
For a decile k (with \(k = 1\) for the top decile, \(k = 8\) for the bottom decile), its average market cap at end of month t is denoted by \(S_k(t)\), the price returns are \(Q_k(t)\), and the equity premium (total returns including dividends minus risk-free returns) is \(P_k(t)\).
2.2 Beta analysis for price returns
Split \(T = 1128\) months into \(N = 47\) two-year, \(K = 24\)-month time periods. For each period \(n = 1, \ldots , N\) and each decile \(k = 1, \ldots , 8\), except the top one, which we use as the benchmark (this top decile roughly corresponds to Standard & Poor 500 constituent stocks), regress
where t is in this period, and \(\alpha _k(n), \beta _k(n)\) are intercept and slope (excess return and market exposure), found using ordinary least squares; and \(\delta _k(t)\) are residuals. Thus we compute beta \(\beta _k(n)\) for each decile from 2nd to 8th and each of 47 two-year periods. For size measure of the kth decile vs top decile, take
where \(S_1(Kn)\) and \(S_k(Kn)\) are average market capitalizations for the beginning of nth period. Then we plot \(\beta _k(n) - 1\) vs \(C_k(n)\) and \((\beta _k(n) - 1)/C_k(n)\) in Fig. 1. We plot \(\beta _k(n) - 1\) instead of \(\beta _k(n)\) since our benchmark for beta is 1: The beta for the top decile (which coincides with the benchmark) is 1.
2.3 Model suggestions for price returns
From Fig. 1(a), we get the suggestion that
\(\beta _k(n) - 1\) is proportional to \(C_k(n)\). This is confirmed by Fig. 1(b). This transformation helps make variance constant. We still cannot claim that these normalized quantities are i.i.d. white noise, since they fail white noise tests. Still, let us extract the trend in \(\beta _k(n)\) by replacing it with \(1 + \gamma C_k(n)\) for some coefficient. Next, comparing with (8), consider the residual \(Q_k(t) - (1 + \gamma C_k(n))Q_1(t)\). We make it dependent on n (the overall two-year period), not individual months t in this period, by taking the sum of geometric price returns \(Q_k(t)\) over t in this two-year period. This sum is equal to overall price returns \({\overline{Q}}_k(t)\) in this two-year period. Then we let:
We then plot \(\varepsilon _k(n)\) vs \(C_k(n)\) in Fig. 2(a), together with \(\varepsilon _k(n)/C_k(n)\) vs \(C_k(n)\) in Fig. 2(b). We see again in Fig. 2(a) that \(\varepsilon _k(n)\) depends on \(C_k(n)\) linearly, and in Fig. 2(b) that the variance becomes constant. Again, we cannot claim that \(\varepsilon _k(n)/C_k(n)\) are i.i.d. since white noise tests fail. But if we model this as i.i.d. \({\mathcal {N}}(\mu , \rho ^2)\), we get:
where \(Z_k(n)\) are i.i.d. standard normal random variables. A generalization of this could be:
i.i.d. multivariate normal with mean zero vector and (not identity) covariance matrix with units on the main diagonal (that is, \(\mathbb E[Z_k(n)] = 0\) and \({\mathbb {E}}[Z_k^2(n)] = 1\)). Combining (9) and (10), we get:
We can estimate \(\gamma = 0.0045, \mu = 0.0069, \rho = 0.052\). This gives us (7), the top row.
2.4 Bottom decile as benchmark
If we repeat this analysis in previous subsection with benchmark 8th decile (which in our research is the bottom decile since we ignore the 9th and 10th deciles), we get: For \(\beta _k(n)\) in (8), we plot \(\beta _k(n) - 1\) vs \(C_k(n)\) in Fig. 3(a). In this case, dependence is also linear. Next, take \(\gamma \), the mean of these quantities. Create residuals similarly to (9):
The plot of these residuals vs \(C_k(n)\), normalized by dividing by \(\sqrt{|C_k(n)|}\), is shown in Fig. 3(b). Making white noise test for \(\varepsilon _k(n)\), we fail to reject this hypothesis. Assume \(\varepsilon _k(n) \sim \mathcal N(\mu , \rho ^2)\). Combining this with (13), we get:
This verifies (7), the bottom row. From (14), we see that linear functions from (12) are not the only possible and reasonable functions for \(\beta \). In Sect. 4, we state and prove results for general \(\alpha , \beta , \rho \). Our coefficients are \(\gamma = 0.12, \mu = 0.0055, \rho = 0.090\).
2.5 Data analysis for equity premia
Repeating similar analysis for equity premia instead of price returns, we get similar functions for both cases: top and bottom deciles as benchmarks. Moreover, point estimates for price returns and equity premia are close; \(\gamma = 0.045, \mu = 0.0017, \rho = 0.052\) for the top decile benchmark, and \(\gamma = 0.12, \mu = 0.0024, \rho = 0.088\) for the 8th decile benchmark. Thus functions \(\beta \) and \(\sigma \) for price returns and equity premia are the same. Unfortunately, for \(\alpha \) this is not true. We will have separate functions for price returns and equity premia. As noted in Sect. 1, white noise terms \(Z_k(n)\) for price returns and equity premia are almost perfectly correlated, since the Pearson correlation coefficient is greater than 99%. However, price returns and equity premia for the benchmark are not perfectly correlated. Thus noise terms for size and wealth processes are different.
2.6 Conclusions
Data analysis in previous subsections implies that excess return for small stocks is positive, and market exposure for small stocks is greater than 1. Large stocks have negative excess return and market exposure less than 1. In other words, small stocks are riskier than larger stocks, but they have higher returns even after adjusting for market exposure. We will accommodate this for continuous time in the next section.
3 A continuous-time model
3.1 Model description
Take a filtered probability space \((\Omega , {\mathcal {F}}, (\mathcal F_t)_{t \ge 0}, {\mathbb {P}})\) with the filtration satisfying the usual conditions (each \({\mathcal {F}}_t\) contains all \({\mathbb {P}}\)-null sets; that is, \(A \in {\mathcal {F}}_t\) with \({\mathbb {P}}(A) = 0\) and \(B \subseteq A\) implies \(B \in {\mathcal {F}}_t\); and the filtration is right-continuous, that is, \({\mathcal {F}}_t = \cap _{s > t}{\mathcal {F}}_s\) for all \(t \ge 0\)). All stochastic processes \(X = (X(t),\, t \ge 0)\) below are adapted, that is, X(t) is \(\mathcal F_t\)-measurable for every \(t \ge 0\). We say that a stochastic process \(B = (B(t),\, t \ge 0)\) is a standard Brownian motion if \(B(t) - B(s) \sim {\mathcal {N}}(0, t-s)\) is independent of \(\mathcal F_s\) for every \(0 \le s < t\).
Replace white noise terms \(Z_k(n)\) in (12) or (14) by standard Brownian motions \(W_k(t)\), or, more exactly, its differential \(\mathrm {d}W_k(t)\). Replace price returns \(Q_k(n)\) by \(\mathrm {d}\ln S_k(t)\). Recall that \(S_k(t)\) is the average market capitalization of the kth portfolio at time t. Replace equity premia \(P_k(n)\) with \(\mathrm {d}\ln V_k(t)\), where \(V_k(t)\) is the wealth (including reinvested dividends) of the kth decile, starting from \(V_k(0) = 1\), divided by wealth similarly computed from risk-free returns. Set three functions: \(\alpha , \beta , \sigma \) of \(C_k(t)\) (alpha, beta, and standard deviation of unsystematic risk). Set a separate function \(\alpha _*\) of \(C_k(t)\) for equity premia instead of price returns. We shall not set separate functions for \(\beta \) and \(\sigma \) since from our data analysis we found that these functions are the same for price returns and equity premia. We have n portfolios indexed by \(1, \ldots , n\), and the benchmark indexed with 0 (that is, total \(n+1\) portfolios). We write separate functions for price returns and equity premia (which we denote by asterisks). Consider the following system of SDE:
where \(C_k(t)\) is the relative size measure defined in the Introduction:
We also model \((S_0, V_0)\) using two-dimensional geometric Brownian motion: \((\ln S_0, \ln V_0)\) has drift vector \((g_S, g_V)\) and covariance matrix
That is, for \(t > s\) we have: \(\ln S_0(t) - \ln S_0(s) \sim \mathcal N(g_S(t-s), \sigma _S^2(t-s))\) and \(\ln V_0(t) - \ln V_0(s) \sim {\mathcal {N}}(g_V(t-s), \sigma ^2_V(t-s))\); here, \(\rho _0\) is the correlation between these two increments. Thus
where \(W_S\) and \(W_V\) are (correlated) standard Brownian motions. We assume \(W_1, \ldots , W_n\) are independent of \((B_S, B_V)\). Note that \(W_1, \ldots , W_n\) can be dependent of each other standard Brownian motions. We assume that \(W = (W_1, \ldots , W_n)\) is an n-dimensional Brownian motion with zero drift vector and covariance matrix \(\varvec{\Sigma }_W\) with units on the main diagonal.
Definition 1
For functions \(\alpha , \beta , \sigma : {\mathbb {R}} \rightarrow {\mathbb {R}}\), real numbers \(g_S, g_V\), \(2\times 2\) covariance matrix \(\varvec{\Sigma }_{(S, V)}\), and \(n\times n\) correlation matrix \(\varvec{\Sigma }_W\), the system of Eqs. (15), (16), (17), (18) is called a CAPM-size market model of \(N+1\) portfolios, indexed by \(0, \ldots , N\). The process \(S_k\) is called market size, or market cap, of the kth portfolio; and the process \(V_k\) is called the wealth process for this kth portfolio. The portfolio indexed by \(k = 0\) is called the benchmark. The functions \(\alpha \) and \(\beta \) are called by their Greek names. The function \(\sigma \) is called the standard error (of the diversifiable risk).
3.2 Existence and uniqueness
We can rewrite (15) using (17):
Remark 1
Each kth Eq. (15) is independent of other equations: Does not contain \(S_l\) for other \(l = 1, \ldots , n\). It depends only on \(S_k\) and \(S_0\).
Definition 2
Define the explosion time as follows:
where \({\mathcal {T}}_k\) is the explosion time for (15), or, equivalently, for (19).
On \([0, {\mathcal {T}})\), Eqs. (15) and (16) have a unique solution.
Theorem 1
-
(a)
If \(\alpha , \beta , \sigma , \alpha _* : {\mathbb {R}} \rightarrow {\mathbb {R}}\) are measurable and locally bounded, there exists a unique strong solution to the system of SDE (15), (16), (17), until the explosion time \({\mathcal {T}}\).
-
(b)
If, in addition, we have the following linear bound:
$$\begin{aligned} |\alpha (c)| + |\beta (c)| + |\sigma (c)| + |\alpha _*(c)| \le K(1 + |c|), \end{aligned}$$then the explosion time is infinite: \({\mathcal {T}} = \infty \).
Proof
(a) First, let us show existence and uniqueness in (19). The diffusion part in this equation is given by \({\tilde{\sigma }}(C_k(t))\,\mathrm {d}{\tilde{W}}_k(t)\), where
and \({\tilde{W}}_k\) is a standard Brownian motion which is a combination of \(W_S\) and \(W_k\). The drift part is given by \({\tilde{\gamma }}(C_k(t))\,\mathrm {d}t\), where
Thus we can rewrite (19) as follows:
These functions \({\tilde{\gamma }}\) and \({\tilde{\sigma }}\) are measurable, and we apply Zvonkin (1974) to prove strong existence and pathwise uniqueness of the solution \(C_k\) to this SDE, at least until the explosion time \({\mathcal {T}}_k\). Here we use Remark 1: the system of n equations \(C_1, \ldots , C_n\) consists of n independent one-dimensional SDE. Next, we can reconstruct \(S_1, \ldots , S_n\) from \(C_1, \ldots , C_n\) and \(S_0\): \(S_k(t) = S_0(t)e^{-C_k(t)}\). Since \(S_0\) is well-defined for infinite time horizon (as geometric Brownian motion), the strong solution \(S_k\) for (15) exists and is pathwise unique, too, at least until the explosion time \({\mathcal {T}}_k\). Finally, strong existence and pathwise uniqueness for (16) can be shown as follows. Let
Since \(\alpha _*, \beta , \sigma \) are locally bounded, they are bounded on \([-m, m]\). We can rewrite (16) as
with \({\overline{W}}_k\) a standard Brownian motion, and
These two functions from (24) are bounded on \([-m, m]\). Rewriting (23) as
we see that (strong) solution until \(t < {\mathcal {T}}_{k, m}\) is well-defined and pathwise unique, because the integrals (both usual and stochastic) are finite. As \(m \rightarrow \infty \), we have: \(\mathcal T_{k, m} \uparrow {\mathcal {T}}_k\) almost surely. Thus (16) have a pathwise unique strong solution until \({\mathcal {T}}_k\). Combining this with (20), we complete the proof of (a).
Next, (b) follows from (19) and (23) and estimates of linear growth for \({\tilde{\gamma }}\) and \({\tilde{\sigma }}\):
following from similar estimates for \(\alpha , \alpha _*, \beta , \sigma \). \(\square \)
4 Main results
4.1 Long-term stability results
Recall the definition of market weights.
Definition 3
For the market model in (15), we define market weights as follows:
These market weights sum up to 1 for every \(t \ge 0\), and are in one-to-one correspondence with \(C_1(t), \ldots , C_n(t)\): There exists a bijection \(\Phi : \triangle _n \rightarrow {\mathbb {R}}^n\) such that
The process \({\mathbf {C}} = (C_1, \ldots , C_n)\), as its each component, is a Markov process. The same is true for the market weight vector \(\mu = (\mu _0, \ldots , \mu _n)\).
Definition 4
We call a probability measure \(\pi _{\mu }\) on \(\triangle _n\) a stationary distribution if \(\mu (0) \sim \pi _{\mu }\) implies \(\mu (t) \sim \pi _{\mu }\) for all \(t \ge 0\). The market is called stable if this market weight vector has a unique stationary distribution \(\pi _{\mu }\), and when we start from another initial distribution \(\mu (0)\), then the distribution of \(\mu (t)\) converges to \(\pi _{\mu }\) as \(t \rightarrow \infty \) in the total variation (TV) norm:
We can similarly (and equivalently) define stability for the process \({\mathbf {C}} = (C_1, \ldots , C_n)\), which is \({\mathbb {R}}^n\)-valued. These definitions of stability are equivalent, since there is a one-to-one continuous mapping between C and \(\mu \).
Theorem 2
Under assumptions of Theorem 1, suppose
Then the market system is stable.
Proof
We apply the method of Lyapunov functions. As a Lyapunov function, take the following infinitely differentiable function \(V : {\mathbb {R}} \rightarrow [0, \infty )\):
This function can be constructed by smoothing kernel and convolution. The generator for each \(C_k\) in (15) is given by
Recall (21): \(\varlimsup \limits _{c \rightarrow \infty }{\tilde{\gamma }}(c) < 0\) and \(\varliminf \limits _{c \rightarrow -\infty }{\tilde{\gamma }}(c) > 0\). Combining this with (25), we get:
From the articles Meyn and Tweedie (1993a, 1993b), we get tightness of each \(C_k\). That is, \(\sup _{t \ge 0}{\mathbb {P}}(|C_k(t)| \ge c) \rightarrow 0\) as \(c \rightarrow \infty \). The same is true for the vector \({\mathbf {C}} = (C_1, \ldots , C_n)\). Combining this observation with the following property: for each i, \({\mathbb {P}}(a< C_i(t) < b\mid C_i(0) = x) > 0,\quad x, a, b \in {\mathbb {R}}\), we complete the proof. \(\square \)
Remark 2
We also have convergence
In particular, we have \({\mathbb {E}}[C_k(t)] \rightarrow m_C\), where \(m_C\) is the mean of the distribution \(\pi _C\).
Example 1
If \(\beta (c) = 1 + \gamma c\), \(\alpha (c) = \mu c\), then \(\alpha (c) + g_S(\beta (c) - 1) = (\mu + g_S\gamma )c\), and (25) is equivalent to
Example 2
From the data analysis in Sect. 2, let
Here, \(\alpha _{\pm }, \beta _{\pm }, \gamma _{\pm }, \sigma _{\pm } > 0\). Let us find conditions for (25): For the first condition, if \(\gamma _+ < 1\), we have \(g_S\beta _+ > 0\); if \(\gamma _+ = 1\), we have \(\alpha _+ + g_S\beta _+ > 0\); if \(\gamma _+ > 1\), we have \(\alpha _+ > 0\). Similarly for the second condition in (25). The actual estimates from Sect. 2 satisfy these conditions.
4.2 Hitting times
We wish to allow \(S_i\) and \(S_0\) to exchange ranks. That is, we want to allow for \(S_i(0) < S_0(0)\) but \(S_i(t) > S_0(t)\) for some \(t > 0\), or vice versa. This is consistent with real world market behavior, when portfolios exchange ranks based on size. In terms of relative size measures \(C_i\), we wish that \(C_i\) can move from positive half-line to negative half-line. In particular, it must hit zero with positive probability.
This is not true for Example 1 if \(\sigma (c) = \rho c\) for \(\rho > 0\). Indeed, then \(C_i\) is a geometric Brownian motion with drift \(-\varGamma \) and thus converges to 0 almost surely as \(t \rightarrow \infty \). Thus the limiting stationary distribution is the delta measure at the origin, \(\delta _{(0, \ldots , 0)}\). The corresponding limiting distribution for \(\mu \), the market weight vector is \(\delta _{(1/(n+1), \ldots , 1/(n+1))}\). If \(\sigma (c)\) is bounded away from zero, this changes the behavior of \({\mathbf {C}}\).
Theorem 3
Under conditions of Theorem 2, assume
-
(a)
For every \(c \in {\mathbb {R}}\) with positive probability there exists a \(t > 0\) such that \(C_i(t) = c\).
-
(b)
The stationary distribution \(\pi _C\) for \({\mathbf {C}} = (C_1, \ldots , C_n)\) has support on \({\mathbb {R}}^n\). The stationary distribution \(\pi _{\mu }\) for \(\mu \) has support on \(\Delta _n\).
Proof
Return again to the Eq. (22), which is a simplified Eq. (19).
-
(a)
Compute the scale function s for the diffusion \(C_k\): Its derivative is
$$\begin{aligned} s'(c) = \exp \left[ -2\int _0^c\frac{{\tilde{\gamma }}(u)}{{\tilde{\sigma }}^2(u)}\,\mathrm {d}u\right] . \end{aligned}$$There exist \(\gamma _* > 0\) and \(c_* > 0\) such that
$$\begin{aligned} {\tilde{\gamma }}(u) \le -\gamma _*,\, c \ge c_*;\quad {\tilde{\gamma }}(u) \ge \gamma _*,\, c \le -c_*. \end{aligned}$$Moreover, \({\tilde{\sigma }}(u) \ge \sigma (u) \ge \sigma _*\) for all \(u \in {\mathbb {R}}\). Thus for \(c \ge c_*\),
$$\begin{aligned} s'(c) \ge s'(c_*)\exp \left[ 2\int _{c_*}^c\frac{\gamma _*}{\sigma _*^2}\,\mathrm {d}c\right] = s'(c_*)\exp \left[ 2(c-c_*)\frac{\gamma _*}{\sigma _*^2}\right] . \end{aligned}$$(28)A similar estimate is true for \(c \le -c_*\):
$$\begin{aligned} s'(c) \le s'(-c_*)\exp \left[ 2(|c|-c_*)\frac{\gamma _*}{\sigma _*^2}\right] . \end{aligned}$$(29)The speed measure has bounded Lebesgue density \(1/(s'(c)\sigma ^2(c))\), as shown in (28), (29), (27). Apply Feller’s test from Karatzas and Shreve (1998), Chapter 5, Section 5 to complete the proof.
-
(b)
The statement for \(\pi _C\) follows from ellipticity of the elliptic partial differential equation governing the Lebesgue density of this stationary distribution. The statement for \(\pi _{\mu }\) follows from one-to-one mapping between \({\mathbf {C}}\) and \(\mu \).
\(\square \)
It seems to us that (27) is a reasonable assumption, since we would want to allow for exchange of ranks of portfolios. This does not contradict our statistical analysis, since we can observe only \(C_i(t) > c_+\) or \(C_i(t) < -c_-\). Our suggested functions \(\sigma (c)\) do give \(\sigma (0) = 0\) if we extend them to \([-c_-, c_+]\) as is. But we could not observe \(C_i(t)\) in a neighborhood of zero, thus we can extend it as a piecewise function.
Under assumptions of Theorem 3, the stationary distribution for each \(C_i\) has (after normalization) density as above: \(1/(s'\sigma ^2)\), which is supported on the whole real line but is bounded. This is different from the case \(\sigma (c) = \sigma _0c\) discussed above in Example 1, when the stationary distribution is concentrated at one point, but the components of the stationary distribution for the overall vector C are not independent since the SDE for individual relative size measures are dependent.
4.3 Lack of propagation of chaos
For interacting particle systems, sometimes dependence (as a process itself, or stationary distribution) vanishes as the number of particles tends to infinity. This phenomenon is called propagation of chaos, since the system becomes less interdependent and more chaotic. Such results were shown for competing Brownian particles and volatility-stabilized models (see citations in the Introduction). But it is unreasonable to expect this for the current system since all particles are dependent. The limiting density, if it exists, will likely be a solution to a stochastic partial differential equation. To derive this large system limit is left for future research.
5 Capital distribution curve
5.1 Modified plots
Let us study the capital distribution curve \((\ln k, \ln \mu _{(k)}(t))\) in this model. We solve the system of stochastic differential equations explicitly. We use the values of \(C_k(t)\) to plot the capital distribution curve:
Thus the ranking of \(C_k(t)\) reverts the ranking of ranked market weights:
Thus we can plot the modified curve \((\ln k, C_{(k)}(t))\). If this curve is linear, the same can be said for the original capital distribution curve. Now we shall study the system of stochastic differential equations (19) and plot \((\ln k, C_{(k)}(t))\) for fixed t.
5.2 Degenerate case
Even if the system is stable, under conditions of Theorem 2, capital distribution curve can be degenerate, equal to one point: \((1/(n+1), \ldots , 1/(n+1))\). This is the case when \(C_k(t) \rightarrow 0\) a.s. as \(t \rightarrow \infty \) for each \(k = 1, \ldots , n\). Indeed, in this case \(\mu _k(t) \rightarrow 1/(n+1)\) a.s. as \(t \rightarrow \infty \). In particular, this is true in Example 1 with \(\sigma (c) = \rho c\) for \(\rho > 0\). Below, we consider the case when the capital distribution curve is not trivial.
5.3 Linear case
This is the case when
Then the system (19) is linear:
As shown in Karatzas and Shreve (1998), Chapter 5, Section 6.C we can solve this system explicitly:
Below, we show the simulation results for \(n = 100\), with initial conditions \(C_k(0) = 0\), \(k = 1, \ldots , n\). We take estimated values \(\mu = 0.0069\), \(\gamma = 0.0045\). For \(\rho \), we take the value 0.1, which is consistent with estimates. Finally, estimates for mean \(g_S\) and standard deviation \(\sigma _S\) of monthly price returns for the benchmark (top decile) are given by \(g_S = 0.0044\) and \(\sigma _S = 0.0541\). We simulate until \(t = 100\), assuming all \(W_1, \ldots , W_n, W_S\) are independent Brownian motions.
6 Conclusions
We developed a model in this article which can be viewed as an enhancement of the Capital Asset Pricing Model, which stresses dependence of stock portfolios upon the overall market. The portfolios are based on size (market cap), and the quantities \(\alpha \), \(\beta \), and standard error \(\sigma \) (of diversifiable idiosyncratic risk) depends on size (more precisely, size of portfolio relative to the size of benchmark). Thus we wrote a system of stochastic differential equations.
We write separately systems of equations for equity premia (total returns, including dividends, minus risk-free returns), and for market size (that is, price returns). They are very similar and the idiosyncratic risk can be taken the same. The equity premium and price returns of the benchmark are driven by different random processes (although correlated) (Fig. 4).
Using CRSP 1926–2020 monthly data of size deciles, we find reasonable guesses for \(\alpha \), \(\beta \), \(\sigma \) as functions of relative size. Our statistical analysis is not fully rigorous, because it fails white noise tests. However, we do find some reasonable results.
On the theoretical side, we prove long-term stability results: under some conditions on \(\alpha , \beta , \sigma \), the vector of relative size measures converges as \(t \rightarrow \infty \) to a stationary distribution.
Finally, an important feature of real-world markets: stability and linearity of the capital distribution curve, is reproduced in our model by numerical simulation.
Future research can include making more sophisticated time series models which take into account autocorrelations, or non-Gaussian fluctuations of the market. It seems important to develop the SPT for the case of dividends, when price and total returns (and by extension market capitalizations and wealth processes) are different. Finally, we would like to derive a large system limit.
References
Banner, A.D., Fernholz, D.: Short-term relative arbitrage in volatility-stabilized markets. Ann Finance 4(4), 445–454 (2008)
Banner, A.D., Fernholz, E.R., Karatzas, I.: Atlas models of equity markets. Ann Appl Probab 15(4), 2330–2996 (2005)
Banner, A.D., Fernholz, E.R., Ichiba, T., Karatzas, I., Papathanakos, V.: Hybrid Atlas models. Ann Appl Probab 21(2), 609–644 (2011)
Banz, R.W.: The relationship between return and market value of common stocks. J Financ Econ 9(1), 3–18 (1981)
Barnes, C., Sarantsev, A.: A note on jump Atlas models. Braz J Probab Stat 34(4), 844–857 (2020)
Chatterjee, S., Pal, S.: A phase transition behavior for Brownian motions interacting through their ranks. Probab Theory Rel Fields 147(1), 123–159 (2010)
Fama, E.F., French, K.R.: Common risk factors in the returns on stocks and bonds. J Financ Econ 33(1), 3–56 (1993)
Fama, E.F., French, K.R.: The capital asset pricing model: theory and evidence. J Econ Perspect 18(3), 25–46 (2004)
Fernholz, E.R.: Stochastic Portfolio Theory. Applications of Mathematics, vol. 48. New York: Springer (2002)
Fernholz, E.R., Karatzas, I.: Relative arbitrage in volatility-stabilized markets. Ann Finance 1(2), 149–177 (2005)
Fernholz, E.R., Karatzas, I.: Stochastic Portfolio Theory: An Overview. Mathematical Modeling and Numerical Methods in Finance, Handbook of Numerical Analysis, pp. 89–168 (2009)
Fernholz, E.R., Karatzas, I., Kardaras, C.: Diversity and relative arbitrage in equity markets. Finance Stoch 9(1), 1–27 (2005)
Jourdain, B., Reygner, J.: Capital distribution and portfolio performance in the mean-field Atlas model. Ann Finance 11(2), 151–198 (2015)
Karatzas, I., Sarantsev, A.: Diverse market models of competing Brownian particles with splits and mergers. Ann Appl Probab 26(3), 1329–1361 (2016)
Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics, vol. 113. New York: Springer (1998)
Meyn, S.P., Tweedie, R.L.: Stability of Markovian processes II: continuous-time processes and sampled chains. Adv Appl Probab 25(3), 487–517 (1993)
Meyn, S.P., Tweedie, R.L.: Stability of Markovian processes III: Foster–Lyapunov criteria for continuous-time processes. Adv Appl Probab 25(3), 518–548 (1993)
Pal, S.: Analysis of market weights under volatility-stabilized market models. Ann Appl Probab 21(3), 1180–1213 (2011)
Pal, S., Wong, T.-K.L.: The geometry of relative arbitrage. Math Financ Econ 10(3), 263–293 (2016)
Pickova, R.: Generalized volatility-stabilized processes. Ann Finance 10(1), 101–125 (2014)
Semenov, A.: The small-cap effect in the predictability of individual stock returns. Int Rev Econ Finance 38, 178–197 (2015)
Sharpe, W.F.: Capital asset prices: a theory of market equilibrium under conditions of risk. J Finance 19(3), 425–442 (1964)
Shkolnikov, M.: Competing particle systems evolving by interacting Lévy processes. Ann Appl Probab 21(5), 1911–1932 (2011)
Shkolnikov, M.: Large systems of diffusions interacting through their ranks. Stoch Process Appl 122(4), 1730–1747 (2012)
Shkolnikov, M.: Large volatility-stabilized markets. Stoch Process Appl 123(1), 212–228 (2013)
van Dijk, M.A.: Is size dead? A review of the size effect in equity returns. J Bank Finance 35(12), 3263–3274 (2011)
Zvonkin, A.K.: A transformation of the phase space of a diffusion process that will remove the drift. Math Sbornik 22(1), 129–149 (1974)
Acknowledgements
We thank the Department of Mathematics and Statistics at our University of Nevada, Reno, for welcoming and supportive atmosphere and for fostering research collaboration between faculty and students (undergraduate and graduate). We thank the referees for useful remarks and positive responses.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Statistical analysis of size-based index funds
Appendix: Statistical analysis of size-based index funds
These size deciles of the CRSP universe are not directly investable. But there exist size-based funds available for individual investors. Among many of them, let us take JKJ, JKG, JKD: iShares Morningstar Small-Cap, Mid-Cap, and Large-Cap exchange-traded funds. These are based on largest 70%, next 20%, and next 7% of the total universe of stocks. In other words, Large-Cap corresponds to Deciles 1–7 weighted by their market capitalizations, Mid-Cap corresponds to Deciles 8–9 weighted by their market capitalizations, and Small-Cap corresponds to the top 7% of the bottom Decile 10.
Monthly total arithmetic returns for these funds are taken from BlackRock web site, July 2004 – August 2020. For risk-free returns, we take 1-month Treasury Constant Maturity Rate from Federal Reserve Economic Data web site, observed at the last day of each month June 2004 – July 2020. We compute geometric versions of these returns. From each such rate r, we obtain geometric total monthly returns for the next month \(\ln (1 + r/1200)\). Then we compute equity premia \(P_S, P_M, P_L\) for these funds. Regress the first two upon the third:
The quantile–quantile plots, Shapiro–Wilk and Jarque–Bera normality tests, and autocorrelation function plots allow us to assume that each series of residuals can be modeled by i.i.d. normal distribution. Thus we can apply standard Student tests for regression coefficients. The 95% confidence intervals for each of \(\alpha _S\) and \(\alpha _M\) contain zero. Thus we can assume that \(\alpha _S = \alpha _M = 0\), but the confidence intervals for \(\beta _S\) and \(\beta _M\) do not contain 1. Point estimates of these coefficients are: \(\beta _S = 1.27\), \(\beta _M = 1.15\). Estimates for standard errors for residuals, and cross-correlation between residuals are: \(\sigma _S = 0.026\), \(\sigma _M = 0.019\), \(\rho = 0.83\). The \(R^2\) values for each regression are 87% and 80%. Thus we see that the CAPM works for actual traded size-based funds.
Rights and permissions
About this article
Cite this article
Flores, B., Ofori-Atta, B. & Sarantsev, A. A stock market model based on CAPM and market size. Ann Finance 17, 405–424 (2021). https://doi.org/10.1007/s10436-021-00390-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10436-021-00390-8
Keywords
- Capital asset pricing model
- Stochastic differential equations
- Capital distribution curve
- Stochastic stability
- Market weight