Investing under uncertainty

What the best approach is towards portfolio optimisation depends on the investment environment. In a world that is uncertain where prediction power lacks, the best strategy is to hold assets in equal amounts. That is an old wisdom, advocated in the old days to households in BabylonFootnote 1 and since proved to be accurate by scientists, e.g., by Bera and Park (2008). In an environment where all stochastic variables are uniformly distributed, so that all outcomes are equally likely, exposing uniformly to all is optimal. In that situation, the vulnerability to unfavourable events is minimised. Note that risk refers to unforeseen circumstances in this context and that optimality is reached when the invested capital is best protected against that.

Interestingly, equal weighting is not considered risk optimal by modern finance theory standards. An equally weighted portfolio doesn’t possess the return-to-risk potential for it to lie on the efficient frontier (Sharpe 1964). According to Markowitz’ (1952) modern portfolio theory (MPT), the risk-minimal portfolio is the fully invested portfolio with the lowest price variance, which happens to be a concentrated portfolio, as, e.g., Clarke et al. (2011) make apparent. Note that no allusion is made to unforeseen circumstances here and that optimality is reached when the price variance of the portfolio is minimal.

The divide between the two understandings of risk optimality is colossal. As Pola (2016) explains among others, the divide is marked by uncertainty, which in concrete terms means that tomorrow’s asset price volatility and correlation are not known for certain. MPT provides no framework for dealing with uncertainty. The theory holds in a world where asset prices strictly obey to predefined laws. We make thus the observation that under complete certainty mean–variance efficiency should be the portfolio optimisation objective, whereas under complete uncertainty diversification must be sought in the portfolio.

The two prepositions, perfect foresight on model parameters and none at all, are both highly hypothetical. It is more plausible to think of today’s investment markets as an environment where risks are partially foreseeable. Asset prices move in line with model parameters … or they don’t, when paradigms change, giving rise to foreseeable price shocks and unforeseen shocks. An investor seeks in principle to minimise losses due to unfavourable price shocks, foreseen or not, and should therefore pursue both risk objectives, we argue, to seek protection as far as unforeseeable risk goes and invest efficiently where risk is foreseeable.

We develop this argument in this article. We relax the assumption of risk certainty that is central in MPT, and derive a more general formulation of the portfolio problem with a double risk objective. We find that this objective brings about what a series of portfolio construction techniques aim to address, which have been developed as an alternative to mean–variance optimisation. They include entropy-based optimisation, Bayesian optimisation, stability-adjusted optimisation, risk parity, resampling, robust optimisation and covariance shrinkage. Our demonstration provides new theoretical backing for those techniques.

Generalising modern portfolio theory

To focus thought, we develop at first instance the special case of the portfolio optimisation problem where the primary objective is to minimise risk rather than seize performance opportunity. An investor wants to limit the probability of incurring a loss while fully invested. Such passive investment strategies are commonly pursued in practice. Without an explicit return objective, optimality is attained according to MPT when the price variance of the portfolio is minimal.

Whether a low price variance actually serves the investment purpose in this setting is questionable. Qian (2011) among others raises the point giving the following concrete example. For an allocation of assets over equities and bonds, for which he assumes volatility levels at 15 and 5%, respectively, and a correlation of 0.2, the MPT optimum lies at 5–95. It goes against intuition that such concentrated portfolio is risk optimal, he notes, and a lesser concentration would feel as less risky.

We formalise this contention. We generalise Markowitz’ definition of utility U based on price variance, to the one given in Eq. (1). As far as foreseeable risk goes, optimality continues to be defined by the variance of the portfolio (x), specified by the first term in the equation where V denotes the covariance matrix between asset prices. As to unforeseen risk, optimality is defined by diversification, which is specified by the second term. Diversification is in (1) measured by the sum of portfolio weights taken in logarithm, e being a vector of ones. The reader may verify that the sum is smallest when all (N) weights are held in equal amounts, 1/N.

$$\hbox{max} .\quad U\left( {x,\theta } \right):\quad - x^{T} Vx - \theta \cdot e^{T} \ln \left( x \right)$$
(1)

The importance given to diversification over price variance is set by θ. This parameter expresses the level of uncertainty, called the entropy in the sciences. It can be zero (complete certainty), in which case the function collapses back to the initial MPT setting with a single risk objective, or θ can go to infinity (total uncertainty), at which point the optimisation objective reduces to one of the diversifications only. From an investor’s point of view, θ expresses a preference for protection against the unknown over risk efficiency as defined by Markowitz.

For an intermediate level of uncertainty, 0 < θ < ∞, both objectives come into play. Roncalli (2013) gives proof that in that situation the optimum is reached when the portfolio is in risk parity, that is, when all holdings contribute equally to the overall price variance. His proof is similar to the one Scherer (2007) gives for the maximum Sharpe portfolio. Scherer shows that a portfolio attains the highest return-to-risk ratio, when the marginal contribution to risk equals the marginal contribution to return for all holdings. In the same manner, a portfolio attains the highest diversification-to-risk ratio when the marginal contributions to risk equal the marginal contributions to diversification and thus when the total contributions equalise (c is a positive scalar). Formally,

$$\frac{{\partial x^{T} V x}}{\partial x} = Vx \triangleq \frac{{\partial e^{T} \ln \left( x \right)}}{\partial x} = x^{ - 1} \Leftrightarrow \left( {Vx} \right) \cdot \left( x \right) = c$$
(2)

The importance of this finding for portfolio theory seems to be left uncommented in the literature. It answers the question Lee (2011) raises: “What exactly does a risk parity portfolio try to achieve? (…) The objective function (or lack thereof), ex ante, and the performance evaluation, ex post, are inconsistent”. And he states: “To date we have not identified one theory that predicts, ex ante, that (…) risk-based portfolios should be more efficient than other portfolios. If such a theory indeed exists, it would represent a profound finding”. We give a lead: risk parity is the solution to the Markowitz optimisation problem where the certainty hypothesis is relaxed.

In fact, when risk parity investing was introduced in the mid-2000s, by Qian (2006, 2011) among others, he gave the intuition of minimising financial loss that lies behind it. Though the arguments and empirical evidence he gave are convincing, his argumentation lacks formal proof, as Lee (2011) rightly points out. In the investment example Qian had put forward, he shows the risk parity portfolio to be more reasonable than the MPT optimum, which is 25% invested in equities and 75% in bonds since \(\left( {15\% \cdot 0.25} \right)^{2} = \;\left( {5\% \cdot 0.75} \right)^{2}\), as the reader can verify.

Note that imposing risk parity produces one particular portfolio that is optimal with respect to (1), the tangency portfolio. That is to say, risk parity is not the only solution to the generalised MPT problem. For one thing, there is a continuum of optimal portfolios lying on a frontier spanned in diversification and risk space. And secondly, diversification can be specified in many ways each leading to a different frontier of solutions. Adopting Rao’s (1982) squared entropy measure for example, as do Carmichael et al. (2015a) who explore an entropy-based optimisation approach, gives the following problem objective:

$$\hbox{max} .\quad U\left( {x,\theta } \right):\quad - x^{T} Vx - \theta \cdot x^{T} x$$
(3)

The covariance shrinkage approach introduced by Ledoit and Wolf (2003) can be found back in this formulation. They minimise the portfolio variance that is measured by a matrix where the covariance levels are shrunk with respect to the variance levels, i.e., (V + θ·I) where I is the identity matrix and θ > 0. We make thus evident that covariance shrinkage is an optimal solution to the generalised MPT problem as well. In a broader sense, all methods that play down covariance levels and by that build more diversified portfolios fall into this category. We elaborate on this in the literature review in the next section.

We now reintroduce the return objective back into the objective function that was left aside at the beginning of the section. Inserting the return objective into Eq. (1) results in the objective function as given in (4), where, as can be noted, Markowitz’ risk-aversion parameter λ reappears.

$$\hbox{max} .\quad U\left( {x,\lambda ,\theta } \right):\;\;R^{T} x - \lambda \cdot x^{T} Vx - \theta \cdot e^{T} \ln \left( x \right)$$
(4)

The frontier of optimal solutions to this problem has a three-dimensional convex form. Again under complete certainty \(\left( {\theta = 0} \right)\), the equation collapses back to the Markowitz problem, and under complete uncertainty \(\left( {\theta \to \infty } \right)\), equal weighting is optimal. The tangency portfolio that can be derived corresponds to the return–risk parity solution for which the holdings contribute equally in terms of return, variance and diversification. Let us see how this portfolio compares with the MPT optimum in the equity-bond allocation example discussed above. Supposing an annual yield of 6% for equities and 2% for bonds, the parity allocation is close to 40–60, as the reader may verify, and the MPT optimum lies at 25–75. The lesser concentration of the parity portfolio should lead to a better preservation of portfolio value, is what we induce.

Asness et al. (2012) show this to be the case. They compare an equity-bond allocation that is in parity with one that is mean–variance optimal over a long period and find that the former outperforms the latter. What their tests reveal is, to us, that uncertainty—or entropy—is not nil in the capital markets but lies close to the level where risk parity is the optimal investment strategy. The authors report that investors prefer a 60–40 equity-bond allocation in practice, which is closer to risk parity than to the mean–variance optimum. They explain this preference by an aversion investors have to taking leverage. Investors prefer assets with lower risk-adjusted returns, is their belief.

We make note that the auxiliary diversification objective is relevant for passive longFootnote 2 investments, where the objective is to preserve capital value while reaping a general risk premium. The objective becomes irrelevant as soon as convictions get involved. For active investors who want to place tactical bets, the purpose of the portfolio optimisation is to implement those bets efficiently taking the minimum of non-intentional risk. It is coherent for those investors to assume complete certainty about the risk parameters and make standard Markowitz optimisations, or alternatively take a Black–Litterman (1992) approach where uncertainty about returns can be expressed and dealt with.

Portfolio theory literature

Awareness that mean–variance portfolio optimisation is falling short in certain circumstances is long-standing; however, it seems to remain unclear what exactly is the matter. One of the most repeated critics in the literature is that Markowitz optimisation is vulnerable to estimation error. Bawa and Klein (1976) made the point that portfolios which are risk-minimised ex ante may not be so ex post if the risk estimates turn out to be wrong. Michaud (1989) went as far as calling portfolio optimisers “error maximisers”. He showed that small errors in the risk estimates may provoke big changes in the optimised portfolio.

But is the heart of the matter estimation error? Using the term alludes to the idea that the shortcomings of modern portfolio theory lie in its application, when the theory is put into practice. We rather view the absence of uncertainty as the key issue. The postulate that asset price movements are stationary processes that obey to predefined laws is very strong, and qualifying deviation from those laws as error is to us a misconception. Taking the view that price movements cannot be fully anticipated ex ante and integrating that view into the definition of investment risk change the concept. The heart of the problem is to us the absence of uncertainty and that is not to confuse with estimation error.

Relaxing the certainty hypothesis is the starting point of the philosophy that lies behind the Bayesian optimisation approach, developed by Brown (1976). In this approach, the risk parameters are estimated such as to limit a sense of loss due to the sub-optimality which would arise when the estimates turn out to be wrong. The parameters are Bayesian-adjusted towards a prior with respect to which sub-optimality is defined. This methodology has been tested by many and is shown to work well empirically, by Jorion (1986) among others, and can be associated with the generalised modern portfolio theory that we contend in this article.

Scherer (2007) shows that the portfolios produced through Bayesian optimisation tend to be less concentrated than their certainty equivalents that are built through mean–variance optimisation. Adjusting risk parameters towards a prior, in case it is well chosen, leads de facto to lower correlation parameters and inherently to more diversified portfolios, he gives as an explanation. Incorporating a sense of loss into the utility function directly as we do is a less circumvent way to come to the same point.

The way we define the portfolio optimisation problem, in Eq. (4), has already been proposed in the literature in different formats, namely by Bera and Park (2008) and by DeMiguel et al. (2009). The former maximise diversification measured by entropy under the constraint of attaining a minimum level of risk-adjusted return. The latter do the inverse, in that they maximise risk-adjusted return constraining the diversification measured by a portfolio norm. Both measures, entropy and norm, are defined upon the portfolio weights. Without quoting each other’s work, they come to essentially the same point, we note.

A number of optimisation methods have been proposed in the literature that augment the portfolio diversification in an ad hoc way, such as the resampling technique (Michaud 1998), stability-adjusted optimisation (Kritzman and Turkington 2016) and robust optimisation by Tütüncü and Koenig (2004). Resampling means taking average weights over a set of portfolios that are optimised on different risk estimates. Stability adjusting is taking average parameter estimates while varying the time window and frequency of the data. In the same spirit, in a robust optimisation a portfolio is optimised not with respect to one risk model that is plausible but to a set of models that are equally plausible. In both methods, the assumption of risk certainty is relaxed in a sense, in an indirect manner.

We underline the importance of defining diversification on the basis of intrinsic asset characteristics such as weights, expressly not on risk parameters. It is easy to understand why. As soon as prices deviate from the modelled parameters, diversification becomes sub-optimal by its own definition and the protection less effective with it. The purpose of diversification in the context of portfolio optimisation as we define it is expressly to counter non-modelled price behaviour and must for that reason not be defined upon it.

The diversification measure proposed by Meucci (2009) based on the eigenvectors of the covariance matrix would be inappropriate to use in a portfolio optimisation context, we argue, as is the diversification ratio (DR), given in Eq. (5), introduced by Choueifaty and Coignard (2008). Carmichael et al. (2015b) make the same observation and suggest using the squared entropy measure instead in the nominator. Doing that leads to the objective function which the risk parity investment strategy aims to optimise, as demonstrated in the previous section.

$$\hbox{max} .\quad DR:\;\frac{{\sigma^{T} x}}{{\sqrt {x^{T} V\,x} }} {\text{where}}\;\sigma \;{\text{is}}\;{\text{a}}\;{\text{vector}}\;{\text{of}}\;{\text{asset}}\;{\text{price}}\;{\text{volatilities.}}$$
(5)

Sensitivity to the problem parameters

In this section, we discuss how the generalisation of the portfolio optimisation problem changes the importance of the risk parameters that are set.

In the Markowitz optimisation setting, the risk parameters, both correlation and volatility, strongly determine the optimisation outcome. Schematically, if correlation is positive (negative) between two assets, their relative weighting will be negative (positive) as that diversifies risk, whereby the amounts invested are inversely proportional to the price variances. This schema no longer applies when the problem is generalised. On the contrary, as Maillard et al. (2010) demonstrate, the global correlation level becomes irrelevant, whether it be positive, nil or negative. Only relative deviations from the general correlation level continue to determine the outcome, whereby the amounts invested are inversely proportional to the volatilities, not variances.Footnote 3

The optimisation outcome becomes thus less sensitive to the input parameters, which is what is to be expected when relaxing the hypothesis of risk certainty. Lesser sensitivity is welcome, in particular for parameters that tend to be unstable. The correlation level between equity and bond prices is a good example of an unstable parameter. Its instability is the result of conflicting market forces. On the one hand, prices move together as the valuation of both asset classes depends on the state of the economy, while on the other hand a direct substitution effect drives towards negative correlation. It is a good example of high entropy, for which, we believe, relaxing certainty is a sensible approach.

A word of caution is in place about the effectiveness of the protection that can be gained from diversification. Much depends on how diversification is being measured. Behind the simple measure used in Eqs. (1) and (3) is the assumption that unforeseen price shocks are uniformly distributed over the assets and that the financial consequences of those shocks are equal. Considerations such as size or the systemic risks inherent to the firms issuing the assets are ignored. Protection will be more effective as the more the assumption of uniformity is respected. It is for that matter sensible to build risk parity portfolios on an aggregate level where the asset groups that are defined are of comparable calibre.

We make the suggestion that these thoughts are behind the top-down approach that is generally pursued when making strategic allocation decisions for globally invested funds. Rather than engaging in a full optimisation over the entire investment universe, the allocation is decided layer by layer, over countries and asset groups before individual assets. On the aggregate level, where risk parameters tend to be least certain, risk parity rules are not uncommon, while on the same continuum on asset level mean–variance optimisation is the norm. In the light of our generalised optimisation theory, proceeding in this way is rational and coherent.

Conclusion

In this article, we define the portfolio optimisation problem in a wider setting than Markowitz (1952) had done. Rather than staying with a model where the risks in a given market are predefined by statistical laws, the possibility that asset prices escape from those laws is considered part of the investment problem. Opening up and relaxing the definition of investment risk leads to a portfolio optimisation framework that is sensible and, as it appears, coincides with investment practice.

We contribute to the long-standing and ongoing debate on the Markowitz optimisation methodology. Many, including Harry Markowitz himself, have argued over the years that the methodology should be used with moderation. It is suited in times of low entropy, where risks are reasonably predictable, but less so in periods where paradigms change. It is interesting to note that in recent turbulent times more reference seems to be made to the old wisdom of diversification, not to put all eggs in one basket.