1 Introduction

The field of portfolio selection is an active research area, which combines elements and methodologies from various analytical disciplines, such as optimization, decision analysis, risk management, and data science, among others. The seminal approach of Markowitz during the 1950s on the well-known mean–variance model (Markowitz 1952), was the cornerstone in this area and set the basis for the adoption of quantitative approaches for financial decision-making. Since then, the field has evolved significantly and numerous advances have been made in the theory and practice of portfolio selection and management (Kolm et al. 2014), covering issues related to risk measurement (Ortobelli et al. 2005; Szegö 2002), modeling issues (Aouni et al. 2018; Mansini et al. 2014), algorithmic developments (Maringer 2007), and implementations in decision support systems (Xidonas et al. 2011), among others.

While many approaches rely on a deterministic setting based on point risk-return estimates derived from historical data, considerable emphasis has also been put on the modeling and treatment of the inherent uncertainty in asset returns, which is a major issue for the success of portfolio selection models. To this end, various approaches have been proposed, such as stochastic optimization in a single and multi-objective setting (Abdelaziz et al. 2007; Abdelaziz and Masmoudi 2014; Pagnoncelli et al. 2009; Popescu 2007). An overview of such approaches was presented by Masmoudi and Abdelaziz (2018). Other relevant approaches include fuzzy optimization (Dzuche et al. 2020; Lin and Hsieh 2004; Arenas Parra et al. 2001) and possibilistic mathematical programming (Giove et al. 2006; Liu et al. 2020; Zhang et al. 2012). A comparison of these frameworks (i.e., stochastic versus fuzzy-possibilistic programming) can be found in Buckley (1990) and Inuiguchi and Ramík (2000).

As a general tool for optimization under uncertainty, RO has also attracted strong interest among operations researchers (Ben-Tal et al. 2009; Gabrel et al. 2014; Kouvelis and Yu 1997). In contrast to traditional portfolio optimization approaches which rely on point estimates for the modeling parameters (i.e., asset returns and covariances), usually specified through historical data, the framework of RO acknowledges that the inputs are subject to errors, thus belonging to an uncertainty set around their nominal estimates. RO is closely related to other methods for optimization under uncertainty, such as stochastic programming. However, the latter assumes that uncertainty is stochastic and can be described by properly defined probability distributions (Ben-Tal and Nemirovski 1998), thus leading to solutions that are feasible in a probabilistic sense. On the other hand, RO provides solutions that are feasible for all realizations of the uncertain parameters in a pre-specified set Bertsimas et al. (2011). Moreover, as noted by Vladimirou and Zenios (1997) while stochastic optimization approaches focus on the expectation of the objective function, RO additionally covers higher moments.

RO models have been widely used for portfolio optimization purposes as indicated by the reviews of Fabozzi et al. (2010) and Xidonas et al. (2020). For instance, Goldfarb and Iyengar (2003) presented robust formulations for the mean–variance model, the Sharpe ratio, and value-at-risk (VaR), using ellipsoidal uncertainty sets, whereas Tütüncü and Koenig (2004) considered interval uncertainty sets for the assets’ mean returns and covariance matrix. Robust mean–variance models were also examined by Garlappi et al. (2007) who also consider model uncertainty (i.e., the uncertainty of the return-generating model), as well as by Fliege and Werner (2014) who reformulated mean–variance RO models in the context of multi-objective optimization. Beyond the mean–variance model, Ghaoui et al. (2003), Zhu and Fukushima (2009), and Zymler et al. (2013) presented RO models for VaR-based risk measures, and Kakouris and Rustem (2014) presented similar models with a copula formulation, whereas Xidonas et al. (2017) used a multi-objective minimax regret approach in the context of the mean-absolute deviation model. RO approaches have also been developed for other portfolio performance measures such as the Omega ratio (Kapsos et al. 2014a) as well as portfolio selection in the context of expected utility maximization (Caçador et al. 2021).

Except for the theoretical advances in the development of RO portfolio construction models, the empirical evaluation of their performance is also important. In the portfolio optimization literature, many studies can be found presenting comparative results for various nominal models in different settings (see, for instance, Angelelli et al. 2008; DeMiguel et al. 2007; Gilli and Schumann 2011; Pavlou et al. 2019; Xidonas and Mavrotas 2014). On the other hand, comparison studies for robust models in a unifying and standardized framework are rather lacking. For instance, Guastaroba et al. (2011) compared two robust formulations of conditional value-at-risk (CVaR) to the nominal CVaR model using data from the London Stock Exchange during the period 1995–2005. Their out-of-sample tests showed that the nominal and robust portfolios achieved similar performance on five indicators (mean and median return, standard deviation, semi-standard deviation, Sortino ratio). Kim et al. (2013) considered robust mean–variance portfolios obtained with box and ellipsoid uncertainty sets, to examine their characteristics with respect to the fundamental factors of the Fama-French model. Similar mean–variance robust models were also considered in the comparative analysis of Kim et al. (2018).

Previous studies such as the ones described above, have focused on specific types of models and/or uncertainty sets. Thus, the results do not allow for “cross-model” comparisons, i.e., for which models RO performs better. Motivated by this finding, the objective of this paper is to present a more comprehensive empirical examination of the performance of RO portfolio construction models, covering different risk measures and performance metrics. Moreover, issues not previously examined, like the composition of the portfolios and their diversification, are also considered in the analysis. The empirical evaluation is based on data involving the stocks of the Dow Jones Index over the period 2005–2020. A rolling-window testing approach is employed to examine the performance of the considered models focusing on their results on out-of-sample tests.

The rest of the paper is organized as follows. Section 2 presents the nominal models considered in the analysis, as well as their robust counterparts. Section 3 presents data and the experimental setup, whereas Sect. 4 focuses on the results of the analysis. Finally, Sect. 5 concludes the paper and discusses some future research directions.

2 Portfolio optimization models

For the purposes of the analysis, three popular portfolio optimization models are considered, namely the mean–variance model, conditional value-at-risk, and the Omega ratio. Except for their popularity as tools for portfolio construction, the selection of these approaches was also based on their different underlying principles, which allows us to analyze methodologies that are fundamentally different instead of just being simple variants of each other. On the one hand, the mean–variance approach is based on a deviation risk measure. On the other hand, conditional value-at-risk relies on a tail risk measure, whereas the Omega ratio is a risk-reward performance measure that incorporates all the higher moments of the returns distribution. In the following sub-sections, we begin with the presentation of the nominal portfolio optimization models based on the above approaches and then discuss their robust counterparts that were used in the analysis. In all cases, it is assumed that short-sales are not allowed (i.e., the weights of the assets in the portfolios are non-negative).

2.1 Nominal models

2.1.1 Mean–variance model

The mean–variance (MV) model was introduced by Markowitz (1952) and it is the most basic and widely used approach for portfolio optimization. Assuming a set of N assets with expected (mean) returns \(\hat{{\varvec{\mu }}}=({\hat{\mu }}_1,\ldots ,{\hat{\mu }}_N)\) and covariance matrix of returns \({\varvec{{\hat{\Sigma }}}}\), the MV finds the asset weights \({\mathbf {x}}=(x_1, \ldots ,x_N)\ge {\mathbf {0}}\) that solve the following quadratic programming problem:

$$\begin{aligned} \begin{array}{ll} \min &{} {\mathbf {x}}^\top {\varvec{{\hat{\Sigma }}}}{\mathbf {x}}\\ \text{ s.t. }&{} {\mathbf {1}}{\mathbf {x}}=1\\ &{} \hat{{\varvec{\mu }}}{\mathbf {x}}\ge \mu ^*\\ &{} {\mathbf {x}}\ge {\mathbf {0}} \end{array} \end{aligned}$$
(1)

where \({\mathbf {1}}\) is a vector of ones and \(\mu ^*\) is the minimum expected level of return for the portfolio.

2.1.2 Conditional value-at-risk

Despite the fundamental contributions of the MV framework in finance and decision theory, its reliance on the two first moments (mean, variance) of the returns distribution is often inadequate to describe the investment preferences and risk attitude of the investors. Moreover, departures from a normal returns distribution make questionable the use of variance as a proper risk measure.

Since the 1990s much attention has been given to tail risk measures, which enable the consideration of extreme losses. Value-at-risk (VaR) is the most well-known such measure. At a given confidence level \(\beta \), VaR\(_\beta \) is the maximum expected loss over a time period. Despite its widespread use, VaR has received criticism as being an incoherent risk measure (Artzner et al. 1999). Moreover, the use of VaR in an optimization context is a difficult task, because it requires the solution of a non-convex problem (Babat et al. 2018). As an alternative, Rockafellar and Uryasev (2000) introduced conditional VaR (CVaR), defined as the conditional expectation of the loss of a portfolio that is at least equal to VaR. Formally, for a continuous probability distribution of asset returns, the CVaR at the \(\beta \)% confidence level for a portfolio with composition \({\mathbf {x}}\), is defined as:

$$\begin{aligned} \text{ CVaR}_\beta ({\mathbf {x}})=\frac{1}{1-\beta }\int _{f({\mathbf {x}},{\mathbf {r}})\ge \alpha _\beta ({\mathbf {x}})}{f({\mathbf {x}},{\mathbf {r}})p({\mathbf {r}})\mathrm {d}{\mathbf {r}}} \end{aligned}$$
(2)

where \({\mathbf {r}}\) is the vector of random assets’ returns, \(p({\mathbf {r}})\) is the associated probability density function, \(f({\mathbf {x}},{\mathbf {r}})\) denotes the portfolio loss function, and \(\alpha _\beta ({\mathbf {x}})\) denotes the VaR\(_\beta \) threshold for the portfolio weights \({\mathbf {x}}\).

Since the VaR function \(\alpha _\beta ({\mathbf {x}})\) does not have an analytical expression, Rockafellar and Uryasev (2000) showed that it is simpler to employ the definition:

$$\begin{aligned} \text{ CVaR}_\beta ({\mathbf {x}})=\min _\alpha \left\{ \alpha + \frac{1}{1-\beta }\int _{{\mathbf {r}}}{[f({\mathbf {x}},{\mathbf {r}})-\alpha ]^+p({\mathbf {r}})\mathrm {d}{\mathbf {r}}}\right\} =\min _\alpha \;F_\beta ({\mathbf {x}},\alpha ) \end{aligned}$$
(3)

where \([z]^+=\max \{0, z\}\).

Except for being a coherent risk measure, CVaR is easy to use for portfolio optimization, as it only requires the solution of a linear programming problem. More specifically, denoting by \({\mathbf {R}}\) a \(T\times N\) matrix with T realizations of asset returns (e.g., over T time periods), the portfolio composition \({\mathbf {x}}\) that optimizes CVaR at the \(\beta \) confidence level, while having a minimum level of expected return \(\mu ^*\), is obtained through the solution of the following linear program:

$$\begin{aligned} \begin{array}{ll} \text{ min } &{} \quad \alpha +\frac{1}{(1-\beta )T}{\mathbf {1}}{\mathbf {y}}\\ \text{ s.t. } &{} \quad \alpha +{\mathbf {y}}+\mathbf {Rx}\ge {\mathbf {0}}\\ &{} \quad \hat{{\varvec{\mu }}}{\mathbf {x}}\ge \mu ^*\\ &{} \quad {\mathbf {1}}{\mathbf {x}}=1\\ &{} \quad {\mathbf {x}}, {\mathbf {y}} \ge {\mathbf {0}},\alpha \in {\mathbb {R}} \end{array} \end{aligned}$$
(4)

where \(\alpha \) is a decision variable representing VaR\(_\beta \) and \({\mathbf {y}}=(y_1,\ldots ,y_T)\) is a vector of decision variables representing the losses that are at least equal to VaR\(_\beta \).

2.1.3 Omega ratio

The Omega ratio is a risk-reward ratio, first proposed by Shadwick and Keating (2002). In the context of the Omega ratio, risk represents the expected losses that exceed a given threshold \(\tau \), whereas reward is the expected gain over the threshold. Formally, the Omega ratio can be defined as follows (Kapsos et al. 2014a; Shadwick and Keating 2002):

$$\begin{aligned} \varOmega _\tau ({\mathbf {x}})=\frac{\int _{\tau }^{+\infty }{[1-F({\mathbf {r}})]\mathrm {d}{\mathbf {r}}}}{\int _{-\infty }^{\tau }{F({\mathbf {r}})\mathrm {d}{\mathbf {r}}}}=\frac{\hat{{\varvec{\mu }}}{\mathbf {x}}-\tau }{{\mathbb {E}}([\tau -\mathbf {Rx}]^+)}+1 \end{aligned}$$
(5)

where \(F({\mathbf {r}})\) is the cumulative distribution function of the portfolio returns and \({\mathbb {E}}([\tau -\mathbf {Rx}]^+)\) represents the expected loss above the threshold \(\tau \).

Compared to other risk-reward performance indicators, such as the Sharpe ratio, Omega takes into consideration all the higher moments of the returns distribution and it is particularly attractive for strongly asymmetric distributions (Balder and Schweizer 2017), as it distinguishes between upside gains and downside risk.

Given the above definition of the Omega ratio, its optimization in the context of portfolio selection appears to be a non-linear and non-convex problem. Mausser et al. (2006) presented a linear programming formulation based on the transformation proposed by Charnes and Cooper (1962) for solving fractional optimization problems. This approach works when the Omega ratio exceeds one, whereas, for other cases, the authors discuss alternatives based on non-linear programming, integer programming, and optimization heuristics. To overcome these limitations, Kapsos et al. (2014b) showed that Omega ratio maximization can be considered in the form of the following parametric optimization problem:

$$\begin{aligned} \max _{{\mathbf {x}}\ge {\mathbf {0}}}\left\{ \delta (\hat{{\varvec{\mu }}}{\mathbf {x}}-\tau )-(1-\delta ){\mathbb {E}}[(\tau -\mathbf {Rx})^+]\right\} \end{aligned}$$
(6)

Solving this problem for different values of the parameter \(\delta \in [0, 1]\) and keeping the solution that yields the highest objective function value, provides the portfolio weights that maximize the Omega ratio. It is worth noting that for a discrete probability distribution, problem (6) is expressed in linear programming form. In the present study, we employ this approach with \(\tau =0\).

2.2 Robust models

RO extends the framework of traditional optimization models, by incorporating uncertainty as a parameter of the problem. RO models can be considered as worst-case formulations of the original problem as far as the deviations of the parameters from their nominal values are concerned. In the context of RO, the uncertain parameters are assumed to belong to an uncertainty set, defined based on domain knowledge and information regarding the probability distributions of the parameters.

Formally, given an objective function \(f({\mathbf {x}})\) to minimize subject to constraints \(g_i({\mathbf {x}}, {\mathbf {u}}_i) \le {\mathbf {0}}\) with uncertain parameters \({\mathbf {u}}_i\) taking values in uncertainty sets \({\mathcal {U}}_i\) (\(i=1,\ldots ,m\)), the general RO formulation is:

$$\begin{aligned} \begin{array}{lll} \min &{} f({\mathbf {x}})\\ \hbox {s.t.} &{} g_i({\mathbf {x}},{\mathbf {u}}_i)\le {\mathbf {0}},&{} \forall \, {\mathbf {u}}_i \in {\mathcal {U}}_i,\quad i=1,\ldots ,m \end{array} \end{aligned}$$
(7)

In such a RO model, the goal is to identify an optimal solution among the solutions which are feasible for all realizations of the parameters \({\mathbf {u}}_i\in {\mathcal {U}}_i\). If the uncertainty sets \({\mathcal {U}}_i\) are continuous, then (7) has an infinite number of constraints, which implies that the RO model offers some “feasibility protection” when the parameters are not known exactly (Bertsimas et al. 2011). It may seem, however, that expressing a standard optimization problem in the framework of RO leads to a substantial increase in computational complexity. Nevertheless, most RO models are in fact of similar complexity as their nominal counterparts. For instance, in the context of portfolio optimization Goldfarb and Iyengar (2003) point out that a wide range of RO problems can be cast as second-order cone programs (SOCPs) that can be easily solved with existing algorithms (Lobo et al. 1998; Nesterov and Nemirovskii 1994; Sturm 1999).

The subsequent sub-sections present the RO counterparts of the portfolio construction models described in Sect. 2.1.

2.2.1 Mean–variance with box uncertainty

The simplest approach to incorporate the uncertainty that arises due to the estimation of parameters of the MV model, is to impose protection if the estimated assets’ returns \({\hat{\mu }}_1, \ldots , {\hat{\mu }}_N\) are not significantly different from their true values \(\mu _1, \ldots , \mu _N\). To this end, the following uncertainty set can be defined (Fabozzi et al. 2007):

$$\begin{aligned} {\mathcal {U}} = \{{\varvec{\mu }}\;|\;|\mu _i - {\hat{\mu }}_i|\le \varepsilon _i,\quad i=1,\ldots ,N\} \end{aligned}$$
(8)

where \(\varepsilon _i\) is the bound that defines the maximum acceptable discrepancy between the estimated and the actual returns. Following Fabozzi et al. (2007), we adopt the specification \(\varepsilon _i = 1.96s_{i}/\sqrt{T}\), where T is the sample size used in the estimation and \(s_{i}\) is the estimated standard deviation of asset’s i returns. This specification defines \(\varepsilon \) on the basis of a 95% confidence interval around the mean.

With the above uncertainty set (8), we consider a variant of the RO model with box uncertainty (MVBU) presented by Fabozzi et al. (2007), assuming no short sales, as follows:

$$\begin{aligned} \begin{array}{lll} \min &{} {\varvec{\varepsilon }}{\mathbf {x}} + {\mathbf {x}}^\top \hat{{\varvec{\Sigma }}}{\mathbf {x}} \\ \hbox {s.t.} &{} \hat{{\varvec{\mu }}}{\mathbf {x}}\ge \mu ^*\\ &{} \mathbf {1x}=1\\ &{} {\mathbf {x}}\ge {\mathbf {0}} \end{array} \end{aligned}$$
(9)

As explained in Fabozzi et al. (2007), in this model assets having a larger estimation error \(\varepsilon _i\) are penalized in the objective, thus leading to lower portfolio allocations for these assets.

2.2.2 Mean–variance with ellipsoidal uncertainty

Despite its simplicity, the box uncertainty set (8) does not consider the correlations among the assets’ returns. Ellipsoidal uncertainty sets overcome this limitation and they have been widely used for applications of RO models in portfolio construction (Fabozzi et al. 2007; Goldfarb and Iyengar 2003; Kolbert and Wormald 2010; Scherer 2007; Ye et al. 2012). In accordance with previous studies, we consider the following ellipsoidal uncertainty set:

$$\begin{aligned} {\mathcal {U}} = \Big \{{\varvec{\mu }}\;|\;({\varvec{\mu }} - \hat{{\varvec{\mu }}})^\top {\varvec{\Sigma }}_\mu ^{-1}({\varvec{\mu }} - \hat{{\varvec{\mu }}})\le \varepsilon ^2\Big \} \end{aligned}$$
(10)

where \({\varvec{\Sigma }}_{\mu }\) represents the covariance matrix of the errors in the estimation of the expected returns. Setting \(\varepsilon \) equal to the \(\alpha \)-th percentile of the \(\chi ^2\) distribution with N degrees of freedom, implies that the \(\alpha \)% of the estimated expected returns will lie inside \({\mathcal {U}}\).

With the above uncertainty set, the RO portfolio selection model is expressed as follows (model MVEU):

$$\begin{aligned} \begin{array}{lll} \min &{} {\mathbf {x}}^\top \hat{{\varvec{\Sigma }}}{\mathbf {x}} + \varepsilon \sqrt{{\mathbf {x}}^\top {\varvec{\Sigma }}_{\mu }{\mathbf {x}}} \\ \hbox {s.t.} &{} \hat{{\varvec{\mu }}}{\mathbf {x}}\ge \mu ^*\\ &{} \mathbf {1x}=1\\ &{} {\mathbf {x}}\ge {\mathbf {0}} \end{array} \end{aligned}$$
(11)

In comparison to the nominal MV model (1), the objective function of the robust counterpart MVEU incorporates the penalty term \(\varepsilon \sqrt{{\mathbf {x}}^\top {\varvec{\Sigma }}_{\mu }{\mathbf {x}}}\) into the objective, accounting for the estimation risk, with \(\varepsilon \) serving as a risk aversion parameter. In the present analysis \({\varvec{\Sigma }}_{\mu }\) is specified as a diagonal matrix with the variances of the assets returns at the diagonal. Despite its non-linear nature, the MVEU model can be solved efficiently as a second-order cone programming problem (Goldfarb and Iyengar 2003; Kolbert and Wormald 2010).

2.2.3 Worst-case CVaR

In the present study, the worst-case CVaR approach of Zhu and Fukushima (2009) is used as a robust counterpart of the CVaR model. With a probability density function of the portfolio return \(p\in {\mathcal {R}}\), Zhu and Fukushima (2009) defined the worst-case CVaR (WCVaR) as follows:

$$\begin{aligned} \text {WCVaR}_{\beta }({\mathbf {x}}) = \sup _{p\in {\mathcal {P}}}\text {CVaR}_{\beta }({\mathbf {x}}) \end{aligned}$$
(12)

where \({\mathcal {P}}\) is the set of probability distributions to which the uncertain (random) portfolio returns belong to. Following the setting of Zhu and Fukushima (2009), a mixture distribution uncertainty set is employed, which is defined as a convex combination of M distributions \(p_1,\ldots ,p_M\). With such a specification for the uncertainty set, WCVaR is given by:

$$\begin{aligned} \text {WCVaR}_{\beta }({\mathbf {x}}) = \min _\alpha \max _{k=1,\ldots ,M} F_\beta ^k({\mathbf {x}}, \alpha ) \end{aligned}$$
(13)

where \(F_\beta ^k({\mathbf {x}}, \alpha )\) is given as in (3) with probability distribution \(p_k\). The above optimization problem can be expressed in a linear programming fashion as follows:

$$\begin{aligned} \begin{array}{lll} \min &{} \theta \\ \hbox {s.t.} &{} \alpha + \frac{1}{(1-{\beta })T^k}{\mathbf {y}}^k \le \theta \quad k=1, \ldots , M\\ &{} \alpha +{\mathbf {y}}^k+{\mathbf {R}}^k {\mathbf {x}} \ge {\mathbf {0}} \quad k=1, \ldots , M\\ &{} \hat{{\varvec{\mu }}}_k{\mathbf {x}}\ge \mu ^* \quad k=1, \ldots , M\\ &{} \mathbf {1x}=1\\ &{} {\mathbf {x}}, {\mathbf {y}}^1, \ldots , {\mathbf {y}}^k\ge {\mathbf {0}}, \quad \alpha , \theta \in {\mathbb {R}} \end{array} \end{aligned}$$
(14)

2.2.4 Worst-case Omega ratio

As introduced in Sect. 2.1.3, the Omega ratio relies on the probability distribution of asset returns. To account for inexact knowledge of this distribution in the context of RO, we employ the concept of the worst-case Omega ratio (WCOR) as defined by Kapsos et al. (2014a). Formally, using a discrete analog of (5), WCOR is defined as follows:

$$\begin{aligned} \text {WCOR}_\tau ({\mathbf {x}}) \equiv \displaystyle \inf _{\small p \in {\mathcal {P}}} \frac{\hat{{\varvec{\mu }}}_p{\mathbf {x}}-\tau }{{\mathbb {E}}_p[(\tau -\mathbf {Rx})^+]} \end{aligned}$$
(15)

where \(\hat{{\varvec{\mu }}}_p\) denotes the expected returns vector estimated under probability distribution p, and \({\mathcal {P}}\) is the set of probability distributions. Thus, the RO counterpart of the Omega ratio optimization problem (6) is:

$$\begin{aligned} \max _{{\mathbf {x}}\ge {\mathbf {0}}}\min _{p\in {\mathcal {P}}}\left\{ \delta (\hat{{\varvec{\mu }}}_p{\mathbf {x}}-\tau )-(1-\delta ){\mathbb {E}}_p[(\tau -\mathbf {Rx})^+]\right\} \end{aligned}$$
(16)

Similarly to the WCOR model, a mixture distribution is employed for modeling uncertainty. With this type of uncertainty set, problem (16) can be stated as follows:

$$\begin{aligned} \begin{array}{lll} \max &{} \theta \\ \hbox {s.t.} &{} \delta (\hat{{\varvec{\mu }}}_k{\mathbf {x}}-\tau )-(1-\delta ){\mathbb {E}}_k[(\tau -{\mathbf {R}}^k{\mathbf {x}})^+]\ge \theta ,&{} \quad k=1,\ldots ,M\\ &{} \mathbf {1x}=1\\ &{} {\mathbf {x}}\ge {\mathbf {0}},\quad \theta \in {\mathbb {R}} \end{array} \end{aligned}$$
(17)

3 Experimental setup

The nominal and robust models described in the previous section are compared through an extensive empirical evaluation of the resulting portfolios, in terms of their characteristics and performance. This section describes the setting of the comparative analysis (i.e., the data and evaluation approach).

3.1 Data and evaluation approach

The analysis is based on a data set that comprises 30 stocks of the Dow Jones Index (the index constituents as of the end of 2020), starting from January 1, 2005 up to June 30, 2020. Daily returns data for the stocks were obtained from Yahoo Finance. The period of the analysis allows the consideration of periods characterized by different patterns (i.e., both bullish and bearish periods), thus enabling the investigation of the performance of portfolio optimization models under varying conditions in the financial markets.

To test the performance of the models, we follow a rolling-window approach similar to the one used by Gilli and Schumann (2011), with all tests done on a quarterly basis. More specifically, starting from the end of 2005, data for one year (2005q1–2005q4) are used as inputs to solve the optimization models and construct optimal portfolios (estimation period), whereas the performance of the portfolios is evaluated over the subsequent quarter (out-of-sample testing in 2006q1). The procedure is then repeated moving the time window one quarter ahead at each repetition of the process (e.g., the second run uses 2005q2–2006q1 for portfolio construction and 2006q2 for testing). Overall, 58 test cases are examined through this procedure. This rolling-window approach enables the testing of the models under different market conditions, including cases where portfolios are constructed during a bullish market and tested in a bearish environment (and vice versa).

In each run of the above process, the models described in Sect. 2 are applied, namely the nominal MV, CVaR, and Omega (OR) optimization models, the MV model with box and ellipsoidal uncertainty sets (MVBU and MVEU, respectively), as well as the worst-case CVaR and worst-case Omega ratio approaches (WCVaR and WCOR, respectively). For the latter two models (WCVaR, WCOR), four mixture distributions are used (i.e., \(M=4\)) corresponding to the data for the four quarters in each estimation period. All CVaR models and results are based on a 95% confidence level. While the Omega ratio models (OR and WCOR) lead to the construction of a single portfolio at each replication of the rolling-window approach, the nominal and robust MV and CVaR models are used to construct 50 portfolios on the Pareto frontier (efficient portfolios). However, the resulting efficient frontiers are not directly comparable in terms of their risk-return patterns. To overcome this issue, five portfolios are selected from the universe of 50 portfolios constructed by each model at each test. The selection is based on matching the portfolios from the different models in terms of their risk profile. This matching is performed based on the (in-sample) return of the portfolios, as well as their standard deviation and CVaR.

The above testing procedure and all models were implemented in MATLAB, using the Gurobi optimization solver.

3.2 Performance metrics

The evaluation of the portfolios derived through the considered models, is based on various indicators, categorized into two main categories. The first refers to the characteristics of the portfolios regarding their composition, whereas the second involves commonly used portfolio performance metrics.

For the analysis of the composition of the portfolios, first, we consider the number of stocks included in each portfolio (NoAssets). The more assets in a portfolio, the more difficult its management may be. On the other hand, investing in more assets improves portfolio diversification. Thus, the second indicator focuses on the diversification of the portfolios, defined through the Herfindahl–Hirschman index (HHI, Kim et al. 2013; Van Horne et al. 1975):

$$\begin{aligned} HHI=\sum _{i=1}^{N}{x_i^2} \end{aligned}$$
(18)

This HHI index ranges in [1/N, 1], with lower values indicating more diversified portfolios, whereas the case \(HHI=1\) corresponds to a portfolio consisting of a single stock.

As noted above, the composition of a portfolio also affects the way it is managed. To take this aspect into consideration, a turnover indicator is also used, in accordance with DeMiguel et al. (2007):

$$\begin{aligned} \text {Turnover} = \sum _{i=1}^{N}|x_i^{t+1}-x_i^{t}| \end{aligned}$$
(19)

where \(x_i^t\) and \(x_i^{t+1}\) denote the weights of stock i in the portfolios in two successive time periods. Lower values for this turnover ratio indicate that the composition of a portfolio remains stable over time, thus leading to lower transaction costs.

Regarding the investment performance of the portfolios, the optimization criteria considered in the selected models are used, namely expected return, standard deviation of returns, 95% CVaR, and the Omega ratio. Moreover, two additional risk-adjusted performance indicators are employed, namely the Sharpe and Sortino ratios:

$$\begin{aligned} \text {Sharpe} = \frac{\mu }{\sigma }\qquad \text {Sortino} = \frac{\mu }{\sigma _D} \end{aligned}$$
(20)

where \(\mu \) is the expected return of a portfolio, \(\sigma \) the corresponding standard deviation, and \(\sigma _D\) is the standard deviation of negative returns.

4 Results

This section presents the results of the analysis. We start with the analysis of the composition of the portfolios and then proceed with the examination of their performance, on the basis of the metrics presented in the previous section.

4.1 Composition of the portfolios

The results on the composition of the portfolios are summarized in Tables 1 and 2. The former Table involves the optimization models that provide a set of efficient portfolios, whereas the second Table refers to models that provide a single optimal portfolio, namely the two models based on the Omega ratio (nominal-OR and robust-WCOR), as well as results obtained from the minimum variance portfolio (MV\(^*\)) and minimum CVaR portfolio (CVaR\(^*\)). In both Tables, the reported results are averages over all tests of the rolling-window approach.

Table 1 Statistics on the composition of the portfolios (multi-portfolio frontiers, averages over all tests)
Table 2 Statistics on the composition of the portfolios (single portfolio models, averages over all tests)

It is evident that among the various approaches, the MV model with box uncertainty leads to the most sparse portfolios, consisting (on average) of about 2.5 stocks, followed by the CVaR and OR models, as well as the nominal MV model. The MV model with ellipsoidal uncertainty leads to the largest portfolios, in terms of their number of assets (i.e., 20–22 assets). Of course, portfolios consisting of less assets are less diversified, as it is evident by the HHI index of the MVBU model, which is the highest among all the approaches in the comparison, whereas the MVEU portfolios are the most diversified. The turnover indicator is also affected by the number of assets in the portfolios, with more diversified portfolios, generally having a lower turnover. However, the results show that this is not always the case, as the minimum variance and minimum CVaR portfolios (MV\(^*\) and CVaR\(^*\)) have lower turnover even compared to the portfolios obtained with MVEU. On the other hand, the CVaR/WCVaR and MVBU portfolios are the ones whose composition varies more over time (i.e., higher turnover).

Overall, the conclusions from this part of the analysis can be summarized as follows:

  1. 1.

    MVEU provides diversification benefits compared to the portfolios in the efficient frontier of the MV model,

  2. 2.

    WCVaR and WCOR provide similar results to their nominal counterparts,

  3. 3.

    The minimum risk portfolios obtained with nominal models (MV\(^*\) and CVaR\(^*\)) exhibit more stable compositions over time.

4.2 Portfolio performance results

The results on the performance metrics of the considered approaches over all out-of-sample tests in the rolling-window approach are summarized in Table 3. More specifically, the Table presents the daily mean return, standard deviation, CVaR (at the 95% level), as well as the Omega, Sharpe, and Sortino ratios. Results are reported for the five approaches that provide a set of efficient portfolios (MV, CVaR, MVBU, MVEU, WCVaR) and the four single portfolio models, namely the minimum variance portfolio (MV\(^*\)), the minimum CVaR portfolio (CVaR\(^*\)), the maximum Omega ratio portfolio (OR), and the portfolio that maximizes the worst-case Omega ratio (WCOR).

Table 3 Summary of performance metrics on the out-of-sample tests

Regarding the approaches that lead to a set of Pareto optimal portfolios, the worst-case CVaR model (WCVaR) leads to portfolios having the highest return, in all three schemes for matching the efficient frontiers of the models. WCVaR is followed by CVaR and MVEU, whereas MV and MVBU have lower returns. Among the single optimal portfolios, those constructed with the worst case Omega ratio (WCOR) achieve the highest return, followed by OR and CVaR.

In terms of risk, as measured by the daily standard deviation of returns and CVaR, the MV and CVaR efficient portfolios are less risky compared to the robust models MVBU, MVEU, and WCVaR, when the efficient frontiers are matched by the return of the portfolios. However, when matching the efficient sets by risk criteria (standard deviation and CVaR), the robust models MVEU and WCVaR outperform their nominal counterparts. On the other hand, MVBU performs similarly to MV, whereas WCOR does not bring improvements compared to its nominal counterpart (OR model). Nevertheless, the minimum variance and minimum CVaR portfolios, are the ones that achieve the lowest levels of risk in the out-of-sample tests, with standard deviation around 0.8% and CVaR approximately 1.7–1.8%.

As far as the risk-return performance measures are concerned (Omega, Sharpe, Sortino), the robust models MVEU, WCVaR, and WCOR consistently outperform their nominal counterparts, whereas MVBU provides inferior results compared to the standard MV approach. Moreover, the MVEU and WCVaR models perform similarly to MV\(^*\) and CVaR\(^*\), respectively.

Opting for a more concrete understanding of the relative performance of the robust models versus their nominal counterparts, the statistical significance of the differences was compared using the paired samples t-test. The comparisons were performed for each performance measure and matching scheme. Table 4 summarizes the results of the comparisons for the full period and separately for crises. Two crisis periods are identified during the period under examination, the first covering the period 2007q3–2009q1 (global financial crisis) and the second involving the first quarter of 2020 (stock market crash due to the COVID-19 pandemic). Moreover, for the comparisons involving the Pareto fronts derived with the MV and CVaR models, as well as their robust variants (MVBU, MVEU, WCVaR), Table 4 reports results for the three matching schemes, i.e., matching by return (column labeled as “Ret.”), matching by standard deviation (column labeled as “Std”), and matching by CVaR. In each comparison scenario, a robust model is compared to its nominal counterpart with the paired samples t-test (at the 5% significance level) across the six performance indicators considered in the analysis, namely return, standard deviation, CVaR, the Omega ratio, the Sharpe ratio, and the Sortino ratio. From these comparisons, we count the differences between the number of performance criteria on which a robust model performs significantly better than its nominal variant (wins) and the number of criteria where it performs worse than the nominal approach (losses). Table 4 presents these “wins–losses” differences, separately for the risk-return criteria (return, standard deviation, CVaR) and the risk-adjusted criteria (Omega, Sharpe, Sortino ratios).

Table 4 Comparison results (wins–losses) for robust versus nominal models by period and matching scheme

According to the results of Table 4 for the full period of the analysis, the MVBU model is not found to bring any improvements compared to the standard MV approach. In fact, in most cases, the MV model outperforms MVBU (i.e., the reported wins–losses differences are mostly negative). The other robust models perform worse compared to their nominal variants on the risk-return criteria when the adopted matching scheme is based on return, but they provide superior results under the matching schemes that are based on the two risk criteria (standard deviation and CVaR; i.e., positive differences). In terms of the risk-adjusted performance criteria, all robust models (except for MVBU) perform at least as satisfactory as their nominal counterparts.

Focusing on the results of the comparative tests on the crisis periods, it is evident that all robust approaches perform at least equivalently to their nominal variants as far as the risk-adjusted criteria are concerned. For instance, WCVaR outperforms CVaR on all three risk-adjusted indicators, irrespective of how the Pareto fronts of the two models are matched. MVBU also performs better than the nominal MV. For the MVBU and WCVaR approaches, similar conclusions are also drawn when focusing the comparisons on the risk-return criteria. The comparison of MVEU against MV shows that MVEU yields inferior results on the individual risk-return criteria, but its performance is similar to MV when considering the risk-adjusted metrics. Finally, no statistical differences are found for the comparison of WCOR against its nominal variant (OR model).

Table 5 provides further details on the comparisons between the robust and nominal models, reporting the mean differences between the two classes of models for the six performance criteria. The reported results are averages over all matching schemes, with positive figures indicating superior performance of the robust approaches over their nominal variants. The results are in line with those discussed above. More specifically, for the full period, the MVBU model provides worse results compared to MV, but it is consistently superior when focusing only on the tests involving periods of stock market crisis. On the other hand, the MVEU model outperforms MV in five of the six indicators for the full tests, but its performance compared to MV is worse in crisis conditions. Nevertheless, in terms of the risk-adjusted performance criteria the MVEU-MV differences are not statistically significant at the 5% level. The WCVaR-CVaR comparison leads to consistent results under both the full period and the crisis periods, with WCVaR performing better in most of the considered performance measures. It is worth noting that the differences in favor of WCVaR are higher during the crisis periods. The same observation also applies to the comparison of WCOR against OR, even though the differences in crisis periods are not found significant.Footnote 1

Table 5 Mean differences between robust and nominal models for the performance indicators

5 Conclusions

The handling of uncertainty is a crucial issue for successful portfolio optimization and management. RO has attracted much interest over the years as a powerful tool to account for estimation errors in this field. Except for theoretical developments on robust portfolio optimization, empirical assessments are also important to understand the benefits and limitations of RO approaches.

This study presented a comprehensive comparative analysis of various RO models for portfolio optimization. Using a data set from the US market, different approaches were considered not focusing solely on the mean–variance framework, but also covering alternative measures of portfolio risk and performance, such as CVaR and the Omega ratio. The results showed that although in terms of their portfolio composition features (e.g., diversification and turnover) robust models have limited differences compared to their nominal counterparts, improvements were observed in portfolio performance. The improvements were larger for risk-adjusted performance measures and under crisis conditions. Minimum risk portfolios (i.e., minimum variance and CVaR portfolios) were also found to be competitive and provide robust performance.

Future research could be extended towards various directions. First, alternative uncertainty sets could be considered, such as the multiobjective scheme of Fliege and Werner (2014) as well as improved estimation procedures for the modeling of RO approaches. Multiobjective formulations could also incorporate additional portfolio selection criteria, such as environmental, social, and governance (ESG) factors (Ballestero et al. 2012; Utz et al. 2014), thus advancing the risk-based RO framework to a more general context. Extensions to other portfolio selection contexts are also worth the investigation, such as cardinality constrained portfolio problems, index tracking, and dynamic portfolio management.