Keywords

1 Introduction

The mean-variance(MV) portfolio selection model proposed by Markowitz (1952) laid the foundation of the modern investment theory. It suggests to balance the profit and the risk in portfolio decision. Following the spirit of Markowtiz’s MV model, the framework of mean-risk portfolio analysis has been extended in various directions, e.g., see Li et al. (2006), Kolm et al. (2014), Gao and Li (2013) and the references therein. However, using variance as the risk measure has some drawbacks, i.e., it penalizes both profit and loss of the random return symmetrically. Realizing the variance is not a perfect term for risk measure, a large amount of new risk measures have been proposed since the development of the MV portfolio selection model. Among these risk measures, the Value-at-Risk (VaR), defined as the quantile of a specified exceeding probability of the loss, becomes popular in the financial industry since the mid-90s. However, the VaR fails to satisfy the axiomatic system of coherent risk measures proposed by Artzner et al. (1999), and it suffers from the non-convexity property in the corresponding portfolio optimization problems. On the other hand, the conditional Value-at-Risk (CVaR), defined as the expected value of the loss exceeding the VaR (Rockafellar and Uryasev 20002002), possesses several good properties, such as convexity, monotonicity and homogeneity, which also proved to be in the class of coherent risk measures (Pflug 2000; Artzner et al. 1999). Rockafellar and Uryasev (20002002) developed an equivalent formulation to compute the CVaR which leads to a convex optimization problem. Due to these nice properties, CVaR has been widely applied in various applications of portfolio selection and risk management, e.g., derivative portfolio (Alexander et al. 2006), credit risk optimization (Andersson et al. 2001), and robust portfolio management (Zhu and Fukushima 2009).

Although the mean-risk portfolio optimization model has been studied extensively in the academic society, translating these models as some useful tools in the real world financial practice is not a trivial task. Even for the classical MV portfolio selection model, it is well known that estimating the expected return and covariance matrix are not an easy task, especially when the size of the portfolio is large (e.g., see Merton 1980; Demiguel et al. 2009a,b). Highly related to the estimation problem of the stock return statistics, the stableness of the out-of-sample performance of the portfolio optimization model is another issue. Demiguel et al. (2009b) checked several portfolio construction methods rooted from the MV portfolio selection formulation. However, these models cannot significantly or consistently outperform the naive portfolio strategy which allocates wealth evenly in all assets. As for the mean-CVaR portfolio optimization model, since CVaR measures just a small portion of the whole distribution, a large number of samples is needed to guarantee the statistical stability. Takeda and Kanamori (2009) and Kondor et al. (2007) showed that the mean-CVaR portfolio optimization model has more serious problems of instability regarding the out-of-sample performance than the MV model. Recently, Lim et al. (2011) reported the similar results that the correspondent portfolio of the mean-CVaR portfolio decision model is extremely unreliable due to the estimation errors. Furthermore, Lim et al. (2011) showed that this problem is even worse when the distribution of the return has a heavy tail. To deal with unstable out-of-sample performance of the mean-CVaR portfolio optimization model, several methods have been proposed. Gotoh and Takeda (2011) introduced the norm-regularity in the mean-risk portfolio decision model to reduce the sparsity of the portfolio decision. Gotoh et al. (2013) further adopted the robust mean-CVaR portfolio optimization technique to overcome such an instability problem.

Motivated by the above research (Lim et al. 2011; Gotoh and Takeda 2011; Gotoh et al. 2013), we propose to use the sparse portfolio and multiple risk measures to mitigate the fragility of the CVaR based data driven portfolio selection model. More specifically, we add the l 1-norm penalty of the portfolio decision vector and the variance of the portfolio return in the mean-CVaR portfolio selection model. To enhance the sparsity of the solution, we also adopt the reweighted-l 1 norm method by computing the weights iteratively. Our numerical experiments show that the resulted out-of-sample performance is significantly enhanced comparing with the traditional DDMC portfolio optimization model.

This paper is organized as follows. The alternative formulations of the DDMC portfolio optimization problems are proposed in Sect. 10.2. The out-of-sample performance of these different models is evaluated by using the simulation approach in Sect. 10.3. The paper is concluded in Sect. 10.4.

2 The Data Driven Mean-Risk Portfolio Optimization

We consider a portfolio constructed by n candidate risky assets, whose random returns are denoted as \(\mathbf{R} \in \mathbb{R}^{n}\). Let \(\mathbf{x} = (x_{1},\cdots \,,x_{n})^{{\prime}}\in \mathbb{R}^{n}\) be the portfolio decision vector, which represents the weight of the allocation of the wealth in each securities. Let f(x, R) be the portfolio loss associated with x and R, e.g., we can simply set f(x, R) = bR x, where b is the benchmark return. To define the CVaR of the loss f(x, R) for a given confidence level β(i.e., β = 95%), we need the cumulative distribution function of f(x, R),

$$\displaystyle\begin{array}{rcl} \Psi (y) = \mathbb{P}(f(\mathbf{x},\mathbf{R}) \leq y),& & {}\\ \end{array}$$

for some number \(y \in \mathbb{R}\), the corresponding β-tail distribution for a given confidence level β is

$$\displaystyle\begin{array}{rcl} \Psi _{\beta }(y) = \left \{\begin{array}{ll} 0, &\mbox{ if}\ y <\text{VaR}_{\beta }, \\ \frac{\Psi (y)-\beta } {1-\beta },&\mbox{ if}\ y \geq \text{VaR}_{\beta }, \end{array} \right.& &{}\end{array}$$
(10.1)

where \(\text{VaR}_{\beta } =\inf \{ z\ \vert \ \Psi (y) \geq \beta \}\). The CVaR of the loss function f(x, R) is then given by

$$\displaystyle\begin{array}{rcl} \text{CVaR}[f(\mathbf{x},\mathbf{R})]:=\int _{f(\mathbf{x},\mathbf{R})\geq \text{VaR}_{\beta }}f(\mathbf{x},\mathbf{R})d\Psi _{\beta }(y),& &{}\end{array}$$
(10.2)

where the integration should be understood as a summation when R is a discrete random vector. Note that the above definition of CVaR is for the general distribution function of the loss function f(x, R), see, e.g., Rockafellar and Uryasev (2002) for some subtle difference on the definition of the CVaR between the cases of discrete random variable and continuous random variable. Rockafellar and Uryasev (2000) and Rockafellar and Uryasev (2002) showed that the CVaR[f(x, R)] can be computed by solving a simple convex optimization problem.

Lemma 2.1

The CVaR of the loss f( x , R ) of the terminal wealth can be computed as follows:

$$\displaystyle\begin{array}{rcl} \mathit{CVaR}[f(\mathbf{x},\mathbf{R})] =\min _{\alpha }\Big\{\alpha + \frac{1} {1-\beta }\text{E}\big[(f(\mathbf{x},\mathbf{R})-\alpha )^{+}\big]\Big\},& & {}\\ \end{array}$$

where α is an auxiliary variable and (y) + := max y,0.

Let D = {r 1, r 2, ⋯ , r m } be the data set of the historical returns, where \(\mathbf{r}_{i} \in \mathbb{R}^{n}\) is the i-th sample of the returns and m is the number of the samples we can observe. Without loss of generality, we assume r i and r j to be independent for any i, j ∈ { 1, ⋯ , m}. The data set D can also be regarded as m realizations of the random return R. From Lemma 2.1, if we fix the loss function as f(R, x) = bR x, the data driven mean-CVaR portfolio optimization model is given as follows:

$$\displaystyle\begin{array}{rcl} (\mathcal{P}_{1})\ \ \min _{\mathbf{x}}& \ \ \alpha + \frac{1} {m(1-\beta )}\sum _{i=1}^{m}(b -\mathbf{r}_{ i}^{{\prime}}\mathbf{x})^{+}&{}\end{array}$$
(10.3)
$$\displaystyle\begin{array}{rcl} \text{Subject to}&:& \sum _{i=1}^{n}x_{ i} = 1,{}\end{array}$$
(10.4)
$$\displaystyle\begin{array}{rcl} & & \frac{1} {m}\sum _{i=1}^{m}\mathbf{r}_{ i}^{{\prime}}\mathbf{x} \geq d,{}\end{array}$$
(10.5)

where d is a pre-given target return level. By introducing some auxiliary variables, problem \((\mathcal{P}_{1})\) can be reformulated as a linear programming problem. To overcome the instability of the out-of-sample performance of the DDMC model \((\mathcal{P}_{1})\), we propose to use the following model \((\mathcal{P}_{2}(\omega ))\) with some given weighting vector \(\omega \in \mathbb{R}^{n}\),

$$\displaystyle\begin{array}{rcl} & (\mathcal{P}_{2}(\omega )):\ \ \min _{\mathbf{x}}\ \ \alpha + \frac{1} {m(1-\beta )}\sum _{i=1}^{m}(b -\mathbf{r}_{ i}^{{\prime}}\mathbf{x})^{+} +\| \mathbf{x}\|_{ 1}^{\omega },& \\ & \text{Subject to}:\ \ \mathbf{x}\ \text{satisfies}\ (\mbox{ 10.4})\ \text{and}\ (\mbox{ 10.5}),&{}\end{array}$$
(10.6)

where ω = (ω 1, ⋯ , ω n ) with ω i  ≥ 0, for i = 1, ⋯ , n and

$$\displaystyle\begin{array}{rcl} \|\mathbf{x}\|_{1}^{\omega }:=\sum _{ i=1}^{n}\omega _{ i}\vert x_{i}\vert.& & {}\\ \end{array}$$

When ω is a unit vector with all elements being 1, the weighted l 1-norm formulation becomes the l 1-norm formulation, which is denoted by ∥ x ∥ 1. Using the l 1 norm as the penalty for the sparsity of the solution is a standard routine in data analysis. The ideal penalty of the sparsity of the solution is l 0 norm, which is defined as ∥ x ∥ 0 =  i = 1 n | Sign(x i ) | with Sign(a) = 1 if a > 0, Sign(a) = −1 if a < 0 and Sign(a) = 0 if a = 0. However, the l 0 norm is highly nonconvex and hard to be optimized directly. It has been proved that the l 1 norm of x, ∥ x ∥ 1, is the convex hull of ∥ x ∥ 0(see Zhao and Li 2012). Thus, it is reasonable to use l 1 norm as the surrogate of l 0 norm to penalize the sparsity. In model \((\mathcal{P}_{2}(\omega ))\), we prefer to use the formulation of weighted-l 1 norm, which further enhances the sparsity by varying the choice of vector ω. Note that problem (\(\mathcal{P}_{2}(\omega )\)) can be reformulated as a linear programming problem,

$$\displaystyle\begin{array}{rcl} (\bar{\mathcal{P}}_{2}(\omega )):\ \ \min _{\mathbf{x},\tau,\phi }& \alpha & + \frac{1} {m(1-\beta )}\sum _{i=1}^{m}\tau _{ i} +\sum _{ j=1}^{n}\phi _{ j}, {}\\ \text{Subject to}&:& \ \ \tau _{i} \geq 0,\ i = 1,\cdots \,,m, {}\\ & & \ \ b -\mathbf{r}_{i}^{{\prime}}\mathbf{x} \leq \tau _{ i},\ \ i = 1,\cdots \,,m, {}\\ & & \ \ \omega _{j}x_{j} \leq \phi _{j},\ \ j = 1,\cdots \,,n, {}\\ & & \ \ \omega _{j}x_{j} \geq -\phi _{j},\ \ j = 1,\cdots \,,n, {}\\ & & \ \ \ \sum _{j}^{n}x_{ j} = 1, {}\\ & & \ \ \ \ \frac{1} {m}\sum _{j}^{n}\mathbf{r}_{ j}^{{\prime}}\mathbf{x} \geq d, {}\\ \end{array}$$

where τ i for i = 1, ⋯ , m and ϕ j for j = 1, ⋯ , n are auxiliary decision variables.

In this work, we also consider to integrate the variance term of the portfolio return in model \((\mathcal{P}_{2}(\omega ))\) to further enhance the stability of the out-of-sample performance, i.e.,

$$\displaystyle\begin{array}{rcl} (\mathcal{P}_{3}(\omega )):\ \ \min _{\mathbf{x}}& \alpha & + \frac{1} {m(1-\beta )}\sum _{i=1}^{m}(b -\mathbf{r}_{ i}^{{\prime}}\mathbf{x})^{+} +\| \mathbf{x}\|_{ 1}^{\omega } +\rho \mathbf{x}^{{\prime}}F\mathbf{x} \\ \text{Subject to}&:& \ \ \mathbf{x}\ \text{satisfies}\ (\mbox{ 10.4})\ \text{and}\ (\mbox{ 10.5}), {}\end{array}$$
(10.7)

where \(F \in \mathbb{R}^{n\times n}\) is the sample covariance matrix of the asset returns. Note that, similar to problem \((\mathcal{P}_{2}(\omega ))\), problem \((\mathcal{P}_{3}(\omega ))\) can be reformulated as a convex quadratic programming formulation, which can be solved efficiently by a commercial solver like IBM CPLEX (IBM 2015).

3 Evaluation and Discussion

3.1 Evaluation Methods

To evaluate the out-of-sample performance of the three portfolio optimization models \((\mathcal{P}_{1})\), \((\mathcal{P}_{2}(\omega ))\) and \((\mathcal{P}_{3}(\omega ))\), we mainly adopt the simulation approach with all parameters being estimated from the real historical price data of some stock index. The main reason of using this approach is as follows. The number of the historical data of the monthly return is very limited in real portfolio management. Thus, it is hard to carry on various tests by solely using the true market historical data. On the other hand, by using the simulation approach, different types of test data sets can be generated, which provides us more freedom to evaluate the performances of the three models under different situations. More specifically, we adopt the following procedures.

  1. (a)

    Data Generation: Generate a data set of returns D sample = {r 1, ⋯ , r m } with a sample size being m according to some distributions of the returns.Footnote 1 For example, if we assume the random returns follow a mixed distribution of multivariate normal distribution and exponential distribution with given mean vector and covariance matrix, we then generate m samples of the returns according to this distribution.

  2. (b)

    Optimization: Solve all three problems \((\mathcal{P}_{1})\), \((\mathcal{P}_{2}(\omega ))\) and \((\mathcal{P}_{3}(\omega ))\) according to the data set D sample to generate the portfolio decisions x 1, x 2 and x 3, respectively. If it is necessary, we can vary the target return level d in three models to achieve the portfolio policy x i(d), i = 1, 2, 3, for different level of d.

  3. (c)

    Evaluation: Generate 50 data set D test (i), i = 1, ⋯ , 50 according to the similar distribution used in step Data Generation with the size of the each data set D test (i) being m. For each test set D test (i), we implement the portfolio policy x i(d), i = 1, 2, 3 and compute the corresponding empirical expected return and CVaR.

In step Evaluation, we actually perform 50 trials of out-of-sample tests and the resulted empirical sample expected return and CVaR are recorded. In each iteration, we use the IBM CPLEX (IBM 2015) as the solver to solve the corespondent linear programming and convex quadratic programming problems of \((\mathcal{P}_{1})\), \((\mathcal{P}_{2}(\omega ))\) and \((\mathcal{P}_{3}(\omega ))\).

3.2 Data Generation

In this paper, we use the 48 industry portfolios constructed by Fama and Frech as the basic data set for our test.Footnote 2 We estimate the mean return vector and covariance matrix of monthly return by using the historical monthly returns from Jan 1998 to Dec 2015. Note that there are only 216 samples of the returns, however, we need to estimate 1176 unknown parameters in the covariance matrix,Footnote 3 which implies that using the sample covariance matrix method may generate a singular matrix. To overcome this difficulty, we adopt the shrinkage estimation method for the covariance matrix proposed by Ledoit and Wolf (2003) by setting the shrinkage coefficient to 0. 1. After we have achieved the sample mean vector of the returns, \(\hat{\mathbf{R}}:= (\hat{R}_{1},\cdots \,,\hat{R}_{n})^{{\prime}}\) and the estimation of the covariance matrix \(\hat{\Sigma }:=\{ \Sigma _{i,j}\}_{i=1,j=1}^{n,n}\), we then use the following method to generate the samples. Adopting a similar setting given by Lim et al. (2011), we construct a hybrid distribution combining the multivariate normal distribution and the exponential distribution. Let B(η) be the Bernoulli random variable with parameter η, i.e., B(η) = 1 with probability η and B(η) = 0 with probability 1 −η. Let z be the exponential random variable with the probability distribution function being

$$\displaystyle\begin{array}{rcl} \mathbb{P}(z <a) =\int _{ 0}^{a}\lambda e^{\lambda s}ds.& & {}\\ \end{array}$$

In this paper, we simply fix λ = 10. Suppose the random vector \(Y \in \mathbb{R}^{n}\) follows the multivariate normal distribution with mean and covariance matrix being \(\hat{R}\) and \(\hat{\Sigma }\), respectively. We assume the random return is captured by the hybrid distribution as follows:

$$\displaystyle\begin{array}{rcl} \mathbf{R} \sim -B(\eta )\big(z\mathbf{1} + \mathbf{l}) +\big (1 - B(\eta )\big)Y,& & {}\\ \end{array}$$

where c: = (c 1, ⋯ , c n ) with \(c_{i}:=\hat{ R}_{i} -\sqrt{\Sigma _{ii}}\) for i = 1, ⋯ , n and \(\Sigma _{ii}\) is the i-th diagonal element of \(\Sigma\). Note that the parameter η controls the tail-loss of the distribution, i.e., the larger the η is, the heavier tail of the distribution will be. Figure 10.1 gives the distribution of one entry of R for different η.

Fig. 10.1
figure 1

The empirical distribution of hybrid random returns R 1 with different value of η

3.3 Re-Weighted Method for Sparse Solution

In portfolio optimization models \((\mathcal{P}_{2}(\omega ))\) and \((\mathcal{P}_{3}(\omega ))\), we use the weighted-l 1 norm to penalize the sparsity of the solution. However, since the objective function is a weighted summation of the CVaR and the weighted-l 1 norm of the portfolio weight, we need to choose the weighting parameter ω carefully. If ∥ ω ∥ is too large, the optimality of the CVaR will be jeopardized. On the other hand, if ∥ ω ∥ is too small, then the resulted solution will be not sparse enough. To overcome this difficulty, we adopt the iterative reweighted method of the l 1 norm to enhance the sparsity of the solution(see, e.g., Zhao and Li 2012). More specifically, we apply the following iterative procedure to change the weighting parameter ω dynamically and adaptively. Let \(\omega ^{(k)} \in \mathbb{R}^{n}\) and x (k) be the weighting vector and portfolio decision vector in k-th iteration, respectively. We repeat the following steps.

  1. (1)

    For any given ω (k), solve the problem \(\mathcal{P}_{2}(\omega ^{(k)})\)(or problem \((\mathcal{P}_{3}(\omega ^{(k)}))\)), which gives the solution x (k). If the stopping criteria is satisfied, e.g., the sparsity of x k does not change any more, we stop the iteration. Otherwise, go to step II.

  2. (2)

    Use x (k) to construct the new weighting parameter ω (k+1) and let k = k + 1. Go to step 1.

There are several ways to construct the new weighting vector \(\omega ^{(k+1)} =\big (\omega _{1}^{(k+1)}\), ω 2 (k+1),⋯ , \(\omega _{n}^{(k+1)}\big)^{{\prime}}\) by using the information of \(\mathbf{x}^{(k)} =\big (x_{1}^{(k)},\cdots \,,x_{n}^{(k)}\big)^{{\prime}}\). Motivated by Zhao and Li (2012) and based on our numerical experiments, we select the following three methods which perform relatively better than the others. Let ε > 0 be a small positive number.

  1. (a)

    Method I: Let ω j (k+1) = 1∕( | x j (k) | +ε) for j = 1, ⋯ , n.

  2. (b)

    Method II: Let ω j (k+1) = 1∕( | x j (k) | +ε)(1−p), for j = 1, ⋯ , n and p ∈ (0, 1).

  3. (c)

    Method III: Let \(\omega _{j}^{(k+1)} = (p + (\vert x_{i}^{k}\vert +\epsilon )^{1-p})/\big((\vert x_{i}^{(k)}\vert +\epsilon )^{1-p}\big[\vert x_{i}^{(k)}\vert +\epsilon +(\vert x_{j}^{k}\vert +\epsilon )^{p}\big]\big)\) for j = 1, ⋯ , n with p ∈ (0, 1).

It is not hard to see that when x i k is a small number then the corresponding weighting coefficient ω i k+1 will be large, which will drive x i k+1 to be even smaller in the next round of optimization.

3.4 Comparison of the Global Mean-CVaR Portfolio

In this section, we compare the out-of-sample performance of the three models \((\mathcal{P}_{1})\), \((\mathcal{P}_{2}(\omega ))\) and \((\mathcal{P}_{3}(\omega ))\) for the special case of finding the global minimum CVaR portfolio. More specifically, we consider the problems with ignoring the constraint (10.5) in all three models \((\mathcal{P}_{1})\), \((\mathcal{P}_{2}(\omega ))\) and \((\mathcal{P}_{3}(\omega ))\). Following the evaluation procedure illustrated in Sect. 10.3.1, we generate one data set D sample to compute the correspondent portfolio weights and apply such portfolio decision in 50 testing data sets D test (j) for j = 1, ⋯ , 50 as the out-of-sample tests. We check three different types of size of D sample and D test (j) as m = 200, m = 300 and m = 400.

Figures 10.2, 10.3 and 10.4 plot the empirical mean value and CVaR of the global minimum CVaR portfolio return generated from 50 out-of-sample tests. We can observe that the empirical mean and CVaR pair spread in a quite large range for model \((\mathcal{P}_{1})\). However, by using our proposed models \((\mathcal{P}_{2}(\omega ))\) and \((\mathcal{P}_{3}(\omega ))\), we can see that the range of the resulted empirical mean and CVaR pair are significantly reduced. Table 10.1 records the detail of the above experiments. The column ‘min’, ‘max’ and ‘range’ show the minimum value, the maximum value and the range(i.e., ‘max’-‘min’) of the corresponding data set, respectively. For the case m = 200, the minimum and maximum value of resulted mean and CVaR of model \((\mathcal{P}_{1})\) is from − 0. 0037 to 0. 541 and 0. 2634 to 0. 5943, respectively. That is to say, the relative difference of the out-of-sample CVaR and mean value are 0. 33 and 0. 0578 for model \((\mathcal{P}_{1})\). In the same row of m = 200, we can observe that this range is reduced to 0. 1756 and 0. 0343, respectively, for model \((\mathcal{P}_{2}(\omega ))\) and reduce to 0. 1394 and 0. 0292, respectively, for model \((\mathcal{P}_{3}(\omega ))\). From Table 10.1, we can see that, as the size of the sample increases, e.g., the case of m = 300 and m = 400, the variation of the resulted empirical mean and CVaR of model \((\mathcal{P}_{1})\), \((\mathcal{P}_{2}(\omega ))\) and \((\mathcal{P}_{3}(\omega ))\) are reduced. However, the performance of the models \((\mathcal{P}_{2}(\omega ))\) and \((\mathcal{P}_{3}(\omega ))\) is better than model \((\mathcal{P}_{1})\).

Fig. 10.2
figure 2

The out-of-sample performance of three models with sample size m = 200 and ε = 0. 1

Fig. 10.3
figure 3

The out-of-sample performance of three models with sample size m = 300 and ε = 0. 1

Fig. 10.4
figure 4

The out-of-sample performance of three models with sample size m = 400 and ε = 0. 1

Table 10.1 The empirical mean value and CVaR of portfolio returns for global minimum CVaR problem generated by different models with η = 0. 1

Table 10.2 and Figs. 10.5, 10.6 and 10.7 show the detailed results of the comparison between the three models when η = 0. 2. As we have illustrated in Sect. 10.3.2, the parameter η controls the shape of the tail distribution of the random returns. Under this case, the stock returns have heavier tails comparing with the previous case with η = 0. 1. However, a similar pattern can be observed that the formulation (\(\mathcal{P}_{2}(\omega )\)) and (\(\mathcal{P}_{3}(\omega )\)) can better control the variation of the empirical mean return and CVaR.

Table 10.2 The empirical mean value and CVaR of portfolio returns for global minimum CVaR problem generated by different models with η = 0. 2
Fig. 10.5
figure 5

The out-of-sample performance of three models with sample size m = 200 and ε = 0. 2

Fig. 10.6
figure 6

The out-of-sample performance of three models with sample size m = 300 and ε = 0. 2

Fig. 10.7
figure 7

The out-of-sample performance of three models with sample size m = 400 and ε = 0. 2

3.5 Comparison of the Empirical Efficient Frontiers

In this section, we compare the mean-CVaR efficient frontiers generated by three models (\(\mathcal{P}_{1}\)), (\(\mathcal{P}_{2}(\omega )\)) and \((\mathcal{P}_{3}(\omega ))\). The efficient frontiers are generated by varying the target return d from 0. 01 to 0. 1 in all these models. Figures 10.8, 10.9 and 10.10 plot the out-of-sample empirical mean-CVaR efficient frontier for 50 trials of simulations with η = 0. 1. Table 10.3 shows the detailed statistics of the comparison. In Table 10.3, the columns ‘min dev’, ‘max dev’ and ‘mean dev’ represent the minimum deviation, maximum deviation and average deviation of the out-of-sample CVaR and expected return.Footnote 4Note that the minimum, maximum and average deviation is computed for all different value of d in 50 trials of simulation. For all of these tests, we can observe that the proposed formulations (\(\mathcal{P}_{2}(\omega )\)) and \((\mathcal{P}_{3}(\omega ))\) perform better than the traditional model \((\mathcal{P}_{1})\). For example, in the row of m = 200 in Table 10.3, the maximum deviation of three models are 19. 98%, 17. 35% and 12. 54%, respectively. The average deviation of three models are 5. 04%, 2. 99%, and 2. 68%, respectively. Similar pattern can be observed when we increase the tail part of the distribution of the random return. Figures 10.11, 10.12, and 10.13 and Table 10.4 provide the detail of the improvement under this case.

Fig. 10.8
figure 8

The out-of-sample performance of three models with sample size m = 200 and ε = 0. 2

Fig. 10.9
figure 9

The out-of-sample performance of three models with sample size m = 300 and ε = 0. 2

Fig. 10.10
figure 10

The out-of-sample performance of three models with sample size m = 400 and ε = 0. 2

Fig. 10.11
figure 11

The out-of-sample performance of three models with sample size m = 200 and ε = 0. 2

Fig. 10.12
figure 12

The out-of-sample performance of three models with sample size m = 300 and ε = 0. 2

Fig. 10.13
figure 13

The out-of-sample performance of three models with sample size m = 400 and ε = 0. 2

Table 10.3 Comparison of mean-CVaR efficient frontiers for different models with η = 0. 1
Table 10.4 Comparison of mean-CVaR efficient frontiers for different models with η = 0. 2

4 Conclusion

In this work, we proposed some methods to reduce the instability issue of the out-of-sample performance for mean-CVaR portfolio optimization model. More specifically, we suggest to add the weighted l 1 norm as a penalty of the sparsity of the portfolio decision and add the variance term in the objective function to control the total variation in mean-CVaR portfolio formulation. In order to balance the sparsity and optimality of the solution, the reweighted l 1 norm method is adopted to adjust the weighting coefficients. Our simulation based experiments show that the proposed methods reduce the variation of the empirical mean value and the CVaR of the portfolio return in out-of-sample test significantly. However, observing from our experiment, the proposed methods still have some limitations. When the size of the portfolio is large, e.g., when n = 500, solely using our methods may not control the variation of the out-of-sample test to a desired level. A possible solution for this case is to increase the number of the samples by using some statistical sampling methods like bootstrap. Another important issue is the computational burden of the proposed methods when n and m are large. For example, for problem \((\mathcal{P}_{2}(\omega ))\), the linear programming formulation (given in Sect. 10.2) has almost m + 2n decision variables and 2(m + n) constraints. In the literature, Kunzi-Bay and Janos (2006) have showed that using the dual formulation and decomposition approach may enhance the efficiency of the solution procedure. All the models considered in this work belong to the static portfolio optimization formulation, which gives the buy-and-hold type of portfolio policy. Studying the stability issue of the out-of-sample test for multiperiod mean-CVaR portfolio optimization problem is an interesting and challenging topic.