The effect of regularization in portfolio selection problems

Pagnoncelli, Bernardo K.; del Canto, Felipe; Cifuentes, Arturo

doi:10.1007/s11750-020-00578-7

The effect of regularization in portfolio selection problems

Original Paper
Published: 25 July 2020

Volume 29, pages 156–176, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

TOP Aims and scope Submit manuscript

The effect of regularization in portfolio selection problems

Download PDF

Bernardo K. Pagnoncelli ORCID: orcid.org/0000-0003-2968-0073¹,
Felipe del Canto² &
Arturo Cifuentes³

665 Accesses
2 Citations
4 Altmetric
Explore all metrics

Abstract

Portfolio selection problems have been thoroughly studied under the risk-and-return paradigm introduced by Markowitz. However, the usefulness of this approach has been hindered by some practical considerations that have resulted in poorly diversified portfolios, or, solutions that are extremely sensitive to parameter estimation errors. In this work, we use sampling methods to cope with this issue and compare the merits of two approaches: a sample average approximation approach and a performance-based regularization (PBR) method that appeared recently in the literature. We extend PBR by incorporating three different risk metrics—integrated chance-constraints, quantile deviation, and absolute semi-deviation—and deriving the corresponding regularization formulas. Additionally, a numerical comparison using index-based portfolios is presented using historic data that includes the subprime crisis.

Robust optimization approaches for portfolio selection: a comparative analysis

Article 24 June 2021

Smoothed semicovariance estimation for portfolio selection

Article Open access 08 July 2024

Numerical Solution of the Regularized Portfolio Selection Problem

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The seminal work of Markowitz (1952) changed the landscape of asset allocation problems, which up to that point were usually tackled in an ad-hoc fashion, see Kolm et al. (2014). By casting portfolio selection as a well-defined optimization problem, he established the risk-return paradigm which is still the fundamental reference framework used by most investment professionals. In essence, he created the conceptual structure that gave birth to modern portfolio theory.

While Markowitz’s contribution cannot be overstated, the direct application of the method proposed in the original paper proved problematic in practice for a variety of reasons. First, the sensitivity of the solution to the inputs—expected returns and the covariance matrix—is considerable, and given that estimation errors are always present the resulting portfolio was often deemed unreliable (see Kolm et al. 2014 and references therein). Second, it has been observed that in many cases the solution was not fully diversified. It is widely accepted that diversification is highly desirable in practice since it is an efficient way of protecting investors against unexpected events (e.g. situations of extreme volatility or geopolitical turmoil). However, forcing diversification in an ad-hoc fashion can lead to poor quality solutions due to an overly constrained feasible set. In Green and Hollifield (1992) the authors derived theoretical conditions under which diversification occurs for the mean-variance problem.

On the other hand, using the variance as a risk measure brings both practical and theoretical limitations. Being symmetrical, the variance is sensitive to extreme values on both ends of the distribution. In essence, it mixes both, “positive” and “negative” events offering a distorted assessment of the true risk the investor is exposed to. In the last 60 years there has been a significant amount of research into different ways of capturing risk (see the excellent surveys Krokhmal et al. 2011; Mansini et al. 2014), and we make use of some of those risk measures in the present work.

Markowitz optimization model assumes the parameters are known exactly, which, of course, in practice is not the case. Several different alternatives have been proposed to overcome this issue: popular choices include the assumption that the parameters follow a known distribution function, or that they belong to a specific uncertainty set. In the first case a widely-used methodology is the sample average approximation (SAA) approach described in Shapiro (2003). Given a distribution function for the returns, a deterministic approximation of the original stochastic problem can be constructed drawing samples from the given distribution. Under certain conditions, it can be shown that, as the sample size increases, the optimal solution and optimal value of the SAA problem converge to their exact counterparts in the original stochastic problem. More traditional methods in stochastic programming often make assumptions regarding the underlying distribution of the problem. SAA is flexible, and it has been shown that even for problems with an astronomical number of scenarios good candidate solutions can be obtained by sampling a few thousand of those scenarios (e.g. Linderoth et al. 2006).

Another popular approach is robust optimization (RO), and its variations. First proposed by Soyster (1973) and later developed in Bertsimas and Sim (2004), RO pursues a worst-case approach—it attempts to protect the decision maker against all possible realizations of the random parameters by considering an uncertainty set. Thus, the decision maker needs to solve a deterministic problem to obtain a solution that offers protection against all possible realizations within this set. RO is also helpful to avoid estimation errors, generating portfolio less sensitive to parameter changes. Examples of RO applications in finance include DeMiguel and Nogales (2009), Fernandes et al. (2016), Goldfarb and Iyengar (2003), Kawas and Thiele (2011), Wang et al. (2016) and Quaranta and Zaffaroni (2008).

Building on these methods, more recent approaches consider risk-averse SAA formulations that combine machine learning and regularization schemes to obtain diversified portfolios with good out-of-sample performance. Machine learning tools, which are widely used in other areas such as regression analysis (see Bishop 2006) and data mining (see Witten et al. 2011), are recently making their way into portfolio management. Algorithms developed by this discipline, like cross-validation and classification algorithms, can be of great importance for estimation procedures and as decision supporting tools.

Regularization techniques, commonly used in high-dimensional regression problems (e.g. Belloni and Chernozhukov 2013; Candes and Tao 2007), have also made an entrance in portfolio optimization due to their ability to cope with numerical stability issues (e.g. Tikhonov 1963) or sparsity of the solution (e.g. Lasso, Tibshirani 1996). In Brodie et al. (2009) the authors reformulated the mean-variance problem into a constrained least-squares regression, and added a $\ell$-1 penalty to the objective function that encourages sparse portfolios. The $\ell$-1 regularization is the most common approach to stabilize solution of portfolio problems, see Corsaro and De Simone (2019) and Dai and Wen (2018) for additional examples. In Fastrich et al. (2015) the authors extend $\ell$-1 regularization by considering Lasso, and propose new non-convex regularization terms that exhibit good performance when applied to large data sets. The authors use cross-validation to estimate the parameter. By using tools from statistical learning theory (Still and Kondor 2010) use the $\ell$-2 norm of the weight vector to induce diversification and achieve stability on out-of-sample experiments.

Recently, Ban et al. (2016) proposed a method called performance-based regularization (PBR) which focuses on both sides of the optimization problem: it acknowledges parameter estimation difficulties, and aims at generating solutions that are more diversified, which is key in practice. The central idea is to exclude solutions that are in-sample optimal, but have potentially high out-of-sample variability. The authors considered the problem of minimizing two risk measures, the variance and the Conditional Value-at-Risk (CVaR), subject to having returns higher than some threshold, in addition to regularization constraints.

In this paper, we formulate an optimization problem based on a maximization-of-return approach (instead of a minimization-of-risk approach), since we believe is more intuitive from a practical standpoint. With that as background our objective is twofold.

First, we consider three different risk measures to cast the portfolio optimization problem constraints: (i) integrated chance constraints, proposed by Haneveld (1986); ii) quantile deviation (see Cotton and Ntaimo 2015); and, absolute semi-deviation, proposed in Ogryczak and Ruszczyski (2002). We prove certain properties of these measures that are key to construct the corresponding PBR version of the relevant constraints.

Second, an extensive numerical comparison between a pure SAA approach and a PBR-based approach is presented using the three risk measures just described, in addition to the CVaR. In light of the mounting evidence in favor or passive strategies vis-à-vis active portfolio selection strategies, we have chosen to cast the optimization problem based on indices rather than individual stocks or bonds (see Arnott et al. 2000; Bogle 1995; Elton and Blake 1996; Malkiel 1995, 1996), and we rebalance the portfolio only once a year. The experiments cover the 10-year period between January 2003 and December 2012, and thus, they include the subprime crisis, the most challenging market environment of the last 50 years. The strategy is based on a rolling horizon scheme, in which past data are used to estimate the portfolio optimization parameters, and the performance of the optimal solution is tested with samples obtained from a parameterized model of returns. By using different samples and risk measures, the results show that the formulation with PBR constraints results in higher levels of diversification compared to solutions obtained with the SAA approach. Nevertheless, in periods of relative market stability, SAA outperforms PBR because the solutions with more variability end up being the ones with higher expected returns.

The rest of the paper is organized as follows. Section 2 describes common regularized mean-risk formulations, including PBR, as well as our modification based on a max-return framework. In Sect. 3 we derived the relevant expressions to use the three risk measures mentioned within the context of PBR. Sect. 4 shows the numerical simulation results, and Sect. 5 presents our conclusions.

2 Regularized mean-risk formulations

Consider the following mean-risk formulation:

$$\begin{aligned} \min _{w \in {\mathbb {R}}^{p}} \quad&\mathrm {Risk}(w^{T}X), \nonumber \\ s.t. \quad&w^{T}{\mathbf {1}}_{p} = 1,\nonumber \\&w^{T}\mu = R, \nonumber \\&(w \ge 0), \end{aligned}$$

(1)

where $w \in {\mathbb {R}}^{p}$ is the investor’s portfolio, ${\mathbf {1}}_{p} \in {\mathbb {R}}^{p}$ is a vector of ones, $X \in {\mathbb {R}}^{p}$ is a random vector representing the return of p assets, $\mu = {\mathbb {E}}(X)$ is a vector of averages of each component of X and $\mathrm {Risk}: {\mathcal {X}} \rightarrow {\mathbb {R}}$ is a risk measure defined on some space of random variables, e.g. the $L^1$ space. The parenthesis in the last constraint indicates it is optional. For most risk measures—and most distribution functions—it is not possible to explicitly solve problem (1). The SAA formulation associated to problem (1) is given by

$$\begin{aligned} \min _{w \in {\mathbb {R}}^{p}} \quad&\widehat{\mathrm {Risk}}(w^{T}{\mathsf {X}}),\nonumber \\ s.t. \quad&w^{T}{\mathbf {1}}_{p} = 1, \nonumber \\&w^{T}{\hat{\mu }} = R, \nonumber \\&(w \ge 0), \end{aligned}$$

(2)

where ${\mathsf {X}} = (X_1,\ldots ,X_n)$ is a random vector representing observed returns, $\widehat{\mathrm {Risk}}(w^{T}{\mathsf {X}})$ is the sample estimator of $\mathrm {Risk}(w^{T}X)$ and ${\hat{\mu }}$ is the vector of sample averages based on the n observations of the random vector X, ${\hat{\mu }} = (1/n)\sum _{i=1}^{n} X_{i}$.

It is well-known that the solution of problem (2) can be highly unreliable due to estimation errors. This happens when $\mathrm {Risk}(w^{T}X) = w^{T}\Sigma w$ (the mean-variance problem) as shown in Best and Grauer (1991), Broadie (1993), Chopra and Ziemba (1993), Frankfurter et al. (1971), Frost and Savarino (1986), Frost and Savarino (1988) and Michaud (1989), and also when the risk measure is the CVaR (Lim et al. 2011). Recall that the CVaR at a confidence level $\alpha$ is defined as

$$\begin{aligned} \mathrm {CVaR}_{\alpha }(w^{T}X) := \min _{\eta \in {\mathbb {R}}} \eta + \frac{1}{1-\alpha }{\mathbb {E}}\left[ (w^{T}X - \eta \big )^{+}\right] , \end{aligned}$$

where $(a)^{+}$ denotes the maximum between $a \in {\mathbb {R}}$ and 0. In order to overcome this problem, the work Ban et al. (2016) proposes a performance-based regularization (PBR) to control the instability of the SAA solution in terms of its out-of-sample behavior. The main idea behind their method is to constrain the variances of the estimators of $\widehat{\mathrm {Risk}}(w^{T}{\mathsf {X}})$ and $\omega ^T{\hat{\mu }}$ in order to move the SAA solution away from portfolios with high variability, which tend to be less diversified and yield poor out-of-sample performance in many cases. This regularization method imposes two additional constraints to problem (2), as follows:

$$\begin{aligned} \mathrm {SV}\left[ \widehat{\mathrm {Risk}}(w^{T}{\mathsf {X}}) \right]&\le U_{1}, \end{aligned}$$

(3)

$$\begin{aligned} \mathrm {SV}\left[ w^{T}{\hat{\mu }}\right]&\le U_{2}, \end{aligned}$$

(4)

where $\mathrm {SV}(\cdot )$ is the standard unbiased sample variance estimator of the variance, denoted by $\mathrm {Var}(\cdot )$, and $U_1$ and $U_2$ are real numbers that are obtained using cross-validation on past data. The values are selected according to the solution that gives the highest Sharpe Ratio. Most importantly, the cross-validation procedure defines automatically the values of $U_1$ and $U_2$, so they are not inputs that need to be defined by the investor. Note that constraint (4) does not depend on the choice of the risk measure and can be easily converted to a quadratic constraint by noting that

$$\begin{aligned} \mathrm {Var}\left[ w^{T}{\hat{\mu }} \right] = w^{T}\Sigma w, \end{aligned}$$

where $\Sigma$ is the covariance matrix of X, which gives

$$\begin{aligned} \mathrm {SV}\left[ w^{T}{\hat{\mu }} \right] = w^{T}\Sigma _{n} w, \end{aligned}$$

where $\Sigma _{n}$ is the sample estimator of $\Sigma$.

Expression (3) is more involved and when the risk measure is the variance, or the CVaR, Ban et al. (2016) derived explicit expressions for constraint (3). Recall that one of the purposes of this paper is to derive closed-form expressions for constraint (3) when using the three risk measures mentioned before. Our first step is to consider a mean-risk formulation different from problem (2), as follows:

$$\begin{aligned} \max _{w \in {\mathbb {R}}^{p}} \quad&w^{T}{\hat{\mu }}, \nonumber \\ s.t. \quad&w^{T}{\mathbf {1}}_{p} = 1, \nonumber \\&\widehat{\mathrm {Risk}}(w^{T}{\mathsf {X}}) \le k, \nonumber \\&\mathrm {SV}\left[ \widehat{\mathrm {Risk}}(w^{T}{\mathsf {X}}) \right] \le U_{1}, \nonumber \\&w^{T}\Sigma _{n}w \le U_{2}, \nonumber \\&(w \ge 0), \end{aligned}$$

(5)

where k is a risk-tolerance parameter selected by the investor. We believe that formulating the optimization problem in reference to a return-maximization framework is more intuitive from a practical viewpoint. Additionally, we think it is easier to select a priori an appropriate risk parameter than a desired return. (Returns, unlike risk tolerance levels, can be, at least in theory, unbounded). In the next section we derive explicit convex formulations of problem (5) for the three different risk measures.

3 Extension to different risk measures

We start defining the three risk measures we will work with in this paper. In all cases let $X \in {\mathbb {R}}^{p}$ be a random vector and $w \in {\mathbb {R}}^{p}$ a vector that represents decisions. All proofs of propositions and lemmas are relegated to the appendix.

Integrated chance constraints (ICC): In Haneveld (1986) the integrated chance constraints (ICC) are defined as
$$\begin{aligned} {\mathbb {E}}_{\omega }\Big [ \big (h(\omega ) - w^{T}X(\omega ) \big )^{+}\Big ], \end{aligned}$$
where $h(\omega ) \in {\mathbb {R}}$ is a random benchmark, e.g. an index, $(x)^{+} = \max \{x,0\}$. ICC imposes that the average violation from a target must be bounded by a given threshold, which will be given by the risk-tolerance parameter k. Let ${\mathsf {Y}} = (Y_{1}, \ldots , Y_{n})$ be a random vector with $Y_{i} = (X_{i,1}, \ldots , X_{i,p}, h_{i})$, $X_{i,j}$ is the return of asset j in scenario i and $h_{i}$ is the value of the stochastic benchmark $h(\omega )$ in scenario i. We have $X_{i} = (X_{i,1},\ldots , X_{i,p})$ as the vector representing the i-th sample of asset returns and ${\mathsf {X}} = (X_{1}, \ldots , X_{n})$ is the random vector of returns, as in formulation (2). The sample estimator of $\mathrm {ICC}$ is
$$\begin{aligned} \widehat{\mathrm {ICC}}(w;{\mathsf {Y}}) = \widehat{\mathrm {ICC}}(w,h;{\mathsf {X}}) = \frac{1}{n} \sum _{i=1}^{n} (h_{i} - w^{T}X_{i})^{+}. \end{aligned}$$
(6)
Absolute semideviation (ASD): In Ogryczak and Ruszczyski (2002) the absolute semideviation (ASD) is defined as
$$\begin{aligned} \mathrm {ASD}(w) = {\mathbb {E}}\Big [ \big ( w^{T}X - w^{T}\mu \big )^{+} \Big ], \end{aligned}$$
where $\mu = {\mathbb {E}}[X]$ is the vector of mean returns. ASD measures the average one-sided excesses with respect to the mean. The sample estimator of ASD is
$$\begin{aligned} \widehat{\mathrm {ASD}}(w; {\mathsf {X}}) = \frac{1}{n} \sum _{i=1}^{n} \big (w^{T}X_{i} - w^{T}{\hat{\mu }} \big )^{+}, \end{aligned}$$
(7)
where ${\hat{\mu }} = \frac{1}{n} \sum _{i=1}^{n} X_{i}$.
Quantile deviation (QDEV): Let $f:{\mathbb {R}}^{p} \times \Omega \rightarrow {\mathbb {R}}$ be a function such that ${\mathbb {E}}\big [|f(w, X)|\big ] < \infty$ for every $X \in \Omega$. Let $\alpha \in (0,1)$. In Ogryczak and Ruszczyski (2002), quantile deviation (QDEV) is defined as
$$\begin{aligned} \mathrm {QDEV}_{\alpha }(w) = {\mathbb {E}}\Big [(1-\alpha ) \big ( (\kappa _{\alpha } - 1)f(w,X) \big )^{+} + \alpha \big ( (1 - \kappa _{\alpha })f(w,X) \big )^{+} \Big ], \end{aligned}$$
where $k_{\alpha }$ is the $\alpha$-quantile of the distribution of f(w, x). Similar to the CVaR, in Ruszczyski and Shapiro (2006) $\mathrm {QDEV}_{\alpha }[w]$ is shown to be equivalent to the result of the following minimization problem:
$$\begin{aligned} \mathrm {QDEV}_{\alpha }(w) = \min _{\eta \in {\mathbb {R}}} {\mathbb {E}}\Big [\epsilon _{1} \big ( \eta - w^{T}X \big )^{+} + \epsilon _{2} \big ( w^{T}X - \eta \big )^{+} \Big ], \end{aligned}$$
(8)
with $f(x,W) = w^{T}X, \alpha = \frac{\epsilon _{2}}{\epsilon _{1} + \epsilon _{2}}$, and $\epsilon _{1}, \epsilon _{2} > 0$. The sample estimator of QDEV based on expression (8) is
$$\begin{aligned} \widehat{\mathrm {QDEV}}_{\alpha }(w; {\mathsf {X}}) = \min _{\eta \in {\mathbb {R}}} \frac{1}{n} \sum _{i=1}^{n} \epsilon _{1}\big (\eta - w^{T}X_{i} \big )^{+} + \epsilon _{2}\big (w^{T}X_{i} - \eta \big )^{+}. \end{aligned}$$

Before presenting the expressions for $\mathrm {SV}$ of the risk measures defined above, we introduce a useful definition and a lemma.

Definition 1

Let $n \in {\mathbb {N}}$, $n \ge 2$. We define $\Omega _{n}$ as the matrix

$$\begin{aligned} \Omega _{n} := \frac{1}{n-1}\big [I_{n} - n^{-1}1_{n}1_{n}^{T}\big ], \end{aligned}$$

where $I_{n}$ is the $n \times n$ identity matrix and $1_{n} = (1, 1, \ldots , 1) \in {\mathbb {R}}^{1 \times n}$.

Lemma 1

Let ${\mathsf {z}} = (z_{1}, \ldots , z_{n})$ be a sample from a given distribution $F(\cdot )$. Then the sample variance is

$$\begin{aligned} \mathrm {SV}[ {\mathsf {z}} ] = {\mathbf {z}}^{T}\Omega _{n}{\mathbf {z}}. \end{aligned}$$

3.1 Regularized ICC constraint

For the ICC we have the following result:

Proposition 1

Let $Y_{1}, \ldots , Y_{n} \overset{\mathrm {i.i.d}}{\sim }F$, where the cumulative distribution F has finite second moment, and let $\widehat{\mathrm {ICC}}_{n}(w;{\mathsf {Y}})$ be defined as in (6). Then

$$\begin{aligned} \mathrm {Var} \left[ \widehat{\mathrm {ICC}}(w;{\mathsf {Y}}) \right] = \frac{1}{n} \mathrm {Var}\big [(h - w^{T}X)^{+}\big ]. \end{aligned}$$

Corollary 1

Under the assumptions of Proposition 1we have

$$\begin{aligned} \mathrm {SV}\left[ \widehat{\mathrm {ICC}}(w;{\mathsf {Y}})\right] = \frac{1}{n}z^{T}\Omega _{n}z, \end{aligned}$$

where $z = (z_{1}, \ldots , z_{n})$ and $z_{i} = (h_{i} - w^{T}X_{i})^{+}$.

Proof

Direct application of Lemma 1 for ${\mathsf {z}} = (z_{1}, \ldots , z_{n})$. $\square$

3.2 Regularized ASD constraint

Similarly, for the ASD we have the following proposition.

Proposition 2

Let $X_{1},\ldots X_{n} \overset{\mathrm {i.i.d}}{\sim }F$, where the cumulative distribution F has finite second moment, and let $\widehat{\mathrm {ASD}}[w; {\mathsf {X}}]$ be defined as in (7). Then

$$\begin{aligned} \mathrm {SV}\Big [\widehat{\mathrm {ASD}}(w; {\mathsf {X}}) \Big ] = \frac{1}{n} z^{T}\Omega _{n}z, \end{aligned}$$

where $z = (z_{1}, \ldots , z_{n}), z_{i} = (w^{T}X_{i} - w^{T} {\hat{\mu }})^{+}$ and ${\hat{\mu }} = (1/n)\sum _{i=1}^{n} X_{i}$ is the vector of sample averages of ${\mathsf {X}}$.

3.3 Regularized QDEV constraint

Finally, for QDEV we have

Proposition 3

Let $X_{1}, \ldots , X_{n} \overset{\mathrm {i.i.d}}{\sim }F$ be a random vector with finite second moments, and let F denote its distribution function. Let $\widehat{\mathrm {QDEV}}_{\alpha }(w; {\mathsf {X}})$ be as above and $\eta ^{*} \in {\mathbb {R}}$ such that

$$\begin{aligned} \widehat{\mathrm {QDEV}}_{\alpha }(w; {\mathsf {X}}) = \frac{1}{n} \sum _{i=1}^{n} \epsilon _{1}\big (\eta ^{*} - w^{T}X_{i} \big )^{+} + \epsilon _{2}\big (w^{T}X_{i} - \eta ^{*}\big )^{+}. \end{aligned}$$

Then

$$\begin{aligned} \mathrm {Var}\Big [\widehat{\mathrm {QDEV}}_{\alpha }(w; {\mathsf {X}}) \Big ] = \frac{1}{n} \Bigg \{\mathrm {Var}\Bigg [ \epsilon _{1}\big (\eta ^* - w^{T}X_{i} \big )^{+} + \epsilon _{2} \big (w^{T}X_{i} - \eta ^* \big )^{+} \Bigg ]\Bigg \}. \end{aligned}$$

Note that

$$\begin{aligned} 1-\alpha = 1- \frac{\epsilon _{2}}{\epsilon _{1} + \epsilon _{2}} = \frac{\epsilon _{1}}{\epsilon _{1} + \epsilon _{2}}. \end{aligned}$$

(9)

For simplicity, write $Z_i = w^{T}X_i$. From expression (9), and using $\epsilon _{1}, \epsilon _{2} > 0$ we have

$$\begin{aligned} \widehat{\mathrm {QDEV}}_{\alpha }(w; {\mathsf {X}}) = (\epsilon _{1} + \epsilon _{2}) \min _{\eta \in {\mathbb {R}}} \frac{1}{n} \sum _{i=1}^{n} (1-\alpha )\big (\eta - Z_{i} \big )^{+} + \alpha \big (Z_{i} - \eta \big )^{+}. \end{aligned}$$

(10)

The next lemma is essential to prove Proposition 3.

Lemma 2

Let $p = \lceil n\alpha \rceil - n\alpha$. Following the notation above, if $p > 0$, then $\eta ^{*} = Z_{(\lceil n\alpha \rceil -1)}$ is the unique minimizer of problem (10). Otherwise, if $p = 0$, then $\eta ^{*} = Z_{(\lceil n\alpha \rceil )}$ is one of the minimizers of problem (10).

Corollary 2

Under the assumptions of Proposition 3we have

$$\begin{aligned} \mathrm {SV}\left[ \widehat{\mathrm {QDEV}}(w;{\mathsf {X}})\right] = \frac{1}{n}z^{T}\Omega _{n}z, \end{aligned}$$

where $z = (z_{1}, \ldots , z_{n})$ and $z_{i} = \epsilon _1(\eta (p)-w^TX_i)^+ + \epsilon _2(w^TX_i - \eta (p))^+$.

Proof

Direct application of Lemma 1 for ${\mathsf {z}} = (z_{1}, \ldots , z_{n})$. $\square$

Note that for all three cases the resulting PBR constraints are quadratic, which makes the corresponding optimization problems amenable to be tackled with off-the-shelf convex commercial solvers. In the next section we use the expressions presented in Propositions 1, 2 and 3 to test the performance of PBR in practice.

4 Numerical Results

Due to the recent boom in passive investments, in addition to mounting evidence that passive (index-based) approaches tend to outperform active investment strategies, we have designed our experiment using indices instead of individual assets. An additional advantage of choosing an investment strategy based on indices is that it offers a high degree of diversification while keeping the size of the optimization problem more manageable. In our study, following Walden (2015), we have selected thirteen indices that offer a wide exposure to the most popular asset classes, namely, stocks, bonds, real estate and commodities. The indices are described in Table 1. The period considered goes from January 2000 until December 2012; thus, it includes the subprime crisis, a time period of significant market turmoil, which we judge essential to assess the virtues of any investment strategy.

Table 1 List of indices

Full size table

4.1 Design of experiments

We use monthly returns from a 3-year period to cast the optimization problem directly, with no parametric estimation, and then we test the performance on year 4. In order to perform extensive computations for year 4 we need some parametric assumption on returns. To this end, we assume that the vector of yearly returns r follows a multivariate normal distribution $N(\mu ,\sigma )$, and estimate the corresponding parameters using past data from the 3-year window. Other possibilities could have been selected as long as samples can be easily obtained by the parametric model. For each year starting in 2003 and for each risk measure, we sample 100 yearly returns and evaluate the optimal solutions obtained by the SAA and PBR formulations, using the same sample in both cases. Such parametric approach allows us to have a more exhaustive and robust assessment of each method.

Thus, we initially start with [2000, 2001, 2002] and test our results with actual returns in 2003. We end with the window [2009, 2010, 2011], testing on 2012, for a total of 10 comparison years in a rolling horizon fashion. In each case (each 3-year window) we solve the data-driven optimization problem (no parametric model is needed in this step) using the four metrics (the three presented in Sect. 3, plus the CVaR, based on the results derived in Ban et al. (2016)), for both the SAA and the PBR-based approach.

For the CVaR and QDEV we use $\alpha =0.9$, which means we want to control average tail losses (beyond the 1–0.9 = 0.1 percentile) given that a loss occurred. For ICC we use $h(\omega ) = 5.5\%$, which is a benchmark of yearly returns. Finally, the values of k for the experiments are the smallest numbers such that problem (5) is feasible for each risk measure, which explains why they vary from experiment to experiment.

The code was written in Python 2.7.13 and the problems were solved using Gurobi (version 7.0.2) on a MacBook Pro with a 2 GHz Intel Core i5 and 8 GB of RAM. In Table 2 we report for SAA and PBR the time (in s) to simulate and solve the 100 problems, and to compute the statistics of interest for the 10 years in study. The difference in computational times across methods is remarkable. This is to be expected, considering that the cross-validation procedure that selects $U_1$ and $U_2$ involves solving several auxiliary optimization problems. Since both methods are being implemented as passive investment strategies the computational times do not prevent the implementation of PBR in practice.

Table 2 Computational times for each experiment

Full size table

4.2 Diversification

From a practical viewpoint, a fundamental aspect of the resulting portfolio is its degree of diversification. Following Woerheide and Persson (1992), we use a normalized version of the complement of Herfindal’s diversification index (DI) to measure this property:

$$\begin{aligned} {\text {DI}} = \frac{1 - \sum _{i=1}^{n} w_{i}^{2}}{1 - 1/n}, \end{aligned}$$

where $w_{i}$ is the weight of index i and $n>1$ is the number of assets available for investment. According to this measure, a concentrated portfolio (only one position among the n assets) has a DI of zero; and a portfolio of equally weighted assets (the so-called 1/n portfolio), would have a DI of one, which corresponds to maximum diversification.

Figure 1 shows the average DI values, for the SAA and PBR portfolios, for each year between 2003 and 2012. The results indicate that the PBR portfolios are more diversified, and often the difference is significant. Moreover, in several cases the SAA portfolios have a ${\text {DI}} = 0$ for all samples, whereas PBR portfolios always have a positive average DI (in all years and for all four risk measures). Figure 1b (0.9-QDEV) reveals an extreme case—in 7 out of 10 years SAA portfolios have a ${\text {DI}}=0$. The reason is that in cases where regularization is not present, there was one asset with a high return, and being fully invested in this asset did not violate the risk constraint. PBR constraints are designed to avoid that: solutions that have high returns are often infeasible because their variability is too high.

It is also interesting to note that diversification is persistent over time, that is, the average ${\text {DI}}$ for the PBR portfolios is not only higher than those of the SAA portfolios, but is also more stable over the 10 year-period under study. Figure 3 shows the area below the SAA and PBR diversification trajectories displayed in Fig. 1. Total diversification—the 1/n portfolio—would have an area of 10. Thus, the entries in Table 3 can be thought of as the number of years in which the strategy corresponds to complete diversification. We observe that the PBR values are roughly between 2 and 3 times those of the SAA portfolios, indicating, again, a much higher degree of diversification.

Table 3 Area below diversification trajectories

Full size table

4.3 Risk and returns

We now turn our attention to the performance of both methods with respect to returns and variability. The comparison will take place in three time windows: pre-crisis (2003–2007), crisis (2008) and post-crisis period (2011–2012). Results in the recovery years (2009–2010) were similar between SAA and PBR and are available from the authors upon request.

4.3.1 Pre-crisis period

Table 4 shows summary statistics for realized returns during the pre-crisis years (2003–2007). All in all, SAA portfolios outperform the PBR portfolios, with differences as high as 8% on a given year. This is not surprising since the stability provided by the PBR algorithm comes at the expense of ruling out high variability solutions, which, in turn, are the ones that yield the highest out-of-sample returns. It is also noteworthy that the differences in performance between the SAA and PBR portfolios are fairly consistent across all risk measures, which validates the robustness of both methods.

Table 4 Average realized returns of optimal portfolios determined by SAA and PBR, and using four different risk measures during years pre crisis (2003–2007)

Full size table

4.3.2 The crisis of 2008

During the subprime crisis, the PBR portfolios have smaller losses, not only in average terms, but also when the maximum and minimum returns are considered, as shown in Table 5. During the crisis PBR portfolios outperform SAA portfolios in every aspect and for all four risk measures. It should be noted that realized returns are significantly low with both approaches because the parameter estimation was based on data that was unable to anticipate the crisis. It is certainly beyond the scope of this study to identify or propose indicators that could predict crises. But it suffices to say that not using regularization techniques can magnify the losses in those scenarios.

Table 5 Average realized returns of optimal portfolios determined by SAA and PBR, and using four different risk measures in 2008. The numbers in parentheses correspond to the minimum and maximum realized returns, respectively

Full size table

4.3.3 Post-crisis period

Let us now compare the two methods in 2011 and 2012. As shown in Tables 6 and 7, the difference in performance is extraordinary, with the SAA portfolios exhibiting significantly more dispersion. Moreover, the PBR portfolios offer more protection against extreme losses, without suffering a noticeable reduction in terms of either average or maximum returns.

Table 6 Average realized returns of optimal portfolios determined by SAA and PBR, and using four different risk measures in 2011

Full size table

Table 7 Average realized returns of optimal portfolios determined by SAA and PBR, and using four different risk measures in 2012

Full size table

4.4 Discussion

For an investor dealing with an asset allocation problem it is by no means clear, unless additional information is provided, which risk measure will best fit his/her interests. The results suggest that the ASD produces more diversified portfolios, albeit with lower but more stable returns. On the other hand, the ICC results in the least diversified portfolios (lowest ${\text {DI}}$ values), combined with the highest realized returns, which, not surprisingly, come with the highest variability. The CVaR portfolios are somewhere in between. Their returns are higher than those obtained by the ASD portfolios, but not as high as the ICC portfolios. Diversification for CVaR is on average higher but it is the only risk measure which generates completely concentrated portfolios (${\text {DI}}=0$) for some—but never for all—of the sampled returns, in some years.

Lastly, QDEV offers a much more complex behavior, and its effects depend greatly on the value of $\alpha$. Our experiments suggest that when this parameter gets closer to 0 or 1, the portfolios tend to be more concentrated and riskier, but also, more rewarding. Interestingly, when $\alpha$ is closer to 0.5, those solutions behave similarly to those rendered by the ASD.

We close this section with a comment regarding the effects of the right-hand side constants, $U_1$ and $U_2$, in the resulting portfolio. The constraint controlled by $U_2$ [variance of returns, constraint (4)] is the one responsible for diversification, while $U_1$ [variance of risk, constraint (3)] induces minor changes in the portfolio allocations. The former is binding more often, whereas low values of $U_2$ are a common cause for infeasibilities. It is therefore possible to infer that controlling the variance of returns excludes unreliable solutions, while controlling the variance of the risk measure improves the out-of-sample performance of the portfolio.

An important distinction must be made between the variability constraints defined by $U_1$ and $U_2$, and the risk constraint defined by k. If infeasibilities are caused by a value of k which is too small, then, no combination of the indices will yield a portfolio with the acceptable level of risk—the only course of action is to simply increase the value of k until a solution is found. Infeasibilities induced by the $U_2$ constraint are completely different. First, since this parameter is set via a machine learning procedure as described in Section 5.4 of Ban et al. (2016), we have no room to maneuver. Second, and more important, the lack of feasible solutions should be taken as a warning, since it means that the observed returns being used present high variability, which should serve as an alert to reframe the optimization problem using more data.

5 Conclusions

Since the publication of Markowitz’s seminal paper, the trade-off between risk and return within an optimization framework has attracted the attention of academics and investors alike. Nevertheless, the practical implementation of solutions obtained from such model has been marred by estimation errors and have often resulted in poorly diversified portfolios. Several tools have been developed to overcome this shortcoming, and in this work we studied a performance-based regularization (PBR) scheme, a novel regularization tool that incorporates machine learning to find the parameters that produce better out-of-sample performance. Building on Ban et al. (2016), we developed explicit convex expressions to test the PBR formulation in combination with three risk measures: integrated chance-constraints, absolute semi-deviation and quantile deviation.

We show in our numerical results that PBR is capable of delivering more diversified portfolios than those of SAA, and also more stable, over time. Additionally, the experiments show that PBR can effectively protect the investors from portfolios with low out-of-sample performance. In particular, during times of crises, PBR’s performance was superior in maximum, average and minimum observed returns for the simulated portfolios. On more stable years, SAA outperforms PBR since the elimination of solutions with high variability—precisely the ones which perform better in those years—can damage returns when market conditions are favorable. Finally, the right-hand sides of the regularized constraints are defined via cross-validation, freeing the investor of the problematic task of having to specify those parameters.

Our findings show that pure SAA techniques are not suitable to deal with practical portfolio selection problems. It is critical to impose some regularization to the problem, and our work shows that PBR is a viable and tractable choice, especially for mid–to long–term investments. Future work should focus on comparing PBR-based methods with robust or distributionally robust optimization, in combination with machine learning techniques. Another avenue of research is to explore regularization schemes in the context of stochastic dynamic—multistage—portfolio problems. It would be interesting to compare different multistage frameworks that have been studied lately, such as Expected Conditional Risk measures (Homem-de-Mello and Pagnoncelli 2016), nested risk measures (Kozmík and Morton 2015) and Expected Conditional Stochastic Dominance (Escudero et al. 2018), and understand the effect of including PBR in each case.

References

Arnott R, Berkin A, Ye J (2000) How well have investors been served in the 1980s and 1990s? J Portf Manag 26(4):84–91
Google Scholar
Ban GY, Karoui NE, Lim A (2016) Machine learning and portfolio optimization. Manag Sci (2016)
Belloni A, Chernozhukov V (2013) Least squares after model selection in high-dimensional sparse models. Bernoulli 19(2):521–547
Google Scholar
Bertsimas D, Sim M (2004) The price of robustness. Oper Res 52(1):35–53
Google Scholar
Best M, Grauer R (1991) On the sensitivity of mean-variance-efficient portfolios to changes in asset means: some analytical and computational results. Rev Financ Stud 4(2):315–342
Google Scholar
Bishop C (2006) Pattern recognition and machine learning (information science and statistics). Springer, New York
Google Scholar
Bogle J (1995) The triumph of indexing. The Vanguard Group, Pennsylvania
Google Scholar
Broadie M (1993) Computing efficient frontiers using estimated parameters. Ann Oper Res 45(1):21–58
Google Scholar
Brodie J, Daubechies I, De Mol C, Giannone D, Loris I (2009) Sparse and stable markowitz portfolios. Proc Nat Acad Sci 106(30):12267–12272
Google Scholar
Candes E, Tao T (2007) The dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35(6):2313–2351
Google Scholar
Chopra V, Ziemba W (1993) The effect of errors in means, variances, and covariances on optimal portfolio choice. J Portof Manag 19(2):6–11
Google Scholar
Corsaro S, De Simone V (2019) Adaptive $\ell_1$-regularization for short-selling control in portfolio selection. Comput Optim Appl 72(2):457–478
Google Scholar
Cotton T, Ntaimo L (2015) Computational study of decomposition algorithms for mean-risk stochastic linear programs. Math Program Comput 7(4):471–499
Google Scholar
Dai Z, Wen F (2018) Some improved sparse and stable portfolio optimization problems. Financ Res Lett 27:46–52
Google Scholar
DeMiguel V, Nogales FJ (2009) Portfolio selection with robust estimation. Oper Res 57(3):560–577
Google Scholar
Elton MGE, Blake C (1996) The persistence of risk-adjusted mutual fund performance. J Bus 69(2):133–157
Google Scholar
Escudero LF, Monge JF, Morales DR (2018) On the time-consistent stochastic dominance risk averse measure for tactical supply chain planning under uncertainty. Comput Oper Res 100:270–286
Google Scholar
Fastrich B, Paterlini S, Winker P (2015) Constructing optimal sparse portfolios using regularization methods. CMS 12(3):417–434
Google Scholar
Fernandes B, Street A, Valladão D, Fernandes C (2016) An adaptive robust portfolio optimization model with loss constraints based on data-driven polyhedral uncertainty sets. Eur J Oper Res 255(3):961–970
Google Scholar
Frankfurter G, Phillips H, Seagle J (1971) Portfolio selection: the effects of uncertain means, variances, and covariances. J Financ Quant Anal 6(5):1251–1262
Google Scholar
Frost P, Savarino J (1986) An empirical bayes approach to efficient portfolio selection. J Financ Quant Anal 21(3):293–305
Google Scholar
Frost P, Savarino J (1988) For better performance: constraint portfolio weights. J Portf Manag 15(1):29–34
Google Scholar
Goldfarb D, Iyengar G (2003) Robust portfolio selection problems. Math Oper Res 28(1):1–38
Google Scholar
Green R, Hollifield B (1992) When will mean-variance efficient portfolios be well diversified? J Financ 47(5):1785–1809
Google Scholar
Haneveld WK (1986) Duality in stochastic linear and dynamic programming. Springer, Berlin
Google Scholar
Homem-de-Mello T, Pagnoncelli BK (2016) Risk aversion in multistage stochastic programming: a modeling and algorithmic perspective. Eur J Oper Res 249(1):188–199
Google Scholar
Kawas B, Thiele A (2011) A log-robust optimization approach to portfolio management. OR Spectrum 33(1):207–233
Google Scholar
Kolm P, Tütüncü R, Fabozzi FJ (2014) 60 years of portfolio optimization: practical challenges and current trends. Eur J Oper Res 234(2):356–371
Google Scholar
Kozmík V, Morton DP (2015) Evaluating policies in risk-averse multi-stage stochastic programming. Math Program 152(1–2):275–300
Google Scholar
Krokhmal P, Zabarankin M, Uryasev S (2011) Modeling and optimization of risk. Surv Oper Res Manag Sci 16(2):49–66
Google Scholar
Lim A, Shanthikumar JG, Vahn GY (2011) Conditional value-at-risk in portfolio optimization: coherent but fragile. Oper Res Lett 39(3):163–171
Google Scholar
Linderoth J, Shapiro A, Wright S (2006) The empirical behavior of sampling methods for stochastic programming. Ann Oper Res 142(1):215–241
Google Scholar
Malkiel B (1995) Returns from investing in equity mutual funds 1971 to 1991. J Financ 50(2):549–572
Google Scholar
Malkiel B (1996) Not so random. Barron, New York, p 55
Google Scholar
Mansini R, Ogryczak W, Speranza MG (2014) Twenty years of linear programming based portfolio optimization. Eur J Oper Res 234(2):518–535
Google Scholar
Markowitz H (1952) Portfolio selection. J Financ 7(1):77–91
Google Scholar
Michaud R (1989) The Markowitz optimization enigma: is ’optimized’ optimal? Financ Anal J 45(1):31–42
Google Scholar
Ogryczak W, Ruszczyski A (2002) Dual stochastic dominance and quantile risk measures. Int Trans Oper Res 9(5):661–680
Google Scholar
Quaranta AG, Zaffaroni A (2008) Robust optimization of conditional value at risk and portfolio selection. J Bank Financ 32(10):2046–2056
Google Scholar
Ruszczyski A, Shapiro A (2006) Optimization of convex risk functions. Math Oper Res 31(3):433–452
Google Scholar
Shapiro A (2003) Monte Carlo sampling methods. Stochastic programming, handbooks in operations research and management science, vol 10. Elsevier, New York, pp 353–425
Google Scholar
Soyster A (1973) Convex programming with set-inclusive constraints and applications to inexact linear programming. Oper Res 21(5):1154–1157
Google Scholar
Still S, Kondor I (2010) Regularizing portfolio optimization. New J Phys 12(7):075034
Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58(1):267–288
Google Scholar
Tikhonov A (1963) Solution of incorrectly formulated problems and the regularization method. Soviet Math Dokl 4(4):1035–1038
Google Scholar
Walden M (2015) Active versus passive investment management of state pension plans: implications for personal finance. J Financ Counsel Plan 26(2):160–171
Google Scholar
Wang Z, Glynn PW, Ye Y (2016) Likelihood robust optimization for data-driven problems. CMS 13(2):241–261
Google Scholar
Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers Inc., San Francisco
Google Scholar
Woerheide W, Persson D (1992) An index of portfolio diversification. Financ Serv Rev 2(2):73–85
Google Scholar

Download references

Acknowledgements

This work was supported by Fondecyt under Grant 1170178.

Author information

Authors and Affiliations

Escuela de Negocios, Adolfo Ibáñez University, Santiago, Chile
Bernardo K. Pagnoncelli
Pontificia Universidad Católica de Chile, Santiago, Chile
Felipe del Canto
Clapes-UC, Santiago, Chile
Arturo Cifuentes

Authors

Bernardo K. Pagnoncelli
View author publications
You can also search for this author in PubMed Google Scholar
Felipe del Canto
View author publications
You can also search for this author in PubMed Google Scholar
Arturo Cifuentes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bernardo K. Pagnoncelli.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was funded by Fondecyt project 1170178.

A Proofs

Proof

(Lemma 1) We have

$$\begin{aligned} \mathrm {SV}[ {\mathsf {z}}]&= \frac{1}{n-1} \sum _{i=1}^{n} \left( z_{i} - \frac{1}{n}\sum _{j=1}^{n} z_{j}\right) ^{2} \\&= \frac{1}{n-1} \left\{ \sum _{i=1}^{n} z_{i}^{2} - \frac{2}{n} \sum _{i=1}^{n} \sum _{j=1}^{n} z_{i}z_{j} + n \cdot \frac{1}{n^{2}} \left( \sum _{j=1}^{n} z_{j} \right) ^{2}\right\} \\&= \frac{1}{n-1} \left\{ z^{T}I_{n}z - \frac{2}{n} \sum _{i=1}^{n} \sum _{j=1}^{n} z_{i}z_{j} + \frac{1}{n} \sum _{i=1}^{n} \sum _{j=1}^{n} z_{i}z_{j} \right\} \\&= \frac{1}{n-1} \left\{ z^{T}I_{n}z - \frac{1}{n} z^{T}1_{n}1_{n}^{T}z \right\} \\&= z^{T} \left\{ \frac{1}{n-1}[I_{n} - n^{-1}1_{n}1_{n}^{T}] \right\} z \\&= z^{T}\Omega _{n}z. \end{aligned}$$

$\square$

Proof

(Proposition 1) From Definition 6, given that the $Y_{i}$ are independent and identically distributed we have

$$\begin{aligned} \mathrm {Var}\left[ \widehat{\mathrm {ICC}}(w;{\mathsf {Y}}) \right]&= \frac{1}{n^{2}} \sum _{i=1}^{n} \mathrm {Var}\big [(h_{i} - w^{T}X_{i})^{+}\big ] \\&= \frac{1}{n^{2}} \cdot n\mathrm {Var}\big [(h - w^{T}X)^{+}\big ] \\&= \frac{1}{n} \mathrm {Var}\big [(h - w^{T}X)^{+}\big ]. \end{aligned}$$

$\square$

Proof

(Proposition 2) Let ${\mathsf {Y}} = (Y_{1}, \ldots , Y_{n})$ with $Y_{i} := -X_{i}$. Then the sample average of ${\mathsf {Y}}$ verifies $\hat{\mu _{{\mathsf {Y}}}} = -{\hat{\mu }}$ and we have that $Y_{i}$ are i.i.d. and they have finite second moment. Rewriting the expression for $\widehat{\mathrm {ASD}}_{n}[w; {\mathsf {X}}]$ in terms of ${\mathsf {Y}}$ we have

$$\begin{aligned} \widehat{\mathrm {ASD}}(w; {\mathsf {X}})&= \frac{1}{n} \sum _{i=1}^{n} \big (w^{T}X_{i} - w^{T}{\hat{\mu }} \big )^{+} \\&= \frac{1}{n} \sum _{i=1}^{n} \big (w^{T}\widehat{\mu _{{\mathsf {Y}}}} - w^{T}Y_{i}\big )^{+} \\&= \widehat{\mathrm {ICC}}\big (w,w^{T}\widehat{\mu _{{\mathsf {Y}}}};{\mathsf {Y}}\big ). \end{aligned}$$

Thus, by Corollary 1

$$\begin{aligned} \mathrm {SV}\Big [\widehat{\mathrm {ASD}}_{n}(w; {\mathsf {X}}) \Big ] = \frac{1}{n} {\tilde{z}}^{T}\Omega _{n}{\tilde{z}}, \end{aligned}$$

where $\Omega _{n}$ is as above, ${\tilde{z}} = ({\tilde{z}}_{1}, \ldots , {\tilde{z}}_{n})$ and ${\tilde{z}}_{i} = \big (w^{T}\widehat{\mu _{{\mathsf {Y}}}} - w^{T}Y_{i}\big )^{+}$. Rewriting the result back in terms of ${\mathsf {X}}$ gives the desired result. $\square$

Proof

(Lemma 2) The expression to be minimized in Problem (10), which we refer to as $F(\eta )$, is piecewise linear with breaking points at $Z_{1}, \ldots , Z_{n}$. For $m \in \{-\lceil n\alpha \rceil +1 , \ldots , \lceil n\alpha \rceil +n - 1\}$ we define

$$\begin{aligned} \Delta (m) := F(Z_{(\lceil n\alpha \rceil + m + 1)}) - F(Z_{(\lceil n\alpha \rceil + m)}), \end{aligned}$$

where $Y_{(i)}$ is the ith order statistic. Note that

$$\begin{aligned} \Delta (m)&= \frac{1}{n} \Bigg \{ \sum _{i=1}^{n} (1-\alpha ) \Big [ (Z_{(\lceil n\alpha \rceil + m + 1)} - Z_{i})^{+} - (Z_{(\lceil n\alpha \rceil + m)} - Z_{i})^{+} \Big ] \\&\quad + \alpha \Big [ (Z_{i} - Z_{(\lceil n\alpha \rceil + m + 1)})^{+} - (Z_{i} - Z_{(\lceil n\alpha \rceil + m)})^{+} \Big ] \Bigg \} \\&= \frac{1}{n} \Bigg \{ (1-\alpha ) \big (\lceil n\alpha \rceil + m \big ) \big (Z_{(\lceil n\alpha \rceil + m + 1)} - Z_{(\lceil n\alpha \rceil + m)} \big ) \\&\quad + \alpha (n - \lceil n\alpha \rceil - m) \big (Z_{(\lceil n\alpha \rceil + m)} - Z_{(\lceil n\alpha \rceil + m + 1)}\big ) \Bigg \} \\&= \frac{1}{n} \big (Z_{(\lceil n\alpha \rceil + m + 1)} - Z_{(\lceil n\alpha \rceil + m)} \big ) \Big [\lceil n\alpha \rceil - n\alpha + m \Big ] \\&= \frac{1}{n} \big (Z_{(\lceil n\alpha \rceil + m + 1)} - Z_{(\lceil n\alpha \rceil + m)} \big ) [p + m]. \end{aligned}$$

From the definition, $0 \le p < 1$. Thus $\Delta (m) < 0$ for $m \le -1$ and $\Delta (m) > 0$ for $m > 0$. If $p > 0$, then $\Delta (0) > 0$ and thus $\eta ^{*} = Z_{(\lceil n\alpha \rceil - 1)}$ is unique. If $p = 0$, then $\Delta (0) = 0$, i.e.,

$$\begin{aligned} F(Z_{(\lceil n\alpha \rceil + 1)}) = F(Z_{(\lceil n\alpha \rceil )}). \end{aligned}$$

Since $F(\cdot )$ is piecewise linear, then its minimum value is $F(Z_{\lceil n\alpha \rceil })$ and $\eta =Z_{(\rceil n\alpha \rceil )}$ is one of its minimizers, which concludes the proof. $\square$

Proof

(Proposition 3) Using the notation above, Lemma 2 implies that

$$\begin{aligned} \widehat{\mathrm {QDEV}}_{\alpha }(w; {\mathsf {X}})&= (\epsilon _{1} + \epsilon _{2}) \frac{1}{n} \sum _{i=1}^{n} (1-\alpha )\big (\eta (p) - Z_{i} \big )^{+} + \alpha \big (Z_{i} - \eta (p)\big )^{+} \\&= \frac{1}{n} \sum _{i=1}^{n} \epsilon _{1}\big (\eta (p) - Z_{i} \big )^{+} + \epsilon _{2} \big (Z_{i} - \eta (p)\big )^{+}, \end{aligned}$$

where $\eta (p)$ can be $Z_{(\lceil n\alpha \rceil )}$ or $Z_{(\lceil n\alpha \rceil - 1)}$ depending on the value of p. The result follows from the assumption that the $Z_i$ observations are i.i.d. $\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pagnoncelli, B.K., del Canto, F. & Cifuentes, A. The effect of regularization in portfolio selection problems. TOP 29, 156–176 (2021). https://doi.org/10.1007/s11750-020-00578-7

Download citation

Received: 18 September 2019
Accepted: 13 July 2020
Published: 25 July 2020
Issue Date: April 2021
DOI: https://doi.org/10.1007/s11750-020-00578-7

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The effect of regularization in portfolio selection problems

Abstract

Similar content being viewed by others

Robust optimization approaches for portfolio selection: a comparative analysis

Smoothed semicovariance estimation for portfolio selection

Numerical Solution of the Regularized Portfolio Selection Problem

1 Introduction

2 Regularized mean-risk formulations

3 Extension to different risk measures

Definition 1

Lemma 1

3.1 Regularized ICC constraint

Proposition 1

Corollary 1

Proof

3.2 Regularized ASD constraint

Proposition 2

3.3 Regularized QDEV constraint

Proposition 3

Lemma 2

Corollary 2

Proof

4 Numerical Results

4.1 Design of experiments

4.2 Diversification

4.3 Risk and returns

4.3.1 Pre-crisis period

4.3.2 The crisis of 2008

4.3.3 Post-crisis period

4.4 Discussion

5 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Proofs

A Proofs

Proof

Proof

Proof

Proof

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation