1 Introduction

Interval-based stochastic dominance (ISD) was proposed by Liu et al. (2021) as a viable approach to extend canonical integer-order stochastic dominance (SD) principles to a theoretically continuous dominance ordering. In that article we establish over a restricted support, an equivalence relationship between second-order stochastic dominance (SSD) and a tail risk measure denominated interval-based conditional value at risk (ICVaR), including the conditional value at risk (CVaR) studied by Rockafellar and Uryasev (2002) as a sub-case. This article builds upon that early contribution to propose a multistage mean-risk portfolio model based on the ICVaR as a risk measure and whose properties are analysed in strict relationship with the underlying stochastic dominance and more general ISD conditions. To this purpose, several methodological steps are required leading to a set of contributions which are anticipated next, before framing accurately this article in the state-of-the-art:

  • We extend the analysis on the relationship between the ICVaR risk measure and ISD principles, to be understood as a generalization of classical integer-SD partial orders, from a one-period to a multi-period framework.

  • Through the ICVaR, due to its dependence on a reference point \(\beta \), as further explained in Sect. 1, we formulate a class of dynamic mean-risk portfolio optimization problems that generalize classical mean-CVaR formulations.

  • Based on a canonical multistage scenario-based formulation we show under which conditions the proposed mean-ICVaR problem solution is sufficient to enforce ISD over several stages.

  • Genuine multistage SD formulations are well known to be computationally very expensive and hardly solvable due to the curse of dimensionality. We show that the mean-ICVaR model helps overcoming that computational constraint.

The key motivation of this article is thus to propose a multi-period mean-risk portfolio optimization model which generalizes the classical mean-CVaR model while considering its stochastic dominance implications. From a financial perspective at the grounds of the problem formulation is the concept of relative, rather than absolute, portfolio optimization in which, given an exogenous benchmark, a portfolio manager seeks a strategy outperforming that benchmark. The adoption of stochastic dominance principles in this context is surely not new. The interplay between mean-risk portfolio models and stochastic dominance principles was first investigated by Ogryczak and Ruszczynski (1999) with a focus on the class of semideviation risk measures and in Ogryczak and Ruszczyński (2002) with a dual characterization of stochastic dominance conditions. Levy (2006) in his volume on SD and investment decision making provides solid foundations for the adoption of SD as a decision paradigm in portfolio theory. Dentcheva and Ruszczyński (2010) introduce the concept of robust SD in relationship with risk-averse optimization, that finds a natural application in a financial context. Along this stream of research: Longarela (2016) proposes a characterization of portfolio efficiency based on SSD in a one period model Kallio and Hardoroudi (2018) develop a detailed computational analysis of SSD-constrained portfolio optimization models and Post and Kopa (2017) analyse a portfolio selection problem employing third-oder stochastic dominance (TSD) criteria. More recently still in a one period framework Malavasi et al. (2021) compare optimal mean-variance portfolio efficiency results to SSD-efficiency. Presented computational results along this stream of applied research are primarily based on one period, static portfolio models. A specific class of mean-risk models is based on the popular Conditional Value-at-Risk (Rockafellar & Uryasev, 2002).

The above class of mean-risk models, whose extension to a multistage model was studied by Pflug and Ruszczynski (2005) and Pflug and Pichler (2014), includes the one presented here next. SD criteria have been generally adopted to model the risk preferences of a decision-maker in multistage models focusing on a variety of application domains (Dentcheva and Ruszczyński 2008; Kopa et al. 2018; Escudero et al. 2018).

An SSD-constrained dynamic optimization problem was proposed by Dentcheva and Ruszczyński (2008), where however the problem formulation relied on a univariate SD order and the value function optimization was based on a recursive discounting process. A similar approach was more recently applied to a financial context by Mei et al. (2022). An early application of SSD criteria in a multistage asset and liability management (ALM) problem is due to Yang et al. (2010) with a focus on risk control at specific stages. More recently, yet enforcing SSD constraints at individual stages, under an independence assumption, Consigli et al. (2020) solve an individual ALM problem over a long term horizon. A similar approach was previously adopted by Kopa et al. (2018) to solve an optimal pension allocation problem based on a multi-criteria optimization problem formulation with SSD constraints at an intermediate and at the final stage. In general, outside canonical Markovian assumptions adopted in stochastic dynamic programming, the formulation of multistage stochastic problems may rely, as done in this work, rather than on discounting future payoffs, on the definition of an objective function based on the terminal wealth (Moriggia et al., 2019) or a final cost function (Singh & Dharmaraja, 2020), or an expected shortfall (Haskell & Jain, 2013). The extension of stochastic dominance principles to a multi-period dynamic framework, requires the definition of dynamic risk preferences, typically through utility functions. A recent review on dynamic risk measures in financial optimization can be found in Chen et al. (2017), clarifying the distinction between terminal, additive and recursive risk measures, consistent with respectively terminal or stage-dependent or time-consistent nested (Dentcheva et al., 2022) SD-constrained formulations. For example, the dynamic CVaR is consistent with the dynamic SSD criterion (Pflug & Ruszczynski, 2005; Pflug & Pichler, 2014; Chen et al., 2017), and the dynamic SD risk-averse measure is defined by taking benefit from the so-called expected conditional stochastic dominance (Escudero et al., 2018). This paper falls in this rather rich research line with a contribution specifically associated with the introduction of a dynamic ICVaR measure, whose definition is based on the dynamic ISD criterion.

The extension of SD principles to ISD, from a financial perspective, was motivated already in Liu et al. (2021), still in a one period setting, by the evidence of hardly feasible first-order stochastic dominance (FSD) portfolio problems and the possibility, through a partition of the portfolio returns domain, to significantly improve otherwise (SSD)-efficient portfolios. The introduction of partial orders other than (FSD), (SSD) and (TSD) has also it’s own rational in decision theory following the early works of Fishburn (1980) on continua of stochastic dominance ordering, then more recently Baucells and Heukamp (2006) propose the concept of prospect stochastic dominance, consistently with prospect theory (Kahneman & Riepe, 1998). More relevant in our context are the works by Müller et al. (2017) on first to second-order SD and by Tsetlin et al. (2015) on generalized almost stochastic dominance (GASD) as possible extension of integer SD criteria. ISD principles have been established in our early work in relationship with those contributions, but motivated primarily in a financial context. In this article we will mainly focus on the ISD implications of the proposed mean-risk portfolio model, without introducing explicitly any SD constraints in the formulation of the optimization problem and within a multi-period, rather than a static decision problem. We do that by introducing a set of lower bounds that help characterizing the dependence over time of the ICVaR measure on first or second-order portfolio and benchmark distributions. We provide a comprehensive set of computational results to support and validate the above claims.

The article will evolve from Sect. 1, where the mathematical and probabilistic properties of the ICVaR are established on their own and in relationship with ISD principles. In Sect. 2 we focus on such relationship, which is extended through a set of lower bounds, to span both first and second-order stochastic dominance-based problems. In Sect. 3 we formulate a multistage asset allocation problem based on a mean-ICVaR trade-off whose solution is expected to enforce SD conditions without introducing explicitly multistage stochastic dominance constraints. In the final Sect. 4 we analyze an extended set of results from the US market to validate in-sample the proposed model and analyse its effectiveness both in terms of stochastic dominance results relative to a benchmark and out-of-sample performance.

2 Interval-based conditional valute-at-risk

We summarize few results from Liu et al. (2021) that help characterizing the risk measure.

Consider two random variables, say W and Y: then W interval stochastically dominates (ISD) Y to the kth-order if, for a given \(\beta \in \mathbb {R}\) the following inequalities hold:

$$\begin{aligned} F_k(W,\eta ) \le F_k(Y,\eta ),&\quad \forall \eta \le \beta , \end{aligned}$$
(1)
$$\begin{aligned} F_{k+1}(W,\eta ) \le F_{k+1}(Y,\eta ),&\quad \forall \beta \le \eta , \end{aligned}$$
(2)

where \({F_{k}}(W,\eta )=\frac{\mathbb {E}[(\eta -W)_{+}^{k-1}]}{(k-1)!}\) if \(k>1\) and \({F_{1}}(W,\eta )=\mathbb {P}[W \le \eta ]\). Based on the reference point \(\beta \), we denote this stochastic dominance order by \(W\succeq _{(k,\beta )} Y\). Following (1), (2), below the \(\beta \) quantile, we adopt the stronger kth-order SD to describe the dominance relation; above \(\beta \), the weaker (\(k+1\))th-order SD. For notation simplicity, we denote the ISD dominance relationship of order k with the benchmark \(\beta \) as ISD-\(k.q_{\beta }\), where \(q_{\beta }\) is the survival value of \(\beta \) with respect to the benchmark variable Y, i.e., \(q_{\beta }=\mathbb {P}_Y(y\ge \beta )\), where y is a realization of Y. Relying on this notation, we see that we are approximating a continuous ordering scheme between traditional integer-order SD: FSD will correspond to ISD-1.0, SSD to ISD-2.0, TSD to ISD-3.0, and ISD-\(k.q_{\beta }\) for different \(q_{\beta }\) values will span the interval between the integer orders. The ISD generalization of canonical SD theory is well established across a range of optimal decision problems (Dentcheva & Ruszczynski, 2003; Müller et al., 2017; Tsetlin et al., 2015; Levy, 2006) and it was shown in Liu et al. (2021) to be particularly meaningful when tackling an optimal portfolio selection problem in which a benchmark investment policy was considered. The ability to discriminate between say first and second-order stochastic dominance over a partition of the portfolio return distribution allows a great deal of flexibility in terms of risk control and performance enhancement.

We define the ICVaR and then recall in Proposition 1, the key result linking this risk measure to second-order stochastic dominance. Over one period, the ICVaR of a random variable W with tolerance \(\alpha \) and reference point \(\beta \), is defined as:

$$\begin{aligned} \rho _{\alpha ,\beta }( W) = \sup _{\eta \le \beta } \{ \eta -\frac{1}{1-\alpha } \mathbb {E}[\eta -W]_+ \},\;\alpha \in [0,1). \end{aligned}$$
(3)

We see from Eq. (3) that the ICVaR can be understood as a generalization of Conditional Value-at-Risk (CVaR). The main difference between the two is that the supreme of ICVaR is taken over \({(-\infty ,\beta ]}\) rather than over \(\mathbb {R}\). When considering \(k=2\) in Eq. (1), on the left-hand side of the reference point, the following result links the ICVaR concept to second-order interval stochastic dominance in the domain \(\eta \le \beta \), as proven in that early article.

Proposition 1

[Liu et al. (2021)] The constraint

$$\begin{aligned} F_2(W,\eta )\le F_2(Y,\eta ),\quad \forall \;\eta \le \beta , \end{aligned}$$

is equivalent to

$$\begin{aligned} \rho _{\alpha ,\beta }( W)\ge \rho _{\alpha ,\beta }( Y ),\quad \forall \;\alpha \in [0,1). \end{aligned}$$

The ICVaR measure is thus in strict relationship with the concept of interval stochastic dominance and indeed the very denomination of this risk measure intends to emphasize that connection: an evidence inspiring this contribution. The ICVaR preserves the risk aversion induced by second-order ISD on the left tail \((-\infty ,\beta ]\).

Proposition 2

(Liu et al. (2021)) For \(\beta \ge \textrm{VaR}_\alpha (W)\),

$$\begin{aligned} \rho _{\alpha ,\beta }( W)= \textrm{CVaR}_\alpha (W); \end{aligned}$$

while for \(\beta \le \textrm{VaR}_\alpha (W)\),

$$\begin{aligned} \rho _{\alpha ,\beta }( W)= \beta -\frac{1}{1-\alpha } \mathbb {E}[\beta -W]_+. \end{aligned}$$

It is worth noting that the CVaR can be defined on the loss or the return functions. Here, we use the definition from Dentcheva and Ruszczyński (2006), \(\textrm{CVaR}_\alpha (W)=\sup _{\eta \in \mathbb {R}} \{ \eta -\frac{1}{\alpha } \mathbb {E}[\eta -W]_+ \}\), to remain consistent with the introduced notation.

Proposition 2 implies that \(\rho _{\alpha ,\beta }( W)\) is always smaller than or equal to \(\textrm{CVaR}_\alpha (W)\) and the ICVaR includes the CVaR as a special case. When \(\textrm{VaR}_\alpha (W)\) is smaller than or equal to the preset benchmark \(\beta \), the investor would just use CVaR to measure the risk. When \(\textrm{VaR}_\alpha (W)\) is greater than \(\beta \), the investor would just focus on the loss beyond the benchmark. Essentially, through \(\beta \) we may specify different shortfall distributions and the ICVaR applies to those losses larger than both the benchmark \(\beta \) and the quantile estimation \(\textrm{VaR}_\alpha (W)\). In this article, slightly abusing notation, all tail risk concepts will actually refer to returns- rather than values-at-risk. Figure 1 shows two cases: \( \rho _{\alpha ,\beta _1}( W)= \beta -\frac{1}{1-\alpha } \mathbb {E}[\beta _1-W]_+\), where \(\beta _1\) is smaller than \(\textrm{VaR}_\alpha (W)\); and \(\rho _{\alpha ,\beta _2}(W)=\textrm{CVaR}_\alpha (W)\) where \(\beta _2\) is larger than \(\textrm{VaR}_\alpha (W)\).

Fig. 1
figure 1

ICVaR and CVaR

The following proposition establishes several axiomatic properties of the risk measure.

Proposition 3

For any \(\alpha \in [0,1]\) and \(\beta \in \mathbb {R}\), \( \rho _{\alpha ,\beta }( W)\) is

  • monotone increasing: \(\rho _{\alpha ,\beta }( W)\le \rho _{\alpha ,\beta }(Y)\) for any two random variables \(W\le Y\) a.s.;

  • concave: \(\rho _{\alpha ,\beta }(\lambda W + (1-\lambda ) Y) \ge \lambda \rho _{\alpha ,\beta }(W) + (1-\lambda ) \rho _{\alpha ,\beta }(Y)\) for any random variables WY and constant \(\lambda \in [0,1]\);

  • positive homogeneous: \( \rho _{\alpha ,\beta }( k W ) = k\rho _{\alpha ,\beta /k}( W )\) for any \(k\in \mathbb {R}_{++}\);

  • cash additive: \( \rho _{\alpha ,\beta }( W + c) = \rho _{\alpha ,\beta -c}( W ) + c\) for any \(c\in \mathbb {R}\).

The proof of these properties can be found in Appendix A.

Compared with the classical CVaR measure, ICVaR maintains the same monotonicity and concavity properties, thus ensuring that the risk measure orders portfolios consistently, based on their risk-aversion attitude. We also observe that when \(\beta =-\infty \), \( \rho _{\alpha ,\beta }( W)\) is a coherent risk measure and degenerates to \(\textrm{CVaR}_{\alpha }(W)\).

The positive homogeneity and cash additivity of the ICVaR, are associated with fluctuations of the value of the benchmark. The ICVaR is proportional to the scale of the investment when the benchmark is reduced by the same proportion. Furthermore, if a constant amount is added to the portfolio value, the risk measure remains unchanged when the benchmark is correspondingly reduced by the same amount.

Notice that the ICVaR cash-additivity rules out translation invariance, which is an important property of variance as a risk measure and relevant in mean-variance portfolio optimization. Indeed cash-additivity implies that ICVaR-based risk assessment depends linearly on variations of wealth. The translation invariance of the variance ensures a stable and absolute measure of risk, which is fully consistent as decision paradigm with a portfolio problem in which an investor intends to control the risk for given expected return. Here, however, we are considering an investor primarily concerned with outperforming a benchmark strategy, from which the introduction of the ICVaR as reference risk measure. Furthermore, as the CVaR, and indeed as proposed in Liu et al. (2021) for the one period problem, this risk measure may very well be adopted in data-driven, non parametric problems. We elaborate further on this point in Sect. 3.3.

In the context of relative portfolio optimization, based on a benchmark portfolio such as a market index Y, we capture the risk of portfolio W with respect to Y with the ICVaR \(\rho _{\alpha , \beta }(W_Y)\), where \(W_Y{:}{=}W-Y\). The risk measure will then focus on the excess tail risk, where the tail depends on a previously specified \(\beta \). We further elaborate on this concepts in Sect. 2.2.

In this article we extend the risk measure to a multi-period setting. When considering a random wealth process \(W_{1,T}{:}{=}\{W_t\}_{t=1}^{T}\) over T periods, we focus on the tail risk exceeding both the quantile and a pre-specified process \(\{\beta _t\}_{t=1}^{T}\), that may be assumed to reflect the investor’ risk preference: in what follows we will consider a constant \(\beta \) over a short investment horizon and focus on the tail risk exposure at the end of the investment period, with the ICVaR measure \(\rho _{\alpha ,\beta _T}(W_{1,T}){:}{=} \rho _{\alpha ,\beta }(W_T)\). The terminal ICVaR is monotone increasing and concave, moreover, it is translation-invariant and positive homogeneous when the benchmark is simultaneously adjusted (Liu et al., 2021; Chen et al., 2017).

By defining the ICVaR with respect to the returns’ distribution at the end of the planning horizon, given normalized unit portfolio values for say \(W_0\) and \(Y_0\), we just require returns to be compounded over the problem stages. Then \(W_T-Y_T\) will lead to \(\rho _{\alpha , \beta }( W_T - Y_{T} )\).

3 ICVaR and stochastic dominance

We generalize to generic ISD principles the relationship between the ICVaR measure and ISD-2 established in Proposition 2, first in a one period and then in a multi-period setting.

3.1 Gap function of order k

Following from above, for \(k=1,2\), the ISD condition in Eq. (1) is equivalent to having

$$\begin{aligned} \inf _{\eta \le \beta } \left[ F_{k}(Y,\eta )- F_{k}(W,\eta ) \right] \ge 0. \end{aligned}$$

On these grounds, primarily for the cases \(k=1,2\), we introduce the function:

$$\begin{aligned} \mathscr {H}_{k}(Y,W,\beta ) {:}{=} \inf _{\eta \le \beta } \left[ F_{k}(Y,\eta )- F_{k}(W,\eta ) \right] . \end{aligned}$$
(4)

We refer to \(\mathscr {H}_{k}(Y,W,\beta )\) as the gap function of order k in the domain \(\eta \le \beta \). Thus, given \(\beta \), the ISD constraint in Eq. (1) is equivalent to having \( \mathscr {H}_{k}(Y,W,\beta )\ge 0\), \(k=1,2\).

To capture the implications of the gap function of order \(k=1,2\), as the reference point \(\beta \) varies, consider the following example.

Example 1

In a security market, there exists a market index Y and a portfolio W with the following return distributions:

  • Y follows a uniform distribution on \([-1,1]\);

  • W follows a piecewise uniform distribution on \([-1,1]\) with density

    $$\begin{aligned} p(x)= \left\{ \begin{array}{ll} {1}/{8},&{} x\in [-1,-0.2],\\ 2,&{} x\in (-0.2,0.1],\\ {1}/{3},&{} x\in (0.1,1]. \end{array} \right. \end{aligned}$$
Fig. 2
figure 2

Gap function of order 1 over different domains

Figure 2 displays the distribution functions of W and Y and for two sub-cases with \(\beta =0.6\) or 0, how to compute the gap function of order 1. The thin dotted blue lines between the two distribution functions (from the dashed line to the solid line) show \(F_{1}(Y,\eta )- F_{1}(W,\eta )\) for different values of \(\eta \) in the domain \(\eta \le \beta \). The gap function from Y to W of order 1 computes the minimal value of \(F_{1}(Y,\eta )- F_{1}(W,\eta )\) in the domain \(\eta \le \beta \). For the case \(\beta =0.6\) on the left: \(\mathscr {H}_{1}(Y,W,\beta )=-0.15\); while for the case \(\beta =0\) on the right: \(\mathscr {H}_{1}(Y,W,\beta )=0\).

Fig. 3
figure 3

Gap function of order 2

Figure 3 shows the \(F_2\) functions of W and Y. For any \(\beta \ge 0\), we have \( \mathscr {H}_{1}(Y,W,\beta )=-0.15\) and W dominates Y in the ISD-2 sense. We can see that when \(\beta =0.6\), we don’t have ISD-1 because of a negative \( \mathscr {H}_{1}(Y,W,\eta )\) for \(0<\eta \le 0.6\). When \(\beta =0\), however the \(\mathscr {H}_{1}(Y,W,\eta )\ge 0\) for any \(\eta \le 0\) and we have FSD over \((-\infty ,\beta )\). A nonnegative gap function is a sufficient and necessary condition for ISD-1 dominance over the benchmark.

The function, furthermore, preserves the ISD order, in the sense of the following proposition.

Proposition 4

If two random variables \(W_1\) and \(W_2\) satisfy \(W_{1}\succeq _{(k,\beta )} W_2\), then, for \(k=1,2\), \(\mathscr {H}_{k}(Y,W_1,\beta ) \ge \mathscr {H}_{k}(Y,W_2,\beta )\).

Proof

Since \(W_{1}\succeq _{(k,\beta )} W_2\). Then \(\forall \eta \le \beta \), we have \(F_k(W_{1},\eta ) \le F_k(W_{2},\eta )\) and thus \(F_k(Y,\eta )- F_k(W_{1},\eta ) \ge F_k(Y,\eta )- F_k(W_{2},\eta )\). Taking the infimum on both sides, we have

$$\begin{aligned} \mathscr {H}_{k}(Y,W_1,\beta ) \ge \mathscr {H}_{k}(Y,W_2,\beta ), \end{aligned}$$

for \(k=1,2\). \(\square \)

As a result of Eq. (4) and the example, we can thus establish an equivalence between the ISD definition in Eqs. (1) and (2) and the gap function for \(k=1,2\) as specified in Table 1. The term \(\infty \) in \(\mathscr {H}_2(Y,W,.)\) means that the infimum in the definition of the gap function (4) is taken for \(\eta \in \mathbb {R}\).

Table 1 Implication of the nonnegativity of the gap function

A nonnegative second-order gap function between W and Y guarantees second-order stochastic dominance (ISD-2.0) of W with respect to Y. Together with a nonnegative first-order gap function it implies a first-order ISD with benchmark \(\beta \).

3.2 Bounds on the ICVaR function

Let now \(W_Y=W-Y\) be the random variable associated with the difference between the portfolio return W and a benchmark return Y. In Sect. 3 we propose a risk-reward model based on ICVaR, that depending on the adopted trade-off between risk and reward, as explained later, may as a by-product enforce the stochastic dominance relationship on the tail \((-\infty ,\beta )\): this evidence is consistent with Propositions 5 and 6, where we show that the ICVaR of the difference between two random variables defines indeed a lower bound on the gap function \(\mathscr {H}_{k}\) in Eq. (4), for \(k=1,2\). The results are established in Propositions 5 and 6. The proofs and technical details are in Appendix A.

Proposition 5

For \(0 \le \alpha < 1\) and \(\beta \le VaR_{\alpha }(W_Y)\), we have:

$$\begin{aligned} \rho _{\alpha ,\beta }(W_Y) \le \beta + \frac{1}{(1-\alpha )}\mathscr {H}_{2}(Y+\beta ,W,\infty ). \end{aligned}$$
(5)

The condition \(\beta \le \textrm{VaR}_\alpha (W_Y)\) in the proposition implies \( \rho _{\alpha ,\beta }( W_Y)= \beta -\frac{1}{1-\alpha } \mathbb {E}[\beta -(Y-W)]_+\), by Proposition 2. Then the property of the expected positive part function implies the conclusion. We remark that the infinity term \(\infty \) means that the infimum of the gap function (4) is taken for all \(\eta \in \mathbb {R}\).

Remark

:The inequality (5) involves three terms: the ICVaR function, the parameter \(\beta \) and the second-order gap function \(\mathscr {H}_{2}\). It’s solution defines a lower bound for \(\mathscr {H}_{2}(Y+\beta ,W,\infty )\) based on the ICVaR and the \(\beta \). This lower bound is given by

$$\begin{aligned} (1-\alpha )(\rho _{\alpha ,\beta }(W_Y) -\beta ). \end{aligned}$$
(6)

Thus a nonnegative lower bound implies and it is implied by an ISD-2.0 dominance of W over the translated benchmark Y+\(\beta \).

Proposition 5 establishes a relationship between the ICVaR and SSD. A connection with first-order ISD can also be established through \(\mathscr {H}_1(.,W,\beta )\).

Proposition 6

For \(0 \le \alpha < 1\) and \(\beta \) nonpositive satisfying \( \beta \le \textrm{VaR}_{\alpha }(W_Y)\), we have a lower bound for the first-order gap function

$$\begin{aligned} \rho _{\alpha ,\beta }(W_Y) \le e(Y,\beta )+\beta +\frac{|\beta |}{1- \alpha } \mathscr {H}_{1}(Y+2\beta ,W,\beta ). \end{aligned}$$
(7)

Here, the error function \(e(Y,\beta )\) depends only on the benchmark Y and the parameter \(\beta \), and it is defined as

$$\begin{aligned} e(Y,\beta )= \frac{1}{1-\alpha } \sup _{\eta \le 0}\left[ F_2(Y+\beta ,\eta ) -|\beta | F_1(Y+\beta ,\eta ) \right] . \end{aligned}$$

We detail the proof in Appendix A. The proof of this result relies on Chebyshev’s inequality, from which we have that \(F_2(W,\eta )\ge |\beta | F_1(W,\eta +\beta )\) for any \(\eta \in \mathbb {R}\). Since Chebyshev’s inequality considers non-negative numbers, this explains why we include the absolute value of \(\beta \) in the inequality.

Remarks

  1. (i)

    A nonnegative lower bound will guarantee that the constraint in Eq. (1) holds on \((-\infty ,\beta )\) for the translated benchmark \(Y+2 \beta \). In particular, if the distribution W stochastically dominates \(Y+2 \beta \) to the second-order and if the first-order lower bound with respect to \(\beta \) is nonnegative, then this is sufficient to have \(W \succeq _{1. \beta }( Y+ 2\beta )\). However if only the second-order lower bound is nonnegative, we cannot guarantee an ISD-1 dominance.

  2. (ii)

    For given \(\alpha \), consider a \(\beta \) approaching 0 from the left, then the term \( |\beta | \mathscr {H}_{1}(Y+2\beta ,W,\beta )\) in Eq. (7) will tend to 0 and at the same time the error function e(Y, 0) will diverge and depend only on \(F_2(Y,\eta )\). In this case the lower bound on \(\mathscr {H}_1(Y+2\beta ,W,\beta )\) will also diverge and hardly provide any information on first-order stochastic dominance.

The inequality (7) involves four terms: the ICVaR, the parameter \(\beta \), the error function \(e(Y,\beta )\) and the first-order gap function \(\mathscr {H}_{1}\). By solving the inequality, we obtain a lower bound for \(\mathscr {H}_{1}\) that depends on the ICVaR function, on \(\beta \), and on the error \(e(Y,\beta )\):

$$\begin{aligned} \frac{(1- \alpha )}{|\beta |} (\rho _{\alpha ,\beta }(W_Y) - e(Y,\beta )-\beta ). \end{aligned}$$
(8)

Equations (7) and (8) include the error term \(e(Y,\beta )\). By definition \(F_1(Y)\) is just the cdf of the benchmark return distribution and the primitive of \(F_2(Y)\). For given \(\beta \), an error close to 0 will then specify the support in which the two distributions agree. Then the error function will increase. A tight bound on the first-order gap function requires a small error function. By recursively increasing \(\beta \) we can thus infer, as further explained in Sect. 2.3, the prevailing ISD order.

3.3 Error function: from one to multi-period

We consider here next a numerical example to clarify the behaviour of the error function and its’ implications on second and first-order stochastic dominance. Let, in particular, the error function \(e(Y,\beta )\) be estimated on weekly returns of the S &P500 index from Jan 3, 2019 to December 25, 2022 for different \(\beta \)s and assuming \(\alpha =0.95\). In Fig. 4, we estimate the error \(e(Y,\beta )\) for \(\beta \in [-0.1,0]\).

Fig. 4
figure 4

The error term \(e(Y,\beta )\) for \(\beta \in [- 0.1,0]\)

We see, in this case, that the error \(e(Y,\beta )\) increases rapidly as \(\beta \) approaches 0 from the left. A tolerable error, less than \(1\%\) thus an effective lower bound requires \(\beta \le -0.03\). In Sect. 4, we will provide evidence on the ICVaR maximization problem as \(\beta \) varies between \(-0.03\) and 0 and show that indeed within this range the problem solution leads consistently to ISD-1 conditions under several problems’ specification.

When \(\beta =\textrm{VaR}_{\alpha }(W_Y)\) the ICVaR will coincide with the CVaR. Thus a \(\beta =\textrm{VaR}_{\alpha }(W_Y)\) close to zero would imply a weak lower bound on \(\mathscr {H}_1\). In this case a strong and tight bound on \(\mathscr {H}_{1}(Y+2\beta ,W,\beta )\) will depend crucially on the selection of \(\beta \): we examine this issue further in Sect. 4, devoted to computational evidence. From Proposition 5, however we see that a tight lower bound for SSD requires a \(\beta \) close to 0. The error function does indeed depend on the distance between the benchmark second and first-order distributions. Through the ICVaR, as \(\beta \) approaches 0, we are then sure to control SSD and ISD-2, while the enforcement of ISD-1 conditions is not guaranteed, as it depends on the selected \(\beta \) and associated error function, for given optimal portfolio distribution.

We extend the definition of the error function in Proposition 6 to several stages and summarize the implications of the lower bounds’ evolution on a multi-period ICVaR problem formulation. Consider the benchmark process \(Y_t\) evaluated in \(t=1,2,\cdots ,T\), then, for each t, we define \(e(Y_t,\beta )\) as a straightforward generalization of \(e(Y,\beta )\):

$$\begin{aligned} e(Y_t,\beta )= \frac{1}{1-\alpha } \sup _{\eta \le 0}\left[ F_2(Y_t+\beta ,\eta ) -|\beta | F_1(Y_t+\beta ,\eta ) \right] . \end{aligned}$$
(9)

In Eq. (9) we just specify the error, stage-wise, by taking the associated distributions \(F_2(Y_t,.)\) and \(F_1(Y_t,.)\), which will be applied in Sect. 3.1 to construct a boundary function and measure the performance of an optimal dynamic portfolio.

It is worth summarizing the set of relationships adopted in the sequel to support first or second-order ISD by maximizing the ICVaR function as detailed in Sect. 3 here next.

  • The pair of functions \(\mathscr {H}_k(Y,W,\beta )\), \(k=1,2\), provides relevant information on first (\(k=1\)) and second (\(k=2\)) order ISD: \(W\succeq _{k,\beta } Y\).

  • For \(k=2\), as \(\beta \rightarrow 0-\), the maximization of ICVaR \(\rho _{\alpha ,\beta }(W_Y)\) will force \(\mathscr {H}_2(Y,W,\beta )\rightarrow 0\) and thus surely enforce SSD and possibly ISD-\(1.\beta \) dominance conditions.

  • The convergence to 0 of \(\mathscr {H}_1(Y,W,\beta )\), which is necessary to establish FSD, on the other hand, due to the behaviour of the error function, may not be attained through the maximization of the ICVaR.

  • For the ISD condition \(W \succeq _{1.\beta } Y\) to be established, however it is sufficient that both \(\mathscr {H}_k(Y,W,\beta )\ge 0\) for \(k=1,2\) through a careful selection of \(\beta \). Notice that in the multistage model, the \(\mathscr {H}_k\) are defined stage-wise and do not depend on the problem dynamics.

In what follows, we apply these results to the solution of a multistage portfolio problem, in which \(\beta \) is defined as a financial return.

4 Portfolio selection with ICVaR

The solution of a static, one period, portfolio problem based on ISD principles was shown in Liu et al. (2021) to extend earlier first (FSD) and second (SSD) order stochastic dominance results to a richer set of risk preferences. The stated equivalence in Liu et al. (2021) between ISD-2 and the ICVaR allows the formulation of a decision problem based on the canonical risk-return trade-off criterion, where the risk is captured by the (ISD-2 consistent) ICVaR measure and the reward by the expected portfolio return.

Several risk-reward models including a stochastic dominance relationship are proposed by Ogryczak and Ruszczynski (1999, 2001, 2002). In this article we consider the mean-ICVaR trade-off problem and extend the modeling framework to a multi-period setting.

We assume a security market consisting of m risky assets, a risk-free asset and a market index to be taken as a benchmark. The introduction of stochastic dominance constraints in the formulation of a portfolio selection model follows naturally in the context of relative, as opposed to absolute, performance optimization: the investor would look for a portfolio strategy that outperforms the market index, depending on the SD order, in so many states of the world. The introduction of ISD criteria helps generalizing such preference order through a reference point which spans the domain of the market index and the portfolio distribution. Relative portfolio optimization is thus associated with the so-called passive, as opposed to active, portfolio management principle common in the fund management industry. The latter typically, even if not necessarily, associated with mean-variance optimization. We show however, in Sect. 4 that the multiperiod mean-ICVaR optimal portfolios show robust out-of-sample performance in terms of risk-adjusted returns. In order to outperform a market index or any other benchmark strategy, portfolio managers rely on an investment universe that would typically include a subset of the index-constituent assets plus other assets that may be negatively correlated with the benchmark. We further discuss this point in Sect. 4.

We formulate a multistage mean-ICVaR portfolio problem with a discrete and finite planning horizon \(t=1, \cdots ,T\). The decision process is specified in terms of portfolio allocations at time t, denoted, for \(i=1,2,..,m\) risk assets, by \(x_{i,t}\) and buying \(x^+_{i,t}\) or selling \(x^-_{i,t}\) decisions of asset i at time t. For \(i=0\), \(x_{0,t}\) represents a risk-free allocation, which, in this context, corresponds to cash. We assume a unit initial wealth. i.e., \(\sum _{i=0}^{m} {x}_{i,0}=W_0=1\). We denote the random returns of the risky assets at time t by \(r_{i,t}\) for \(i=1,2,..,m\) in Eq. (10f) and assume a null return on the investment in the risk-free asset. We present in Sect. 3.3 a simple scenario generation method adopted to support the dynamic problem formulation (Dupačová et al., 2000). Consistent with canonical arbitrage-free conditions as established by Klaassen (1998, 2002), we generate the tree process for both the assets’ returns in the investment universe and, following the same tree structure, for the benchmark returns. The benchmark distribution relevant for the ISD analysis is thus exogenous and enters the problem formulation through the ICVaR measure, only.

In the dynamic formulation, as previously stated, we consider a terminal ICVaR measure: \(\rho _{\alpha ,\beta }(W_T-Y_T)\), corresponding to the terminal portfolio value \(W_T\) and the terminal value of the market index \(Y_T\).

The following motivations support the adoption of a multistage, dynamic problem formulation:

  • The evaluation of the extra-return generated by a multi-period relative to a one period, myopic optimal portfolio policy. The ICVaR optimization with respect to a benchmark allows in particular the control of excess tail risk exposure over several stages.

  • By solving a multistage instance, we intend to infer, for \(k=1,2\), the ISD-k partial order induced by the solution of problem (10). The evidence is of particular relevance due to the high computational costs of a genuine multi-period ISD-k formulation.

  • Validate the multi-period ICVaR problem formulation in general as a classical mean-risk model, and specifically against a classical multistage CVaR problem formulation, under different risk-reward trade-offs.

We denote a multistage mean-ICVaR optimization instance by \(\mathscr {L}(\lambda ,\beta ,T)\), where \(\beta \) defines the terminal ICVaR risk measure, which controls the risk on the left tail at the end of the planning horizon.

$$\begin{aligned}&\max \limits _{x, x^+, x^-}&(1-\lambda ) \mathbb {E}[W_{T}] + \lambda \rho _{\alpha , \beta }( W_T - Y_{T} ) \end{aligned}$$
(10a)
$$\begin{aligned}&\mathrm{s.t.}&W_{t}= \sum _{i=0}^m x_{i,t}, \hspace{0.5cm} \ {t=1,\cdots ,T,} \end{aligned}$$
(10b)
$$\begin{aligned}{} & {} x_{0,0}=\hat{x}_{0} +(1-c_s)\sum _{i=1}^m x_{i,0}^- -(1+c_b)\sum _{i=1}^m x_{i,0}^+,\hspace{0.5cm} \end{aligned}$$
(10c)
$$\begin{aligned}{} & {} x_{i,0}=\hat{x}_{i,0}+x_{i,0}^+ - x_{i,0}^-, \hspace{0.5cm} {i=1,\cdots ,m,} \end{aligned}$$
(10d)
$$\begin{aligned}{} & {} x_{0, t}=x_{0, t-1} +(1-c_s)\sum _{i=1}^m x_{i, t}^- -(1+c_b)\sum _{i=1}^m x_{i, t}^+,\ \hspace{0.5cm} {t=1,\cdots ,T,} \qquad \end{aligned}$$
(10e)
$$\begin{aligned}{} & {} x_{i, t}=x_{i, t-1} (1+r_{i, t})+x_{i, t}^+ - x_{i, t}^-,\hspace{0.5cm} \ {i=1,\cdots ,m,\ t=1,\cdots ,T,} \qquad \end{aligned}$$
(10f)
$$\begin{aligned}{} & {} x^+_{i,T}=x^-_{i,T}=0, \ \hspace{0.5cm} {i=1,\cdots ,m}, \qquad \end{aligned}$$
(10g)
$$\begin{aligned}{} & {} x_{i,t}, x_{i,t}^+, x_{i,t}^-\ge 0, \ \hspace{0.5cm} {i=1,\cdots ,m,\ t=0,1,\cdots ,T-1.} \end{aligned}$$
(10h)

Here \(\{W_t\}_{t=1}^T\) in (10b) is the wealth process and \(W_T{:}{=}\sum _{i=1}^m x_{i,T}\) is the wealth at the terminal stage T. \(\{Y_t\}_{t=1}^T\) is the benchmark portfolio process if all wealth was invested in the market index. The portfolio evolution is captured by Eqs. (10d) and (10f) for the initial portfolio allocation and subsequent buying and selling decisions, also referred to as rebalancing decisions. No rebalancing is allowed at the end of the finite planning horizon T, as from (10g). The variable \(x_{0,t}\) in (10c) and (10e) specifies the cash balance at time t as a result of an initial cash position \(\hat{x}_{0}\) and subsequent buying and selling decisions, in this case accounting for transaction costs \(c_s\) and \(c_b\) upon selling and buying, respectively. At \(t=0\) we also specify an input portfolio \(\hat{x}_{i,0}\), if any. The optimal root node allocation, the only one under full uncertainty, will be determined according to (10d).

The objective function (10a) employs the mean-risk model with a trade-off determined by \(\lambda \) varying between 0 and 1. Here the reward is represented by the expected terminal wealth and the risk measure by the ICVaR. For given \(\alpha \) and \(\beta \), parameter \(\lambda \) helps spanning alternative risk-reward trade-offs in terms of convex combinations between the two measures. As \(\beta \) varies, however, different shortfall distributions will be considered. Following the definition of the risk measure, an increasing \(\beta \) would restrict the shortfall domain progressively. We are particularly interested to the case \(\lambda =1\) to validate the resulting stochastic dominance orders of W with respect to Y. Based on this formulation, after solution, we may assess, as time evolves, the resulting performance per unit tail risk, this latter represented by the ICVaR, or per unit volatility risk, as common in classical portfolio analysis, on the generated probability distributions. As already motivated, the adoption of the ICVaR is related to its relationship with SSD. We clarified in Sect. 2 and will provide supporting numerical evidence in Sect. 4 that a mean-CVaR problem formulation can be established just by equating, for given \(\alpha \), the \(\beta \) to the \(VaR_{\alpha }(.)\).

Problem (10) simplifies naturally to the one period case for \(T=1\), by just considering only the root node investment decision and no rebalancing then after. The one-period problem is a static mean-ICVaR problem, denoted by \(\mathscr {L}(\lambda ,\beta ,1)\). We can define the multi-period mean-CVaR model \(\mathscr {G}(\lambda , T)\) as the solution of \(\mathscr {L}(\lambda , VaR_{\alpha },T)\). We also denote a one-period mean-CVaR model by \(\mathscr {G}(\lambda ,1){:}{=}\mathscr {L}(\lambda , VaR_{\alpha },1)\). Both the mean-CVaR model (Rockafellar & Uryasev, 2002) and the mean-ICVaR model can be formulated as linear programming problems when the return rate vector r and the benchmark y are discretely distributed.

Following the stochastic program (10), for \(\lambda =1\), given a tolerance \(\alpha \), as \(\beta \) increases the portfolio manager will look for the portfolio composition that maximizes the ICVaR of the difference between the portfolio and the benchmark returns. In the computational section we present an extended set of results for different \(\lambda \) and \(\beta \) values. The case \(\lambda =0\), as we will see, is of limited interest resulting simply in an optimal corner solution with all the wealth invested in the asset with highest expected return. The equivalence between the CVaR\(_{\alpha }\) and the \(\rho _{\alpha ,VaR_{\alpha }}\) problems will be validated numerically.

4.1 Performance measurement based on lower boundary functions

Following the problem formulation (10), we introduce in Eqs. (11) and (12), two functions instrumental to develop in Sect. 4 a specific solution analysis.

Let, in particular, \(\mathscr {L}(\lambda ,\beta ,T)\) be an instance of the multistage problem (10) with risk-return trade-off parameter \(\lambda \), reference point \(\beta \) and investment horizon T. \(\{W_t\}_{t=1}^{T}\) is the optimal wealth process generated by the solution of \(\mathscr {L}(\lambda ,\beta ,T)\) and \(\{Y_t\}_{t=1}^{T}\) is the benchmark value process. For \(t=1,\cdots ,T\), we define two boundary functions of the second and first-order between \(W_t\) and \(Y_t\), respectively, as:

$$\begin{aligned} \zeta _{2,t}(\lambda ,\beta ,T)&{:}{=}&(1-\alpha )(\rho _{\alpha ,\beta }(W_t-Y_t) -\beta ), \end{aligned}$$
(11)
$$\begin{aligned} \zeta _{1,t}(\lambda ,\beta ,T)&{:}{=}&\frac{(1- \alpha )}{|\beta |} (\rho _{\alpha ,\beta }(W_t-Y_t) - e(Y_t,\beta )-\beta ) . \end{aligned}$$
(12)

The two functions are clearly inspired by the assumptions of Propositions 5 and 6, and are characterized by: (i) the trade-off parameter \(\lambda \in [0,1]\); (ii) the reference point \(\beta \) associated with the shortfall distribution; (iii) \(t \le T\) to specify the stage to which the bounds refer to, with T end of the investment horizon; (iv) \(k=1,2\) in \(\zeta _{k,t}\) to denote the ISD order, and (v) the wealth \(W_t\) in stage t generated by the optimal solution of \(\mathscr {L}(\lambda ,\beta ,T)\) and \(Y_t\) the comparative value in stage t of an investment in the market index.

Relying on the two boundary functions in Eqs. (11) and (12), we can now reconsider in Table 2, the summary evidence on the theoretical relationships established so far.

Table 2 Implication of the non-negativity of the lower bound function \(\zeta _{k,t}(\lambda ,\beta , T)\) for \(k=1,2\)

Table 2 summarizes the following evidence:

  1. (i)

    A nonnegative \(\zeta _{2,t}(\lambda ,\beta , T)\) guarantees second-order stochastic dominance (ISD-2.0) of \(W_t\) with respect to \(Y_t+\beta \). In particular, by continuity, when \(\beta \) is small, a nonnegative function \(\zeta _{2,t}(\lambda ,\beta , T)\) enforces the ISD-2.0 condition over the benchmark \(Y_t\).

  2. (ii)

    If the ISD-2.0 condition holds and \(\zeta _{1,t}(\lambda ,\beta , T) \ge 0\), then \(W_t \succeq _{1. \beta }( Y_t+ 2\beta )\). In the last row of the table, the positivity of \(\zeta _{2,t}\) is used only to guarantee the ISD-2.0 condition.

  3. (iii)

    It may occur that \(\zeta _{1,t}(\lambda ,\beta , T)\) is nonnegative and even if close to 0, ISD-2.0 conditions cannot be guaranteed, in which case ISD-1 conditions cannot be guaranteed either.

From the evidence in Tables 1 and 2 focusing respectively on the gap functions and the boundary functions, we see that in particular for \(k=2\), after solving the optimization problem (10), the non-negativity of \(\mathscr {H}_2(Y_t,W_t,\infty )\) and of \(\zeta _{2,t}(\lambda ,\beta ,T)\) should guarantee ISD-2.0 (SSD) and possibly ISD-\(1.q_{\beta }\) orders between the portfolio and the benchmark. Even if negative, the greater the value of \(\zeta _{2,t}(\lambda ,\beta ,T)\), the closer the performance of portfolio \(W_t\) to that of benchmark \(Y_t\) according to SSD order. In this respect the boundary function may be interpreted as a performance measure of the optimal portfolio relative to the benchmark. By just maximizing the ICVaR in (10), based on the introduced mathematical relationship, we should then induce as a by-product at least ISD-2 and maybe ISD-1, which explains the rationale behind the objective function in (10). In what follows, after introducing a scenario based formulation of problem (10), we will develop the computational section relying on the functions discussed so far.

4.2 Scenario-based formulation

To solve the optimization problem, we use the scenario tree approach. Let T still be the investment horizon. We denote the scenario tree nodes in stage t by \(n \in \mathscr {N}_t, \ t \le T\). Every non-root node n has a unique ancestor node \(n- \in \mathscr {N}_{t-1}\). For each non-leaf node \(n\in \mathscr {N}_{t}\), \(t\le T-1\), we denote the set of its children nodes by \(n+ \subseteq \mathscr {N}_{t+1}\). For each node \(n \in \mathscr {N}_{T}\), a scenario is a path \(n, {n-}\), \(n{-}{-}\), \(\cdots , {n_0}\) where \(n_0\) is the root node. The number of possible scenarios is equal to \(|\mathscr {N}_{T}|\). Every node carries a probability of occurrence given by \(p_n\), such that \(\sum _{n\in \mathscr {N}_t} p_n=1\) and for every non-terminal node \(n \in \mathscr {N}_{t}\), \(t\le T-1\) satisfies \(p_n=\sum _{j \in n+} p_j\). For the node \(n \in \mathscr {N}_t\), the realization of the asset returns is denoted by

$$\begin{aligned} r_n=(r_{1,n}, r_{2,n}, \cdots , r_{m,n}), \end{aligned}$$

where \(r_{i,n}\) is the return of the i-th asset in node n. In the model specification, let the input portfolio position at the root node \(n_{0}\) be denoted by \( [ \hat{x}_{1,n_{0}}, \hat{x}_{2,n_{0}}, \cdots , \hat{x}_{m,n_{0}}]\). Problem (10) can be written as the following linear programming problem.

$$\begin{aligned}&\max \limits _{x,x^+,x^-,\eta , \phi }&(1-\lambda ) \sum _{n \in \mathscr {N}_{T}} \ p_n\ W_n + \lambda \eta -\frac{\lambda }{1-\alpha }\sum _{ n \in \mathscr {N}_{T}} \ p_n \ \phi _{n} \end{aligned}$$
(13a)
$$\begin{aligned}&\mathrm{s.t.}&W_n= \sum _{i=0}^m x_{i,n},\ {\forall n \in \mathscr {N}_{T},} \end{aligned}$$
(13b)
$$\begin{aligned}{} & {} x_{0,n_0}=\hat{x}_{0,n_0} +(1-c_s)\sum _{i=1}^m x_{i,n_0}^- - (1+c_b)\sum _{i=1}^m x_{i,n_0}^+, \end{aligned}$$
(13c)
$$\begin{aligned}{} & {} x_{0,n}=x_{0,n-} +(1-c_s)\sum _{i=1}^m x_{i,n}^- - (1+c_b)\sum _{i=1}^m x_{i,n}^+,\ {\forall n \in \mathscr {N}_{t},\ t=1,\cdots , T,} \nonumber \\ \end{aligned}$$
(13d)
$$\begin{aligned}{} & {} x_{i,n_{0}}=\hat{x}_{i,n_{0}}+x^+_{i,n_{0}}-x^-_{i,n_{0}},\ {i=1,\cdots ,m,} \end{aligned}$$
(13e)
$$\begin{aligned}{} & {} x_{i,n}=x_{i,n-} (1+r_{i,n})+x_{i,n}^+ - x_{i,n}^-,\ {\forall n \in \mathscr {N}_{t},\ t=1,\cdots , T,\ i=1,\cdots ,m,} \nonumber \\\end{aligned}$$
(13f)
$$\begin{aligned}{} & {} x_{i,n}, x_{i,n}^+, x_{i,n}^- \ge 0, \quad \forall n \in \mathscr {N}_{t},\ t=0,1,\cdots , T-1,\ i=1,\cdots ,m, \end{aligned}$$
(13g)
$$\begin{aligned}{} & {} \beta _T \ge \eta , \end{aligned}$$
(13h)
$$\begin{aligned}{} & {} \phi _{n} \ge \eta - (W_n-Y_{n}), \quad \forall n \in \mathscr {N}_{T}, \end{aligned}$$
(13i)
$$\begin{aligned}{} & {} \phi _{n} \ge 0, \quad x_{i,n}^+= x_{i,n}^- =0, \quad \forall n \in \mathscr {N}_{T},\ i=1,\cdots ,m. \end{aligned}$$
(13j)

The two inequalities \(\beta _T \ge \eta \) in (13h) and \(\phi _{n} \ge \eta - (W_n-Y_{n})\) in (13i) are defined at the end of the planning horizon and enforce the ICVaR optimization relative to the benchmark distribution.

Consider the case \(\lambda =1\) in Eq. (13a): based on the two inequalities (13h) and (13i), the decision maker will maximise in expectation the difference \(W_n-Y_n\) at the end of the planning horizon: this should be sufficient, according to Propositions 5 and 6, to enforce second or first-order ISD, without including explicitly stochastic dominance constraints in the problem formulation. In this setting, due to the scenario formulation of the problem, the stochastic order between the probability distributions of \(W_n\) and \(Y_n\), would be defined for \(n \in \mathscr {N}_t\) in each stage \(t=1,2,...,T\). This approach is for this reason, referred to as stage-wise ISD ordering, to be distinguished from the case in which the SD conditions are evaluated conditionally in every sub-tree of the multistage problem.

The other set of constraints from (13b) to (13g) are easily understood following the scenario tree formulation of the corresponding constraints introduced in problem (10). The instance \(\mathscr {L}(0,\beta ,T)\) reduces to a simple expected terminal wealth maximization problem, or growth model relevant for a risk-neutral investor. Under the given assumptions, the optimal portfolio strategy takes the form of a tree process or optimal contingency plan, whose first stage, root-node decision defines the implementable optimal here-and-now portfolio allocation.

We complete this section by summarizing the scenario generation algorithm adopted to support the multistage formulation.

4.3 Scenario generation

Let \(r_{i,n}\) be the return of the i-th asset in node n. Given an initial state \(r_{i,0}\), we assume a rather simple mean-reverting auto-regressive return model (Campbell et al., 1997) to be estimated relying on OLS:

$$\begin{aligned} r_{i,n}=r_{i,n-}+\alpha _{i}({\hat{r}_i}-r_{i,n-})\Delta t_{n-} +\sigma _i \sum _{j=1}^{m} c_{i,j} e_{j,n}, \end{aligned}$$
(14)

where \(\Delta t_{n-}= t_n-t_{n-}\), with \(t_n\) to denote the time associated with node n, and the matrix \(C=\{c_{i,j}\}_{1 \le i, j \le m}\) is the Cholesky decomposition of an estimated correlation matrix. The \(e_{j,n}\) are then independent samples from a standard normal distribution. Under these assumptions we are considering a Gaussian vector return process, that may clearly be rather simplistic in general and, as we see in Sect. 4, specifically for ETF’s, but that we assume sufficient to establish the properties and evidence central to this study. Alternative and more advanced market models may be employed following for instance (Campbell et al. 1997; Valle et al. 2017; Consigli et al. 2020).

The coefficients \(\alpha _i, \hat{r}_i\) and \(\sigma _i\) define the mean reversion coefficient, the return equilibrium and the standard deviation, respectively, of each return process, to be estimated from historical data. Observe that Eq. (14) can be rewritten as

$$\begin{aligned} \Delta r_{i,n-}=\alpha _{i} \hat{r}_i \Delta t_{n-} - \alpha _{i} r_{i,n-} \Delta t_{n-}+\sigma _{i} \, \sum _{j=1}^{m} c_{i,j} e_{j,n}, \end{aligned}$$
(15)

where \(\Delta r_{i,n-} = r_{i,n}-r_{i,n-}\). Thus, with \( a_{i}=\alpha _{i} \hat{r}_i \Delta t_{n-}, \ \ b_{i}=-\alpha _{i} \Delta t_{n-}\), Eq. (15), for each i, takes the form

$$\begin{aligned} \Delta r_{i,n-}= a_i+b_i r_{i,n-}+ \sigma _i \, \sum _{j=1}^{m} c_{i,j} e_{j,n}. \end{aligned}$$
(16)

Equation (16) can then be estimated through linear regression model with error term \(\sum _{j=1}^{m} c_{i,j} e_{j,n}\). Following Eqs. (14) and (15), due to the assumption on the residuals \(e_{j,n}\), we consider a Gaussian model for the ETFs adopted in the case-study. We will see in Sect. 4.1 that the stylised evidence of the ETFs hardly carrying a Gaussian distribution is actually confirmed in our setting. Same for the market portfolio, actually. In Sect. 4, however, we present evidence that despite this simple statistical assumption, the model, in practice, effectively enforces SD conditions with respect to a market portfolio. It may also be argued that indeed, specifically when considering a partial order between probability distributions, the Gaussian assumption has clear limitations. Our aim, however, is very much on the comparison of an exogenous benchmark distribution with the portfolio return distribution generated by the solution of the optimization problem.

The asset return vector process must satisfy so-called arbitrage-free conditions. Following Klaassen (1997, 1998, 2002), these can be enforced along the tree in each node n by checking recursively through the simulation process the dual variables associated with the children nodes \(n+\) in every sub-tree: for every \(t \le T-1\) and \(n \in \mathscr {N}_t\), we verify the existence of a strictly positive solution \(v_{s}\) to the system:

$$\begin{aligned} \sum _{s\in n+} v_{s}(1+ r_{i,s})=1, \quad \forall \ 1 \le i \le m. \end{aligned}$$

We see that for a compatible system of equations, to validate the arbitrage free condition, we require a set of arcs at least equal to the number of assets in the portfolio (Geyer et al., 2010). We apply the algorithm proposed in Barro et al. (2022) to generate the scenarios and check for the absence of arbitrage. We denote the branching structure of a symmetric 4-stage scenario tree by \(\left[ S_1-S_2-S_3-S_4\right] \), where \(S_{t+1}\) defines the number of children for each node in \(n \in \mathscr {N}_t\). We are here not going into further details and refer the reader to the references quoted above.

In Sect. 4 we present a set of results based on a rich 4-stage scenario tree with branching degree \(\left[ 40-8-6-6\right] \), resulting into 11520 scenarios at the end of the investment horizon. This scenario structure represents a good compromise between:

  • The computational tractability of the resulting multistage stochastic program which is subject to the curse of dimensionality and

  • the generation stage by stage of a sufficiently well defined benchmark distribution whose ISD conditions we wish to assess.

We provide the required numerical evidence in Sect. 4.

5 Computational evidence

We present an extended set of results to analyse the main implications of adopting the proposed multistage ICVaR model (13) for portfolio selection. This section includes:

  1. 4.1

    The definition of the dataset adopted in the project and we anticipate the analysis developed in the following sections.

  2. 4.2

    The analysis of the evidence emerging from the solution of one instance of a multistage ICVaR problem and the associated ISD evidence.

  3. 4.3

    The extension of the results to validate their consistency over 2 years, from 2021 to 2022, and thus complete the model in-sample validation.

  4. 4.4

    The evidence collected over those 2 years in terms of out-of-sample results.

5.1 Data input and experimental set-up

We present an extended set of results for a 4-stage problem with scenario branching \([40-8-6-6]\) resulting in 11520 scenarios. This specific tree structure, with a high first stage branching degree and a rich set of scenarios, aims on one hand at deriving sufficiently reliable stochastic dominance results at the end of the first stage (when comparing the one period against the multi-period solutions) let’s refer to this as an SD requirement, and on the other hand to preserve computational tractability and in-sample stability when solving the multistage problems (Dempster et al., 2011), name this computational requirement. Following the evidence in Liu et al. (2021), we consider a minimum of 40 possible realizations of the benchmark portfolio to be sufficient to evaluate ISD conditions. Given this, we wish to determine the number of stages and the associated number of scenarios.

As for the computational requirement, consider in Table 3 the evidence from different instances of the mean-ICVaR problem as the planning horizon increases \(T=1,2,3,4\) and 5. We consider here the case \(\mathscr {L}(\lambda ,\beta ,T)\) with \(\lambda =1\) and \(\beta =0\). Every instance is solved on a laptop with Intel i5 9400 4.1 GHz processor and 16 GB of RAM. The implementation was done in Python version 3.8 with the Gurobi version 9.0.3 solver and the adoption of the dual simplex algorithm.

Table 3 Scenario tree structure and computational evidence, \(\mathscr {L}(1, 0, T)\) with \(T=1,\cdots ,5\)

We see that the 5-stage problem, when maintaining the same tree expansion scheme would lead to a very large and unsolvable stochastic program. The column \(L_2\)-norm refers to the Euclidean distance between (a) the first four moments of the generated weekly returns’ distributions for every asset class plus the benchmark and (b) the same moments collected from past data and displayed in Table 4.

The branching degree in the second to the fourth stages needs to consider the arbitrage-free condition we discussed at the end of Sect. 3.3 resulting in a number of branches at least equal to the number of assets Geyer et al. (2010). This is the key motivation for the relatively small set of investment opportunities considered in this case study. The investment universe and the number of stages may be increased by reducing the root node branching degree. This however would worsen the approximation of the wealth and the benchmark distributions needed at the end of the first stage to validate the ISD-based partial order. The adoption of an importance sampling approach would be desirable in this context.

The asset universe in this application includes the following five exchange traded funds (ETF). The first two are representative of the energy (XLE) and the technology sector (XLK) within the S &P500. These two ETFs are based on the industry partition of SPY, adopted as the benchmark in the optimization problem. Other three ETFs, poorly or negatively correlated with the benchmark, are also considered to facilitate portfolio diversification: the SPDR gold shares (GLD) which tracks the performance of gold bullion, an ETF for long-term (7 to 10 years) treasury bond investments (IEF), and finally an ETF constructed to track the US dollar performance (USDU). Plus we have a cash account with null return. We assume no transaction costs in this section, so all results are generated with \(c_b=c_s=0\) in the problem specification. The benchmark is the S &P500 market index (SPY).

Table 4 provides a set of descriptive statistics of weekly returns over the 2019-2022 period based on this legend: the columns refer to each asset here above. In the rows: Mean is for the historical average weekly return of the asset class. Std for weekly standard deviation of the returns, Max for the maximum weekly return over the period, Min for the minimum, Skewness and Kurtosis are the third and fourth moments and the Sharpe ratio is the ratio between the weekly average return and standard deviation. Same notation applies to Tables 5 and 6. The same assets’ labels are adopted in the following tables of this section.

Table 4 Statistics of assets’ weekly return rates between 01/01/2019 and 31/12/2022

The rationale for including the ETFs of precious metals (GLD), medium term bonds (IEF) and the US currency (USDU: relative to a basket of convertible currencies) comes from their correlation with the benchmark portfolio SPY, as shown in Table 5, resulting into anti-cyclical and greater diversification potential of a dynamic strategy. Following the remark in Sect. 3.3, we see in Table 4 that, consistently with canonical financial evidence, the ETF’s as well as the benchmark’s historical data are not Gaussian, as assumed in the statistical model. We address this issue in two ways: first by showing in Table 6 the error induced by such assumption in the \(L_2\) norm and by relying on the Kolmogorov-Smirnov (KS) test. Second, by presenting in the final section in- and out-of-sample evidence of the performance of optimal investment strategies under the given statistical assumptions. A more accurate statistical calibration and model development would likely lead to improved financial performance and a more effective risk control. We will see, however, that even under such simplifying assumption, the core contribution of this work will stand.

Table 5 Correlation matrix of weekly return rates between 01/01/2019 and 31/12/2022

To further motivate the adopted scenario tree, as for the SD requirement, we show in Table 6 the outcome of the scenario generation for every asset. Given the adopted Gaussian assumptions, at least the first two moments of the historical distributions are sufficiently well approximated by the simulated distributions. The aggregate evidence on the \(L_2\)-norm in Table 3 is now decomposed for each asset.

Table 6 Simulated weekly asset returns distributions at the last stage. Scenario tree \([40-8-6-6]\)

A null value of the KS leads to accepting the null hypothesis that the residuals come from a standard normal distribution. We are comparing the returns’ standardised historical distribution and the standardised simulated returns distribution.

From the evidence in Tables 4 and 6 we can anticipate that the returns of the ETFs XLE and XLK show higher standard deviation (volatility) than the S &P500, and in this restricted asset universe may be considered risky assets. On the other hand, GLD shows similar volatility and Sharpe ratio, while finally the ETFs: IEF and USDU may be considered the less risky investments, which are also the least correlated with the benchmark.

As a compromise between SD and computational requirements, we will thus focus on a set of 4-stage mean-ICVaR problem instances. By varying the trade-off parameter, the \(\beta \) and the planning horizon, we recall that every instance is specified as \(\mathscr {L}(\lambda ,\beta ,4)\), where the mean-CVaR problems, for \(\beta =0\), will sometimes be denoted as \(\mathscr {G}(\lambda , T)\). Following the evidence in Fig. 4, the parameter \(\beta \), specified in terms of weekly returns, is assumed tobe greater or equal to \(-0.03\). The \(\beta =0.01\) was indeed in any experiment reported to be equivalent to the classical CVaR formulation, which holds for \(\beta =0\). In all instances, we leave \(\alpha =0.95\) to define the ICVaR tolerance.

Section 4.2 is structured in two parts: one discussing the financial properties and main evidence emerging from the optimal solution of problem instances \(\mathscr {L}(\lambda ,\beta ,T)\) for different specifications of the arguments. The second focusing mainly on the evidence linking the solution of the ICVaR problem to the ISD conditions. These are estimated ex-post as a result of the solution of problem (13) by estimating the \(k.q_{\beta }\) for which \(W \succeq _{k.q_{\beta }} Y\).

In what follows we refer to in-sample validation as including those analyses aimed at validating computationally the set of properties laid down in Sects. 2 and 2.2. In particular, with reference to the ISD-based formulation and the classical mean-CVaR problem formulation proposed by Rockafellar and Uryasev (2002), Consigli et al. (2016), Chen et al. (2016). We also wish to verify the advantages, if any, of undertaking a dynamic approach. Out-of-sample validation will instead simply refer to the results collected when replacing the random asset returns with those actually realized in the market, so mainly to assess the effectiveness of the optimal portfolios in terms of market performance and risk control.

5.2 In-sample model validation

We consider in this section only one instance of an optimal portfolio problem defined at the beginning of January 2021 to collect qualitative information on a four stage problem \(\mathscr {L}(\lambda ,\beta ,4)\) (\(T=4\) weeks). By varying \(\lambda \) and \(\beta \), we collect a rich set of results including, for \(\beta =0\), the mean-CVaR solutions. In Sect. 4.2.1, we examine the diversification of the root portfolio and some key statistical evidences. In Sect. 4.2.2, we validate the established relationship between the ICVaR measure and stochastic dominance by examining the gap function and bound functions established in Sects. 2.2 and 3.1.

5.2.1 ICVaR model validation

We first analyse the evidence on the optimal root node portfolios of a 4 stage mean-ICVaR problem \(\mathscr {L}(\lambda ,\beta ,4)\), mainly to analyse their diversification properties. As a comparison, we also study as special case, the multistage mean-CVaR solution of \(\mathscr {G}(\lambda ,4)\). We present results for \(\lambda =\{1,0.75,0.5,0.25,0\}\) to rule the trade-off between the expected wealth and the ICVaR measure in the objective function, and for \(\beta =\{-0.03,-0.02,-0.019,-0.01,0\}\) to specify the shortfall distribution in the tail. The rationale for \(\beta =-0.019\) will be given below.

Table 7 shows the optimal root-node solution of \(\mathscr {L}(\lambda ,\beta ,4)\) and \(\mathscr {G}(\lambda ,4)\) for each problem instance. In the last two columns we display the values of the Herfindal-Hirschman index (HHI) and the Shannon entropy (SE) associated with the optimal portfolio in node \(n_0\). For \(\lambda =0\), as expected, the optimal portfolio is always defined by a fully concentrated corner solution with all the wealth allocated in the asset with highest expected return XLK: this evidence is independent of \(\beta \).

Table 7 Optimal root node portfolio of \(\mathscr {L}(\lambda ,\beta ,4)\) for different \(\lambda \) and \(\beta \)

We can summarize the following evidence from Table 7.

  • For any \(\lambda \), when \(\beta =0\) the optimal solution fully agrees with the mean-CVaR solutions denoted by \(\mathscr {G}(\lambda ,4)\). We can also notice that indeed for every \(\lambda \) as \(\beta \rightarrow 0-\) from below the portfolio composition converges to that and displays a higher diversification.

  • As \(\lambda \) decreases to 0, as a result of a decreasing relevance of the ICVaR in the objective function, we see a progressive reduction of the optimal portfolio diversification and an increasing concentration in the XLK asset, which carries the highest expected return as shown in Table 6. This is the main reason for limiting the analysis in Table 8 to the cases with \(\lambda \ge 0.5\).

  • For \(\lambda \le 0.5\) furthermore we see that the root node solution is pretty insensitive to \(\beta \), with minimal variations of the optimal portfolio composition.

  • The set of results for \(\lambda =1\) is in our context of specific interest: the focus is entirely on the terminal ICVaR measure estimated on the portfolio return distribution and the S &P500 distribution. We see that in this case the root node portfolio is well diversified and relatively stable with a high weight of the S &P500 industry subsectors. We show in Table 8 that the case \(\lambda =1\) is also the one that leads to the strongest in-sample ISD-1 order.

In Table 8, we present a set of results for each problem instance, now however restricting the evidence to \(\lambda =\{1,0.75,0.5\}\), \(\beta =\{-0.03,-0.02,-0.019,-0.01,0\}\) and considering jointly the one stage and the multistage problems: \(T=\{1,4\}\).

The set of instances with \(T=1\) implies the solution of a one period problem based on the first branching only, thus with 40 leaf nodes (a scenario fan). The following notation is adopted in the Table: for each \(\lambda \) we denote with \(W_1, \ W_{1.4}\) and \(\bar{W}_4\), respectively, the portfolio statistics in \(T=1\) when solving \(\mathscr {L}(\lambda ,\beta ,1)\), in \(t=1\) (end of the first stage) when solving \(\mathscr {L}(\lambda ,\beta ,4)\), and finally the weekly statistics over 4 stages, again as solution of the 4 stage problem. Here \(\bar{W_4}\) is specified as a geometric mean wealth, to account for stage-by-stage compounding effects. The numerical evidence in Table 8 is thus all based on one homogeneous weekly stage with the only exception, discussed below, of the ISD information at \(T=4\). We report the average E(W), the standard deviation \(\sigma (W)\), their ratio to define the popular Sharpe ratio (SR) and the \(95\%\) CVaR. When \(\lambda =1\) we add in every section evidence on the ISD-\(k.q_{\beta }\) estimated on the portfolio against the benchmark distributions after the problem solution.

Table 8 Instances \(\mathscr {L}(\lambda ,\beta ,4)\): key evidence, comparison with one period solution

Table 8 provides the core information we rely upon to motivate the multistage formulation from a financial as well as an SD-related perspectives. We said already that all the statistics refer to weekly stages and we always have an initial endowment of 1 monetary unit. The S &P500 benchmark portfolio also carries a normalised unit value in \(t=0\).

  • Consider the first set of evidence for \(\lambda =1\).

    • For \(T=1\) as \(\beta \) increases the risk adjusted returns (Sharpe ratio) do also increase. Furthermore when checking the stochastic dominance of the portfolio against the benchmark distribution at the end of the first stage, as expected we see that indeed second-order stochastic dominance is guaranteed as \(\beta \rightarrow 0-\) and for \(\beta =-0.019\) we have the strongest ISD-1 order.

    • For \(T=4\), here but also for \(\lambda =0.75\) and 0.5, we see that the first stage statistics based on \(W_{1,4}\) confirm the \(T=1\) evidence, but once we consider the average risk-adjusted returns and the ISD partial order computed in \(T=4\) the results improve significantly relative to the cases \(T=1\). Furthermore as \(\beta \) increases to 0 we see that the ISD order decreases. We show in the next subsection that such evidence is consistent with the properties of the lower bounds.

  • As \(\lambda \) decreases to 0.5, as expected, we observe that:

    • For \(T=1\) the expected return increases as well as the risk adjusted returns and the \(\textrm{CVaR}_{95\%}\) worsen.

    • For \(T=4\), similar evidence as before with the first stage statistics mostly confirming those collected when \(T=1\) and a significant improvement of weekly statistics for \(\bar{W}_4\).

  • Essentially for any \(\lambda \) and \(\beta \) when comparing the evidence of the second (\(W_{1,4}\)) and third (\(\bar{W}_4\)) subsections, we see that the extension to a multi-period model leads jointly to higher subperiods financial performance and stronger ISD-orders. This is the primary motivation to consider a multistage rather than a one period problem formulation.

In the next Sect. 4.2.2 we will concentrate on the ICVaR-ISD relationship for \(\lambda =1\), then in Sect. 4.3 we verify the general consistency of a set of evidences analysed so far.

5.2.2 ICVaR problem solution and ISD evidence

The purpose of this section, based on the solution of problem 13 is to verify the implications of the ICVaR maximization on the stochastic dominance \(W_t \succeq _{k.q_{\beta }} Y_t\) relying on the introduced gap and boundary functions. In Sect. 2 the non-negativity of the gap function was put in direct relationship with SD partial orders. Furthermore the lower bound \(\zeta _{2,t}\) was linked to the ICVaR through Eq. (6), so that ISD-2 dominance should come up as a by-product when solving the optimization problem.

The analysis below relies on the following variables, that we summarize briefly.

  • We introduce for the gap functions \(\mathscr {H}_k(Y_t,W_t,\beta )\) a new notation to ease the comparisons and to associate the analysis more explicitly with the problem instance: namely for \(k=1,2\) the gap functions are now denoted by \(\delta _{k,t}(\lambda ,\beta , T)\). This notation includes \(t \le T\) based on the benchmark \(Y_t\) and portfolio evolution \(W_t\) evaluated in t and generated by the solution of problem \(\mathscr {L}(\lambda ,\beta , T)\). We recall from Sect. 2 that the gap functions capture the divergence between the portfolio and benchmark distributions. Through the dynamic extension, we can monitor the evolution in each stage \(t\le T\) of the distance between those distributions.

  • To verify the implication on the ISD order of alternative specifications of \(\beta \), we compute \(\zeta _{k,t}(1,\beta , T)\) for \(k=1,2\) and derive ISD-\(k.q_{\beta }\) information from the solution of \(\mathscr {L}(1,\beta ,T)\) with \(\beta =\{-0.03\), \(- 0.02,- 0.019, - 0.01, 0\}\).

In Table 9, given \(\lambda =1\) and \(T=4\) we show the numerical results for different values of \(\beta \) as t increases to T.

Table 9 Bounds behaviour and ISD conditions for several problems specification

Table 9 allows several remarks. For our purposes it does provide the key information we are after: the solution of problem \(\mathscr {L}(1,0,4)\) guarantees a very tight lower bound to SSD, a null difference \(F_2(Y_t)-F_2(W_t)\) for every \(t=1,2,3,4\) and it is sufficient to lead to first-order ISD conditions in each stage.

  • The lower bound \(\zeta _{2,t}(1,\beta , 4 )\) decreases when t increases, suggesting that the bounds at terminal stage \(t=4\) control the lower bounds over the previous stages.

  • The function \(\delta _{2,t}(1,\beta , 4 )\) is mostly null for any t and \(\beta \). This shows that SSD conditions are effectively enforced in a multi-period framework.

  • The ISD order improves generally when moving from \(t=1\) to \(t=4\). This suggests that in a multi-period model, the stochastic dominance order is refined over the stages.

  • The previous results show that the ISD-2 relationship holds through the nonnegativity condition of the function \(\delta _{2,t}\). For the first-order case \(k=1\), we see that the \(\delta _{1,t}\) are generally non-positive for \(t=1,2\) but non negative afterwards and always nonnegative for \(\beta =-0.03\): this is a sufficient condition for \(W_t \succeq _{1.q_{\beta }} Y_t\), which is indeed confirmed in the lowest part of the table for \(t=3,4\).

To complement this analysis we plot in Fig. 5 the ISD-\(k.q_{\beta }\) order after solving a sequence of optimization problems \(\mathscr {L}(1,\beta ,T)\) as \(\beta \) varies when \(T=\{1,4\}\). For \(T=1\) we evaluate the ISD order for \(\beta =\left[ -0.03,0;0.003\right] \) with 0.003 steps. Observe that the SSD condition mostly holds over the specified domain and for \(\beta \in (-0.019,-0.017)\) the strongest ISD-1 order is attained, which motivates the inclusion of \(\mathscr {L}(\lambda ,-0.019,T)\) as problem instance in several results. Interestingly, for \(T=4\), when increasing the \(\beta \), here \(\beta =\left[ -0.03,0; 0.0015 \right] \), \(\mathscr {L}(1,\beta ,4)\) the strongest ISD is attained around the same values. Surprisingly as \(\beta \rightarrow 0-\) the ISD-1 remains very low. The evidence confirms that the error functions gives sufficient but not necessary conditions for ISD-1.

Fig. 5
figure 5

ISD-order associated with two ICVaR models. Left: \(\mathscr {L}(1,\beta ,1)\) and right: \(\mathscr {L}(1,\beta ,4)\). Here, for \(\beta \), we consider on the left 100 evaluations and on the right 20 evaluation with sample points equally spaced in \([-0.03,0]\)

We wish to further substantiate the claim that indeed, when extending the investment horizon, the first stage SD conditions won’t be jeopardized: consider for this purpose Fig. 6 and the behaviour of \(\zeta _{2,1}(1,\beta ,T)\) and \(\delta _{2,1}(1,\beta ,T)\) for increasing \(\beta \) and \(T=1,2,3,4\). Left to right: the left plot clearly shows that the first-stage lower bound is increasingly tight as \(\beta \) increases to 0 and this result does not depend on T. The right plot furthermore, shows that indeed for \(\beta \ge -0.019\), SSD is surely supported at the end of the first stage by the solution and actually for \(T=4\) it won’t even depend on \(\beta \). This shows that when increasing the investment horizon, the ICVaR maximization enforces SSD, and actually, as shown in Table 9, may lead to ISD-1 conditions already at the end of the first stage.

Fig. 6
figure 6

Estimates of \(\zeta _{2,1}(1,\beta ,T)\) (left) and \(\delta _{2,1}(1,\beta ,T)\) (right) for \(\beta =\{-0.03,-0.02,-0.019,-0.01,0\}\) and \(T=\{1,2,3,4\}\). Here, the branching structure for \(T=1\) is [40], for \(T=2\) is \([40-8]\), for \(T=3\) is \([40-8-6]\) and for \(T=4\) is \([40-8-6-6]\)

Finally in Fig. 7 we plot the second and first-order probability distributions induced by the solution of problems \(\mathscr {L}(1,- 0.03,T)\) for \(T=1,2,3,4\) at the end of the first period and, respectively, at the horizon. The plots on the left thus allow the comparison of the second-order SD always in \(t=1\) as the planning horizon is extended. Both second and first-order distributions in the first row refer to the one period case, \(T=1\) and for this case problem from Table 8 we have ISD\(-k.q_{\beta }=1.6125\) for \(T=1\). For \(T=2,3,4\) we show row-wise the second and first-order distributions of respectively \(W_{1,T}\) and \(W_T\). For each pair we see that essentially as the investment horizon is extended the ISD orders strengthen and when \(T=4\) we have left an ISD\(-k.q_{\beta }=1.913\) and right ISD\(-k.q_{\beta }=1.373\) at the horizon. In general the end of first stage stochastic order is mostly preserved as T increases and the stochastic order at the horizon improves.

Fig. 7
figure 7

second-order (left column) distribution for the first stage and First-order (right column) cumulative distributions for the \(t=T\) stage varying \(T=1,2,3,4\), for the model \(\mathscr {L}(1,-0.019,T)\)

5.3 Consistency analysis, multistage problem solution

In this section we select a set of sub-problems and present evidence on the consistency over time of the key conclusions reached in Sects. 4.2.1 and 4.2.2. We wish to support the main results presented in Sect. 4.2, specifically devoted to the financial and the ISD properties of the solutions. To this aim we develop a rolling window procedure based on 3 years (152 weeks) of data for statistical model estimation and scenario generation and solve the optimization problem (13) over the following T weeks, for \(T=\{1,4\}\). Starting from the first sample based on 2018/01/07-2021/01/03 data, we repeat the process with weekly steps and fixed 152 weeks’ rolling windows to derive and test a sequence of 104 optimal solutions. Always based on a 4-stage scenario tree with structure \([40-8-6-6]\).

5.3.1 Risk-reward analysis and portfolio diversification

Consider the following problem instances: \(\mathscr {L}(\lambda ,\beta ,T)\) for \(\lambda =\{0.5,1\}, \beta =\{- 0.03,0\}\) and \(T=\{1,4\}\). When computing the one-stage models, we take the 40 branches in the first stage of the scenario tree as input data.

For each test-problem, we display in this case the time-averages of the variables already introduced in Table 8. In summary: for \(T=1\) we derive the end-of-the-week wealth distribution and statistics there upon, while for the multistage models \(\mathscr {L}(\lambda ,\beta ,4)\) and \(\mathscr {G}(\lambda ,4)\), we compute end-of-the-first-week evidence denoted by \(W_{1.4}\) and the weekly average wealth \(\bar{W_4}\) and statistics there upon. In the last section, based on the 4 stage weekly mean \(\bar{W}_4\), as for the ISD-\(k.q_{\beta }\) evidence we display the end of the month evidence that jointly with ISD-\(k.q_{\beta }(W_{1,4})\) helps assessing the advantages of considering the multistage formulation. In general ISD estimates are only considered when \(\lambda =1\). We focus on a subset of problem instances, namely for \(\beta =\{-0.03,0\}, \ \lambda =\{0.5,1\}\) and \(T=\{1,4\}\) sufficient to our purposes and compute the ISD conditions over the entire dataset and conditional on the gap functions \(\delta _{2,t}\ge 0\) for \(t=1,4\).

Table 10 Time-average statistics of optimal portfolios from 104 test-problems, January 2021 to December 2022

For \(\lambda =1, T=4\) and \(\beta =\{- 0.03,0\}\) we also show in Fig. 8 the HHI values associated with the optimal root node portfolios, to confirm the good diversification properties of these model instances over the 2021–2022 period.

Fig. 8
figure 8

HHI index associated with the root node portfolio of \(\mathscr {L}(1,-0.03,4)\) model (left) and \(\mathscr {G}(1,4)\) model (right) over 104 weeks from January 2021 until December 2022

The evidence in Table 10 essentially confirms the results collected in Sect. 4.2.1. We remind that over the 2 years we are considering average results from weekly data:

  • For given T, reducing \(\lambda \) and increasing \(\beta \) leads to higher Sharpe ratios on average and lower diversification. From Fig. 8 we see that indeed over the 2 years the optimal portfolios’ HHI remains above 0.3.

  • When increasing T the first stage and average Sharpe ratios do increase both at the end of period 1 and on average for any \(\lambda \) and \(\beta \).

  • For \(\lambda =1\) as T increases the ISD-\(k.q_{\beta }\) improves significantly and the strongest degree is reached ex-post on average for \(\beta =-0.03\).

  • When conditioning on problem instances for which \(\delta _{2,1}\ge 0\), furthermore, for \(T=1\) or 4 we see first that the condition holds with higher frequency when extending the planning horizon and then that the ISD evidence improves significantly.

  • In the 4-stage problem, furthermore, we see that the gap function nonnegativity condition is met with higher frequency and indeed the resulting average ISD order are close to FSD under either \(\beta =-0.03\) or \(\beta =0\).

We analyse further the relationship between the ICVaR maximization and the ISD in the following section.

5.3.2 ICVaR-ISD consistency

Figures 9 and 10 help extending a relevant set of remarks raised in Sect. 4.2.2 for a single problem instance to the several instances of the 2021–2022 period. We also complement the evidence discussed in Sect. 4.3.1. In particular we assess the relationship between the evolution of \(\delta _{2,t}\) for \(t=\{1,4\}\) and the ISD conditions estimated after the problem solution. We limit the evidence to the instances \(\mathscr {L}(1,\beta ,4)\) for \(\beta =\{-0.03,0\}\) and display the probability distributions of \(\delta _{2,1}\) and \(\delta _{2,4}\) in Fig. 9 and of ISD-\(k.q_{\beta }(W_{1,4})\) and \((\bar{W}_4)\) in Fig. 10.

Fig. 9
figure 9

Cumulative distribution over 2021–2022, 104 weeks for \(\delta _{2,1}\) (blue) and \(\delta _{2,4}\) (red), different ICVaR models

The left and right plots of \(\delta _{2,t}\) in Fig. 9 differ only for \(\beta =-0.03\) on the left and \(\beta =0\) on the right. In this latter case we are thus considering the \(CVaR_{0.95}\) function: the blue lines are associated with the end of stage 1 while the red with the end of stage 4. Then for \(\beta =-0.03\) we see that given the problem solution after 1 week the second-order distributions agree in more than \(50\%\) of the cases, a percentage that increases to \(70\%\) for \(t=T=4\). The probability that the two distributions differ more than \(1.5\%\) at the horizon is 0 and more than \(7\%\) after 1 stage is null as well. Similar evidence when \(\beta =0\). In either cases we see that the condition for SSD and possibly ISD-1 is met at the end of the planning horizon and improves significantly stage-by-stage. Consider now Fig. 10. Similar pattern with the left plot slightly better than the right.

Fig. 10
figure 10

Cumulative distributions over 2021–2022, 104 weeks, for ISD-\(k.q_{\beta }(W_{1,4})\) (blue) and ISD-\(k.q_{\beta }(W_4)\) (red) for different models

For \(\beta =-0.03\) we see that ISD-1 is attained at the horizon in \(60\%\) of the cases and with probability \(50\%\) the solution leads to \(W\succeq _{1.2} Y\). After 1 period with probability \(50\%\) we have \(W\succeq _{2.0} Y\). On the left we see that out of all the experiments in roughly \(22\%\) of the cases we have TSD only in the case of \(\beta =-0.03\) and this probability increases to roughly \(30\%\) if \(\beta =0\). On the opposite side of the distribution when \(\beta =-0.03\) we see on the left plot that FSD is attained with \(25\%\) probability and for \(\beta =0\) with probability \(32\%\) at the horizon.

Summarizing the ICVaR maximization induces ISD-1 dominance with high probability both at \(T=4\) and at the end of the first stage.

5.4 Out-sample analysis

The same rolling window approach described in Sect. 4.3 is adopted to derive a set of optimal root node portfolios whose performance is back-tested against actually realized market returns out-of-sample over the 2021–2022 period. These 2 years are well characterized in the US market by the 2021 rebound and very positive trend after the 2020 pandemic-related crisis, and by the 2022 negative and volatile trend due to tight monetary policy resulting into increasing interest rates as a response to growing inflationary pressures.

In Table 11, we show out-of-sample return statistics for a selected set of 1 and 4 period optimal portfolios, for \(\lambda =\{0.5,1\}\), \(\beta =\{-0.03,-0.19,0\}\) and for \(T=\{1,4\}\) again just to compare static, one period with multi-stage results. We assume a unit initial wealth at the beginning of the out-of-sample period and compute the cumulative wealth process by compounding the out-of-sample portfolio returns. We show the terminal wealth of each model in column \(W_2\) of Table 11 and the associated portfolio dynamics in Fig. 11. Here, \(W_{0.5}\), \(W_1\), \(W_{1.5}\) and \(W_2 \) denote the portfolio wealth after half year, at the end of the first year, then at the one and half and two years ends, respectively. In the last three columns we display for every week in the case study, in column WR, the proportion of weeks in which the optimal portfolio outperforms the benchmark S &P500, and in \(\mathrm{E(ER)}_{+}\) and \(\mathrm{E(ER)}_{-}\) the weekly average out-sample excess returns above and below the benchmark S &P500, respectively. We rely on this last set of evidence to convey the statistical significance of the results.

Table 11 Back testing results over 104 weeks from 2021/01/03 to 2022/12/25 for each problem instance (rows): first 5 columns on wealth evolution; second 5 columns on statistics of weekly returns and last 3 columns on corresponding performance statistics

The evidence from Table 11, see the Fig. 11 plots as well, is interesting and can be analysed by considering the one against multistage results first, then the risk-reward trade-offs and finally looking at the reference point \(\beta \). In the bottom row we have the S &P500: roughly \(22\%\) return the first year, that goes to 0 in the first semester of 2021 and then increases slightly during a volatile period:

  • In general the extension to 4 stages is beneficial on every model instance and we see that the mean-CVaR optimal portfolios are those who track closely the benchmark dynamics both during the positive 2021 but also during the negative 2022.

  • For \(T=1\) we see that in 2021 either a few optimal portfolios track the S &P500 or they do not exploit the positive market year while in 2022 essentially neither of them avoids the increasing volatility and negative trend. This evidence changes when considering the optimal first stage portfolios associated with \(T=4\): most optimal portfolios track the index in 2021 and a significant subset of the even if volatile, do now loose during 2022.

  • For \(\lambda =0.5\) and decreasing \(\beta \) we see that indeed in the multistage model, the optimal portfolios significantly outperform the mean-CVaR solutions \(\mathscr {G}(0.5,4)\). Same for \(\lambda =1\) and the optimal portfolios from \(\mathscr {L}(\lambda ,-0.03,4)\) are those with the best performance and hedging effectiveness out-of-sample.

  • From the last three columns, we see that, unlike in the one period case, the multistage formulation leads to the outperformance of the control portfolio relative to the benchmark significantly above \(50\%\) in all instances and \(50\%\) when the mean-CVaR problem is solved. Furthermore under any problem formulation the expected positive excess returns exceed in absolute value the expected negative excess returns.

Fig. 11
figure 11

Out-of-sample evidence over 104 weeks from January 03 2021 until December 25 2022, several selected problem instances

6 Conclusion

This article provides an in-depth analysis of the relationship between a multistage mean-risk portfolio selection problem and the resulting stochastic dominance relationships between the optimal portfolio and a benchmark market portfolio. As a risk measure, we propose the terminal interval Conditional Value-at-Risk, whose relationship with interval stochastic dominance is exploited throughout to analyse in- and out-of-sample evidence and support the proposed modeling framework.

We show that without explicitly introducing stochastic dominance constraints in the problem formulation, thanks to the established theoretical relationships between the ICVaR and ISD orders, depending on the adopted mean-ICVaR problem formulation, we are able to enforce strong SD conditions in every stage by solving a multistage stochastic linear programming problem. This research is motivated from modelling and computational perspectives, and it does entail several steps that we wish here to summarize and frame in view of possible extensions and colleagues’ contributions.

  • We extend a classical mean-CVaR optimization problem to several stages and, through the ICVaR, to instances deeper in the tail, whose relationship with underlying stochastic dominance principles is analysed thoroughly and validated computationally. The advantages and implications of the multi-period extension are discussed in detail.

  • ISD ordering allows by definition a continuous spanning of partial orders from FSD to TSD and, in our context, it has been instrumental, after solution and depending on the problem formulation, to infer the resulting stage-dependent dominance relationship between the optimal portfolio and benchmark return distributions.

  • In a comprehensive computational study, the proposed lower bounds on the gap functions, consistently with the sufficient conditions proven in the article, are shown to enforce first and second-order ISD principles under several problem specifications.

  • The adopted discrete representation of the mean-ICVaR problem relies on a rich scenario tree with several thousands of scenarios over a very short-planning horizon, namely 1 month: this resulted in a very large-scale problem. The root node branching degree was kept in our experiments rather high mainly to help studying the SD implications in the first stages.

  • The results rely on a relatively standard scenario tree generation method applied to a mean-reverting Gaussian vector return process adopted both for the assets’ return vector process and the benchmark: these are inputs to the stochastic optimization problems, whose SD results do inevitably depend on the adopted stochastic assumptions. We have shown in the computational results, that, however such assumption does not jeopardise the effectiveness of the mean-ICVaR trade-off paradigm.

Starting from this last point above, in consideration of the statistical properties of the adopted data set, either a more advanced stochastic model or, as in Barro and G.Consigli (2019), a scenario generation based on a data-driven approach, possibly over more extended stages, would provide relevant way forwards in this context. The extension of the investment universe to further refine and calibrate the resulting portfolio distribution relative to the benchmark is also a desirable research path. In this respect, our project has been limited to just five asset classes and the S &P500 as the benchmark. A more extended investment universe would also allow the development of portfolio efficiency analysis based on expected wealth and ICVaR as relevant reward and risk measures, as in classical portfolio theory.

From a financial perspective, stochastic dominance principles are also relevant when taking as the benchmark a given pre-specified portfolio strategy or decision rule. From a modeling viewpoint, preserving the ISD-based stage-wise dominance approach, specifically in a multistage framework we have seen that the reference point \(\beta \) is indeed varying across stages resulting in different shortfall distributions, whose implications are worth further analysis.

Finally from a theoretical perspective the adoption of the terminal risk measure and stage-wise SD conditions in our context has proven sufficient to determine a rich set of evidence but further analysis is needed to verify the optimal strategy time-consistency.