Keywords

1 Introduction: Context and Motivation for the Nested Affine Variational Inequalities Model

Nested affine variational inequalities represent a flexible modeling tool for many real-world applications like, e.g., the renowned multi-portfolio selection (see, e.g. [5]). To introduce the general formulation of the model, we first briefly describe the specific instance of the multi-portfolio optimization problem.

Consider N accounts, with ν = 1, …, N. Each account ν’s budget \(b^\nu \in \mathbb R_+\) is invested in K assets of a market. The decision variables y ν ∈ Y ν ⊆ R K stand for the fractions of b ν invested in each asset, where Y ν is a nonempty compact polyhedron containing the feasible portfolios, e.g., the standard simplex. Let \(r\in \mathbb R^K\) indicate random variables, where r k is the return on asset k ∈{1, …, K} over a single-period investment. We define \(\mu ^\nu = \mathbb E^\nu (r) \in \mathbb R^K\) as expectations of the assets’ returns for ν, as well as the positive semidefinite covariance matrix \(\Sigma ^\nu = \mathbb E^\nu ((r - \mu ^\nu )(r - \mu ^\nu )^\top )\). We consider the following measures for portfolio income I ν and risk R ν, where we use the portfolio variance as the risk measure: \(I_\nu (y^\nu ) \triangleq b^\nu (\mu ^\nu )^\top y^\nu \), \(R_\nu (y^\nu ) \triangleq \frac {1}{2} (b^\nu )^2 (y^\nu )^\top \Sigma ^\nu y^\nu \).

When trades from multiple accounts are pooled for common execution, individual accounts can suffer the market impact that stems from a lack of liquidity. To take account of this transaction cost effect, we introduce a positive semidefinite market impact matrix \(\Omega ^\nu \in \mathbb R^{K \times K}\) whose entry at position (i, j) is the impact of the liquidity of asset i on the liquidity of asset j. For each account ν we consider a linear market impact unitary cost function. The total transaction costs term for ν is:

The multi-portfolio problem can be formulated as the following Affine Variational Inequality AVI(M low, d low, Y ): find \(y \in Y \triangleq Y_1 \times \cdots \times Y_N \) such that

$$\displaystyle \begin{aligned} \left(M^{\text{low}}y + d^{\text{low}}\right)^\top (w - y) \ge 0 \quad \forall w \in Y, \end{aligned} $$
(1)

where \(d^{\text{low}} \triangleq -b^\nu \mu ^\nu \) and

We assume the matrix M low to be positive semidefinite and, in turn, AVI(M low, d low, Y ) to be monotone: these properties can be guaranteed under mild assumptions, see [5, Theorem 3.3]. We denote by SOL(M low, d low, Y ) the solution set of AVI(M low, d low, Y ), which is a polyhedron (see [5, Theorem 2.4.13]). Note that AVI(M low, d low, Y ) corresponds to an equivalent Nash Equilibrium Problem (NEP), where the players’ objective functions are convex and quadratic. Since the set SOL(M low, d low, Y ) is not necessarily a singleton in the framework we consider, one has to discriminate among the solutions of AVI(M low, d low, Y ) according to some further upper level criterion. Thus, to model the resulting selection problem, we introduce the monotone nested affine variational inequality AVI(M up, d up, SOL(M low, d low, Y )), that is the problem of calculating y ∈SOL(M low, d low, Y ) that solves

$$\displaystyle \begin{aligned} \left(M^{\text{up}}y + d^{\text{up}}\right)^\top (w-y) \ge 0,\quad \forall\,w\in {\text{SOL}}(M^{\text{low}},d^{\text{low}},Y), \end{aligned} $$
(2)

where \(\mathbb R^{{NK}\times {NK}} \, \ni M^{\text{up}} \succeq 0\) and \(d^{\text{up}} \in \mathbb R^{NK}\). Problem (2), which has a hierarchical structure, includes as a special instance the minimization of the convex quadratic objective function \(\frac {1}{2} y^\top M^{\text{up}} y + {d^{\text{up}}}^\top y\), where M up is symmetric, over SOL(M low, d low, Y ). It is also worth mentioning the special instance where the N accounts form an upper-level (jointly convex) NEP to select over SOL(M low, d low, Y ); in this case, M up turns out to be nonsymmetric. We refer the reader to [1] for further information about NEPs.

Remark Convergent solution procedures have been devised in the literature (see, e.g., [3, 4]) to address monotone nested AVIs when M up is positive semidefinte plus, i.e. M up is positive semidefinite and y M up y = 0 ⇒ M up y = 0 (see, [2, Ex. 2.9.24]). Requiring M up to be positive semidefinite plus is restrictive: for example, taking NK = 2, any matrix

$$\displaystyle \begin{aligned} M^{\text{up}} = \begin{pmatrix} m_{1} & 2 {\sqrt{m_{1} m_{2}}} + m_{3}\\ -m_{3} & m_{2} \end{pmatrix} \end{aligned}$$

with m 1, m 2 nonnegative scalars and \(m_{3} \neq - \sqrt {m_1 m_2}\), is positive semidefinite but not positive semidefinite plus. Actually, the class of semidefinite plus matrices is “slightly” larger than the ones of symmetric positive semidefinite and positive definite matrices.

Recently, a projected averaging Tikhonov-like algorithm has been proposed in [6] to cope with monotone nested VIs allowing for matrices M up that are not required to be positive semidefinite plus.

We present a solution method for problem (2). We apply the results presented in [6] to the specific instance of monotone nested affine variational inequalities, taking full advantage of some strong properties AVIs enjoy, such as error bound results. This allows us to put forward an algorithm to address problems like the multi-portfolio selection in a more general framework with respect to the literature, where the upper level operator is invariably assumed to be monotone plus (see, e.g., [5]).

2 The Tikhonov Approach

We require the following mild conditions to hold:

  1. (A1)

    M up is positive semidefinite;

  2. (A2)

    M low is positive semidefinite;

  3. (A3)

    Y  is nonempty and compact.

The set SOL(M low, d low, Y ) is nonempty, convex, compact and not necessarily single-valued, due to (A2) and (A3), see e.g. [2, Section 2.3]. It follows that the feasible set of the nested affine variational inequality (2) is not a singleton. Moreover, thanks to (A1), the solution set of (2) can include multiple points.

Let us introduce the Tikhonov operator:

For any τ > 0, by assumptions (A1) and (A2), Φτ is monotone and affine.

The following finite quantities will be useful in the forthcoming analysis:

We propose a Linear version of the Projected Averaging Tikhonov Algorithm (L-PATA) to compute solutions of (2).

Algorithm 1: Linear version of the Projected Averaging Tikhonov Algorithm (L-PATA)

Index i refers to the outer iterations occurring as the condition in step (S.4) is verified, which correspond to the (approximate) solutions w i+1 of the AVI subproblems

$$\displaystyle \begin{aligned} \Phi_\tau (y)^\top(w-y) \geq -\varepsilon_{\text{sub}}, \quad \forall \, w \in Y, \end{aligned} $$
(3)

with ε sub = i −2 and τ = i. The sequence {y k} includes all the points obtained by making classical projection steps with the given diminishing stepsize rule, see step (S.2). The sequence {z k} consists of the inner iterations needed to compute (approximate) solutions of the AVI subproblem (3), and it is obtained by performing a weighted average on the points y j, see step (S.3). Index l lets the sequence of the stepsizes restart at every outer iteration, while considering solely the points y j belonging to the current subproblem for the computation of z k+1. We remark that the condition in step (S.4) only requires the solution of a linear problem.

We now deal with the convergence properties of L-PATA. With the following result we relate (approximate) solutions of the AVI subproblem (3) where ε sub ≥ 0 to approximate solutions of problem (2).

Proposition 1

Assume conditions (A1)–(A3) to hold, and let y  Y  satisfy (3) with τ > 0 and ε sub ≥ 0. It holds that

$$\displaystyle \begin{aligned} \left(M^{\mathit{\text{up}}}y + d^{\mathit{\text{up}}}\right)^\top (w-y) \geq -\varepsilon_{\mathit{\text{up}}}, \quad \forall w \in \mathit{\text{SOL}}(M^{\mathit{\text{low}}},d^{\mathit{\text{low}}},Y), \end{aligned} $$
(4)

with ε up = ε sub τ, and

$$\displaystyle \begin{aligned} \left(M^{\mathit{\text{low}}}y + d^{\mathit{\text{low}}}\right)^\top (w-y) \geq -\varepsilon_{\mathit{\text{low}}}, \quad \forall w \in Y, \end{aligned} $$
(5)

with \(\varepsilon _{\mathit{\text{low}}} = \varepsilon _{\mathit{\text{sub}}} + \frac {1}{\tau } H D\).

Proof

We have for all w ∈SOL(M low, d low, Y ):

$$\displaystyle \begin{aligned} -\varepsilon_{\text{sub}} \tau & \leq \left[\tau \left(M^{\text{low}}y + d^{\text{low}}\right) + \left(M^{\text{up}}y + d^{\text{up}}\right)\right]^\top(w-y) \\ & \leq \left[\tau \left(M^{\text{low}}w + d^{\text{low}}\right) + \left(M^{\text{up}}y + d^{\text{up}}\right)\right]^\top(w-y) \\ & \leq \left(M^{\text{up}}y + d^{\text{up}}\right)^\top(w-y), \end{aligned} $$

where the first inequality is due to (3), the second one comes from (A2), and the last one is true because y ∈ Y  and then \(\left (M^{\text{low}}w + d^{\text{low}}\right )^\top (y-w) \geq 0\). Hence, (4) is true.

Moreover, we have for all w ∈ Y :

$$\displaystyle \begin{aligned} \left(M^{\text{low}}y + d^{\text{low}}\right)^\top (w-y) & = \Phi_\tau (y)^\top (w-y) - \frac{1}{\tau} \left(M^{\text{up}}y + d^{\text{up}}\right)^\top (w-y) \\ & \geq -\varepsilon_{\text{sub}} - \frac{1}{\tau} H D, \end{aligned} $$

where the inequality is due to (3). Therefore, we get (5). □

Here follows the convergence result for L-PATA.

Theorem 1

Assume conditions (A1)–(A3) to hold. Every limit point of the sequence {w i} generated by L-PATA is a solution of problem (2).

Proof

First of all, we show that i →. Assume by contradiction that this is false, hence an index \(\bar k\) exists such that either \(\bar k = 0\) or the condition in step (S.4) is satisfied at the iteration \(\bar k - 1\), and the condition in step (S.4) is violated for every \(k \geq \bar k\). In this case, it is true that \(i \to \bar \imath \), and then \(\tau ^k = \bar \tau \triangleq \bar \imath \) for every \(k \ge \bar k\).

For every \(j \in [\bar k, k]\), and for any v ∈ Y , we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} \|y^{j+1} - v\|{}_2^2 & = & \|P_Y(y^j - \frac{1}{2(j-\bar k+1)^{0.5}} \Phi_{\bar \tau}(y^j)) - v\|{}_2^2\\ {} & \le & \|y^j - \frac{1}{2(j-\bar k+1)^{0.5}} \Phi_{\bar \tau}(y^j) - v\|{}_2^2\\ {} & = & \|y^j - v\|{}_2^2 + \frac{1}{4(j-\bar k+1)} \|\Phi_{\bar \tau}(y^j)\|{}_2^2 - \frac{1}{(j-\bar k+1)^{0.5}} \Phi_{\bar \tau}(y^j)^\top (y^j - v), \end{array} \end{aligned}$$

and, in turn,

$$\displaystyle \begin{aligned} \Phi_{\bar \tau}(y^j)^\top (v - y^j) \ge \frac{\|y^{j+1} - v\|{}_2^2 - \|y^j - v\|{}_2^2}{(j-\bar k+1)^{-0.5}} - \frac{1}{4(j-\bar k+1)^{0.5}} \|\Phi_{\bar \tau}(y^j)\|{}_2^2. \end{aligned}$$

Summing, we get

$$\displaystyle \begin{aligned} \begin{array}{rcl} \frac{\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}} \Phi_{\bar \tau}(y^j)^\top (v - y^j)}{\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}}} & \ge & \frac{\displaystyle\sum_{j=\bar k}^{k} \left(\|y^{j+1} - v\|{}_2^2 - \|y^j - v\|{}_2^2 - \frac{1}{4(j-\bar k+1)} \|\Phi_{\bar \tau}(y^j)\|{}_2^2\right)}{2\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}}} \\ {} & = & \frac{\displaystyle\left(\|y^{k+1} - v\|{}_2^2 - \|y^{\bar k} - v\|{}_2^2 - \displaystyle\sum_{j=\bar k}^{k} \frac{1}{4(j-\bar k+1)} \|\Phi_{\bar \tau}(y^j)\|{}^2_2\right)}{2\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}}} \\ {} & \geq & - \frac{\displaystyle \left( \|y^{\bar k} - v\|{}_2^2 + \displaystyle\sum_{j=\bar k}^{k} \frac{1}{4(j-\bar k+1)} \|\Phi_{\bar \tau}(y^j)\|{}^2_2\right)}{2\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}}}, \end{array} \end{aligned} $$
(6)

which implies

$$\displaystyle \begin{aligned} \begin{array}{rcl} \Phi_{\bar \tau}(v)^\top (v - z^k) & = & \frac{\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}} \Phi_{\bar \tau}(v)^\top (v - y^j)}{\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}}}\\ {} & \ge & -\frac{\displaystyle\left(\|y^{\bar k} - v\|{}_2^2 + \sum_{j=\bar k}^{k} \frac{1}{4(j-\bar k+1)} \|\Phi_{\bar \tau}(y^j)\|{}^2_2\right)}{2\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}}} \\ {} & & + \frac{\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}} (\Phi_{\bar \tau}(v) - \Phi_{\bar \tau}(y^j))^\top (v - y^j)}{\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}}}\\ {} & \ge & -\frac{\displaystyle\left(\|y^{\bar k} - v\|{}_2^2 + \sum_{j=\bar k}^{k} \frac{1}{4(j-\bar k+1)} \|\Phi_{\bar \tau}(y^j)\|{}^2_2\right)}{2\displaystyle\sum_{j=\bar k}^{k} \frac{1}{2(j-\bar k+1)^{0.5}}}, \end{array} \end{aligned} $$
(7)

where the last inequality holds thanks to the monotonicity of \(\Phi _{\bar \tau }\). Indicating by z ∈ Y  any limit point of the sequence {z k}, taking the limit k → in the latter relation and subsequencing, the following inequality holds true:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \Phi_{\bar \tau}(v)^\top (v - z) & \ge & -\frac{\displaystyle\left(\|y^{\bar k} - v\|{}_2^2 + \sum_{j=\bar k}^{\infty} \frac{1}{4(j-\bar k)} \|\Phi_{\bar \tau}(y^j)\|{}^2_2\right)}{2\displaystyle\sum_{j=\bar k}^{\infty} \frac{1}{2(j-\bar k)^{0.5}}} = 0, \end{array} \end{aligned}$$

because \(\sum _{j=\bar k}^{\infty } \frac {1}{2(j-\bar k)^{0.5}} = +\infty \) and \(\left (\sum _{j=\bar k}^{\infty } \frac {1}{4(j-\bar k)}\right )/\left (\sum _{j=\bar k}^{\infty } \frac {1}{2(j-\bar k)^{0.5}}\right ) = 0\), due to [6, Proposition 4], and then z is a solution of the dual problem

$$\displaystyle \begin{aligned} \Phi_{\bar \tau}(v)^\top (v - z) \ge 0, \enspace \forall v \in Y. \end{aligned}$$

Hence, the sequence {z k} converges to a solution of problem (3) with ε sub = 0 and \(\tau = \bar \tau \), see e.g. [2, Theorem 2.3.5], in contradiction to \(\min _{y \in Y} \Phi _{\bar \tau }(z^{k+1})^\top (y - z^{k+1}) < - \varepsilon ^k = - {\bar \imath }^{-2}\) for every \(k \ge \bar k\). Therefore we can say that i →.

Consequently, the algorithm produces an infinite sequence {w i} such that w i+1 ∈ Y  and

$$\displaystyle \begin{aligned}\Phi_{i} (w^{i+1})^\top(y-w^{i+1}) \geq -i^{-2}, \quad \forall \, y \in Y, \end{aligned}$$

that is (3) holds at w i+1 with ε sub = i −2 and τ = i. By Proposition 1, specifically from (4) and (5), we obtain

$$\displaystyle \begin{aligned}\left(M^{\text{up}}w^{i+1} + d^{\text{up}}\right)^\top (y-w^{i+1}) \geq -i^{-1}, \quad \forall y \in \text{SOL}(M^{\text{low}},d^{\text{low}},Y), \end{aligned}$$

and

$$\displaystyle \begin{aligned}\left(M^{\text{low}}w^{i+1} + d^{\text{low}}\right)^\top (y-w^{i+1}) \geq -i^{-1}(1+ H D), \quad \forall y \in Y. \end{aligned}$$

Taking the limit i → we get the desired convergence property for every limit point of {w i}. □

We consider the natural residual map for the lower-level AVI(M low, d low, Y )

(8)

Function V  is continuous and nonnegative, as reminded in [4]. Also, V (y) = 0 if and only if y ∈SOL(M low, d low, Y ). Condition

$$\displaystyle \begin{aligned} V(y) \leq \widehat \varepsilon_{\text{low}}, \end{aligned} $$
(9)

with \(\widehat \varepsilon _{\text{low}} \geq 0\), is alternative to (5) to take care of the feasibility of problem (2).

Remark Since both the variational inequalities (1) and (2) are affine, then ε up and either ε low or \(\widehat \varepsilon _{\text{low}}\) give actual upper-bounds to the distances between y and SOL(M up, d up, SOL(M low, d low, Y )) and SOL(M low, d low, Y ), respectively.

Theorem 2 If y SOL(M low, d low, Y ) satisfies (4), then there exists c up > 0 such that

$$\displaystyle \begin{aligned} \mathit{\text{dist}}_{\mathit{\text{SOL}}\Big(M^{\mathit{\text{up}}},d^{\mathit{\text{up}}},\mathit{\text{SOL}} (M^{\mathit{\text{low}}},d^{\mathit{\text{low}}},Y)\Big)}(y) \le c_{\mathit{\text{up}}} \varepsilon_{\mathit{\text{up}}}. \end{aligned}$$

If y  Y  satisfies (5), then there exists c low > 0 such that

$$\displaystyle \begin{aligned} \mathit{\text{dist}}_{\mathit{\text{SOL}}(M^{\mathit{\text{low}}},d^{\mathit{\text{low}}},Y)}(y) \le c_{\mathit{\text{low}}} \varepsilon_{\mathit{\text{low}}}. \end{aligned}$$

If y  Y  satisfies (9), then there exists \(\widehat c_{\mathit{\text{low}}}> 0\) such that

$$\displaystyle \begin{aligned} \mathit{\text{dist}}_{\mathit{\text{SOL}}(M^{\mathit{\text{low}}},d^{\mathit{\text{low}}},Y)}(y) \le \widehat c_{\mathit{\text{low}}} \widehat \varepsilon_{\mathit{\text{low}}}. \end{aligned}$$

Proof The claim follows from [2, Proposition 6.3.3] and [6, Proposition 3]. □

In view of Theorem 2, conditions (4) and either (5) or (9) define points that are approximate solutions for problem (2), also under a geometrical perspective. In particular, the lower the values of ε up and either ε low or \(\widehat \varepsilon _{\text{low}}\), the closer the point gets to the solution set of the nested affine variational inequality (2).

We give an upper bound to the number of iterations needed to drive both the upper-level error ε up, given in (4), and the lower-level error \(\widehat \varepsilon _{\text{low}}\), given in (9), under some prescribed tolerance δ.

Theorem 3

Assume conditions (A1)–(A3) to hold and, without loss of generality, \(L_\Phi \triangleq \|M^{\mathit{\text{up}}}+M^{\mathit{\text{low}}}\|{ }_2 < 1\) . Consider L-PATA. Given a precision δ ∈ (0, 1), let us define the quantity

Then, the upper-level approximate problem (4) is solved for y = z k+1 with ε up = δ and the lower-level approximate problem (9) is solved for y = z k+1 with \(\widehat \varepsilon _{\mathit{\text{low}}} = \delta \) and the condition in step (S.4) is satisfied in at most

iterations k, where η > 0 is a small number, and

(10)

Proof

See the proof of [6, Theorem 2]. □