1 Introduction

The aim of this paper is to solve the problem of maximising expected utility of terminal wealth for an investor facing proportional transaction costs in a Black–Scholes market. While this problem has been studied extensively in the literature, it is still an open problem in the finite-horizon case to construct optimal strategies and verify their optimality. The aim of this paper is to close this gap.

Starting with the seminal contribution of Magill and Constantinides [15], the continuous-time optimal investment problem with proportional transaction costs has been studied extensively over the last decades, and three different approaches to tackle this type of problem have emerged: (1) The primal approach based on stochastic control and viscosity solution theory, in which one studies the Hamilton–Jacobi–Bellman (HJB) equation of the problem; (2) the dual approach based on shadow prices, in which one determines an auxiliary frictionless market with unfavourable price processes yielding the same optimal strategy as the original problem; and (3) asymptotics for vanishing costs.

For the problem of optimal consumption over an infinite horizon, the primal approach was utilised, among others, by Davis and Norman [23], Shreve and Soner [37], Akian et al. [1], Kabanov and Klüppelberg [32] and de Vallière et al. [25], whereas Kallsen and Muhle-Karbe [34], Choi et al. [13] and Herczegh and Prokaj [29] used the dual approach to solve this problem. Asymptotic optimality results were obtained by Janeček and Shreve [31] and Gerhold et al. [27]. Moreover, Akian et al. [3] use the primal approach and Gerhold et al. [28] and Gerhold et al. [26] the dual approach to determine closed-form solutions for the problem of maximising the asymptotic growth rate under (small) transaction costs.

In the present paper, we focus on the finite-horizon optimal terminal wealth problem without intermediate consumption in a Black–Scholes market, which was introduced in Akian et al. [2]. The HJB equation of this problem has also been studied in Davis et al. [24] in the context of utility indifference pricing. An adaptation of the results of Davis et al. [24] implies that the value function is a viscosity solution of the HJB equation, and uniqueness holds in the case of bounded utility functions. Belak et al. [9] extend the uniqueness result to more general utility functions including log and power utility. Dai and Yi [21] show that the HJB equation admits a classical solution in the case of log and power utility if the state space is reduced to positive stock holdings, and Dai et al. [20] and Chen et al. [12] extend this result to the problem with intermediate consumption and CARA utility, respectively. Moreover, Czichowsky et al. [19] and Czichowsky and Schachermayer [16, 17, 18] use the shadow price approach to establish existence of optimal strategies for general price processes extending beyond semimartingales. Numerical schemes in the Black–Scholes setting can be found in Kunisch and Sass [36], Dai and Zhong [22] and Herzog et al. [30]. Finally, Bichuch [11] in the Black–Scholes setting and Kallsen and Muhle-Karbe [35] and Kallsen and Li [33] for more general price processes solve the finite-horizon problem asymptotically for small transaction costs. Summing up the results on the finite-horizon problem, it is known that

1) the value function \(\mathcal{V}\) is the unique viscosity solution of the HJB equation;

2) there exists a classical solution \(V\) of the HJB equation if the state space is reduced to positive stock holdings; and

3) there exists a frictionless market in which the optimal strategy coincides with the optimal strategy in the transaction costs market.

It is, however, not immediate that the classical solution \(V\) coincides with the value function \(\mathcal{V}\), nor if the optimal strategy obtained from the dual approach is a reflected diffusion in the no-trading region implied by the HJB equation. More precisely:

1) \(V\) is a classical solution on the reduced state space, whereas the value function \(\mathcal{V}\) is a viscosity solution on the entire state space. While it is not particularly difficult to verify that the classical solution \(V\) is also a viscosity solution on the reduced state space, this does not imply that \(V = \mathcal{V}\) on the reduced state space since the existing literature only provides uniqueness results for viscosity solutions on the entire state space. Thus, in order to rigorously conclude that \(V = \mathcal{V}\), one either needs to (a) prove uniqueness of viscosity solutions on the reduced state space, or (b) show that \(V\) extends to a viscosity solution on the entire state space. In both cases, a careful inspection of the behaviour of \(V\) and \(\mathcal{V}\) at the boundary of the reduced state space is necessary.

2) No link between the auxiliary frictionless market and the HJB equation is known. Hence, while existence of an optimal strategy is guaranteed by the dual approach, it is an open question whether optimal strategies are determined by the trading regions implied by the HJB equation. In particular, without establishing this link, it is not clear if the trading regions obtained from solving the HJB equation numerically determine an optimal strategy.

3) The classical approach of verifying optimality of candidate strategies via the primal approach requires a sufficiently smooth value function \(\mathcal{V}\) to justify the application of Itô’s formula along all controlled state processes (and hence on the entire state space). We shall see, however, that the value function is not of class \(C^{1,2}\) everywhere (see Theorem 4.14 for the precise statement), and thus there is a need for a verification argument requiring less regularity on \(\mathcal{V}\).

To show that \(V\) and \(\mathcal{V}\) coincide and that optimal strategies are determined by the HJB equation, we devise a novel verification procedure which only requires to evaluate the candidate value function along the uncontrolled state process. Since the uncontrolled state process naturally avoids states in which the HJB equation degenerates (i.e., regularity fails), this puts Itô’s formula at our disposal to establish the claims. For this, we characterise the value function as the smallest continuous function which is superharmonic with respect to the uncontrolled state process and nondecreasing in the direction of transactions. More precisely, we proceed as follows:

1) We define trading regions in terms of the classical solution \(V\) of the HJB equation and show that for every initial state, there exists a trading strategy which turns the corresponding state process into a diffusion reflected at the boundaries between the trading regions.

2) We define a function \(h_{0}\) which maps the initial state to the expected utility obtained by following the trading strategy constructed in Step 1) and show that \(h_{0}\) is superharmonic and nondecreasing in the direction of transactions. A simple argument shows that \(h_{0}\) coincides with the classical solution \(V\) on the reduced state space, and by construction, \(h_{0}\) is dominated by the value function \(\mathcal{V}\).

3) We argue that every superharmonic function which is nondecreasing in the direction of transactions is a viscosity supersolution of the HJB equation. By the previous step and the comparison principle in Belak et al. [9], this implies that \(h_{0}\) dominates the value function. By Step 2), this shows that \(h_{0}\) and \(\mathcal{V}\) coincide, yielding that \(\mathcal{V}\) and \(V\) are equal on the reduced state space and the trading strategies constructed in Step 1) are optimal.

In particular, as we characterise the value function as the smallest superharmonic function, our approach may be seen as an alternative duality theory for singular control problems. Moreover, since our approach naturally avoids points of singularity of the infinitesimal generator of the underlying stochastic process (which are typically the points at which it is difficult to verify regularity of the value function), it is conceivable that the verification argument can be applied to other singular control problems as well.

Our verification argument is inspired by recent results of Christensen [14] and Belak et al. [8] in the context of stochastic impulse control. However, since optimal trading regions for singular control problems are given in terms of first-order derivatives of the value function, as opposed to the value function itself for impulse control problems, the mathematical analysis in the present paper differs significantly from the corresponding impulse control results.

Moreover, our superharmonic function approach can be seen as a version of the stochastic Perron method; see Bayraktar and Sîrbu [4, 5, 6] for early developments and Bayraktar and Zhang [7] for the case of a singular control problem with transaction costs. In contrast to the stochastic Perron method, we require the superharmonicity property along the uncontrolled state process together with the monotonicity in the direction of transactions, whereas for the stochastic Perron method, one would typically ask for superharmonicity along every controlled state process (or at least a subset of state processes containing a maximising sequence). As a consequence, it is more involved in our setting to argue that the superharmonic functions dominate the value function (we rely on viscosity arguments for this, whereas it is immediate in the setting of [7]). On the other hand, our definition makes verification significantly easier since we only need to verify superharmonicity for the uncontrolled state process. We note, however, that our setup implies superharmonicity along any state processes obtained from a piecewise constant strategy (see Remark 4.2 below) and hence the two concepts coincide as soon as there exists a maximising sequence consisting of piecewise constant strategies.

The remainder of this article is structured as follows. In Sect. 2, we set up the market model, recall existing results from the literature, and discuss implications of our results. In Sect. 3, we construct the candidate optimal strategies as well as the corresponding reflected diffusions. Our main results can be found in Sect. 4, where we present the verification theorem to show that these candidate optimal strategies are indeed optimal, analyse the regularity of the value function in detail, and prove that the classical solution of Dai and Yi [20] coincides with the value function on the reduced state space.

2 Market model and problem formulation

2.1 The market model

We let \(W = (W(t))_{t\geq 0}\) be a standard Brownian motion defined on the canonical Wiener space \((\Omega ,\mathcal{F},\mathbb{P})\). For each \(t\geq 0\), we denote the augmented filtration generated by \((W(u)-W(t))_{u \geq t}\) by \(\mathbb{F}^{t} = (\mathcal{F}^{t}(u))_{u\geq t}\) and set \(\mathbb{F} := \mathbb{F}^{0}\). Moreover, we fix some terminal time \(T>0\) as well as some initial time \(t\in [0,T)\).

We consider a Black–Scholes market \((P^{0},P^{1}) = (P^{0}(u),P^{1}(u))_{u \in [t,T]}\) with

$$\begin{aligned} \mathrm{d}P^{0}(u) &= 0,\qquad u\in [t,T],\ P^{0}(t)=1, \\ \mathrm{d}P^{1}(u) &= \alpha P^{1}(u)\,\mathrm{d}u+\sigma P^{1}(u)\, \mathrm{d}W(u),\qquad u\in [t,T],\ P^{1}(t)=1. \end{aligned}$$

Here, \(\alpha >0\) and \(\sigma >0\) denote the excess return and volatility of the stock, respectively. With this, we assume that the investor buys shares of the stock at the ask price \((1+\lambda )P^{1}\), where \(\lambda >0\), and sells shares of the stock at the bid price \((1-\mu )P^{1}\), where \(\mu \in (0,1)\).

Next, to model trading strategies, we take \(\mathbb{F}^{t}\)-adapted, nondecreasing, càdlàg processes \(L=(L(u))_{u\in [t,T]}\) and \(M=(M(u))_{u \in [t,T]}\) with \(L(t-)=M(t-)=0\). Here, \(L\) and \(M\) represent the cumulative units of money used for purchases and sales of the stock, respectively. With this, we denote by \(B = B_{t,b}^{L,M} = (B_{t,b} ^{L,M}(u))_{u\in [t,T]}\) and \(S = S_{t,s}^{L,M} = (S_{t,s}^{L,M}(u))_{u \in [t,T]}\) the investor’s wealth invested in the bond and the stock, respectively. Assuming that the strategy \((L,M)\) is self-financing, the evolution of \(B\) and \(S\) can be written as

$$\begin{aligned} \mathrm{d}B(u) &= -(1+\lambda )\,\mathrm{d}L(u) + (1-\mu )\,\mathrm{d}M(u),\qquad u \in [t,T], \\ \mathrm{d}S(u) &= \alpha S(u)\,\mathrm{d}u + \sigma S(u)\,\mathrm{d}W(u) + \mathrm{d}L(u) - \mathrm{d}M(u),\qquad u\in [t,T], \end{aligned}$$

where the initial values are given by \(B(t-)=b\) and \(S(t-)=s\), respectively. The net wealth \(X = X_{t,b,s}^{L,M} = (X_{t,b,s}^{L,M}(u))_{u \in [t,T]}\) of the investor after liquidation of the stock position is then given by

We say that a trading strategy is admissible if the corresponding net wealth process is nonnegative. For this, we define the solvency cone

$$ \mathcal{S} := \{(b,s)\in \mathbb{R}^{2} : b+(1+\lambda )s>0, b+(1- \mu )s>0\}. $$

With this, whenever \((b,s)\in \overline{\mathcal{S}}\), the investor can liquidate her stock holdings to end up with nonnegative wealth. A trading strategy \((L,M)\) is therefore admissible for an initial position \((b,s)\in \overline{\mathcal{S}}\) if the corresponding pair \((B_{t,b}^{L,M},S_{t,s}^{L,M})\) takes values in \(\overline{ \mathcal{S}}\). The set of all admissible trading strategies of this form is denoted by \(\mathcal{A}(t,b,s)\).

The objective of the investor is to maximise expected utility of the net terminal wealth after liquidation, i.e.,

$$ \mathcal{V}(t,b,s) := \sup _{(L,M)\in \mathcal{A}(t,b,s)} \mathbb{E} \bigl[ U_{p}\bigl(X_{t,b,s}^{L,M}(T)\bigr)\bigr], $$
(2.1)

where the utility function \(U_{p}:(0,\infty )\to \mathbb{R}\) is defined as

$$ U_{p}(x) := \textstyle\begin{cases} x^{p}/p &\quad \text{if }p< 1,p\neq 0, \\ \log x &\quad \text{if }p=0. \end{cases} $$

We extend \(U_{p}\) to \([0,\infty )\) by setting \(U_{p}(0):= \lim _{x\downarrow 0}U_{p}(x)\).

2.2 Overview of existing results

As pointed out in Sect. 1, the portfolio problem defined in (2.1) has received considerable interest in the past. In this section, we briefly summarise the results which will be needed in the sequel.

First, it is easy to see that the value function \(\mathcal{V}\) defined in (2.1) is finite on \([0,T]\times \mathcal{S}\). More specifically, we have (see Belak et al. [9])

$$ U_{p}\big(b + \min \{(1-\mu )s,(1+\lambda )s\}\big) \leq \mathcal{V}(t,b,s) \leq \varphi _{p}(t,b,s), $$

where for \(\gamma \in (1-\mu ,1+\lambda )\) and \(K>1\), the function \(\varphi _{p}:[0,T]\times \overline{\mathcal{S}}\to \mathbb{R}\) is given by

$$ \varphi _{p}(t,b,s) := U_{p}\bigl( (b+\gamma s) f(t) \bigr) $$

with \(f:[0,T]\to \mathbb{R}\) given by

$$ f(t) := \exp \bigg(K\frac{1}{2(1-p)}\frac{\alpha ^{2}}{\sigma ^{2}}(T-t) \bigg). $$

Davis et al. [24] (with some adaptations) and Belak et al. [9, 10] establish that the value function \(\mathcal{V}\) is continuous and the unique viscosity solution of the HJB equation

$$ 0 = \min \{ \mathcal{L}^{\mathrm{nt}}\mathcal{V}(t,b,s), \mathcal{L} ^{\mathrm{buy}}\mathcal{V}(t,b,s), \mathcal{L}^{\mathrm{sell}} \mathcal{V}(t,b,s)\} $$
(2.2)

on \([0,T)\times \mathcal{S}\) with terminal condition

$$ \mathcal{V}(T,b,s) = U_{p}\big(b + \min \{(1-\mu )s,(1+\lambda )s\} \big), \qquad (b,s)\in \overline{\mathcal{S}}, $$

and boundary condition

$$ \mathcal{V}(t,b,s) = U_{p}(0), \qquad (t,b,s)\in [0,T]\times \partial \mathcal{S}. $$

The differential operators \(\mathcal{L}^{\mathrm{nt}}\), \(\mathcal{L} ^{\mathrm{buy}}\) and \(\mathcal{L}^{\mathrm{sell}}\) in (2.2) are given by

$$\begin{aligned} \mathcal{L}^{\mathrm{nt}}\mathcal{V}(t,b,s) &:= -\partial _{t} \mathcal{V}(t,b,s) - \alpha s \partial _{s}\mathcal{V}(t,b,s) - \frac{1}{2} \sigma ^{2} s^{2} \partial _{s}^{2}\mathcal{V}(t,b,s), \\ \mathcal{L}^{\mathrm{buy}}\mathcal{V}(t,b,s) &:= (1+\lambda )\partial _{b}\mathcal{V}(t,b,s) - \partial _{s}\mathcal{V}(t,b,s), \\ \mathcal{L}^{\mathrm{sell}}\mathcal{V}(t,b,s) &:= -(1-\mu )\partial _{b}\mathcal{V}(t,b,s) + \partial _{s}\mathcal{V}(t,b,s). \end{aligned}$$

The uniqueness of the value function is a consequence of the following comparison principle, obtained in Belak et al. [9, Theorem 4.4].

Theorem 1

Let\(u,v:[0,T]\times \overline{\mathcal{S}}\to \overline{\mathbb{R}}\)and fix\(\varepsilon >0\). Assume that\(u\)is an upper semicontinuous viscosity subsolution and\(v\)is a lower semicontinuous viscosity supersolution of (2.2) such that

$$ U_{p}\big(b+\min \{(1-\mu )s,(1+\lambda )s\}\big) \leq u(t,b,s),v(t,b,s) \leq \varphi _{p}(t,b,s). $$

If\(u(T,b,s)\leq v(T,b+\varepsilon ,s)\)and\(u(t,b,s)\leq U_{p}(0)\)for every\((b,s)\in \partial \mathcal{S}\), then\(u(t,b,s)\leq v(t,b+ \varepsilon ,s)\)on\([0,T]\times \overline{\mathcal{S}}\).

It is expected that the operators \(\mathcal{L}^{\mathrm{nt}}\), \(\mathcal{L}^{\mathrm{buy}}\) and \(\mathcal{L}^{\mathrm{sell}}\) determine the optimal strategies. To be more precise, given a smooth solution \(v\) of (2.2), we define

$$\begin{aligned} {\mathcal{R}}^{\mathrm{buy}}(v) &:= \{(t,b,s) \in [0,T)\times \mathcal{S}: {\mathcal{L}}^{\mathrm{buy}}v(t,b,s) = 0\}, \\ {\mathcal{R}}^{\mathrm{sell}}(v) &:= \{(t,b,s) \in [0,T)\times \mathcal{S}: {\mathcal{L}}^{\mathrm{sell}}v(t,b,s) = 0\}, \\ {\mathcal{R}}^{\mathrm{nt}}(v) &:= ([0,T)\times \mathcal{S})\setminus ({\mathcal{R}}^{\mathrm{buy}}\cup {\mathcal{R}}^{\mathrm{sell}}). \end{aligned}$$

We expect that the optimal strategy keeps the process \((B,S)\) inside the no-trading region \(\mathcal{R}^{\mathrm{nt}}(v)\) by reflecting \((B,S)\) at the boundary \(\partial \mathcal{R}^{\mathrm{nt}}(v)\). Note that we do not include the terminal time \(T\) in the definition of the trading regions since we require the investor to liquidate her holdings in the stock at time \(T\), and we do not include the boundary \(\partial \mathcal{S}\) since the only admissible, and hence optimal, strategy on \(\partial \mathcal{S}\) is to instantly close the stock position and refrain from further trading; see Lemma 3.1 below.

While with the previous result, the value function \(\mathcal{V}\) is completely characterised (and can be computed numerically), it does not suffice to construct and verify the optimal strategies. It is therefore necessary to study the HJB equation in more detail for the existence of a regular solution.

We denote by \(\mathcal{S}_{0}\) and \(\overline{\mathcal{S}_{0}}\) the restrictions to positive stock holdings of the solvency region and of its closure, respectively, i.e.,

$$ \mathcal{S}_{0} := \{(b,s)\in \mathcal{S} : s > 0\} \qquad \text{and} \qquad \overline{\mathcal{S}_{0}} := \{(b,s)\in \overline{\mathcal{S}} : s > 0\}. $$

Dai and Yi [21, Theorem 5.1] show that the HJB equation admits a classical solution on the restricted solvency region:

Theorem 2

There exists a continuous function\(V:[0,T]\times \overline{ \mathcal{S}_{0}}\to \mathbb{R}\)such that\(V\in C^{1,2}(([0,T) \times \mathcal{S}_{0})\setminus F)\)and\(\partial _{t} V\leq 0\)which solves the HJB equation (2.2) in the classical sense. Here, the set\(F\)is given by

$$ F := \textstyle\begin{cases} \emptyset &\quad \textit{if }\pi _{M} < 1, \\ \{(t,b,s)\in [0,T)\times \mathcal{S}_{0}: b = 0\} &\quad \textit{if } \pi _{M} = 1, \\ \{(t,b,s)\in [0,T)\times \mathcal{S}_{0}: b = 0, t = t^{\mathrm{up}} \} &\quad \textit{if }\pi _{M} > 1, \end{cases} $$
(2.3)

where

$$ t^{\mathrm{up}} := T - \frac{\log (1+\lambda ) - \log (1-\mu )}{ \alpha -(1-p)\sigma ^{2}} $$
(2.4)

and\(\pi _{M} := \alpha /((1-p)\sigma ^{2})\)denotes the Merton fraction.

Let us emphasise here that the combination of the fact that \(\mathcal{V}\) is a viscosity solution of the HJB equation on \([0,T)\times \mathcal{S}\), the fact that \(V\) is a classical solution of the HJB equation on \([0,T)\times \mathcal{S}_{0}\), and the uniqueness result implied by the comparison principle in Theorem 2.1 do not imply that \(\mathcal{V} = V\) on \([0,T)\times \mathcal{S}_{0}\); some additional work is necessary to arrive at this conclusion. Indeed, while it is not too difficult to show that \(V\) is also a viscosity solution on \([0,T)\times \mathcal{S}_{0}\) (which is in particular immediate outside the set \(F\)), one would need to extend \(V\) to a continuous viscosity solution defined on the entire state space \([0,T)\times \mathcal{S}\) to apply the comparison theorem. Verifying that such an extension exists, however, requires additional work including a careful study of the behaviour of \(V\) (and its partial derivatives) as \(s\downarrow 0\). While we believe that it is possible to follow this direct approach, we shall nevertheless take a different route which has the advantage of additionally verifying optimality of our candidate optimal trading strategies. These candidate strategies are defined in terms of trading regions implied by the classical solution \(V\). More precisely, Theorem 2.2 allows us to define the reduced trading regions

$$\begin{aligned} \mathcal{R}_{0}^{\mathrm{buy}} &:= \{(t,b,s) \in [0,T)\times \mathcal{S}_{0}: \mathcal{L}^{\mathrm{buy}} V(t,b,s) = 0\}, \\ \mathcal{R}_{0}^{\mathrm{sell}} &:= \{(t,b,s) \in [0,T)\times \mathcal{S}_{0}: \mathcal{L}^{\mathrm{sell}} V(t,b,s) = 0\}, \\ \mathcal{R}_{0}^{\mathrm{nt}} &:= ([0,T)\times \mathcal{S}_{0}) \setminus (\mathcal{R}_{0}^{\mathrm{buy}}\cup \mathcal{R}_{0}^{ \mathrm{sell}}). \end{aligned}$$

Note that we must have \(\mathcal{L}^{\mathrm{nt}} V = 0\) on \(\mathcal{R}_{0}^{\mathrm{nt}}\). In order to construct the optimal strategies, it is important to determine the geometry of these sets and the location of the boundaries between them. Dai and Yi [21, Theorems 4.3, 4.5 and 4.7] provide the following characterisation of these free boundaries.

Theorem 3

1) There exist nonnegative, nonincreasing functions\(\underline{\pi }:[0,T)\to \mathbb{R}\)and\(\overline{\pi }:[0,T) \to \mathbb{R}\)with\(\underline{\pi }(t) < \overline{\pi }(t)\)for all\(t\in [0,T)\)such that

$$\begin{aligned} \mathcal{R}_{0}^{\mathrm{buy}} & = \bigg\{ (t,b,s) \in [0,T)\times \mathcal{S}_{0}: \frac{s}{b+s} \leq \underline{\pi }(t) \bigg\} , \\ \mathcal{R}_{0}^{\mathrm{sell}} & = \bigg\{ (t,b,s) \in [0,T)\times \mathcal{S}_{0}: \frac{s}{b+s} \geq \overline{\pi }(t) \bigg\} , \\ \mathcal{R}_{0}^{\mathrm{nt}} & = \bigg\{ (t,b,s) \in [0,T)\times \mathcal{S}_{0}: \underline{\pi }(t) < \frac{s}{b+s} < \overline{ \pi }(t)\bigg\} . \end{aligned}$$

Moreover, \(V\)is of class\(C^{\infty }\)on\(\mathcal{R}_{0}^{ \mathrm{nt}}\).

2) The function\(\underline{\pi }\)is continuous and satisfies

$$ \underline{\pi }(t) \textstyle\begin{cases} < 1 &\quad \textit{if }\pi _{M}\leq 1, \\ > 1 &\quad \textit{if }\pi _{M}>1, t < t^{\mathrm{up}}, \\ = 1 &\quad \textit{if }\pi _{M}>1, t = t^{\mathrm{up}}, \\ < 1 &\quad \textit{if }\pi _{M}>1, t > t^{\mathrm{up}}, \end{cases} $$

where\(t^{\mathrm{up}}\)is defined in (2.4). Furthermore, \(\underline{\pi }(t) = 0\)for\(t\in [t^{\mathrm{down}},T)\), where

$$ t^{\mathrm{down}} := T - \frac{\log (1+\lambda ) - \log (1-\mu )}{ \alpha }. $$

3) It holds that

$$ \overline{\pi }(t) \textstyle\begin{cases} < 1 &\quad \textit{if }\pi _{M}< 1, \\ = 1 &\quad \textit{if }\pi _{M}=1, \\ > 1 &\quad \textit{if }\pi _{M}>1, \end{cases} $$

and\(\overline{\pi }\in C^{\infty }([0,T))\).

Remark 4

A close inspection of the results in Dai and Yi [21] implies that

$$ \inf _{t\in [0,T)} |\overline{\pi }(t) - \underline{\pi }(t)| > 0. $$

Indeed, Dai and Yi [21, Sect. 5] show that the HJB equation (2.2) can be transformed into a double obstacle problem with obstacles given by \(1/(x+1+\lambda )\) (determining \(\underline{\pi }\)) and \(1/(x+1-\mu )\) (determining \(\overline{\pi }\)), respectively. Since \(V\) is continuous and the distance between the obstacles is strictly positive, this implies that the above infimum is also strictly positive.

Figures 13 below sketch the different scenarios for the location of the free boundaries. Note that \(t^{\mathrm{up}}\) is the time point at which the lower free boundary is equal to one, i.e., \(\underline{\pi }(t^{\mathrm{up}}) = 1\) (this may only happen if \(\pi _{M}>1\)), and \(t^{\mathrm{down}}\) is the time point from which onwards the lower free boundary is equal to zero, i.e., \(\underline{\pi }(t) = 0\) for all \(t\in [t^{\mathrm{down}},T]\).

Fig. 1
figure 1

The trading regions for \(\pi _{M}<1\)

For obvious reasons, we refer to \(\underline{\pi }\) and \(\overline{ \pi }\) as the buy and sell boundary, respectively. If our conjecture that the buy and sell boundaries characterise the optimal strategies is indeed correct (which will be rigorously proved in Sect. 4), Theorem 2.3 has the following implications:

1) If \(\pi _{M}<1\) (cf. Fig. 1), i.e., if borrowing is not optimal in the absence of costs, then it is also not optimal in the presence of costs.

2) If \(\pi _{M} = 1\) (cf. Fig. 2), i.e., if it is optimal to invest all money in the stock in the absence of transaction costs, then two cases must be distinguished in the presence of costs. If the initial position of the investor is such that \(b\leq 0\), then the bond position is closed and all money is kept in the stock (since \(\overline{\pi }=1\)). However, if the initial position is such that \(b>0\), then it is not optimal to close the bond position. This is because we force the investor to close the stock position at the terminal time \(T\), and hence it is too expensive to first buy shares of the stock at the initial time just to liquidate the stock position once the investment horizon is reached.

Fig. 2
figure 2

The trading regions for \(\pi _{M}=1\)

3) If \(\pi _{M}>1\) (cf. Fig. 3), i.e., if borrowing is optimal without costs, we need to distinguish three cases. Since the investor never switches from borrowing to no-borrowing or vice versa after the initial transaction (\(\overline{\pi }>1\) and \(\underline{\pi }\) is nonincreasing), the initial transaction determines whether borrowing or no-borrowing is optimal:

Fig. 3
figure 3

The trading regions for \(\pi _{M}>1\)

\(t^{\mathrm{up}} > 0\). In this case, borrowing is optimal since \(\underline{\pi}(0)>1\).

\(t^{\mathrm{up}} = 0\). If the initial position is such that \(b<0\), then borrowing is optimal. Otherwise the investor invests all her wealth in the stock (since \(\underline{\pi}(0) = 1\)).

\(t^{\mathrm{up}} < 0\). In this case, borrowing is optimal if \(b<0\) and no-borrowing is optimal if \(b\geq 0\). This is because \(\underline{\pi }(t)<1<\overline{\pi }(t)\) for all \(t\in [0,T)\).

4) In any case, as soon as \(t\geq t^{\mathrm{down}}\), the investor refrains from buying shares of the stock since \(\underline{\pi }(t) = 0\).

5) If the initial position \((b,s)\) is such that \(s\leq 0\), then whenever \(\underline{\pi }(t)=0\), it is optimal to liquidate the stock position and refrain from further trading. Whenever \(\underline{ \pi }(t)>0\), the investor performs an initial transaction which takes her position to the boundary of the no-trading region. This is proved in Sect. 4, but intuitively this behaviour is clear: Since the excess return \(\alpha \) is positive and the investor has to liquidate her stock holdings at time \(T\), it should never be optimal to have a short position in the stock.

3 Construction of the optimal strategies

We proceed with the construction of the candidate optimal strategies. For this, we fix an arbitrary initial datum \((t_{0},b_{0},s_{0}) \in [0,T)\times \overline{\mathcal{S}}\). We first observe that if \((b _{0},s_{0})\in \partial \mathcal{S}\), then the only admissible and hence optimal strategy is to immediately close the position and refrain from further trading. The proof follows as in Shreve and Soner [37, Remark 2.1].

Lemma 1

Let\((b_{0},s_{0})\in \partial \mathcal{S}\). Then the only admissible strategy is to instantly jump to the position\((0,0)\)and remain there.

In what follows, we may thus assume\((b_{0},s_{0})\in \mathcal{S}\). For the construction of the optimal strategy, we need to prove the existence of nondecreasing processes \(L^{*} = (L^{*}(u))_{u \in [t,T]}\) and \(M^{*} = (M^{*}(u))_{u\in [t,T]}\) which turn the controlled wealth process \((B^{*},S^{*}) := (B^{L^{*},M^{*}}_{t_{0},b _{0}}, S^{L^{*},M^{*}}_{t_{0},s_{0}})\) into a diffusion reflected at the boundary of \(\mathcal{R}_{0}^{\mathrm{nt}}\). We first observe that we can without loss of generality assume that the initial position \((t_{0},b_{0},s_{0})\) is an element of the closure of the no-trading region \(\mathcal{R}_{0}^{\mathrm{nt}}\). Indeed, if \((t_{0},b_{0},s _{0})\not \in \overline{\mathcal{R}_{0}^{\mathrm{nt}}}\), then we can find \((b^{*},s^{*})\) and (minimal) \(\ell ,m\geq 0\) such that \((t_{0}, b^{*}, s^{*})\in \partial \mathcal{R}_{0}^{\mathrm{nt}}\) and

$$ b^{*} = b_{0} - (1+\lambda )\ell + (1-\mu )m,\qquad s^{*} = s_{0} + \ell - m. $$

With this, if \((L^{*},M^{*})\) is the candidate optimal strategy for \((t_{0},b^{*},s^{*})\), the pair \((L^{*} + \ell , M^{*} + m)\) is the candidate optimal strategy for \((t_{0},b_{0},s_{0})\). In other words, by a suitable initial transaction, we can always ensure that we start within the closure of the no-trading region.

Next, we observe that in the following, we can rule out all cases in which the investor liquidates either the bond or the stock position at time \(t_{0}\) and refrains from further transactions. Comparing with Figs. 13 and recalling that we may assume \((t_{0},b_{0},s_{0})\in \overline{\mathcal{R}_{0}^{\mathrm{nt}}}\), these cases are

  1. (a)

    \(\pi _{M}\) arbitrary, \(t_{0}\geq t^{\mathrm{down}}\) and \(s_{0}=0\);

  2. (b)

    \(\pi _{M} = 1\) with \(s_{0}>0\) and \(b_{0}= 0\);

  3. (c)

    \(\pi _{M} > 1\) with \(s_{0}>0\), \(b_{0}=0\), and \(t_{0}\geq t ^{\mathrm{up}}\);

  4. (d)

    \(\pi _{M} > 1\) with \(s_{0}=0\), \(b_{0}>0\), and \(t_{0} = t ^{\mathrm{up}}\).

The remaining cases are given by

  1. (e)

    \(\pi _{M} < 1\) with \(s_{0},b_{0}>0\);

  2. (f)

    \(\pi _{M} = 1\) with \(s_{0},b_{0}>0\);

  3. (g)

    \(\pi _{M} > 1\) with \(s_{0},b_{0}>0\) and \(t_{0}>t^{ \mathrm{up}}\);

  4. (h)

    \(\pi _{M} > 1\) with \(s_{0}>0\), \(b_{0}<0\).

The cases (e)–(g) are no-borrowing cases, whereas we expect borrowing to be optimal in the case (h). It turns out that for the construction of the reflected diffusions, it is advantageous to consider the change of variables \(s/b\) in the no-borrowing case and \(s/(-b)\) in the borrowing case. For this, we define

$$ {\mathcal{S}}_{+} := \left \{(b,s)\in \mathcal{S}: b>0, s>0\right \}, \qquad {\mathcal{S}}_{-} := \left \{(b,s)\in \mathcal{S}: b< 0, s>0\right \}. $$

In the sequel, we work on the reduced state space \([0,T]\times {\mathcal{S}} _{+}\) in the no-borrowing cases (e)–(g) and \([0,T]\times {\mathcal{S}} _{-}\) in the borrowing case (h).

3.1 Construction in the no-borrowing case (e)

The main idea for the construction of the optimal strategy is to find a suitable transformation of the state space so that the problem of constructing an obliquely reflected diffusion in an unbounded and time-dependent cone simplifies to normal reflection in a time-dependent interval. The transformation and construction is based on ideas from Gerhold et al. [26], and hence we keep the exposition to a minimum. We restrict ourselves to the case \(p<1\), \(p\neq 0\) (power utility) and remark that the construction for the case \(p=0\) (log utility) follows similarly.

In the situation of case (e), i.e., \(\pi _{M} < 1\) with \(s_{0},b_{0}>0\), we first observe that \(V\in C^{1,2}([0,T)\times \mathcal{S}_{0})\) by Theorem 2.2. We define

$$ \ell (t) := \frac{\underline{\pi }(t)}{1-\underline{\pi }(t)}\qquad \text{and}\qquad u(t) := \frac{\overline{\pi }(t)}{1-\overline{ \pi }(t)}, $$
(3.1)

and note that \(\ell \) and \(u\) constitute the buy and sell boundaries under the change of variables \((b,s)\mapsto s/b\), i.e.,

$$ \mathcal{R}_{0}^{\mathrm{nt}} = \bigg\{ (t,b,s)\in [0,T)\times \overline{ \mathcal{S}} : \ell (t) < \frac{s}{b} < u(t)\bigg\} . $$

By Theorem 2.3, we see that \(\ell < u\), \(\ell \in C([0,T))\) and \(u\in C^{\infty }([0,T))\).

On the set \([0,T]\times \mathcal{S}_{+}\) we consider the transformation

$$ V(t,b,s) = b^{p}\exp \bigg( - p\int _{\log \frac{s}{bu(t)}}^{0} w(t,y)\, \mathrm{d}y \bigg) = b^{p}\exp \bigg( - p\int _{x}^{0} w(t,y)\, \mathrm{d}y \bigg), $$

where

$$ x = x(t,b,s) := \log \frac{s}{bu(t)}. $$

With this and using that \(V\) satisfies \(\mathcal{L}^{\mathrm{buy}}V \geq 0\) and \(\mathcal{L}^{\mathrm{sell}} V\geq 0\), we see that \(w\) satisfies

$$ 1-\mu \leq \frac{w(t,x)}{u(t)(1-w(t,x))e^{x}} \leq 1+\lambda , $$
(3.2)

and equality holds if and only if \(\mathcal{L}^{\mathrm{sell}} V = 0\) or \(\mathcal{L}^{\mathrm{buy}} V = 0\), respectively. Moreover, since \(\mathcal{L}^{\mathrm{nt}}V = 0\) whenever \(\mathcal{L}^{\mathrm{buy}} V > 0\) and \(\mathcal{L}^{\mathrm{sell}} V > 0\), we have

$$\begin{aligned} 0 &=\int _{x}^{0} \partial _{t}w(t,y)\,\mathrm{d}y - \bigg(\alpha -\frac{1}{2} \sigma ^{2} - \frac{u'(t)}{u(t)}\bigg)w(t,x) \\ & \phantom{=:}- \frac{1}{2}p\sigma ^{2}w(t,x)^{2} - \frac{1}{2}\sigma ^{2}\partial _{x}w(t,x) \end{aligned}$$

whenever \(w/(u(1-w)e^{x}) \not \in \{1-\mu ,1+\lambda \}\). Taking the derivative with respect to \(x\) in the last equation, we obtain

$$\begin{aligned} \frac{1}{2}\sigma ^{2}\partial _{x}^{2}w(t,x) & = - \partial _{t}w(t,x) - \bigg(\alpha - \frac{1}{2}\sigma ^{2} - \frac{u'(t)}{u(t)}\bigg) \partial _{x}w(t,x) \\ & \phantom{=:}- p\sigma ^{2}w(t,x)\partial _{x}w(t,x). \end{aligned}$$

Consider again the fraction in (3.2), i.e.,

$$ f(t,x) := \frac{w(t,x)}{u(t)(1-w(t,x))e^{x}}. $$

Since by (3.1) the points \(x = 0\) and \(x = \log ( \ell (t)/u(t))\) constitute the boundary points of the no-trading region in the new variables, we must have

$$\begin{aligned} f(t,x) &= 1-\mu \qquad \text{if }x\geq 0, \end{aligned}$$
(3.3)
$$\begin{aligned} f(t,x) &= 1+\lambda \qquad \text{if } x\leq \log \frac{\ell (t)}{u(t)}. \end{aligned}$$
(3.4)

Moreover, note that for the point \(x = \log (\ell (t)/u(t))\), these considerations are only valid for \(t\in [0,t^{\mathrm{down}})\) since otherwise \(\log (\ell (t)/u(t)) = -\infty \).

Remark 2

We have

$$ f(t,x) = \frac{w(t,x)}{u(t)(1-w(t,x))e^{x}} \in [1-\mu ,1+\lambda ] $$

and \(f(t,x)\in \{1-\mu ,1+\lambda \}\) inside the buy and sell regions. This suggests that

$$ f\bigl(t,X^{*}(t)\bigr)P^{1}(t) \qquad \text{with} \qquad X^{*}(t) = \log \frac{S^{*}(t)}{B^{*}(t)u(t)} $$

(where \((B^{*},S^{*})\) is the optimally controlled portfolio process) is the shadow price in our problem. This can be confirmed as in Gerhold et al. [26].

The next step is to construct a reflected diffusion in the time-dependent interval \([\log (\ell /u),0]\).

Lemma 3

There exist a process\(\Psi = (\Psi (t))_{t\in [t_{0},T)}\)and nondecreasing processes\(L = (L(t))_{t\in [t_{0},T)}\)and\(M = (M(t))_{t \in [t_{0},T)}\)such that\(L\)is constant on\([t^{\mathrm{down}},T)\)and

$$ \mathrm{d}\Psi (t) = \bigg(\alpha -\frac{1}{2}\sigma ^{2} - \frac{u'(t)}{u(t)}\bigg)\,\mathrm{d}t + \sigma \,\mathrm{d}W(t) + \mathrm{d}L(t) - \mathrm{d}M(t), $$
(3.5)

with

$$ \Psi (t_{0}) = \log \frac{s_{0}}{b_{0}u(t_{0})}, $$

and such that\(\Psi \)is a diffusion reflected on the boundaries of the time-dependent interval\([\log (\ell /u),0]\).

Proof

This follows from Słomiński and Wojciechowski [38, Theorem 3.3] together with Remark 2.4. □

Let us now define a process \(N = (N(t))_{t\in [t_{0},T)}\) by \(N({t_{0}}) = s_{0}/P^{1}({t_{0}})\) and, for \(t\in [t_{0},T)\),

$$\begin{aligned} \mathrm{d}N(t) &= N(t)\biggl(1 - w\Bigl(t,\log \frac{\ell (t)}{u(t)} \Bigr)\biggr)\,\mathrm{d}L(t)- N(t)\bigl(1 - w(t,0)\bigr)\,\mathrm{d}M(t). \end{aligned}$$
(3.6)

Remark 4

Comparing with Gerhold et al. [26], we interpret \(ue^{\Psi }\) as the optimal stock-to-bond ratio and \(N\) as the optimal cumulative number of shares of the stock bought up to time \(t\). Furthermore, again as in Gerhold et al. [26], the function \(w\) can be interpreted as the optimal risky fraction, and hence \(1-w\) coincides with the optimal fraction of wealth invested into the bond which gives a nice interpretation for all the terms occurring in (3.6).

With this, we now have all the tools at hand to construct the optimal strategies. Let us define a process \(S^{*} = (S^{*}(t))_{t\in [t_{0},T)}\) through \(S^{*}(t) := N(t)P^{1}(t)\). Then \(S^{*}(t_{0}) = N(t_{0})P^{1}(t_{0}) = s_{0}\) and

$$\begin{aligned} \mathrm{d}S^{*}(t) &= \alpha S^{*}(t)\,\mathrm{d}t + \sigma S^{*}(t)\, \mathrm{d}W(t) \\ & \phantom{=:}+ S^{*}(t)\biggl(1 - w\Bigl(t,\log \frac{\ell (t)}{u(t)} \Bigr)\biggr)\,\mathrm{d}L(t) \\ & \phantom{=:}- S^{*}(t)\bigl(1 -w(t,0)\bigr)\,\mathrm{d}M(t),\qquad t \in [t_{0},T). \end{aligned}$$

Also define \(B^{*} = (B^{*}(t))_{t\in [t_{0},T)}\) by \(B^{*}(t) := {S^{*}(t)}e^{-\Psi (t)}/{u(t)}\) so that \(B^{*}({t_{0}}) = S^{*}({t _{0}})/(u(t_{0})e^{\Psi ({t_{0}})}) = b_{0}\) and

$$ \mathrm{d}B^{*}(t) = -w\bigg(t,\log \frac{\ell (t)}{u(t)}\bigg)B^{*}(t)\, \mathrm{d}L(t) + w(t,0)B^{*}(t)\,\mathrm{d}M(t),\qquad t\in [t_{0},T). $$

Using the definition of \(B^{*}\), (3.3) and (3.4), we see that

$$ w(t,0)B^{*}(t) = \frac{w(t,0){S^{*}(t)}}{u(t)e^{0}} = (1-\mu )S^{*}(t) \bigl(1-w(t,0)\bigr), $$

and similarly

$$\begin{aligned} w\bigg(t,\log \frac{\ell (t)}{u(t)}\bigg)B^{*}(t) &= \frac{w(t,\log ( \ell (t)/u(t))){S^{*}(t)}}{u(t)e^{\log (\ell (t)/u(t))}} \\ &= (1+\lambda )S(t)^{*}\biggl(1-w\Bigl(t,\log \frac{\ell (t)}{u(t)} \Bigr)\biggr). \end{aligned}$$

So in total, the dynamics of \(B^{*}\) simplify to

$$\begin{aligned} \mathrm{d}B^{*}(t) &= -(1+\lambda )S^{*}(t)\biggl(1 - w\Bigl(t,\log \frac{ \ell (t)}{u(t)}\Bigr)\biggr)\,\mathrm{d}L(t) \\ & \phantom{=:}+ (1-\mu )S^{*}(t)\bigl(1 - w(t,0)\bigr)\,\mathrm{d}M(t),\qquad t \in [t_{0},T). \end{aligned}$$

Hence, if we define

$$\begin{aligned} \mathrm{d}L^{*}(t) &= N(t)P^{1}(t)\biggl(1 - w\Bigl(t,\log \frac{ \ell (t)}{u(t)}\Bigr)\biggr)\,\mathrm{d}L(t),\qquad t\in [t_{0},T), \\ \mathrm{d}M^{*}(t) &= N(t)P^{1}(t)\bigl(1 - w(t,0)\bigr)\,\mathrm{d}M(t),\qquad t \in [t_{0},T), \end{aligned}$$

with \(L^{*}({t_{0}}) = 0\), \(M^{*}({t_{0}}) = 0\) and if we set (liquidation at the terminal time)

then \((L^{*},M^{*})\in \mathcal{A}(t_{0},b_{0},s_{0})\), and \((B^{*},S^{*})\) is a diffusion reflected at \(\partial \mathcal{R}_{0} ^{\mathrm{nt}}\).

3.2 Construction in the other cases

The construction in the other cases (f)–(h) follows in a similar way as the construction in the case (e). Let us outline the critical differences.

Assume first that we are in one of the no-borrowing cases (f) or (g). That is, we either have \(\pi _{M} = 1\) with \(s_{0},b_{0}>0\), or \(\pi _{M} > 1\) with \(s_{0},b_{0}>0\) and \(t_{0}>t^{\mathrm{up}}\). While the construction here is similar to the case (e), we have to be more careful since the upper boundary in terms of the transformation \(s/b\) is now equal to infinity which does not allow us to consider the transformation \(x = \log (s/(bu(t)))\). However, since the upper boundary is now equal to infinity, we deal with one-sided reflection which simplifies matters again (we never have to sell shares of the stock!). As before, we define the lower boundary as \(\ell (t):= \underline{\pi}(t)/(1-\underline{\pi}(t))\) and consider the slightly different transformation

$$ V(t,b,s) = b^{p}\exp \bigg( - p\int _{\log (s/b)}^{0} w(t,y)\,\mathrm{d}y \bigg). $$

Setting \(x=\log (s/b)\) and arguing in a similar fashion as before, the existence of the candidate optimal strategy follows. Note, however, that the process \(\Psi \) now has to be constructed without the \(u'(t)/u(t)\) term in its drift (see (3.5)).

Let us now turn to the borrowing case (h). That is, assume \(\pi _{M}>1\), \(s_{0}>0\), \(b_{0}<0\) and \(t_{0}\in [0,T)\). Since \(b_{0}<0\) and we want the optimally controlled bond wealth \(B^{*}\) to satisfy \(B^{*}<0\), we have to consider a different transformation. More precisely, we consider the transformation \(s/(-b)\) instead. We first define the trading boundaries to be

$$ \ell (t) := -\frac{\overline{\pi }(t)}{1-\overline{\pi }(t)} \qquad \text{and} \qquad u(t) := -\frac{\underline{\pi }(t)}{1-\underline{\pi }(t)}. $$

Theorem 2.3 implies \(0<\ell (t)<u(t)\leq \infty \), \(\ell \in C^{\infty }([0,T))\) and \(u\in C([0,T))\) (note, however, that in this case the lower boundary \(\ell \) is defined by means of \(\overline{\pi }\) instead of \(\underline{\pi }\)). The transformation of the function \(V\) is then chosen to be

$$ V(t,b,s) = (-b)^{p}\exp \bigg( -p\int _{0}^{\log \frac{s}{-b\ell (t)}} w(t,y)\,\mathrm{d}y \bigg), $$

where we restrict \(V\) to the set \(\mathcal{S}_{-}\). This leads to similar calculations as in the case (e), but with the lower boundary \(\ell \) in place of the upper boundary \(u\) in the drift term of the process \(\Psi \) (see (3.5)).

4 Verification and value function regularity

We now proceed by verifying that the strategies constructed in the previous section are indeed optimal. Since \(V\) is only defined on \([0,T]\times \overline{\mathcal{S}_{0}}\), classical verification arguments are difficult (it will turn out that \(\mathcal{V}\) is not sufficiently regular everywhere, see Theorem 4.14). Instead, we adapt the approach introduced in Christensen [14] for impulse control problems to our setting. The idea is to show that the value function is the pointwise minimum of a suitable set of superharmonic functions.

More precisely, let ℍ be the set of all continuous functions \(h:[0,T]\times \overline{\mathcal{S}}\to \overline{\mathbb{R}}\) satisfying the following properties:

(i) \(h(T,b,s) \geq \mathcal{V}(T,b,s)\) on \(\{T\}\times \overline{ \mathcal{S}}\).

(ii) \(h\) is nonincreasing in the direction of transactions, i.e., for all \((t,b,s) \in [0,T]\times \overline{\mathcal{S}}\) and \(\ell ,m\geq 0\) with \((b - (1+\lambda )\ell + (1-\mu )m,s+\ell -m)\in \overline{ \mathcal{S}}\), it holds that

$$ h(t,b,s) \geq h\big(t,b - (1+\lambda )\ell + (1-\mu )m,s+\ell -m \big). $$

(iii) \(h\) is space-time superharmonic with respect to the uncontrolled portfolio process. More precisely, denote by \((B^{0},S ^{0}) = (B^{0}_{t,b},S^{0}_{t,s})\) the wealth process corresponding to the strategy \(L\equiv M\equiv 0\) and let \(\vartheta \) be the first hitting time of \(\partial \mathcal{S}\). Then \(h\) is called space-time superharmonic if

$$ h(t,b,s) \geq \mathbb{E}\big[ h\bigl(\tau \wedge \vartheta , B^{0} _{t,b}(\tau \wedge \vartheta ), S^{0}_{t,s}(\tau \wedge \vartheta ) \bigr) \big] $$

for every \([t,T]\)-valued stopping time \(\tau \).

(iv) \(h\) satisfies the bounds

$$ U_{p}\big(b + \min \{(1-\mu )s,(1+\lambda )s\}\big)\leq h(t,b,s) \leq \varphi _{p}(t,b,s). $$

We expect that \(\mathcal{V}\) is the pointwise minimum of the elements of ℍ. If this is true, we can prove the optimality of \((L^{*},M^{*})\) as follows:

  1. 1)

    Show that every \(h\in \mathbb{H}\) dominates \(\mathcal{V}\).

  2. 2)

    Define the function

    $$ h_{0}(t,b,s) := \mathbb{E}\bigl[U_{p}\bigl(X_{t,b,s}^{L^{*},M^{*}}(T) \bigr)\bigr] $$

    and show that \(h_{0}\in \mathbb{H}\).

It then follows that \(\mathcal{V} \leq h_{0}\), but \(h_{0} \leq \mathcal{V}\) since \((L^{*},M^{*})\) is admissible. Hence \(h_{0}= \mathcal{V}\) and \((L^{*},M^{*})\) is optimal.

The advantage of this approach is that \(h_{0}\) (and hence \(\mathcal{V}\)) need not be of class \(C^{1,2}\) everywhere for the above argument to work. Indeed, in order to show that \(h_{0}\) satisfies the monotonicity property (ii), we only require piecewise continuous differentiability in certain spatial directions for each fixed time point \(t\), and to verify the superharmonicity property (iii), we only require some regularity along the paths of the uncontrolled state process, which naturally avoid the points of degeneracy of the infinitesimal generator \(\mathcal{L}^{\mathrm{nt}}\) (and hence the points on the \(b\)- and \(s\)-axes where regularity fails).

In Lemma 4.1 below, we show that every \(h\in \mathbb{H}\) dominates \(\mathcal{V}\). Then we proceed by analysing the regularity of \(h_{0}\) and use these results to show that \(h_{0}\) is superharmonic (Proposition 4.11) and nonincreasing in the direction of transactions (Proposition 4.12). The optimality of \((L^{*},M ^{*})\) then follows in Theorem 4.13.

Lemma 1

Let\(h\in \mathbb{H}\). Then\(\mathcal{V}\leq h\).

Proof

We show that \(h\) is a viscosity supersolution of the HJB equation (2.2). By Theorem 2.1 (comparison principle), it then follows that for every \(\varepsilon >0\), we have \(\mathcal{V}(t,b,s) \leq h(t,b+\varepsilon ,s)\) everywhere, and by the continuity of \(h\), we can send \(\varepsilon \downarrow 0\) to conclude.

Let us therefore fix \((t_{0},b_{0},s_{0})\in [0,T)\times \mathcal{S}\) and let \(\varphi \in C^{1,2}([0,T)\times \mathcal{S})\) be such that \(\varphi \leq h\) and \(\varphi (t_{0},b_{0},s_{0}) = h(t_{0},b_{0},s _{0})\). We have to show that

$$ \min \{\mathcal{L}^{\mathrm{nt}}\varphi (t_{0},b_{0},s_{0}), \mathcal{L}^{\mathrm{buy}}\varphi (t_{0},b_{0},s_{0}), \mathcal{L} ^{\mathrm{sell}}\varphi (t_{0},b_{0},s_{0})\} \geq 0. $$

Let \(\ell >0\) be such that \((b_{0} - (1+\lambda )\ell , s + \ell ) \in \mathcal{S}\). Then

$$\begin{aligned} &\varphi (t_{0},b_{0},s_{0}) - \varphi \big(t_{0},b_{0} - (1+\lambda )\ell , s + \ell \big) \\ &\geq h(t_{0},b_{0},s_{0}) - h\big(t_{0},b_{0} - (1+\lambda )\ell , s + \ell \big) \geq 0 \end{aligned}$$

since \(\varphi (t_{0},b_{0},s_{0}) = h(t_{0},b_{0},s_{0})\), \(\varphi \leq h\) and \(h\) is nonincreasing in the direction of transactions. Now divide by \(\ell \) and send \(\ell \downarrow 0\) to obtain \(\mathcal{L}^{\mathrm{buy}}\varphi (t_{0},b_{0},s_{0}) \geq 0\). By similar arguments, we have \(\mathcal{L}^{\mathrm{sell}}\varphi (t _{0},b_{0},s_{0}) \geq 0\), and hence it only remains to show that \(\mathcal{L}^{\mathrm{nt}}\varphi (t_{0},b_{0},s_{0}) \geq 0\).

Suppose that on the contrary, we have \(\mathcal{L}^{\mathrm{nt}} \varphi (t_{0},b_{0},s_{0}) < 0\). Then there exist \(\varepsilon , \delta > 0\) such that \(t_{0}+\varepsilon < T\), \(\overline{B}_{\varepsilon }(b_{0},s_{0})\subseteq \mathcal{S}\) and \(\mathcal{L}^{\mathrm{nt}} \varphi (t,b,s) < -\delta \) for all \((t,b,s)\in [t_{0},t_{0}+\varepsilon ]\times \overline{B}_{\varepsilon }(b_{0},s_{0})\). Now define the stopping time

$$ \tau _{\varepsilon }:= \inf \big\{ u\geq t_{0}: \bigl(B^{0}_{t_{0},b _{0}}(u),S^{0}_{t_{0},s_{0}}(u)\bigr)\not \in \overline{ B_{\varepsilon }}(b_{0},s_{0})\big\} \wedge (t_{0}+\varepsilon ). $$

Since \(h\) is space-time superharmonic and by Itô’s formula, we have

$$\begin{aligned} \varphi (t_{0},b_{0},s_{0}) &= h(t_{0},b_{0},s_{0}) \\ &\geq {\mathbb{E}}\bigl[h\bigl(\tau _{\varepsilon }, B^{0}_{t_{0},b _{0}}(\tau _{\varepsilon }),S^{0}_{t_{0},s_{0}}(\tau _{\varepsilon }) \bigr)\bigr] \\ & \geq {\mathbb{E}}\bigl[\varphi \bigl(\tau _{\varepsilon }, B^{0} _{t_{0},b_{0}}(\tau _{\varepsilon }),S^{0}_{t_{0},s_{0}}( \tau _{\varepsilon })\bigr)\bigr] \\ &= \varphi (t_{0},b_{0},s_{0}) - {\mathbb{E}}\bigg[\int _{t_{0}}^{ \tau _{\varepsilon }}\mathcal{L}^{\mathrm{nt}}\varphi \bigl(u, B^{0} _{t_{0},b_{0}}(u),S^{0}_{t_{0},s_{0}}(u)\bigr)\,\mathrm{d}u\bigg], \end{aligned}$$

i.e., \({\mathbb{E}}[ \int _{t_{0}}^{\tau _{\varepsilon }}\mathcal{L} ^{\mathrm{nt}}\varphi (u, B^{0}_{t_{0},b_{0}}(u),S^{0}_{t_{0},s_{0}}(u))\, \mathrm{d}u] \geq 0\). This, however, must imply that

$$ \max _{u\in [t_{0},t_{0}+\varepsilon ],\; (b,s)\in \overline{B}_{ \varepsilon }(b_{0},s_{0})} \mathcal{L}^{\mathrm{nt}}\varphi (u,b,s) \geq 0. $$

Sending \(\varepsilon \downarrow 0\) implies that \(\mathcal{L}^{ \mathrm{nt}}\varphi (t_{0},b_{0},s_{0}) \geq 0\) which is a contradiction. □

Remark 2

From the superharmonicity and the monotonicity in the direction of transactions, it is easily seen that \(h(\cdot ,B^{L,M},S^{L,M})\) is a supermartingale for any \(h\in \mathbb{H}\) and any piecewise constant trading strategy \((L,M)\) and thus

$$ h(t,b,s) \geq {\mathbb{E}}\bigl[U_{p}\bigl(X_{t,b,s}^{L,M}(T)\bigr) \bigr] \;\, \text{for any piecewise constant }(L,M)\in \mathcal{A}(t,b,s). $$

In the context of the closely related stochastic Perron method, one studies instead the set of all functions \(\bar{h}\) for which \(\bar{h}(\cdot ,B^{L,M},S^{L,M})\) is a supermartingale for all\((L,M)\in \mathcal{A}(t,b,s)\) and shows that the pointwise infimum of this set coincides with the value function as a means of obtaining the viscosity characterisation of \(\mathcal{V}\).

In Sect. 3 and Lemma 3.1, we have constructed candidate optimal strategies \((L^{*},M^{*}) = (L^{*} _{t,b,s}(u),M^{*}_{t,b,s}(u))_{u\in [t,T]}\) for every \((t,b,s)\in [0,T) \times \overline{\mathcal{S}}\). Moreover, it is obvious that a candidate optimal strategy \((L^{*}_{T,b,s}(u),M^{*}_{T,b,s}(u))\) is the strategy which merely liquidates the stock position \(s\). This allows us to define the function

$$ h_{0}(t,b,s) := \mathbb{E}\bigl[U_{p}\bigl(X_{t,b,s}^{L^{*},M^{*}}(T) \bigr)\bigr], \qquad (t,b,s)\in [0,T]\times \overline{\mathcal{S}}. $$
(4.1)

Our next aim is to show that \(h_{0}\in \mathbb{H}\) and hence \(h_{0} = \mathcal{V}\) and \((L^{*},M^{*})\) is optimal. As a first step, we show that \(h_{0}\) coincides with \(V\) on \([0,T]\times \mathcal{S}_{0}\).

Proposition 3

The function\(h_{0}\)defined in (4.1) coincides with the classical solution\(V\)of the HJB on the reduced state space\([0,T]\times \mathcal{S}_{0}\).

Proof

Let \((t,b,s)\in [0,T)\times \mathcal{S}_{0}\). If \((t,b,s)\) is such that we are in one of the liquidation cases (a), (b) or (c), then direct computations reveal that \(h_{0}(t,b,s)=V(t,b,s)\) since \(V\) is explicitly known at these points (cf. Dai and Yi [21, Proposition 3.2]). For example, assume that \(p=0\), \(\pi _{M}>1\) and \((t,b,s) = (t^{\mathrm{up}},0,s)\). Then Dai and Yi [21, Proposition 3.2] show that

$$\begin{aligned} V(t,b,s) &= \log s + \log (1-\mu ) + \bigg(\alpha -\frac{1}{2}\sigma ^{2}\bigg)(T-t) \\ &= {\mathbb{E}}\bigl[\log \bigl((1-\mu )S^{L^{*},M^{*}}_{t,s}(T) \bigr)\bigr] \\ &= {\mathbb{E}}\bigl[U_{0}\bigl(b + (1-\mu )S^{L^{*},M^{*}}_{t,s}(T) \bigr)\bigr] = h_{0}(t,b,s). \end{aligned}$$

We therefore exclude these cases in the sequel. For ease of notation, we denote the controlled processes \((B^{L^{*},M^{*}}_{t,b,s},S^{L^{*},M ^{*}}_{t,b,s})\) by \((B^{*},S^{*})\).

First, let us remark that \((t,b-(1+\lambda )L^{*}(t) + (1-\mu )M^{*}(t),s+L ^{*}(t)-M^{*}(t))\) is contained in \(\mathcal{S}_{0}\) since \((t,b,s) \in \mathcal{S}_{0}\) and \(\mathcal{R}_{0}^{\mathrm{nt}}\subseteq [0,T) \times \mathcal{S}_{0}\). By the fundamental theorem of calculus for line integrals, we thus have

$$ V(t,b,s) = V\bigl(t,b-(1+\lambda )L^{*}(t) + (1-\mu )M^{*}(t),s+L^{*}(t)-M ^{*}(t)\bigr) $$

since \(\mathrm{d}L^{*}(t)\neq 0\) only if \((t,b,s)\in \mathcal{R}_{0} ^{\mathrm{buy}}\) and \(\mathrm{d}M^{*}(t)\neq 0\) only if \((t,b,s) \in \mathcal{R}_{0}^{\mathrm{sell}}\). For all \(n\in \mathbb{N}\), define a stopping time

$$ \tau _{n} := \inf \bigg\{ u\geq t: \int _{t}^{u} \bigl|\sigma S^{*}(r) \partial _{s}V\bigl(r,B^{*}(r),S^{*}(r)\bigr)\bigr|^{2}\,\mathrm{d}r \geq n\bigg\} \wedge T. $$

Note that \((\cdot ,B^{*},S^{*})\in ([0,T]\times \mathcal{S}_{0}) \setminus F\) after the initial transaction since we have ruled out the liquidation cases. Thus, using that \(V\) is in \(C^{1,2}(([0,T)\times \mathcal{S}_{0})\setminus F)\) and solves the HJB equation, we obtain

$$\begin{aligned} V(t,b,s) &= V\bigl(t,b-(1+\lambda )L^{*}(t) + (1-\mu )M^{*}(t),s+L ^{*}(t)-M^{*}(t)\bigr) \\ & = V\bigl(\tau _{n},B^{*}(\tau _{n}),S^{*}(\tau _{n})\bigr)- \int _{t} ^{\tau _{n}} \sigma S^{*}(u)\partial _{s}V\bigl(u,B^{*}(u),S^{*}(u) \bigr)\,\mathrm{d}W(u). \end{aligned}$$

Taking expectations on both sides shows by the definition of \(\tau _{n}\) that

$$ V(t,b,s) = \mathbb{E}\bigl[V\bigl(\tau _{n},B^{*}(\tau _{n}),S^{*}(\tau _{n})\bigr)\bigr]. $$

We are left with showing that

$$ \lim _{n\to \infty }\mathbb{E}\bigl[V\bigl(\tau _{n},B^{*}(\tau _{n}),S ^{*}(\tau _{n})\bigr)\bigr] = \mathbb{E}\bigl[U_{p}\bigl(X_{t,b,s}^{L ^{*},M^{*}}(T)\bigr)\bigr] = h_{0}(t,b,s). $$
(4.2)

We first note that there exist constants \(C_{1},C_{2} > 0\) such that

$$ C_{1}\mathcal{V}(t,b,s) \leq V(t,b,s) \leq C_{2} \varphi _{p}(t,b,s)\qquad \text{for all $(t,b,s)\in \overline{ \mathcal{R}_{0}^{\mathrm{nt}}}$}. $$
(4.3)

To see this, assume that \(p\neq 0\) (the case \(p=0\) is similar). Since \(V\), \(\mathcal{V}\) and \(\varphi _{p}\) are homogeneous of order \(p\), we can write

$$\begin{aligned} V(t,b,s) &= (b+s)^{p} V(t,1-\pi ,\pi ), \\ \mathcal{V}(t,b,s) &= (b+s)^{p} \mathcal{V}(t,1-\pi ,\pi ), \\ \varphi _{p}(t,b,s) &= (b+s)^{p} \varphi _{p}(t,1-\pi ,\pi ), \end{aligned}$$

where \(\pi = \pi (b,s) := s/(b+s)\). But \(\pi (b,s)\) is bounded on \(\overline{\mathcal{R}_{0}^{\mathrm{nt}}}\), and hence so are \(V(t,1-\pi ,\pi )\), \(\mathcal{V}(t,1-\pi ,\pi )\) and \(\varphi _{p}(t,1- \pi ,\pi )\), from which we infer (4.3).

Case 1: \(p\in (0,1)\). We claim that \((V(\tau _{n},B^{*}(\tau _{n}),S ^{*}(\tau _{n})))_{n\in \mathbb{N}}\) is uniformly integrable, in which case (4.2) holds. Let \(\varepsilon >0\) be such that \(p(1+\varepsilon )<1\). Then

$$\begin{aligned} 0 &\leq \mathbb{E}\big[\bigl|V\bigl(\tau _{n},B^{*}(\tau _{n}),S^{*}(\tau _{n})\bigr)\bigr|^{1+\varepsilon }\big] \\ &\leq (C_{2})^{1+\varepsilon }\mathbb{E}\big[\bigl|\varphi _{p}\bigl( \tau _{n},B^{*}(\tau _{n}),S^{*}(\tau _{n})\bigr)\bigr|^{1+\varepsilon } \big] \\ &= (C_{2})^{1+\varepsilon }\frac{1+\varepsilon }{p^{\varepsilon }} \mathbb{E}\bigl[\varphi _{p(1+\varepsilon )}\bigl(\tau _{n},B^{*}(\tau _{n}),S^{*}(\tau _{n})\bigr)\bigr] \\ &\leq (C_{2})^{1+\varepsilon } \frac{1+\varepsilon }{p^{\varepsilon }} \varphi _{p(1+\varepsilon )}(t,b,s). \end{aligned}$$

Here, we have used that \(U_{p} = (1+\varepsilon )p^{-\varepsilon }U _{p(1+\varepsilon )}\) for the equality in the third line, and the fact that \(\varphi _{p(1+\varepsilon )}(\cdot ,B^{*},S^{*})\) is a supermartingale (compare with the proof of Lemma 2.2 in Belak et al. [9]) to arrive at the last inequality.

Case 2: \(p<0\). We write

Since monotone convergence gives

we only have to show that . But this follows from admissibility of \((L^{*},M^{*})\) and monotone convergence since

Case 3: \(p=0\). This follows in a similar fashion as in Cases 1 and 2 by splitting \(V(\tau _{n},B^{*}(\tau _{n}),S^{*}(\tau _{n}))\) into its positive and negative parts and using that we have the estimate \(-x ^{-p}/p \leq \log x \leq x^{p}/p\) for every \(p\in (0,1)\). □

Proposition 4.3 proves the regularity of \(h_{0}\) on \([0,T)\times \mathcal{S}_{0}\). The following lemmas investigate the regularity of \(h_{0}\) for \(s\leq 0\).

Lemma 4

It holds that\(h_{0}\in C^{1,2}([0,t^{\mathrm{down}})\times ( \mathcal{S}\setminus \mathcal{S}_{0}))\). Moreover, for every\((t,b,s)\in [0,t^{\mathrm{down}})\times (\mathcal{S}\setminus \mathcal{S}_{0})\)and\(\ell ^{*}>0\)with\((t,b-(1+\lambda )\ell ^{*},s+ \ell ^{*})\in \mathcal{R}_{0}^{\mathrm{buy}}\)and\(s+\ell ^{*}>0\), we have

$$\begin{aligned} \partial _{t} h_{0}(t,b,s) &= \partial _{t} V(t,b,s)|_{(b,s) = (b-(1+ \lambda )\ell ^{*},s+\ell ^{*})}, \\ \partial _{b} h_{0}(t,b,s) &= \partial _{b} V(t,b,s)|_{(b,s) = (b-(1+ \lambda )\ell ^{*},s+\ell ^{*})}, \\ \partial _{s} h_{0}(t,b,s) &= \partial _{s} V(t,b,s)|_{(b,s) = (b-(1+ \lambda )\ell ^{*},s+\ell ^{*})}, \\ \partial _{s}^{2} h_{0}(t,b,s) &= \partial _{s}^{2} V(t,b,s)|_{(b,s) = (b-(1+\lambda )\ell ^{*},s+\ell ^{*})}. \end{aligned}$$

Proof

The idea is to bootstrap the regularity of \(h_{0}\) for nonpositive \(s\) from the regularity of \(h_{0}=V\) for positive \(s\) inside the buy region \(\mathcal{R}_{0}^{\mathrm{buy}}\); see Fig. 4.

Fig. 4
figure 4

Bootstrapping the regularity of \(h_{0}\)

Fix \((t_{0},b_{0},s_{0})\in [0,T]\times (\mathcal{S}\setminus \mathcal{S}_{0})\) (in particular \(s_{0}\leq 0\)) with \(t_{0}< t^{ \mathrm{down}}\) and let \(\delta >0\) be such that \(t_{0}+\delta < t^{ \mathrm{down}}\) and \(B_{\delta }(b_{0},s_{0})\subseteq \mathcal{S}\). By making \(\delta \) smaller if necessary, we may furthermore assume that \([t_{0},t_{0}+\delta )\times B_{\delta }(b_{0},s_{0})\subseteq \mathcal{R}_{0}^{\mathrm{buy}}\). Now, for every \((t,b,s)\in [t_{0},t _{0}+\delta )\times B_{\delta }(b_{0},s_{0})\), there exists some \(\ell _{0}>0\) such that \((b-(1+\lambda )\ell _{0},s+\ell _{0})\in \partial \mathcal{R}_{0}^{\mathrm{nt}}\cap \partial \mathcal{R}_{0}^{ \mathrm{buy}}\) and \(s+\ell _{0} > 0\) (since \(t< t^{\mathrm{down}}\) and hence \(\underline{\pi }(t)>0\) by Theorem 2.3). Moreover, by the construction of \(h_{0}\),

$$ h_{0}(t, b, s) = h_{0}\big(t, b - (1+\lambda ) \ell ,s+\ell \big) $$

for every \(\ell \in [0,\ell _{0}]\). By the monotonicity of the buy boundary, making \(\delta \) even smaller if necessary, we can therefore find some \(\ell ^{*}\in (0,\ell _{0})\) such that the interval \([t _{0},t_{0}+\delta )\times B_{\delta }(b_{0}-(1+\lambda )\ell ^{*}, s _{0}+\ell ^{*})\) is contained in the interior of \(\mathcal{R}_{0}^{ \mathrm{buy}}\) and \(s > 0\) for all \((b,s)\in B_{\delta }(b_{0}-(1+ \lambda )\ell ^{*}, s_{0}+\ell ^{*})\). Note that by construction and Proposition 4.3, we have

$$ h_{0}(t,b,s) = h_{0}\big(t,b - (1+\lambda )\ell ^{*}, s+\ell ^{*}\big) = V\big(t,b - (1+\lambda )\ell ^{*}, s+\ell ^{*}\big) $$

for all \((t,b,s)\in [t_{0},t_{0}+\delta )\times B_{\delta }(b_{0},s _{0})\). Since

$$ V\in C^{1,2}\Big([t_{0},t_{0}+\delta )\times B_{\delta }\big(b_{0}-(1+ \lambda )\ell ^{*}, s_{0}+\ell ^{*}\big)\Big), $$

the result follows. □

Proposition 5

The function\(h_{0}\)satisfies

$$ \mathcal{L}^{\mathrm{nt}}h_{0}(t,b,s) \geq 0,\qquad \mathcal{L}^{ \mathrm{buy}}h_{0}(t,b,s) = 0\qquad \textit{and}\qquad \mathcal{L} ^{\mathrm{sell}}h_{0}(t,b,s) > 0 $$

in the classical sense on\([0,t^{\mathrm{down}})\times (\mathcal{S} \setminus \mathcal{S}_{0})\).

Proof

It follows immediately from Lemma 4.4 for all \((t,b,s)\in [0,t^{\mathrm{down}})\times (S\setminus S_{0})\) that

$$\begin{aligned} \mathcal{L}^{\mathrm{buy}}h_{0}(t,b,s) &= \mathcal{L}^{\mathrm{buy}} V \big(t,b-(1+\lambda )\ell ^{*},s+\ell ^{*}\big) = 0, \end{aligned}$$
(4.4)
$$\begin{aligned} \mathcal{L}^{\mathrm{sell}}h_{0}(t,b,s) &= \mathcal{L}^{ \mathrm{sell}} V\big(t,b-(1+\lambda )\ell ^{*},s+\ell ^{*}\big) > 0, \end{aligned}$$
(4.5)

for a suitable choice of \(\ell ^{*}\). From (4.4), we obtain

$$ \partial _{s}h_{0}(t,b,s) = (1+\lambda )\partial _{b}h_{0}(t,b,s). $$

Plugging this into (4.5) yields

$$ (1-\mu )\partial _{b}h_{0}(t,b,s) < \partial _{s}h_{0}(t,b,s) = (1+ \lambda )\partial _{b}h_{0}(t,b,s) $$

which implies \(\partial _{b}h_{0}(t,b,s)>0\) and thus \(\partial _{s}h _{0}(t,b,s)>0\) for all \((b,s)\in (\mathcal{S}\setminus \mathcal{S} _{0})\). It only remains to show that

$$ \mathcal{L}^{\mathrm{nt}}h_{0}(t,b,s) \geq 0. $$

Case 1: \(s=0\). Fix some \(\ell ^{*}>0\) such that \((t,b,\ell ^{*})\in \mathcal{R}_{0}^{\mathrm{buy}}\) so that

$$ \partial _{t} h_{0}(t,b,0) = \partial _{t}V\big(t,b-(1+\lambda )\ell ^{*},\ell ^{*}\big)\leq 0 $$

by Theorem 2.2. Therefore,

$$ \mathcal{L}^{\mathrm{nt}}h_{0}(t,b,0) = -\partial _{t}h_{0}(t,b,0) \geq 0. $$
(4.6)

Case 2: \(s<0\). For some suitable \(\ell ^{*}>0\), we have

$$\begin{aligned} \partial _{t} h_{0}(t,b,s) &= \partial _{t} V(t,b,s)|_{(b,s) = (b-(1+ \lambda )\ell ^{*},s+\ell ^{*})} = \partial _{t} h_{0}\big(t,b-(1+\lambda )s,0\big), \\ \partial _{s} h_{0}(t,b,s) &= \partial _{s} V(t,b,s)|_{(b,s) = (b-(1+ \lambda )\ell ^{*},s+\ell ^{*})} = \partial _{s} h_{0}\big(t,b-(1+\lambda )s,0\big), \\ \partial _{s}^{2} h_{0}(t,b,s) &= \partial _{s}^{2} V(t,b,s)|_{(b,s) = (b-(1+\lambda )\ell ^{*},s+\ell ^{*})} = \partial _{s}^{2} h_{0}\big(t,b-(1+ \lambda )s,0\big). \end{aligned}$$

Therefore,

$$\begin{aligned} \mathcal{L}^{\mathrm{nt}}h_{0}(t,b,s) &= -\partial _{t}h_{0}(t,b,s) - \alpha s\partial _{s}h_{0}(t,b,s) - \frac{1}{2}\sigma ^{2}s^{2}\partial _{s}^{2}h_{0}(t,b,s) \\ &= -\partial _{t}h_{0}\big(t,b-(1+\lambda )s,0\big) - \alpha s\partial _{s}h_{0}\big(t,b-(1+\lambda )s,0\big) \\ & \phantom{=:}- \frac{1}{2}\sigma ^{2}s^{2}\partial _{s}^{2}h_{0}\big(t,b-(1+ \lambda )s,0\big). \end{aligned}$$

By (4.6), we have

$$ -\partial _{t}h_{0}\big(t,b-(1+\lambda )s,0\big) = \mathcal{L}^{ \mathrm{nt}}h_{0}\big(t,b-(1+\lambda )s,0\big) \geq 0. $$

Moreover, since \(s<0\) and \(\partial _{s}h_{0}(t,b-(1+\lambda )s,0)>0\), we have

$$ - \alpha s\partial _{s}h_{0}\big(t,b-(1+\lambda )s,0\big) > 0, $$

and since \(\partial _{s}^{2}h_{0}(t,b-(1+\lambda )s,0)\leq 0\) (\(V\) is concave; see Dai and Yi [21, Remark 4.2]), we see that

$$ - \frac{1}{2}\sigma ^{2}s^{2}\partial _{s}^{2}h_{0}\big(t,b-(1+\lambda )s,0\big)\geq 0. $$

Putting the pieces together, we obtain \(\mathcal{L}^{\mathrm{nt}}h _{0}(t,b,s) > 0\). □

We have similar statements for the case \(t\geq t^{\mathrm{down}}\) with \(s<0\); the proofs are however significantly easier.

Lemma 6

We have\(h_{0}\in C^{\infty }([t^{\mathrm{down}},T)\times ( \mathcal{S}\setminus \{(b,s)\in \mathcal{S}: s\geq 0\}))\), and\(h_{0}\)is given explicitly as

$$ h_{0}(t,b,s) = U_{p}\big(b + (1+\lambda )s\big) $$

on\([t^{\mathrm{down}},T)\times \mathcal{S}\setminus \{(b,s)\in \mathcal{S}: s> 0\}\).

Proof

This is an immediate consequence of the definition of \(h_{0}\) and \((L^{*},M^{*})\). Indeed, if \((t,b,s)\in [t^{\mathrm{down}},T)\times ( \mathcal{S}\setminus \{(b,s)\in \mathcal{S}: s> 0\})\), then \((L^{*},M^{*})\) is such that the stock position is immediately liquidated and the investor refrains from further trading. That is,

$$\begin{aligned} h_{0}(t,b,s) &= h_{0}\big(t, b + (1+\lambda )s,0\big) \\ &= \mathbb{E}\bigl[U_{p}\bigl(B_{t,b + (1+\lambda )s}^{L^{*},M^{*}}(T) + 0\bigr)\bigr] = U_{p}\big(b + (1+\lambda )s\big), \end{aligned}$$

from which the assertion of the lemma follows. □

Proposition 7

The function\(h_{0}\)satisfies

$$ \mathcal{L}^{\mathrm{nt}}h_{0}(t,b,s) > 0,\qquad \mathcal{L}^{ \mathrm{buy}}h_{0}(t,b,s) = 0\qquad \textit{and}\qquad \mathcal{L}^{\mathrm{sell}}h_{0}(t,b,s) > 0 $$

in the classical sense on\([t^{\mathrm{down}},T)\times (\mathcal{S} \setminus \{(b,s)\in \mathcal{S}: s\geq 0\})\).

Proof

This follows directly since \(h_{0}(t,b,s) = U_{p}(b + (1+\lambda )s)\) by Lemma 4.6. □

Corollary 8

For every\(t\in [t^{\mathrm{down}},T)\), we have\(h_{0}(t,\cdot ) \in C(\overline{\mathcal{S}})\).

Proof

It suffices to show that \((b,s)\mapsto h_{0}(t,b,s)\) is continuous at \((b,0)\) for every \(b>0\). By Lemma 4.6, \(h_{0}(t,b,0)\) is given explicitly as \(h_{0}(t,b,0) = U_{p}(b)\). Moreover, for \(s>0\), \(h_{0}(T,b,s) = U_{p}(b+(1-\mu )s)\) and \(\partial _{t} h_{0}\leq 0\) and thus

$$ h_{0}(t,b,s)\geq h_{0}(T,b,s) \geq h_{0}(t,b,0) = U_{p}(b)\qquad \text{for all }s>0, $$

which implies that \(h_{0}\) is at least lower semi-continuous at \((b,0)\) for every \(b>0\). By Theorems 2.2 and 2.3, \(h_{0}=V\) is \(C^{1,2}\) and satisfies

$$ \partial _{t}h_{0}(t,b,s) = - \alpha s \partial _{s} h_{0}(t,b,s) - \frac{1}{2}\sigma ^{2}s^{2}\partial _{s}^{2}h_{0}(t,b,s) \geq - \alpha s\partial _{s} h_{0}(t,b,s) $$
(4.7)

for \(t\in [t^{\mathrm{down}},T)\), \(b>0\) and \(s>0\) sufficiently small (so that \((t,b,s)\) is in the no-trading region) since \(\partial _{s}^{2} h _{0} \leq 0\). An argument as in Lemma 3.4 in Dai and Yi [21] shows that \(\partial _{s} h_{0}(t,\cdot )\) is bounded from above on \((b-\delta ,b+\delta )\times (0,\bar{s}]\) for every \(0<\delta <b\) and \(\bar{s} > 0\), uniformly in \(t\in [t^{ \mathrm{down}},T)\). Thus (4.7) implies that

$$ \liminf _{\hat{b}\to b,s\downarrow 0} \partial _{t}h_{0}(t,\hat{b},s) \geq 0 \qquad \text{uniformly in }t\in [t^{\mathrm{down}},T). $$

But since \(\partial _{t}h_{0} \leq 0\), we must have

$$ \lim _{\hat{b}\to b,s\downarrow 0} \partial _{t}h_{0}(t,\hat{b},s) = 0 \qquad \text{uniformly in }t\in [t^{\mathrm{down}},T). $$
(4.8)

Now choose \((b_{k},s_{k})_{k\in {\mathbb{N}}}\) with \(b_{k},s_{k}>0\) and \((b_{k},s_{k})\to (b,0)\). Then (4.8) implies that for every \(\varepsilon >0\), there exists some \(K\in {\mathbb{N}}\) such that

$$ h_{0}(t,b_{k},s_{k}) \leq h_{0}(T,b_{k},s_{k}) + \varepsilon = U_{p} \bigl(b_{k} + (1-\mu )s_{k}\bigr) + \varepsilon \qquad \text{for all }k\geq K, $$

from which we conclude that

$$ \limsup _{k\to \infty } h_{0}(t,b_{k},s_{k}) \leq \lim _{k\to \infty } U _{p}\big(b_{k} + (1-\mu )s_{k}\big) = U_{p}(b) = h_{0}(t,b,0), $$

i.e., \(h_{0}\) is also upper semi-continuous at \((b,0)\). □

Remark 9

We cannot expect more regularity of \(h_{0}\) at \(s=0\). Indeed, taking the derivative with respect to \(s\) in the equality in (4.7) and formally sending \(s>0\) to zero shows that \(\partial _{s}h_{0}(t,b,0+)\) solves

$$\begin{aligned} 0 &= -\partial _{t}\partial _{s}h_{0}(t,b,0+) - \alpha \partial _{s}h _{0}(t,b,0+), \\ h_{0}(T,b,0+) &= \partial _{s}U_{p}\big(b+(1+\lambda )s\big)\big|_{s=0}, \end{aligned}$$

i.e., \(\partial _{s}h_{0}(t,b,0+) = e^{\alpha (T-t)}\partial _{s}U_{p}(b+(1+ \lambda )s)|_{s=0}\). This is in contrast to

$$ \partial _{s}h_{0}(t,b,0-) = \partial _{s}U_{p}\big(b+(1+\lambda )s \big)\big|_{s=0}, $$

suggesting that \(\partial _{s}h_{0}\) is not continuous at \(s=0\).

We are now ready to prove that \(h_{0}\in \mathbb{H}\). By construction, we already know that \(h_{0}\leq \mathcal{V}\leq \varphi _{p}\). Moreover, it is clear from the above analysis that

$$ h_{0}(t,b,s) \geq U_{p}\big(b + \min \{(1-\mu )s,(1+\lambda )s\} \big) $$

since this is clearly satisfied for \(t = T\) and \(h_{0}\) is nonincreasing in \(t\).

We proceed in three steps. First we show that \(h_{0}\) is continuous, then we show that \(h_{0}\) is superharmonic, and finally we show that \(h_{0}\) is nonincreasing in the direction of transactions.

Proposition 10

The function\(h_{0}\)is continuous on\([0,T]\times \overline{\mathcal{S}}\).

Proof

By Proposition 4.3, Lemmas 4.4 and 4.6 and Corollary 4.8, it only remains to prove that \(h_{0}\) is continuous in \((t^{\mathrm{down}}, b, 0)\) for every \(b\geq 0\). Moreover, by Lemma 4.6 and Corollary 4.8, we can restrict attention to sequences \((t_{n},b_{n},s_{n})_{n\in {\mathbb{N}}}\) converging to \((t^{\mathrm{down}},b,0)\) with \(t_{n}< t^{\mathrm{down}}\) and \(s_{n}>0\) for all \(n\in {\mathbb{N}}\). Next, we observe that every such sequence is eventually contained in \(\mathcal{R}_{0}^{\mathrm{buy}} \cup \mathcal{R}_{0}^{\mathrm{nt}}\) since \(s_{n}\to 0\) and \(\overline{ \pi }>0\), and we may furthermore assume \(b_{n}>0\) for all \(n\in {\mathbb{N}}\). Now, if \((t_{n},b_{n},s_{n})\in \mathcal{R}_{0} ^{\mathrm{buy}}\), there is \(\ell _{n}>0\) with \((t_{n},b_{n}-(1+\lambda )\ell _{n},s_{n}+\ell _{n})\in \partial \mathcal{R}_{0}^{\mathrm{nt}}\) and

$$ h_{0}(t_{n},b_{n},s_{n}) = h_{0}\big(t_{n},b_{n}-(1+\lambda )\ell _{n},s _{n}+\ell _{n}\big), $$

which implies that we may restrict further to sequences in \(\overline{ \mathcal{R}_{0}^{\mathrm{nt}}}\). But because we have \(\mathcal{L}^{\mathrm{nt}}h_{0}(t,b,s) = \mathcal{L}^{\mathrm{nt}} V(t,b,s) = 0\) for all \((t,b,s)\in \overline{\mathcal{R}_{0}^{ \mathrm{nt}}}\) with \(s>0\), we can argue as in the proof of Corollary 4.8. □

Proposition 11

The function\(h_{0}\)is superharmonic.

Proof

Fix \((t,b,s)\in [0,T)\times \mathcal{S}\), let \(\tau \) be a \([t,T]\)-valued stopping time and \(\vartheta \) the first exit time of the uncontrolled portfolio process \((B^{0}_{t,b},S^{0}_{t,s})\) from \(\mathcal{S}\).

Case 1: \(s>0\). In this case, \(S^{0}_{t,s}(u)>0\) for all \(u\in [t, \tau \wedge \vartheta ]\) and hence \(h_{0}=V\) is \(C^{1,2}\) along the paths of \((u,B^{0}(u),S^{0}(u))\) by Proposition 4.3. Let \(\varepsilon >0\) and denote by \(\vartheta ^{\varepsilon }\) the first exit time of \((B^{0}_{t,b}+ \varepsilon ,S^{0}_{t,s})\) from \(\mathcal{S}\). Then clearly \(\vartheta ^{\varepsilon }\geq \vartheta \). For every \(n\in \mathbb{N}\), let us define a stopping time

$$ \tau _{n} := \inf \bigg\{ u\geq t: \int _{t}^{u}\bigl|\sigma S^{0}_{t,s}(r) \partial _{s}h_{0}\bigl(r, B ^{0}_{t,b}(r)+\varepsilon , S^{0}_{t,s}(r)\bigr) \bigr|^{2}\,\mathrm{d}r \geq n\bigg\} \wedge \tau \wedge \vartheta . $$

An application of Itô’s formula shows that

$$\begin{aligned} h_{0}(t,b+\varepsilon ,s) &= h_{0}\bigl(\tau _{n},B^{0}_{t,b}(\tau _{n})+ \varepsilon ,S^{0}_{t,s}(\tau _{n})\bigr) \\ & \phantom{=:}+ \int _{t}^{\tau _{n}}\mathcal{L}^{\mathrm{nt}}h_{0}\bigl(u, B^{0}_{t,b}(u)+\varepsilon , S^{0}_{t,s}(u)\bigr)\,\mathrm{d}u \\ & \phantom{=:} + \int _{t}^{\tau _{n}}\sigma S^{0}_{t,s}(u)\partial _{s}h_{0}\bigl(u, B ^{0}_{t,b}(u)+\varepsilon , S^{0}_{t,s}(u)\bigr)\,\mathrm{d}W(u). \end{aligned}$$

Taking expectations yields

$$\begin{aligned} h_{0}(t,b+\varepsilon ,s) &= \mathbb{E}\bigg[h_{0}\bigl(\tau _{n},B ^{0}_{t,b}(\tau _{n})+\varepsilon ,S^{0}_{t,s}(\tau _{n})\bigr) \\ & \phantom{=:\mathbb{E}\bigg[}+ \int _{t}^{\tau _{n}}\mathcal{L}^{ \mathrm{nt}}h_{0}\bigl(u, B^{0}_{t,b}(u)+\varepsilon , S^{0}_{t,s}(u) \bigr)\,\mathrm{d}u\bigg]. \end{aligned}$$

By Proposition 4.3, \(h_{0}\) is a classical solution of the HJB on \([0,T]\times \mathcal{S}_{0}\), and so we see that

$$ \int _{t}^{\tau _{n}}\mathcal{L}^{\mathrm{nt}}h_{0}\bigl(u, B^{0}_{t,b}(u), S^{0}_{t,s}(u)\bigr)\,\mathrm{d}u \geq 0 $$

and hence

$$ h_{0}(t,b+\varepsilon ,s) \geq \mathbb{E}\bigl[h_{0}\bigl(\tau _{n},B ^{0}_{t,b}(\tau _{n})+\varepsilon ,S^{0}_{t,s}(\tau _{n})\bigr)\bigr]. $$
(4.9)

Next, since \(\partial _{t} h_{0}\leq 0\) and \(h_{0}(T,b,s) = V(T,b,s) = U_{p}(b + (1-\mu )s)\), we have \(h_{0}(t,b,s) \geq U_{p}(b+(1-\mu )s)\) and hence

$$ h_{0}\bigl(\tau _{n},B^{0}_{t,b}(\tau _{n})+\varepsilon ,S^{0}_{t,s}( \tau _{n})\bigr) \geq U_{p}(\varepsilon ). $$

We can therefore send \(n\to \infty \) in (4.9), use Fatou’s lemma and \(\partial _{b} h_{0}\geq 0\) to obtain

$$\begin{aligned} h_{0}(t,b+\varepsilon ,s) &\geq \mathbb{E}\bigl[h_{0}\bigl(\tau \wedge \vartheta ,B^{0}_{t,b}(\tau \wedge \vartheta )+\varepsilon ,S ^{0}_{t,s}(\tau \wedge \vartheta )\bigr)\bigr] \\ &\geq \mathbb{E}\bigl[h_{0}\bigl(\tau \wedge \vartheta ,B^{0}_{t,b}( \tau \wedge \vartheta ),S^{0}_{t,s}(\tau \wedge \vartheta )\bigr) \bigr]. \end{aligned}$$

Sending \(\varepsilon \downarrow 0\) then shows that \(h\) is superharmonic.

Case 2: \(s\leq 0\) and \(t\geq t^{\mathrm{down}}\). For simplicity, we write \(\tilde{\tau }:= \tau \wedge \vartheta \). If \(t\geq t^{ \mathrm{down}}\), then \((L^{*},M^{*})\) performs an initial transaction from \((b,s)\) to \((b + (1+\lambda )s, 0)\) so that

$$ h_{0}(t,b,s) = h\big(t, b + (1+\lambda )s, 0\big) = U_{p}\big( b + (1+ \lambda )s \big). $$
(4.10)

On the other hand, by the same arguments and Jensen’s inequality, we have

$$\begin{aligned} {\mathbb{E}}\bigl[h_{0}\bigl(\tilde{\tau }, B^{0}_{t,b}(\tilde{\tau }), S^{0}_{t,s}(\tilde{\tau })\bigr)\bigr] &= {\mathbb{E}}\bigl[h_{0} \bigl(\tilde{\tau }, B^{0}_{t,b}(\tilde{\tau })+ (1+\lambda ) S^{0} _{t,s}(\tilde{\tau }), 0\bigr)\bigr] \\ &= \mathbb{E}\bigl[U_{p}\bigl( B^{0}_{t,b}(\tilde{\tau })+ (1+\lambda ) S^{0}_{t,s}(\tilde{\tau }) \bigr)\bigr] \\ &\leq U_{p}\big(\mathbb{E}[ B^{0}_{t,b}(\tilde{\tau })+ (1+\lambda ) S^{0}_{t,s}(\tilde{\tau }) ]\big) \\ &= U_{p}\big(b + (1+\lambda )\mathbb{E}[ S^{0}_{t,s}(\tilde{\tau }) ] \big). \end{aligned}$$

Now since \(S^{0}_{t,s}\) is a supermartingale for every \(s\leq 0\), it follows that \(\mathbb{E}[ S^{0}_{t,s}(\tilde{\tau }) ] \leq s\) and hence

$$ {\mathbb{E}}\bigl[h_{0}\bigl(\tilde{\tau }, B^{0}_{t,b}(\tilde{\tau }), S^{0}_{t,s}(\tilde{\tau })\bigr)\bigr] \leq U_{p}\big(b+ (1+\lambda ) s\big) = h_{0}(t,b,s) $$

by the monotonicity of \(U_{p}\) and (4.10).

Case 3: \(s\leq 0\) and \(t< t^{\mathrm{down}}\). We have

On \(\{\tilde{\tau }\geq t^{\mathrm{down}}\}\), we have as before that

$$ {\mathbb{E}}\bigl[h_{0}\bigl(\tilde{\tau }, B^{0}_{t,b}(\tilde{\tau }), S^{0}_{t,s}(\tilde{\tau })\bigr)\bigr] \leq U_{p}\big(b+(1+\lambda )s \big) = h_{0}(t,b,s); $$

so we may without loss of generality assume that \(\tilde{\tau }< t ^{\mathrm{down}}\). However, since we know by Lemma 4.4 and Proposition 4.5 that \(h_{0}\) is \(C^{1,2}\) and satisfies the HJB in the classical sense, we obtain

$$ h(t,b,s) \geq \mathbb{E}\bigl[h_{0}\bigl(\tilde{\tau },B^{0}_{t,b}( \tilde{\tau }),S^{0}_{t,s}(\tilde{\tau })\bigr)\bigr] $$

as in the case \(s>0\). □

Proposition 12

The function\(h_{0}\)is nonincreasing in the direction of transactions.

Proof

Fix \((t,b,s)\in [0,T]\times \overline{\mathcal{S}}\) and let \(\ell ,m\geq 0\) be such that

$$ \big(b - (1+\lambda )\ell +(1-\mu )m,s+\ell -m\big)\in \overline{ \mathcal{S}}. $$

We have to show that

$$ h_{0}(t,b,s) \geq h_{0}\big(t,b - (1+\lambda )\ell +(1-\mu )m,s+ \ell -m\big). $$

However, by Proposition 4.10, \(h(t,\cdot )\) is continuous and satisfies

$$ \mathcal{L}^{\mathrm{buy}}h_{0}(t,b,s)\geq 0 \qquad \text{and} \qquad \mathcal{L}^{\mathrm{sell}}h_{0}(t,b,s)\geq 0 $$

for every \((t,b,s)\in ([0,T)\times \mathcal{S})\setminus \{(t,b,s) \in [0,T)\times \mathcal{S}: t\geq t^{\mathrm{down}}, s=0\}\) and \((t,b,s)\not \in F\) (defined in (2.3)) by Propositions 4.3, 4.5 and 4.7. Therefore, by the fundamental theorem of calculus for line integrals, we immediately obtain the claim. □

Combining Propositions 4.104.12 proves the optimality of \((L^{*},M^{*})\).

Theorem 13

We have\(h_{0}\in \mathbb{H}\), and thus\(h_{0} = \mathcal{V}\)and\((L^{*},M^{*})\)is optimal.

Since \(h_{0} = \mathcal{V}\), we furthermore have the following regularity result.

Theorem 14

The value function\(\mathcal{V}\)is continuous everywhere and (at least) of class\(C^{1,2}\)except for possibly the points\((t,b,s)\)for which one of the following statements is true:

1) \(b=0\)and\((t,b,s)\)is on the buy boundary.

2) \(\pi _{M} = 1\)and\(b=0\).

3) \(t=t^{\mathrm{down}}\)and\(s\leq 0\). However, \(\mathcal{V}(t^{\mathrm{down}},\cdot )\in C^{2}(\mathcal{S}\setminus \{(b,s)\in \mathcal{S}: s = 0\})\).

4) \(t\geq t^{\mathrm{down}}\)and\(s=0\). However, \(\mathcal{V}(\cdot ,b,0)\in C^{\infty }((t^{\mathrm{down}},T))\)for all\(b\geq 0\).

Moreover, \(\mathcal{V}\)is of class\(C^{\infty }\)on\(\mathcal{R}_{0}^{\mathrm{nt}}\).