1 Introduction

In this paper we consider a one dimensional stochastic impulse optimal control problem modeling the economic problem of irreversible investment with fixed adjustment cost.

Let \(X=\{X_t\}_{t\ge 0}\) be a real valued positive process representing an economic indicator (such as the GDP of a country, the production capacity of a firm and so on) on which a planner/manager can intervene. When no intervention is undertaken, it is assumed that the process X evolves autonomously according to a time-homogeneous Itô diffusion. On the other hand, the planner may act on this process, increasing its value, by choosing a sequence of interventions dates \(\{\tau _n\}_{n\ge 1}\) and of intervention amplitudes \(\{i_n\}_{n\ge 1}\), with \(i_n>0\) .Footnote 1 Hence, the control is represented by a sequence of couples \(\left\{ (\tau _n,i_n)\right\} _{n\ge 1}\): the first component represents the intervention time, the second component the size of intervention. The goal of the controller is to maximize over the set of all admissible controls, the expected total discounted income

$$\begin{aligned} \mathbb {E}\left[ \int _0^\infty e^{-\rho t} f\left( X_t\right) dt- \sum _{n\ge 1}e^{-\rho \tau _n} \left( c_0i_n+c_1\right) \right] , \end{aligned}$$

where f is a reward function, \(c_0>0\) and \(c_1>0\) represent, respectively, the proportional and the fixed cost of intervention, and \(\rho >0\) is a discount factor.

From the modeling side, our problem is the “extension” to the case \(c_1>0\) of the same problem already treated in the literature in the case \(c_1=0\) (see, e.g. [63, Ch. 4, Sec. 5]. In this respect, it applies to economic problems of capacity expansion, notably irreversible investment problems.Footnote 2

From the theoretical side, the introduction of a fixed cost of control is relevant, as it leads from a problem well posed (in the sense of existence of optimal controls) as a singular control problem to a problem well posed as an impulse control problem.Footnote 3 Such a change is not priceless at the theoretical level. Indeed, the introduction of a fixed cost of control has two unpleasant effects. Firstly, it destroys the concavity of the objective functional even if the revenue function is concave. Secondly, when approaching the problem by dynamic programming techniques (as we do), the dynamic programming equation has a nonlocal term and takes the form of a quasi-variational inequality (QVI, hereafter), whereas it is a variational inequality in the singular control case.

1.1 Related literature

First of all, it is worth noticing that the stochastic impulse control setting has been widely employed in several applied fields: e.g., exchange and interest rates [20, 49, 56], portfolio optimization with transaction costs [34, 51, 57], inventory and cash management [12, 21, 27,28,29, 44, 45, 58, 62, 67, 68, 71], real options [47, 53], reliability theory [7]. More recently, games of stochastic impulse control have been investigated with application to pollution [38].

From a modeling point of view, the closest works to ours can be considered [3, 6, 26, 35, 51]. On the theoretical side, starting from the classical book [17], several works investigated QVIs associated to stochastic impulse optimal control in \(\mathbb {R}^n\). Among them, we mention the recent [43] in a diffusion setting and [14, 31] in a jump-diffusion setting. In particular [17, Ch. 4] deals with Sobolev type solutions, whereas [43] deals with viscosity solutions. These two works prove a \(W^{2,p}\)-regularity, with \(p<\infty \), for the solution of QVI, which, by classical Sobolev embeddings, yields a \(C^1\)-regularity. However, it is typically not easy to obtain by such regularity information on the structure of the so called continuation and action regions, hence on the candidate optimal control. If this structure is established, then one can try to prove a verificiation theorem to prove that the candidate optimal control is actually optimal. In a stylized one dimensional example, [43, Sec. 5] successfully employs this method by exploiting the regularity result proved in [43, Sec. 4] to depict the structure of the continuation and action region for the problem at hand. Concerning verification, we need to mention the recent paper [15], which provides a non-smooth verification theorem in a quite general setting based on the stochastic Perron method to construct a viscosity solution to QVI; also this paper, in the last section, provides and application of the results to a one dimensional problem with an implementable solution. In dimension one other approaches, based on excessive mappings and iterated optimal stopping schemes, have been successfully employed in the context of stochastic impulse control (see [3, 6, 35, 46]). More recently, these methods have been extended to Markov processes valued in metric spaces (see [25]); again a complete description of the solution is shown in one dimensional examples.

1.2 Contribution

From the methodological side our work is close to [43]. As in the latter, we follow a direct analytical method based on viscosity solutions and we do not employ a guess-and-verify approach.Footnote 4 Indeed, we directly provide necessary optimality conditions that, by uniqueness, fully characterize the solution. In particular, we do not postulate the smooth-fit principle, as it is usually done in the guess-and-verify approach, but we prove it directly.Footnote 5 To the best of our knowledge a rigorous analytical treatment as ours of the specific problem treated in this paper seems to be still missing in the literature. It is important to notice that our analysis yields a a complete and implementable characterization of the optimal control policy through the identification of the continuation and action regions. Since the aforementioned techniques based on excessive mappings seems to be perfectly employable to our problem (even under weaker assumption), it is worth to point out that our contribution is methodological. As it is well known, the (implementable) characterization of the optimal control in stochastic impulse control problems is a challenging task in dimension larger than one. Hence, it is important to have at hand an approach like ours that might be generalized to address impulse control problems in multi-dimensional setting. To this regard, it is worth to notice the following.

  • To the best of our knowledge, the only study providing a complete picture of the solution in dimension two—through a two dimensional (Ss)-rule—is the recent paper [16] (see also [69] in a deterministic setting). The techniques used there are analytical and based on the study of QVI’s. Unfortunately, in this paper, the authors are able to provide a complete solution only in a very specific case.

  • In the presence of semiconvex data, our approach to prove \(C^1\) regularity of the value function based on semiconvexity jointly with the viscosity property, unlike [43], might be successful to prove a directional regularity result just along nondegenerate directions (see [37] in a singular control context).

  • The directional regularity result mentioned above might be sufficient to derive the right optimality condition to solve the control problem (see again [37] in a singular control context).

1.3 Contents

In Sect. 2 we set up the problem. In Sect. 3 we state some preliminary results on the value function v, in particular we show that it is semiconvex. In Sect. 4 we derive QVI associated to v and show that it solves the latter in viscosity sense. After that, we prove that v is of class \(C^2\) in the continuation region (the region where the differential part of QVI holds with equality, see below) and of class \(C^1\) on the whole state space (Theorem 4.6, our first main result), hence proving the smooth fit-principle. We prove the latter result relying just on the semiconvexity of v and exploting the viscosity supersolution property; unlike [43], this allows to avoid the use of a deep theoretical result such as the Calderon–Zygmund estimate. So, with respect to the aforementioned reference, our method of proof is cheaper from a theoretical point of view; on the other hand, it heavily relies on assumptions guaranteeing the semiconvexity of v. In Sect. 5 we use the latter regularity to establish the structure of the continuation and action regions—the real unknown of the problem—showing that they are both intervals. This allows to express explicitly v up to the solution of a nonlinear algebraic system of three variables (Theorem 5.11, our second main result). In Sect. 6, relying on the results of the previous section, we are able to construct an optimal control policy (Theorem 6.1, our third main result). The latter turns out to be based on the so called (Ss)-rule:Footnote 6 the controller acts whenever the state process reaches a minimum level s (the “trigger” boundary) and brings immediately the system at the level \(S>s\) (the “target” boundary). Finally, in Sect. 7, we provide a numerical illustration of the solution when X follows a geometric Brownian motion dynamics between intervation times, analyzing the sensitivity of the solution with respect to the volatility coefficient \(\sigma \) and to and the fixed cost \(c_1\).

2 Problem formulation

We introduce some notation. We set

$$\begin{aligned} \mathbb {R}_+{:}{=}[0,+\infty ),\quad {\overline{\mathbb {R}}}_+{:}{=}[0,+\infty ], \quad \mathbb {R}_{++}{:}{=}(0,+\infty ). \end{aligned}$$

The set \(\mathbb {R}_{++}\) will be the state space of our control problem. Throughout the paper we adopt the conventions \(e^{-\infty }=0\) and \(\inf \emptyset =\infty \). Moreover, we simply use the symbol \(\infty \) in place of \(+\infty \) when positive quantities are involved and no confusion may arise. Finally, the symbol n will always denote a natural number.

Let \((\Omega ,\mathscr {F}, \{\mathscr {F}_t\}_{t\ge 0}, \mathbb {P})\) be a filtered probability space satisfying the usual conditions and supporting a a one dimensional Brownian motion \(W=\{W_t\}_{t\ge 0}\). We denote \( \mathbb {F}{:}{=}\{\mathscr {F}_t\}_{t\in \overline{\mathbb {R}}_+}\), where we set \(\displaystyle {\mathscr {F}_\infty {:}{=}\bigvee _{t\in \mathbb {R}_+}\mathscr {F}_t}\). We take \(b,\sigma :\mathbb {R}\rightarrow \mathbb {R}\) satisfying the following

Assumption 2.1

\(b,\sigma :\mathbb {R}\rightarrow \mathbb {R}\) are Lipschitz continuous functions, with Lipschitz constants \(L_b,L_\sigma \), respectively, identically equal to 0 on \((-\infty ,0]\) and with \(\sigma >0\) on \(\mathbb {R}_{++}\). Moreover, \(b,\sigma \in C^1(\mathbb {R}_{+})\) and \(b^{\prime },\sigma ^{\prime }\) are Lipschitz continuous on \(\mathbb {R}_{++}\), with Lipschitz constants \(\tilde{L}_b, \tilde{L}_\sigma >0\), respectively.

Remark 2.2

The requirement that \(b^{\prime },\sigma ^{\prime }\) are Lipschitz continuous is typical when one wants to prove the semiconvexity/semiconcavity of the value function in stochastic optimal control problem (see, e.g., the classical reference [72, Ch. 4, Sec. 4.2] in the context of regular stochastic control; [14] in the context of impulse control). We use this assumpton since, as outlined in the introduction, in our approach the proof of the semiconvexity of the value function will be a crucial step towards the proof of the \(C^1\) regularity.

Let \(\tau \) be a (possibly not finite) \(\mathbb {F}\)-stopping time and let \(\xi \) be an \(\mathscr {F}_\tau \)-measurable random variable. By standard SDE’s theory with Lipschitz coefficients, Assumption 2.1 guarantees that there exists a unique (up to undistinguishability) \(\mathbb {F}\)-adapted process \(Z^{\tau ,\xi }=\{Z^{\tau ,\xi }_t\}_{t\ge 0}\) with continuous trajectories on \([\tau ,\infty )\), such that

$$\begin{aligned} Z^{\tau ,\xi }_t= \left\{ \begin{array}{ll} 0&{}\quad \text{ for } t\in [0,\tau )\\ \xi +\int _\tau ^t b(Z^{\tau ,\xi }_s)ds +\int _\tau ^t \sigma (Z^{\tau ,\xi }_s)dW_s &{}\quad \mathbb {P}\text{-a.s. }, \text{ for } t\ge \tau . \end{array}\right. \end{aligned}$$
(2.1)

Moreover, by a straightforward adaptation of [50, Sec. 5.2, Prop. 2.18] to random initial data, we obtain

$$\begin{aligned} \xi ,\eta \ \mathscr {F}_\tau \text{-measurable } \text{ random } \text{ variables, } \xi \le \eta \ \mathbb {P}\text{-a.s. } \ \Longrightarrow \ Z^{\tau ,\xi }_{t+\tau }\le Z^{\tau ,\eta }_{t+\tau } \ \mathbb {P}\text{-a.s. }, \ \forall t\ge 0. \end{aligned}$$
(2.2)

Now fix \(x\in \mathbb {R}_{++}\). By (2.2) and Assumption 2.1, it follows that \(Z^{0,x}\) takes values in \(\mathbb {R}_+\). Due to the nondegeneracy assumption on \(\sigma \) over \(\mathbb {R}_{++}\), as a consequence of the results of [50, Sec. 5.5.C], the process \(Z^{0,x}\) is a (time-homogeneous) regular diffusion on \(\mathbb {R}_{++}\); i.e., setting \( \tau _{x,y}{:}{=}\inf \left\{ t\ge 0:Z^{0,x}_t=y\right\} , \) one has

$$\begin{aligned} \mathbb {P}\{\tau _{x,y}<\infty \}>0 \quad \forall y\in \mathbb {R}_{++}. \end{aligned}$$

In “Appendix” we show that Assumption 2.1 guarantees that the boundaries 0 and \(+\infty \) are natural for \(Z^{0,x}\) in the sense of Feller’s classification.

We introduce now a set of admissible controls and their corresponding controlled process. As a set of admissible controls (i.e., feasible investment strategies) we consider the set \(\mathscr {I}\) of all sequences of couples \(I=\left\{ (\tau _n,i_n)\right\} _{n\ge 1}\) such that:

  1. (i)

    \(\{\tau _n\}_{n\ge 1}\) is an increasing sequence of \({\overline{\mathbb {R}}}_+\)-valued \(\mathbb {F}\)-stopping times such that \(\tau _n<\tau _{n+1}\)\(\mathbb {P}\)-a.s. over the set \(\{\tau _n<\infty \}\) and

    $$\begin{aligned} \lim _{n\rightarrow \infty }\tau _n =\infty \quad \mathbb {P}\text{-a.s. }; \end{aligned}$$
    (2.3)
  2. (ii)

    \(\{i_n\}_{n\ge 1}\) is a sequence of \(\mathbb {R}_{++}\)-valued random variables such that \(i_n\) is \(\mathscr {F}_{\tau _n}\)-measurable for every \(n\ge 1\);

  3. (iii)

    The following integrability condition holds:

    $$\begin{aligned} \sum _{n\ge 1} \mathbb {E} \left[ e^{-\rho \tau _n}(i_n+1) \right] <\infty . \end{aligned}$$
    (2.4)

For \(n\ge 1\), \(\tau _n\) represents an intervention time, whereas \(i_n\) represents the intervention size at the corresponding intervention time \(\tau _n\). Condition (2.3) ensures that, within a finite time interval, only a finite number of actions are executed. We allow the case \(\tau _n=\infty \) definitively, meaning that only a finite number of actions are taken. Condition (2.4) ensures that the functional defined below is well defined. We call null control any sequence \(\{(\tau _n,i_n)\}_{n\ge 1}\) such that \(\tau _n=\infty \) for each \(n\ge 1\) and denote any of them by \(\emptyset \). Notice that using the same notation \(\emptyset \) for the null controls is not ambiguous with regard to the control problem we are going to define, as any null control will give rise to the same payoff.

Given a control \(I\in \mathscr {I}\), an initial stopping time \(\tau \ge 0\) and a random variable \(\xi >0\)\(\mathbb {P}\)-a.s. \(\mathscr {F}_\tau \)-measurable, we denote by \(X^{\tau ,\xi ,I}=\{X^{\tau ,\xi ,I}_r\}_{r\in [0,\infty )}\) the unique (up to indistinguishability) càdlàg process on \([\tau ,\infty )\) solving the SDE (in integral form)

$$\begin{aligned} X^{\tau ,\xi ,I}_t= \left\{ \begin{array}{ll} 0&{} \quad \text{ for } t\in [0,\tau )\\ \xi + \displaystyle \int _\tau ^t b\left( X^{\tau ,\xi ,I}_s\right) ds +\int _\tau ^t\sigma \left( X^{\tau ,\xi ,I}_s\right) dW_s+ \sum _{n\ge 1}\mathbf {1}_{[\tau ,r]}\left( \tau _n\right) \cdot i_n&{}\quad \text{ for } t\in [\tau ,\infty ) \end{array}\right. \end{aligned}$$
(2.5)

If \(t=0\) and \(\xi \equiv x\in \mathbb {R}_{++}\) then we denote \(X^{0,\xi ,I}\) by \(X^{x,I}\). It is easily seen that, if \(\tau ^{\prime }\) is another stopping time such that \(\tau ^{\prime }\ge \tau \), then the following flow property holds true

$$\begin{aligned} X^{\tau ,\xi ,I}_t=X^{\tau ^{\prime },X^{\tau ,\xi ,I}_{\tau ^{\prime -}},I}_t \quad \forall t\ge \tau ^{\prime }, \quad \mathbb {P}\text{-a.e.. } \end{aligned}$$
(2.6)

Note that, up to undistinguishability, we have \( X^{x,\emptyset }= Z^{0,x}\). Moreover, setting by convention \(\tau _0{:}{=}0\), \( i_0{:}{=}0\) and \(X_{0^-}{:}{=}x\), we have recursively on \(n\in \mathbb {N}\)

$$\begin{aligned} X^{x,I}_t = Z^{\tau _n, X^{x,I}_{\tau _n} }_t \quad \forall t\in [\tau _n,\tau _{n+1}), \quad \mathbb {P}\text{-a.s. }. \end{aligned}$$

Then, by (2.2), we have the following monotonicity of the controlled process with respect to the initial data

$$\begin{aligned} X_t^{x,I}\le X_t^{x^{\prime },I}\ \mathbb {P}\text{-a.s. }, \quad \forall t\ge 0, \quad \forall I\in \mathscr {I}, \quad \forall x,x^{\prime }:0<x\le x^{\prime }. \end{aligned}$$
(2.7)

Next, we introduce the optimization problem. Given \(\rho >0\), \(f:\mathbb {R}_{++}\rightarrow \mathbb {R}_{++}\) measurable, \(c_0>0\), \(c_1>0\), we define the payoff functional J by

$$\begin{aligned} J(x,I){:}{=}\mathbb {E}\left[ \int _0^{\infty }e^{-\rho t} {f}\left( X^{x,I}_t\right) dt-\sum _{n\ge 1}e^{-\rho \tau _n}\left( c_0 i_n+c_1\right) \right] ,\quad \forall x\in \mathbb {R}_{+},\quad \forall I\in \mathscr {I}. \end{aligned}$$
(2.8)

We notice that (2.4) and the fact that f is bounded from below ensure that J(xI) is well defined and takes values in \(\mathbb {R}\cup \{\infty \}\).

We will make use of the following assumption on f.

Assumption 2.3

\(f\in C^1(\mathbb {R}_{++};\mathbb {R}_+)\), \(f^{\prime }>0\), \(f^{\prime }\) is strictly decreasing and f satisfies the Inada condition at \(\infty \):

$$\begin{aligned} f^{\prime }(\infty ){:}{=}\lim _{x\rightarrow \infty }f^{\prime }(x)=0. \end{aligned}$$

Finally, without loss of generality, we assume that \(f(0^+){:}{=}{\displaystyle \lim _{x\rightarrow 0^+} f(x)=0}\).

Note that

$$\begin{aligned} M_b:=\left( \sup _{x\in \mathbb {R}_{++}} b^{\prime }(x)\right) ^+< \infty \end{aligned}$$
(2.9)

by Assumption 2.1. The following assumption will ensure finiteness for the problem (Proposition 3.2).

Assumption 2.4

\(\rho >M_b\).

Assumptions 2.1, 2.3 and 2.4 will be standing through the rest of the manuscript.

The optimal control problem that we address consists in maximizing the functional (2.8) over \(I\in \mathscr {I}\), i.e., for each \(x\in \mathbb {R}_+\), we consider the maximization problem

figure a

Remark 2.5

The fact that \(c_1>0\) means that there is a fixed cost when the investment occurs. This provides that (P) is well posed as an impulse control problem, i.e. optimal controls can be found within the class of impulse controls . If it was \(c_1=0\) (only proportional intervention cost), the setting providing existence of optimal controls would be the more general singular control setting (see e.g. [63, Ch. 4]). For comparison between impulse and singular control we refer to [18]; for the relevance of the introduction of the fixed cost we refer to [61], where the asymptotics for \(c_1\rightarrow 0\) is investigated. In Sect. 7.1.2, we comment this issue through the numerical outputs.

We also notice that one might consider more general intervention costs \(C:\mathbb {R}_{++}\rightarrow \mathbb {R}_+\) increasing and convex, (e.g. \(C(i)=\alpha i^2+\beta i+c_1\) with \(\alpha ,c_1>0\) and \(\beta \ge 0\)). We believe that, at least for a suitable subclass of such cost functons, the solution would depict the same structure as the one we provide here in the affine case (i.e. \(C(i)=c_0i+c_1\)). On the other hand, we underline that at many points our proofs make use of the affine structure of the cost and the generalization seems to be not straightforward.

3 Preliminary results on the value function

In this section we introduce the value function associated with (P) and establish some basic properties of it. We define the value function v by

$$\begin{aligned} v(x){:}{=}\sup _{I\in \mathscr {I}}J(x,I), \quad \forall x\in \mathbb {R}_{++}. \end{aligned}$$
(3.1)

We notice that v is \(\overline{\mathbb {R}}_{+}\)-valued, as by Assumption 2.3

$$\begin{aligned} v(x)\ge J(x,\emptyset )=\hat{ v}(x){:}{=}\mathbb {E} \left[ \int _0^{\infty }e^{-\rho t} f(X^{x,\emptyset }_t)dt\right] \ge 0 \quad \forall x\in \mathbb {R}_{++}. \end{aligned}$$
(3.2)

Note that \(\hat{v}\) is nondecreasing as \(f^{\prime }>0\) (Assumption 2.3) and by (2.7).

Proposition 3.1

v is nondecreasing.

Proof

Let \(0<x\le x^{\prime }\). Since \(f^{\prime }>0\) (see Assumption 2.3), from (2.7) we get \(J(x;I)\le J(x^{\prime };I)\) for every \(I\in \mathscr {I}\). The claim follows by taking the supremum over \(I\in \mathscr {I}\). \(\square \)

We denote by \(f^*\) the Fenchel–Legendre transform of f on \(\mathbb {R}_{++}\):

$$\begin{aligned} f^*(\alpha ){:}{=}\sup _{x\in {\mathbb {R}_{++}}} \big \{ f(x)-\alpha x \big \},\quad \forall \alpha \in \mathbb {R}_{++}. \end{aligned}$$
(3.3)

Nonnegativity and continuity of f (see Assumption 2.3) and the condition \(f^{\prime }(\infty )=0\) (again Assumption 2.3) guarantee that \(0\le f^*(\alpha )<\infty \) for all \(x\in \mathbb {R}_{++}\).

Proposition 3.2

For all \(\alpha \in \left( 0,c_0\rho \right] \) we have

$$\begin{aligned} 0\le \hat{ v}(x)\le v(x)\le \frac{f^*(\alpha )}{\rho } + \frac{\alpha x}{\rho } , \quad \forall x\in \mathbb {R}_{++} \end{aligned}$$
(3.4)

and

$$\begin{aligned} \limsup _{x\rightarrow \infty }\frac{v(x)}{x}=0. \end{aligned}$$
(3.5)

Proof

The fact that \(0\le \hat{ v}\le v\) was already noticed in (3.2). We show the remaining inequality. Let \(x\in \mathbb {R}_{++}\) and \(I\in \mathscr {I}\). For \(R>0\), define the stopping time \( \hat{\tau }_R{:}{=}\inf \left\{ t\ge 0:X^{x,I}_t\ge R\right\} . \) Notice that, since \(b\in C^1(\mathbb {R}_{++};\mathbb {R})\) and \(b(0)=0\) by Assumption 2.1, mean value theorem yields

$$\begin{aligned} b(\xi )\le b(0)+M_b \xi =M_b\xi , \quad \forall \xi \in \mathbb {R}, \end{aligned}$$
(3.6)

where \(M_b\) is defined in (2.9). Set \(\tau _0{:}{=}0\) and let \(t\in \mathbb {R}_{++}\). Applying Itô’s formula to \(\varphi (s,X^{x,I}_s){:}{=}e^{-\rho s}X^{x,I}_s\), \(s\in [0,\hat{\tau }_R)\), taking expectations after considering that \(X_s^{x,I}\in (0,R)\) for \(s\in [0,\hat{\tau }_R)\), summing up over \(n\in \mathbb {N}\) and using (3.6) and 2.4, we get

$$\begin{aligned} \mathbb {E}\left[ e^{-\rho t} X^{x,I}_{t\wedge \hat{\tau }_R}\right]&= x- \rho \int _0^t e^{-\rho s} \mathbb {E} \left[ \mathbf {1}_{[0, \hat{\tau }_R]}(s) X_s^{x,I}\right] ds+\int _0^te^{-\rho s} \mathbb {E} \left[ \mathbf {1}_{[0, \hat{\tau }_R]}(s) b(X^{x,I}_s) \right] ds \\&\quad + e^{-\rho t}\mathbb {E} \left[ \sum _{n\ge 1, \,\tau _n\le t\wedge \hat{\tau }_R} i_n \right] \\&\le x+(M_b-\rho )\int _0^t e^{-\rho s} \mathbb {E} \left[ \mathbf {1}_{[0, \hat{\tau }_R]}(s) X^{x,I}_s\right] ds + e^{-\rho t} \mathbb {E} \left[ \sum _{n\ge 1, \, \tau _n\le t\wedge \hat{\tau }_R} i_n \right] \\&\le x + e^{-\rho t} \mathbb {E} \left[ \sum _{n\ge 1, \, \tau _n\le t\wedge \hat{\tau }_R} i_n \right] . \end{aligned}$$

By Fatou’s lemma, letting \(R\rightarrow \infty \) and observing that \(\tau _{R} \rightarrow \infty \)\(\mathbb {P}\)-a.s. , we get

$$\begin{aligned} \mathbb {E}\left[ e^{-\rho t} X^{x,I}_t\right] \le x + e^{-\rho t} \mathbb {E} \left[ \sum _{n\ge 1, \, \tau _n\le t} i_n \right] . \end{aligned}$$
(3.7)

By integrating the second term on the right-hand side of (3.7), we have using Fubini–Tonelli’s Theorem (as all the integrands involved are nonnegative)

$$\begin{aligned} \mathbb {E} \left[ \int _0^\infty \left( e^{-\rho t} \sum _{n\ge 1,\, \tau _n\le t} i_n \right) dt \right]= & {} \mathbb {E} \left[ \sum _{n\ge 1} \left( \int _{\tau _n}^\infty e^{-\rho (t-\tau _n)} dt \right) e^{-\rho \tau _n} i_n \right] \nonumber \\= & {} \frac{1}{\rho } \mathbb {E} \left[ \sum _{n\ge 1} e^{-\rho \tau _n }i_n \right] . \end{aligned}$$
(3.8)

Therefore, taking into account (3.7), (3.8) and (2.4), we have

$$\begin{aligned} \mathbb {E}\left[ \int _0^\infty e^{-\rho t}X^{x,I}_t dt\right] \le \frac{1}{\rho } \left( x + \mathbb {E} \left[ \sum _{n\ge 1} e^{-\rho \tau _n}i_n \right] \right) <\infty . \end{aligned}$$
(3.9)

Now let \(\alpha >0\). By definition of \(f^*\) and by (3.9), we can write

$$\begin{aligned}&\mathbb {E} \left[ \int _0^\infty e^{-\rho t}f(X^{x,I}_t)dt - \sum _{n\ge 1}e^{-\rho \tau _n}(c_0i_n+c_1) \right] \\&\quad \le \mathbb {E} \left[ \int _0^\infty e^{-\rho t} \left( f^* (\alpha )+\alpha X^{x,I}_t \right) dt - \sum _{n\ge 1}e^{-\rho \tau _n}(c_0i_n+c_1) \right] \\&\quad \le \frac{f^*(\alpha )}{\rho } + \frac{\alpha x}{\rho } + \left( \frac{\alpha }{\rho }-c_0 \right) \mathbb {E} \left[ \sum _{n\ge 1} e^{-\rho \tau _n}i_n \right] . \end{aligned}$$

By arbitrariness of \(I\in \mathscr {I}\), if \(\alpha \in \left( 0,c_0\rho \right] \), the latter provides the last inequality in (3.4).

Take now \(\alpha \in (0,c_0\rho ]\). By (3.4) we have

$$\begin{aligned} 0\le {\displaystyle {\limsup _{x\rightarrow \infty }\frac{v(x)}{x}}}\le \alpha \, {\displaystyle {\limsup _{x\rightarrow \infty }\frac{v(x)}{\alpha x}\le \alpha \,\limsup _{x\rightarrow \infty }\left\{ \frac{f^*(\alpha )}{\alpha \rho x} + \frac{1}{\rho }\right\} =\frac{\alpha }{\rho }}} \end{aligned}$$

By arbitrariness of \(\alpha \) we get (3.5). \(\square \)

Assumption 3.3

The following conditions hold true.

  1. (i)

    \(\rho >\max \big \{B_0, C_0\big \}\) where \(B_0,C_0\) are the constants defined in Lemma A.3.

  2. (ii)

    For each \(\beta >0\),

    $$\begin{aligned} \ M(\beta ){:}{=}\mathbb {E} \left[ \int _0^\infty e^{-\rho t} \left( f^{\prime }\left( X_t^{\beta ,\emptyset }\right) \right) ^2 dt \right] <\infty . \end{aligned}$$
    (3.10)
  3. (iii)

    For each \(\eta >0\), the function f is semiconvex on \([\eta , \infty )\). Precisely, there exists a nonincreasing function \(K_0:\mathbb {R}_{++}\rightarrow \mathbb {R}_{++}\) such that

    $$\begin{aligned} f(\lambda x+(1-\lambda )y)- \lambda f(x)-(1-\lambda )f(y) \le K_0(\eta ) \lambda (1-\lambda ) (y-x)^2, \quad \forall \lambda \in [0,1], \ \forall x,y\in [\beta ,\infty ). \end{aligned}$$
    (3.11)
  4. (iv)

    The function \(K_0\) in (iii) is such that, for each \(\beta >0\),

    $$\begin{aligned} \hat{ M}(\beta ){:}{=}\mathbb {E} \left[ \int _0^\infty e^{-\rho t} \left( K_0\left( X_t^{\beta ,\emptyset }\right) \right) ^2 dt \right] <\infty . \end{aligned}$$
    (3.12)

Remark 3.4

Semiconvex functions are functions that can be written as difference of a convex function and a quadratic one (see [26, Prop. 1.1.3] or [72, Ch. 4, Sec. 4.2]). Moreover, a function \(\varphi \in C^2([\beta ,\infty );\mathbb {R})\) verifies (3.11) with \(K_0(\eta ):= -2\,\inf _{[\eta ,\infty )} f^{\prime \prime }\) (see again [26, Prop. 1.1.3]).

The following Proposition shows that power functions satisfy Assumption 3.3(ii)–(iv).

Proposition 3.5

Let \(f\in C^2(\mathbb {R}_{++};\mathbb {R})\) such that \(f^{\prime }>0\), \(f^{\prime \prime }<0\) and

$$\begin{aligned} f^{\prime }(\xi )\le C_0\left( 1+|\xi |^{\gamma -1}\right) , \quad f^{\prime \prime }(\xi )\ge -C_0\left( 1+|\xi |^{\gamma -2}\right) \quad \forall \xi \in \mathbb {R}_{++} \end{aligned}$$
(3.13)

for some \(C_0> 0\) and \(\gamma \in (0,1)\) and let \(\rho >L_b(1-\gamma )+\frac{1}{2}L_\sigma ^2(1-\gamma )(2-\gamma )\). Then f satisfies Assumption 3.3(ii)–(iv).

Proof

Let \(\beta \in \mathbb {R}_{++}\) and observe that, by Assumption 2.1, we have

$$\begin{aligned} |b(\xi )|\le L_b|\xi |, \quad |\sigma (\xi )|\le L_\sigma |\xi | \quad \forall \xi \in \mathbb {R}. \end{aligned}$$

With a localization procedure similar to the one of the prof of Proposition 3.2 (now keeping the process \(X^{\beta ,\emptyset }\) away from 0), we get from Itô’s formula

$$\begin{aligned}&\mathbb {E}\left[ e^{-\rho t}\big |X^{\beta ,\emptyset }_t\big |^{\gamma -1}\right] \\&\quad = |\beta |^{\gamma -1}+\mathbb {E}\left[ \int _0^t e^{-\rho s} \left[ -\rho \big |X^{\beta ,\emptyset }_s\big |^{\gamma -1}+(\gamma -1)\big |X^{\beta ,\emptyset }_s\big |^{\gamma -2} b\left( X_s^{\beta ,\emptyset }\right) \right. \right. \\&\qquad \left. \left. +\, \frac{1}{2} (\gamma -1)(\gamma -2)\big |X^{\beta ,\emptyset }_s\big |^{\gamma -3} \sigma ^2\left( X_s^{\beta ,\emptyset }\right) \right] ds\right] \\&\quad \le |\beta |^{\gamma -1}+\mathbb {E}\left[ \int _0^t e^{-\rho s}\left[ -\rho \big |X^{\beta ,\emptyset }_s\big |^{\gamma -1}+{L}_b(1-\gamma )\,\big |X^{\beta ,\emptyset }_s\big |^{\gamma -1}\right. \right. \\&\qquad \left. \left. +\, \frac{1}{2} {L}^2_\sigma (1-\gamma )(2-\gamma )\big |X^{\beta ,\emptyset }_s\big |^{\gamma -1} \right] ds\right] . \end{aligned}$$

Then Assumption 3.3(ii) follows from (3.13) and Gronwall’s Lemma applied to the inequality above.

Moreover, note that, since \(\xi \mapsto -C_0(1+|\xi |^{\gamma -2})\) is negative and increasing, by Remark 3.4 and (3.13) we obtain that f verifies Assumption 3.3(iii) with

$$\begin{aligned} K_0(\eta ):=-2\gamma (\gamma -1)\eta ^{\gamma -2} \quad \forall x\in \mathbb {R}_{++}. \end{aligned}$$
(3.14)

Finally, similarly as above, we have

$$\begin{aligned}&\mathbb {E}\left[ e^{-\rho t}\big |X^{\beta ,\emptyset }_t\big |^{\gamma -2}\right] \\&\quad = |\beta |^{\gamma -2}+\mathbb {E}\left[ \int _0^t e^{-\rho s} \left[ -\rho \big |X^{\beta ,\emptyset }_s\big |^{\gamma -2}+(\gamma -2)\big |X^{\beta ,\emptyset }_s\big |^{\gamma -3} b\left( X_s^{\beta ,\emptyset }\right) \right. \right. \\&\qquad \left. \left. +\, \frac{1}{2} (\gamma -2)(\gamma -3)\big |X^{\beta ,\emptyset }_s\big |^{\gamma -4} \sigma ^2\left( X_s^{\beta ,\emptyset }\right) \right] ds\right] \\&\quad \le |\beta |^{\gamma -2}+\mathbb {E}\left[ \int _0^t e^{-\rho s}\left[ -\rho \big |X^{\beta ,\emptyset }_s\big |^{\gamma -2}+{L}_b(1-\gamma )\,\big |X^{\beta ,\emptyset }_s\big |^{\gamma -1}\right. \right. \\&\qquad \left. \left. +\, \frac{1}{2} {L}_\sigma ^2 (1-\gamma )(2-\gamma )\big |X^{\beta ,\emptyset }_s\big |^{\gamma -1} \right] ds\right] . \end{aligned}$$

Then Assumption 3.3(iv) follows from Gronwall’s Lemma applied to the inequality above and from (3.14). \(\square \)

Remark 3.6

Note that, if \(\rho \) satisfies Assumption 3.3(i), then it also satisfies the requirement of Proposition 3.5.

Proposition 3.7

Let Assumption 3.3 hold. Then v is semiconvex on \([\beta ,\infty )\) for each \(\beta >0\), i.e., for each \(\beta >0\) there exists \(K_1(\beta )>0\) such that

$$\begin{aligned} v(\lambda x+(1-\lambda )y)-\lambda v(x)-(1-\lambda ) v(y)\le K_1(\beta ) \lambda (1-\lambda )(x-y)^2\quad \forall \lambda \in [0,1],\ \forall x,y\in [\beta ,\infty ). \end{aligned}$$
(3.15)

Proof

Fix \(\beta >0\). Let \(x,y\in [\beta , \infty )\) with \(x\le y\) and \(I\in \mathscr {I}\). For each \(\lambda \in [0,1]\) set \(z_\lambda {:}{=}\lambda x+(1-\lambda )y\) and \(\Sigma ^{\lambda ,x,y,I}{:}{=}\lambda X^{x,I}+(1-\lambda )X^{y,I}\). We write

figure b

Applying Hölder’s inequality, observing that \(X^{\beta ,\emptyset }\le X^{z_\lambda ,I}\wedge \Sigma ^{\lambda ,x,y,I}\), that \(f^{\prime }\) is decreasing, using Assumption 3.3(i) and using Lemma A.3(ii), we write

$$\begin{aligned} \mathbf {A}&\le \mathbb {E} \left[ \int _0^\infty e^{-\rho t} f^{\prime }\left( X^{\beta ,\emptyset }_t\right) \left| X^{z_\lambda ,I}_t - \Sigma ^{\lambda ,x,y,I}_t \right| dt \right] \\&\le \left( \mathbb {E} \left[ \int _0^\infty e^{-\rho t} \left( f^{\prime }\left( X^{\beta ,\emptyset }_t\right) \right) ^2 dt \right] \right) ^{1/2} \left( \mathbb {E} \left[ \int _0^\infty e^{-\rho t} \left| X^{z_\lambda ,I}_t -\Sigma ^{\lambda ,x,y,I}_t \right| ^2 dt \right] \right) ^{1/2}\\&\le M(\beta )^{1/2} \left( \mathbb {E} \left[ \int _0^\infty e^{-\rho t} \left| X^{z_\lambda ,I}_t -\Sigma ^{\lambda ,x,y,I}_t \right| ^2 dt\right] \right) ^{1/2}\\&\le M(\beta )^{1/2} \left( \int _0^\infty e^{-\rho t} A_0e^{B_0t} dt \right) ^{1/2} \lambda (1-\lambda )|x-y|^2 \\&= \frac{A_0^{1/2}M(\beta )}{(\rho -B_0)^{1/2}} \lambda (1-\lambda )|x-y|^2. \end{aligned}$$

Moreover, by Assumption 3.3(i), (iii), (iv), again using Hölder’s inequality and applying Lemma A.3(i), we have

$$\begin{aligned} \mathbf {B}&\le \lambda (1-\lambda ) \mathbb {E} \left[ \int _0^\infty e^{-\rho t} K_0\left( X^{\beta ,\emptyset }_t\right) \left| X_t^{y,I} - X_t^{x,I} \right| ^2 dt \right] \\&\le \lambda (1-\lambda ) \left( \mathbb {E} \left[ \int _0^\infty e^{-\rho t} \left( K_0\left( X^{\beta ,\emptyset }_t\right) \right) ^2 dt \right] \right) ^{1/2} \left( \mathbb {E} \left[ \int _0^\infty e^{-\rho t} \left| X^{y,I}_t - X^{x,I}_t \right| ^4 dt \right] \right) ^{1/2}\\&\le \lambda (1-\lambda ) \hat{M}(\beta )^{1/2} \left( \int _0^\infty e^{-\rho t} e^{C_0t} dt \right) ^{1/2}|x-y|^2 \\&= \frac{ \hat{ M}(\beta )^{1/2}}{(\rho -C_0)^{1/2}}\lambda (1-\lambda )|x-y|^2. \end{aligned}$$

Now let \(\delta >0\) and let I be such that \(v(z)\le J(z,I)+\delta \). The inequalities above provide

$$\begin{aligned} v(z)-\delta -\lambda v(x)-(1-\lambda )v(y)&\le J(z,I) -\lambda J(x,I)-(1-\lambda )J(y,I)\\&\le K_1(\beta )\lambda (1-\lambda )|x-y|^2\quad \forall x,y\ge \beta , \ \forall \lambda \in [0,1], \end{aligned}$$

where \(K_1(\beta ){:}{=}\frac{ \hat{ M}(\beta )}{(\rho -C_0)^{1/2}}+\frac{ A_0^{1/2} \hat{ M}(\beta )}{(\rho -B_0)^{1/2}}\). We then obtain (3.15) by arbitrariness of \(\delta \). \(\square \)

In view of the fact that the results which follow rely on the semiconvexity of v, Assumption 3.3 will be standing for the remaining of this section and in Sects. 4, 5 and 6.

Define the space

$$\begin{aligned} {\text {Lip}}_{{\small {\mathrm{loc}}},c_0}(\mathbb {R}_{++}){:}{=}\left\{ u:\mathbb {R}_{++}\rightarrow \mathbb {R}\ \text{ locally } \text{ Lipschitz } \text{ continuous } \text{ on } \mathbb {R}_{++}, \ \text{ s.t. }\ \limsup _{x\rightarrow \infty } \frac{u(x)}{x}<c_0 \right\} . \end{aligned}$$
(3.16)

We recall that semiconvex functions on open sets are locally Lipschitz. So, by Propositions 3.2 and 3.7, we have \(v\in {\text {Lip}}_{{\small {\mathrm{loc}}},c_0}(\mathbb {R}_{++})\). The space \( {\text {Lip}}_{{\small {\mathrm{loc}}},c_0}(\mathbb {R}_{++})\) will be used in the next section.

4 Dynamic programming

The dynamic programming equation associated to our dynamic optimization problem is the quasi-variational inequality (see, e.g., [17])

figure c

where \(\mathscr {L}\) and \(\mathscr {M}\) are operators formally defined by

$$\begin{aligned} \mathscr {L}u(x)&{:}{=}&\rho u(x)-b(x) u^{\prime }(x)-\frac{1}{2}\sigma ^2(x)u^{\prime \prime }(x),\quad x\in \mathbb {R}_{++}, \end{aligned}$$
(4.1)
$$\begin{aligned} \mathscr {M}u(x)&{:}{=}&\sup _{i>0}\left\{ u(x+i)-c_0i-c_1\right\} ,\quad x\in \mathbb {R}_{++}. \end{aligned}$$
(4.2)

We note that \(\mathscr {L}\) is a differential operator, so it has a local nature, while \(\mathscr {M}\) is a functional operator having a nonlocal nature.

4.1 Continuation and action region

Here we define and study the first properties of the continuation and action region in the state space \(\mathbb {R}_{++}\).

Lemma 4.1

\({\mathscr {M}}\) maps \({\text {Lip}}_{{\small {\mathrm{loc}}},c_0}(\mathbb {R}_{++})\) into itself.

Proof

Let \(u\in {\text {Lip}}_{{\small {\mathrm{loc}}},c_0}(\mathbb {R}_{++})\). Then there exists \(\overline{x} ,\varepsilon >0\) such that

$$\begin{aligned} \frac{u(x)}{x}-c_0\le -\varepsilon \quad \forall x\ge \overline{x} . \end{aligned}$$
(4.3)

By (4.3), for all \(i>0\), \(x\ge \overline{x}\), we have

$$\begin{aligned} u(x+i)-(c_0i+c_1)=(x+i) \left( \frac{u(x+i)}{x+i}-c_0 \right) +c_0x-c_1\le (c_0-\varepsilon ) x. \end{aligned}$$

Hence, by taking the supremum over \(i>0\),

$$\begin{aligned} \frac{\mathscr {M}u(x)}{x}\le c_0-\varepsilon \quad \forall x\ge \overline{x}, \end{aligned}$$

which shows that \({\displaystyle \limsup _{x\rightarrow \infty }\frac{\mathscr {M}u(x)}{x}< c_0}\).

Now we show that \(\mathscr {M}u\) is Lipschitz continuous on \([M^{-1},M]\) for each \(M>0\). Using (4.3) one can show that

$$\begin{aligned} \limsup _{i\rightarrow +\infty }\sup _{x\in [M^{-1},M]} \big \{ u(x+i)-c_0i\big \}=-\,\infty . \end{aligned}$$
(4.4)

Set

$$\begin{aligned} U(x){:}{=}\sup \big \{i\in \mathbb {R}_{++}: u(x+i)-c_0i\ge u(x)-1\big \} \quad \forall x\in [M^{-1},M]. \end{aligned}$$

The limit (4.4) provides that there exists \(R>0\) such that

$$\begin{aligned} U(x)\le R \quad \forall x\in [M^{-1},M]. \end{aligned}$$

Hence, we have

$$\begin{aligned} \mathscr {M}u(x)=\sup _{i\in (0,R]} \{u(x+i)-c_0i-c_1\}\quad \forall x\in [M^{-1},M]. \end{aligned}$$
(4.5)

Now let \(\hat{L}\) be the Lipschitz constant of \(u|_{[M^{-1},M+R]}\). Then, if \(M^{-1}\le x<y\le M\), \(0<i\le R\), we can write

$$\begin{aligned} u(x+i)-(c_0i+c_1)- \hat{L}(y-x) \le u(y+i)-(c_0i+c_1)\le u(x+i)-(c_0i+c_1)+ \hat{L}(y-x).\nonumber \\ \end{aligned}$$
(4.6)

Now the claim follows by taking the supremum over \(i\in (0,R]\) on (4.6) and recalling (4.5). \(\square \)

By definition of v we have

$$\begin{aligned} v(x)\ge v(x+i)-c_0i-c_1 \quad \forall i>0, \end{aligned}$$
(4.7)

hence

$$\begin{aligned} v\ge \mathscr {M}v. \end{aligned}$$
(4.8)

We define the continuation region\(\mathscr {C}\) and the action region\(\mathscr {A}\) by

$$\begin{aligned}&\mathscr {C}{:}{=}\big \{ x\in \mathbb {R}_{++}:\ \mathscr {M}v(x)<v(x)\big \} \quad \text{(continuation } \text{ region) } \end{aligned}$$
(4.9)
$$\begin{aligned}&\mathscr {A}{:}{=}\mathbb {R}_{++} {\setminus } \mathscr {C}= \big \{ x\in \mathbb {R}_{++}:\ \mathscr {M}v(x)=v(x)\big \} \quad \text{(action } \text{ region) }. \end{aligned}$$
(4.10)

They will represent, respectively, the region where it will be convenient to let the system evolve autonomously and the region where it wil be convenient to undertake an action by exercising an impulse. By Proposition 3.2 and Lemma 4.1, both members of (4.8) are finite continuous functions. In particular, \(\mathscr {C}\) is open and \(\mathscr {A}\) is closed in \(\mathbb {R}_{++}\).

For \(x\in \mathscr {A}\), let us introduce the set

$$\begin{aligned} \Xi (x){:}{=}{\mathop {\hbox {argmax}}\limits _{i>0}}\,\big \{v(x+i)-c_0i-c_1\big \}. \end{aligned}$$

Clearly \(\Xi (x)\) is empty if \(x\in \mathscr {C}\). In principle \(\Xi (x)\) might be empty even if \(x\in \mathscr {A}\), but this is not the case as shown by the following.

Proposition 4.2

Let \(x\in \mathscr {A}\).

  1. (i)

    \(\Xi (x)\) is not empty.

  2. (ii)

    For all \(\xi \in \Xi (x)\), we have \(x+\xi \in \mathscr {C}.\)

Proof

(i) Let \(x\in {\mathscr {A}}\) and take a sequence \(\{i_n\}_{n\in \mathbb {N}{\setminus }\{0\}}\subset \mathbb {R}_{++}\) such that

$$\begin{aligned} {\mathscr {M}}v(x)\ge v\left( x+i_n\right) -c_0i_n-c_1\ge {\mathscr {M}}v(x)-\frac{1}{n}, \quad \forall n\in \mathbb {N}{\setminus }\{0\}. \end{aligned}$$
(4.11)

Then, considering that \({\displaystyle {\limsup _{i\rightarrow \infty }\frac{v(x+i)}{x+i}=0}}\) by Proposition 3.2 and that \(\mathscr {M}v(x)\) is finite, we easily see, arguing by contradiction, that, in order to fulfill (4.11), the sequence \(\{i_n\}_{n\in \mathbb {N}}\) must be bounded. Hence, by considering a subsequence if necessary, we have \(i_n\rightarrow i^*\in \mathbb {R}_+\). Let us show that \(i^*>0\). Indeed, assume by contradiction that \(i^*=0\). By (4.11), taking into account that v is continuous and that \(v(x)={\mathscr {M}}v(x)\) as \(x\in \mathscr {A}\), we obtain \(v(x)={\mathscr {M}}v(x)\le v(x)-c_1\), a contradiction. Then we have shown that \(i^*>0\). From (4.11) we obtain, by continuity, \({\mathscr {M}}v(x) = v(x+i^*)-c_0i^*-c_1\) and the claim follows.

(ii) This part of the proof closely follows the proof of [43, Prop. 2]. We omit it for brevity.

\(\square \)

Note that, as a consequence of Proposition 4.2, we have \(\mathscr {C}\ne \emptyset \). Indeed, either \(\mathscr {A}=\emptyset \), thus \(\mathscr {C}=\mathbb {R}_{++}\); or \(\mathscr {A}\ne \emptyset \), thus \(\mathscr {C}\ne \emptyset \) by Proposition 4.2(ii). Formally, Proposition 4.2(ii) says that, if the system is in a position \(x\in \mathscr {A}\): (i) an optimal control exists [part (i)]; (ii) this optimal control places the system in \(\mathscr {C}\) [part (ii)]. We will verify this fact rigorously afterwards.

4.2 Dynamic programming principle and viscosity solutions

The rigorous connection between v and (QVI) passes through the dynamic programming principle (DPP).

Proposition 4.3

For every \(x>0\) and every \(\mathbb {F}\)-stopping time \(\tau \in \overline{\mathbb {R}}_+\),

figure d

Proof

We refer to [22] (for the finite horizon case; our formulation is the usual one for time homogeneous infinite horizon problems). \(\square \)

Here we study (QVI) by means of viscosity solutions.

Definition 4.4

(Viscosity solution) Let \(u\in {\text {Lip}}_{{\small {\mathrm{loc}}},c_0}(\mathbb {R}_{++})\).

  1. (i)

    u is a viscosity subsolution to (QVI) if for every \((x_0,\varphi )\in \mathbb {R}_{++}\times C^2(\mathbb {R}_{++})\) such that \(u-\varphi \) has a local maximum at \(x_0\) and \(u(x_0)=\varphi (x_0)\) we have

    $$\begin{aligned} \min \big \{\mathscr {L}\varphi (x_0)-f(x_0),u(x_0)-\mathscr {M}u(x_0)\big \}\le 0; \end{aligned}$$
  2. (ii)

    u is a viscosity supersolution to (QVI) if for every \((x_0,\varphi )\in \mathbb {R}_{++}\times C^2(\mathbb {R}_{++})\) such that \(u-\varphi \) has a local minimum at \(x_0\) and \(u(x_0)=\varphi (x_0)\) we have

    $$\begin{aligned} \min \big \{\mathscr {L}\varphi (x_0)-f(x_0),u(x_0)-\mathscr {M}u(x_0)\big \}\ge 0; \end{aligned}$$
  3. (iii)

    u is a viscosity solution to (QVI) if it is both a viscosity subsolution and a viscosity supersolution of (QVI).

Proposition 4.5

The value function v is a viscosity solution of (QVI).

Proof

Supersolution property Let \(x_0\in \mathbb {R}_{++}\) and \(\varphi \in C^2(\mathbb {R}_{++})\) be such that \(v-\varphi \) has a local minimum at \(x_0\) and \(v(x_0)=\varphi (x_0)\). In particular, \(v\ge \varphi \) on \((x_0-\delta ,x_0+\delta )\) for a suitable \(\delta \in (0,x_0)\). By (4.8) we only need to show that \(\mathscr {L}\varphi (x_0)-f(x_0)\ge 0\). To this aim, consider the stopping time \(\tau {:}{=}\inf \left\{ t\ge 0:|X^{x_0,\emptyset }_t-x_0|>\delta \right\} \) and note that \(\mathbb {P}\{\tau >0\}=1\) by continuity of trajectories. Then, from (DPP) we get

$$\begin{aligned} v(x_0)\ge \mathbb {E}\left[ \int _0^{\tau \wedge \varepsilon } e^{-\rho t} {f}\left( X_t^{x_0,\emptyset }\right) dt +e^{-\rho (\tau \wedge \varepsilon )}v\left( X^{x_0,\emptyset }_{\tau \wedge \varepsilon }\right) \right] \quad \forall \varepsilon >0. \end{aligned}$$
(4.12)

From this we derive

$$\begin{aligned} \varphi (x_0)\ge \mathbb {E}\left[ \int _0^{\tau \wedge \varepsilon } e^{-\rho t} {f}\left( X_t^{x_0,\emptyset }\right) dt +e^{-\rho (\tau \wedge \varepsilon )}\varphi \left( X^{x_0,\emptyset }_{\tau \wedge \varepsilon }\right) \right] \quad \forall \varepsilon >0. \end{aligned}$$
(4.13)

By applying Dynkin’s formula, dividing by \(\varepsilon \), letting \(\varepsilon \rightarrow 0^+\) and considering that \(X^{x,\emptyset }\) is right-continuous in 0 and \(\mathbb {P}\{\tau >\varepsilon \}\rightarrow 1\) as \(\varepsilon \rightarrow 0^+\), we obtain the desired inequality.

Subsolution property Let \(x_0\in \mathbb {R}_{++}\) and \(\varphi \in C^2(\mathbb {R}_{++})\) be such that \(v-\varphi \) has a local maximum at \(x_0\) and \(v(x_0)=\varphi (x_0)\). If \(v(x_0)={\mathscr {M}}v(x_0)\), then we are done. Then assume \(v(x_0)\ge \xi + {\mathscr {M}}v(x_0)\) for some \(\xi >0\). In this case, we need to show that \( \mathscr {L}\varphi (x_0)-f(x_0)\le 0\). Assume by contradiction that \( \mathscr {L}\varphi (x_0)-f(x_0)\ge \varepsilon >0\). By continuity of \({\mathscr {L}}\varphi -f\) and of \(v-{\mathscr {M}}v\) and in view of the fact that \(v-\varphi \) has a local maximum at \(x_0\) and \(\varphi (x_0)=v(x_0)\), there exists \(\delta \in (0,x_0/2)\) such that

$$\begin{aligned} \forall x\in B(x_0,2\delta ]\quad \left\{ \begin{array}{ll} \text{(i) } &{} \mathscr {L}\varphi (x)-f(x)\ge \varepsilon /2\\ \text{(ii) } &{} \varphi (x) \ge v(x) \\ \text{(iii) } &{} v(x) - {\mathscr {M}}v(x) \ge \xi /2. \end{array}\right. \end{aligned}$$
(4.14)

Now define the stopping time \( \tau {:}{=}\inf \{t\ge 0:| X_t^{x_0,\emptyset }-x_0|>\delta \} \) and note that \(\mathbb {P}\{\tau >0\}=1\). In view of (4.14)(iii), undertaking an investment in the region \(B(x_0,2\delta ]\) is not optimal. Hence (DPP) can be rewritten limiting the ranging of I to the set of controls such that \(\tau _1>\tau \), yielding the simple equality

$$\begin{aligned} v(x_0)= \mathbb {E}\left[ \int _0^{\tau }e^{-\rho t} f\left( X^{x_0,\emptyset }_t\right) dt +e^{-\rho \tau } v \left( X_{\tau }^{x_0,\emptyset }\right) \right] . \end{aligned}$$
(4.15)

Finally, we have, by (4.15), Dynkin’s formula and (4.14)(i)–(ii),

$$\begin{aligned} \frac{\varepsilon }{2}\mathbb {E}\left[ \tau \right]&\le \mathbb {E}\left[ \int _0^\tau e^{-\rho t}\left( \mathscr {L}\varphi \left( X_t^{x_0,\emptyset }\right) -f\left( X_t^{x_0,\emptyset }\right) \right) dt \right] \nonumber \\&= \varphi (x_0)- \mathbb {E}\left[ \int _0^{\tau } e^{-\rho t}f\left( X^{x_0,\emptyset }_t\right) dt +e^{-\rho \tau } \varphi \left( X_{\tau }^{x_0,\emptyset }\right) \right] \nonumber \\&\le v(x_0)- \mathbb {E}\left[ \int _0^{\tau } e^{-\rho t}f\left( X^{x_0,\emptyset }_t\right) dt +e^{-\rho \tau } v \left( X_{\tau }^{x_0,\emptyset }\right) \right] =0. \end{aligned}$$
(4.16)

This provide a contradiction as \(\mathbb {P}\left\{ \tau >0\right\} =1\). \(\square \)

4.3 Regularity of the value function

Here we establish the regularity properties of the value function. Precisely, exploiting the semiconvexity provided by Proposition 3.7 and the viscosity property provided by Proposition 4.5, we show that it is of class \(C^1\) on \(\mathbb {R}_{++}\) and of class \(C^2\) on \(\mathscr {C}\).

Theorem 4.6

\(v\in C^1(\mathbb {R}_{++};\mathbb {R})\,\bigcap \, C^2(\mathscr {C};\mathbb {R})\).

Proof

Let \(x_0\in \mathbb {R}_{++}\). As v is semiconvex in a neighborhood of \(x_0\) (Proposition 3.7), in such a neighborhood it can be written as difference of a convex function and a quadratic one (see Remark 3.4). Hence, the one-side derivatives \(v^{\prime }_+(x_0), v^{\prime }_-(x_0)\) exist and \(v^{\prime }_-(x_0)\le v^{\prime }_+(x_0)\). To show that v is differentiable at \(x_0\), we need to show that the previous inequality is indeed an equality. Assume, by contradiction, that \(v^{\prime }_-(x_0)< v^{\prime }_+(x_0)\). Then we can construct a sequence of functions \(\{\varphi _n\}_{n\in \mathbb {N}} \subset C^2(\mathbb {R}_{++})\) such that, for every \(n\in \mathbb {N}\),

$$\begin{aligned} \varphi _n(x_0)=v(x_0), \quad \varphi _n\le v, \quad \varphi ^{\prime }_n(x_0)=\frac{v^{\prime }_-(x_0)+v^{\prime }_+(x_0)}{2}, \quad \varphi _n^{\prime \prime }(x_0)\ge n. \end{aligned}$$

Then \({\mathscr {L}}\varphi _n(x_0)-f(x_0)\rightarrow -\infty \) as \(n\rightarrow \infty \), which is impossible as v is a viscosity supersolution to (QVI), by Proposition 4.5. Hence it must be \(v^{\prime }_-(x_0)=v^{\prime }_+(x_0)\). By arbitrariness of \(x_0\), this shows that v is differentiable on \(\mathbb {R}_{++}\). By semiconvexity we deduce that \(v\in C^1(\mathbb {R}_{++})\) (see [65, Theorem 25.5]).

The fact that \(v\in C^2({\mathscr {C}};\mathbb {R})\) follows from a standard localization argument: in each interval \((a,b)\subset {\mathscr {C}}\) the function v is a viscosity solution to the linear equation \({\mathscr {L}}u-f=0\) with boundary conditions \(u(a)=v(a)\) and \(u(b)=v(b)\). By uniform ellipticity of \({\mathscr {L}}\) over (ab) (see, e.g., [36, Ch. 6]), this equation admits a unique solution in \(C^2((a,b);\mathbb {R})\), which must also be a viscosity solution. By uniqueness of viscosity solutions to the linear equation above with Dirichlet boundary conditions, we conclude that v coincide with the classical solution, hence \(v\in C^2((a,b);\mathbb {R})\). As \(\mathscr {C}\) is open, the claim follows by arbitrariness of (ab). \(\square \)

Corollary 4.7

We have

  1. (i)

    \(v^{\prime }(x+\zeta )=c_0\), for every \(x\in \mathscr {A}, \ \forall \zeta \in \Xi (x)\).

  2. (ii)

    \(v^{\prime }(x) =c_0\), for every \(x\in \mathscr {A}.\)

Proof

The proof is the same as in [43, Lemma. 5.2] and we skip it for the sake of brevity. \(\square \)

Corollary 4.7(i) will be used in the next section to characterize the optimal target point, i.e. the point in the continuation region where it is optimal to place the system when it reaches the action region.

5 Explicit expression of the value function

In this section we characterize \(\mathscr {C},\mathscr {A}\) and v up to the decreasing solution of the homogeneous ODE \(\mathscr {L}=0\) and to the solution of a nonlinear system of three algebraic equations.

Lemma 5.1

\(\mathscr {A}\) does not contain any interval of the form \([a,\infty )\), with \(a>0\). In particular \(\mathscr {C}\ne \emptyset \).

Proof

Assume, by contradiction, that there exists \(a>0\) such that \(\mathscr {A}\supset [a,\infty )\). Then, due to Lemma 4.7(ii), we have

$$\begin{aligned} v(x)=c_0(x-a)+v(a), \quad \forall x\ge a, \end{aligned}$$

which contradicts Proposition 3.2. On the other hand we should also have

$$\begin{aligned} v(x)=\mathscr {M}v(x), \quad \forall x\ge a. \end{aligned}$$

So it must be

$$\begin{aligned} c_0(x-a)+v(a)=\sup _{i>0}\big \{{c_0(x+i-a)}+v(a)-c_0i-c_1\big \} \quad \forall x\ge a, \end{aligned}$$

which is impossible as \(c_1>0\). \(\square \)

The following assumption ensures that the action region is an interval.

Assumption 5.2

\(b|_{\mathbb {R}_+}\) is concave.

Lemma 5.3

Let Assumption 5.2 hold. Then \(\mathscr {A}\) is an interval.

Proof

Since \(\mathscr {A}\) is closed, it is sufficient to show that there do not exist points \(x_0,x_1\in \mathbb {R}_{++}\), with \(x_0<x_1\), such that \(x_0,x_1 \in \mathscr {A}\) and \((x_0,x_1)\subset \mathscr {C}.\) Arguing by contradiction, we assume that such points instead exist. Given \(x\in (x_0,x_1)\), set \(j{:}{=}i-(x_1-x)\) for every \(i>0.\)

Then, recalling that \(x\in \mathscr {C}\), so \(v(x)>\mathscr {M}v(x)\), and that \(x_1\in \mathscr {A}\), hence \(v(x_1)=\mathscr {M}v(x_1)\), we can write

$$\begin{aligned} v(x)&> \mathscr {M}v(x)=\sup _{i>0}\big \{v(x+i)-c_0i-c_1\big \}\ge \sup _{i>x_1-x}\big \{v(x+i)-c_0i-c_1\big \}\\&=\sup _{j>0}\big \{v(x_1+j)-c_0j-c_1\big \}+c_0(x-x_1)=v(x_1)+c_0(x-x_1), \quad \forall x\in (x_0,x_1). \end{aligned}$$

Therefore

$$\begin{aligned} v(x)-v(x_1)> c_0(x-x_1) \quad \forall x\in (x_0,x_1). \end{aligned}$$
(5.1)

Due to Proposition 4.2(i), we have for some for some \(y_1>x_1\), \(y_1\in \mathscr {C}\),

$$\begin{aligned} v(x_1)= v(y_1)-c_0(y_1-x_1)-c_1. \end{aligned}$$
(5.2)

On the other hand, \(v\ge \mathscr {M}v\) implies

$$\begin{aligned} v(x)\ge v(y_1)-c_0(y_1-x)-c_1 \quad \forall x\in (x_1,y_1). \end{aligned}$$
(5.3)

Combining (5.2) and (5.3) we get

$$\begin{aligned} v(x)-v(x_1)\ge c_0(x-x_1) \quad \forall x\in (x_1,y_1). \end{aligned}$$
(5.4)

Then (5.1) and (5.4) show that the function

$$\begin{aligned} \varphi (x)=v(x_1)+ c_0(x-x_1), \quad \ x\in \mathbb {R}_{++}, \end{aligned}$$

is such that \(\varphi (x_1)=v(x_1)\) and \(v-\varphi \) has a local minimum at \(x_1\). Since v is a viscosity supersolution to (QVI), this implies

$$\begin{aligned} \rho v(x_1)-c_0b(x_1)\ge f(x_1). \end{aligned}$$
(5.5)

Now, by (5.1), there exists \(\xi \in (x_0,x_1)\) such that \(v^{\prime }(\xi )<c_0\). Let

$$\begin{aligned} y_2{:}{=}\sup \left\{ x\in [x_0,\xi ):\ v^{\prime }(x)\ge c_0\right\} . \end{aligned}$$

The definition above is well posed as \(x_0\in \mathscr {A}\), so that by Corollary 4.7(ii) we have \(v^{\prime }(x_0)=c_0\). Moreover, by continuity of \(v^{\prime }\) and by definition of \(y_2\) we have

$$\begin{aligned} y_2<\xi<x_1, \quad v^{\prime }(y_2)=c_0, \quad v^{\prime }(x)<c_0 \quad \forall x\in (y_2,\xi ). \end{aligned}$$
(5.6)

Therefore, considering that v is twice differentiable in \((x_0,\xi )\) as this interval is contained in \(\mathscr {C}\), from (5.6) and by continuity of \(v^{\prime }\) we see that

$$\begin{aligned} v^{\prime }(y_2)=c_0,\ v^{\prime \prime }(y_2)\le 0. \end{aligned}$$
(5.7)

The equality \(\mathscr {L}v=f\) holds in classical sense at \(y_2\), hence (5.7) entails

$$\begin{aligned} \rho v(y_2)- c_0b(y_2)\le f(y_2). \end{aligned}$$
(5.8)

Combining (5.5) with (5.8), we get

$$\begin{aligned} \rho (v(x_1)-v(y_2))- c_0(b(x_1)-b(y_2))\ge f(x_1)-f(y_2). \end{aligned}$$
(5.9)

On the other hand, considering (5.1) with \(x=y_2\) and then combining it with (5.9), we get

$$\begin{aligned} \rho c_0(x_1-y_2) -c_0(b(x_1)-b(y_2))> f(x_1)-f(y_2) \end{aligned}$$
(5.10)

Now, as \(x_1\in \mathscr {A}\), by (5.2) we have

$$\begin{aligned} v(y_1)-c_0(y_1-x_1)-c_1=\sup _{y>x_1} \big \{{v}(y)-c_0(y-x_1)-c_1\big \}. \end{aligned}$$
(5.11)

The function v is twice differentiable at \(y_1\) since \(y_1\in \mathscr {C}\), so (5.11) yields

$$\begin{aligned} v^{\prime }(y_1)=c_0, \ v^{\prime \prime }(y_1)\le 0. \end{aligned}$$

Therefore the equality\(\mathscr {L}v(y_1)=f(y_1)\) yields the inequality

$$\begin{aligned} \rho v(y_1)- c_0 b(y_1)\le f(y_1). \end{aligned}$$
(5.12)

Combining (5.12) with (5.5), we get

$$\begin{aligned} \rho (v(y_1)-v(x_1))- c_0(b(y_1)-b(x_1))\le f(y_1)-f(x_1). \end{aligned}$$
(5.13)

On the other hand, from (5.11) we get

$$\begin{aligned} v(y_1)-v(x_1)\ge c_0(y_1-x_1). \end{aligned}$$
(5.14)

So, from (5.13) and (5.14) we get

$$\begin{aligned} \rho c_0(y_1-x_1)-c_0(b(y_1)-b(x_1))\le f(y_1)-f(x_1). \end{aligned}$$
(5.15)

To conclude, note that (5.10) and (5.15) are not compatible with the strict concavity of

$$\begin{aligned} \mathbb {R}_{++}\rightarrow \mathbb {R},\ x \mapsto f(x)+c_0 b(x)-\rho c_0 x \end{aligned}$$

which follows from Assumptions 2.3 and 5.2. \(\square \)

Under Assumption 5.2, Lemma 5.1 and Lemma 5.3 provide

$$\begin{aligned} \begin{array}{ccl} \text{ either } &{} \text{(i) }\,\,\quad &{}\mathscr {C}= \mathbb {R}_{++}\\ \text{ or } &{} \text{(ii) }\,\,\quad &{} \exists \ r, s, \ \ 0\le r< s<\infty :\mathscr {C}=(0,r)\cup (s,\infty ). \end{array} \end{aligned}$$
(5.16)

Case (i) above corresponds to the case in which the continuation region invades all the state space and it is never convenent to undertake an action. In case (ii) the action region is not empty and there is convenience to undertake an action when the system reaches this region.

Consider the homogeneous ODE

$$\begin{aligned} \mathscr {L}u=0 \quad \text{ on }\quad \mathbb {R}_{++}. \end{aligned}$$
(5.17)

By [19, Th. 16.69] its general solution is of the form

$$\begin{aligned} u=A\psi +B\varphi , \quad A,B\in \mathbb {R}, \end{aligned}$$

where \(\psi , \varphi \) are, respectively, the unique (up to a multiplicative constant) strictly increasing and strictly decreasing solutions to (5.17) and, as 0 and \(\infty \) are not accessible boundaries for the reference diffusion Z, these fundamental solutions satifsy the following boundary conditions

$$\begin{aligned} \psi (0^+){:}{=}\lim _{x\rightarrow 0^+}\psi (x)=0, \quad \varphi (0^+){:}{=}\lim _{x\rightarrow 0^+}\varphi (x)=+\infty , \ \ \ \lim _{x\rightarrow \infty } \psi (x)=+\infty , \quad \lim _{x\rightarrow \infty } \varphi (x)=0. \end{aligned}$$
(5.18)

Other properties of these functions can be found on [19, Sec. 16.11]. On the other hand, the function \(\hat{v}\) defined in (3.2) is the unique solution in \(\mathbb {R}_{++}\), within the class of functions having at most linear growth, to the nonhomogeneous ODE \(\mathscr {L}u=f\) (see [19, Th. 16.72]: actually in the quoted result the function f is required to be bounded, but the proof works as well in our context within the class of functions having at most linear growth). It follows that every classical solution to

$$\begin{aligned} \mathscr {L}u=f, \quad \text{ over } \ \mathscr {I}\subset \mathbb {R}_{++}, \end{aligned}$$
(5.19)

where \(\mathscr {I}\) is an open interval, must have the form \(u=A\psi +B \varphi +\hat{v}\). Therefore, as by Proposition 4.5 and Theorem 4.6 the value function v solves in classical sense (5.19), according to the two possibilities of (5.16), in case (i) there must exist real numbers AB such that

$$\begin{aligned} \begin{aligned}&v=\hat{v}+A\psi +B\varphi \quad \text{ on }\quad \mathbb {R}_{++};&\end{aligned} \end{aligned}$$
(5.20)

in case (ii) there must exist real numbers \(A_r,B_r,A_s,B_s\)

$$\begin{aligned} \begin{aligned}&\left\{ \begin{array}{ll} v=\hat{v}+A_r\psi +B_r\varphi &{} \quad \text{ on }\quad (0,r), \\ v=\hat{v}+A_s\psi +B_s\varphi &{} \quad \text{ on }\quad (s,\infty ). \end{array}\right.&\end{aligned} \end{aligned}$$
(5.21)

Proposition 5.4

Let Assumption 5.2 hold. According to the cases (i) and (ii) of (5.16) we have, respectively:

  • If case (i) holds, then \(v\equiv \hat{v}\), hence \(A=B=0\) in (5.20);

  • If case (ii) holds, then \({\displaystyle \lim _{x\rightarrow \infty }} (v(x)-\hat{v}(x))=0\) and \(A_s=B_r=0\), \(A_r,B_s\ge 0\) in (5.21).

Proof

Assume that case (i) holds. As \(\mathscr {L}v=f\) on \(\mathscr {C}=\mathbb {R}_{++}\), by a standard localization procedure we get (see, e.g., the proof of Proposition 3.2)

$$\begin{aligned} v(x) =\mathbb {E}\left[ \int _0^{t} e^{-\rho s}f\left( X^{x,\emptyset }_s\right) ds\right] +\mathbb {E}\left[ e^{-\rho t} v \left( X_t^{x,\emptyset }\right) \right] \quad \forall t\in \mathbb {R}_+. \end{aligned}$$
(5.22)

We pass to the limit \(t\rightarrow \infty \) on the first addend of the right hand side by using the monotone convergence theorem. As for the second addend, we use (3.4) and (3.7) with \(I=\emptyset \) to write

$$\begin{aligned} 0\le \mathbb {E}\left[ e^{-\rho t} v \left( X_t^{x,\emptyset }\right) \right] \le e^{-\rho t}\frac{f^*(\alpha )}{\rho } + \frac{\alpha }{\rho } x \quad \forall \alpha \in (0,c_0\rho ]. \end{aligned}$$

Then

$$\begin{aligned} 0\le \limsup _{t\rightarrow \infty } \ \mathbb {E}\left[ e^{-\rho t} v \left( X_t^{x,\emptyset }\right) \right] \le \frac{\alpha }{\rho } x \quad \forall \alpha \in (0,c_0\rho ]. \end{aligned}$$

By arbitrariness of \(\alpha \) we conclude that

$$\begin{aligned} \lim _{t\rightarrow \infty } \mathbb {E}\left[ e^{-\rho t} v \left( X_t^{x,\emptyset }\right) \right] =0. \end{aligned}$$

Hence

$$\begin{aligned} v(x)=\mathbb {E}\left[ \int _0^\infty e^{-\rho s}f\left( X^{x,\emptyset }_s\right) ds\right] . \end{aligned}$$
(5.23)

By definition of \(\hat{v}\) and by the inequality \(v\ge \hat{ v}\), this proves the claim.

Now assume that case (ii) holds. For each \(x>s\) set \(\tau _{x}{:}{=}\inf \left\{ t\ge 0:X^{x,\emptyset }_t\le s\right\} \). As \(\infty \) is a natural boundary for \(Z^{0,x}=X^{x,\emptyset }\), by (A.2) we have

$$\begin{aligned} \lim _{x\rightarrow \infty }\mathbb {P}\{\tau _{x}\ge M\}=1 \quad \forall M>0. \end{aligned}$$
(5.24)

If \(0<x<x^{\prime }\), by (2.7) with \(I=\emptyset \) we get

$$\begin{aligned} \mathbb {P}\text{-a.s. },\ X^{x,\emptyset }_t\le X_t^{x^{\prime },\emptyset }\ \text{ for } \text{ all } t\ge 0, \end{aligned}$$

so, we also have \(\tau _x\le \tau _{x^{\prime }}\)\(\mathbb {P}\)-a.s.. If \(\{x_n\}_{n\in \mathbb {N}}\) is a sequence diverging to \(\infty \), we then have

$$\begin{aligned} \lim _{n\rightarrow \infty }\tau _{x_n}=\infty \quad \mathbb {P}\text{-a.s.. } \end{aligned}$$
(5.25)

As \(\mathscr {L}v=f\) on \((s,\infty )\), as for (5.22), we get

$$\begin{aligned} v(x_n)=\mathbb {E}\left[ \int _0^{\tau _{x_n}\wedge t} e^{-\rho \zeta }f\left( X^{x_n,\emptyset }_\zeta \right) d\zeta \right] + \mathbb {E}\left[ e^{-\rho (\tau _{x_n}\wedge t)} v \left( X^{x_n,\emptyset }_{t\wedge \tau _{x_n}}\right) \right] \quad \forall t\in \mathbb {R}_+,\ n\in \mathbb {N}. \end{aligned}$$
(5.26)

Therefore, splitting over \(\{\tau _{x_n}< t\}\) and \(\{\tau _{x_n}\ge t\}\) the second addend on the right hand side,

$$\begin{aligned} v(x_n)&=\mathbb {E}\left[ \int _0^{\tau _{x_n}\wedge t} e^{-\rho \zeta }f\left( X^{x_n,\emptyset }_\zeta \right) d\zeta \right] + \mathbb {E}\left[ \mathbf {1}_{\{\tau _{x_n}\ge t\}} e^{-\rho t} v \left( X^{x_n,\emptyset }_t\right) \right] \\&\quad +\mathbb {E}\left[ \mathbf {1}_{\{\tau _{x_n}<t\}}e^{-\rho (\tau _{x_n}\wedge t)} v \left( X^{x_n,\emptyset }_{t\wedge \tau _{x_n}}\right) \right] \\&\le \mathbb {E}\left[ \int _0^{\tau _{x_n}\wedge t} e^{-\rho \zeta }f\left( X^{x_n,\emptyset }_\zeta \right) d\zeta \right] + \mathbb {E}\left[ \mathbf {1}_{\{\tau _{x_n}\ge t\}} e^{-\rho t} v \left( X^{x_n,\emptyset }_t\right) \right] + \mathbb {E}[e^{-\rho \tau _{x_n}}\mathbf {1}_{\{\tau _{x_n}<t\}}]v(s). \end{aligned}$$

for all \(t\ge 0\). Now we pass to the limit \(t\rightarrow \infty \) by using the same arguments used to obtain (5.23), and we get

$$\begin{aligned} v(x_n) \le \mathbb {E}\left[ \int _0^{\tau _{x_n}} e^{-\rho \zeta }f\left( X^{x_n,\emptyset }_\zeta \right) d\zeta \right] +\mathbb {E}[e^{-\rho \tau _{x_n}} \mathbf {1}_{\{\tau _{x_n}<\infty \}}]v (s). \end{aligned}$$

Then, the definition of \(\hat{v}\) provides

$$\begin{aligned} v(x_n)-\mathbb {E}\left[ e^{-\rho \tau _{x_n}} \mathbf {1}_{\{\tau _{x_n}<\infty \}}\right] v (s) \le \hat{v}(x_n)-\mathbb {E}\left[ \mathbf {1}_{\{\tau _{x_n}<\infty \}}\int _{\tau _{x_n}}^{\infty } e^{-\rho \zeta }f\left( X_\zeta ^{x_n,\emptyset }\right) d\zeta \right] \le \hat{ v}(x_n). \end{aligned}$$

Using (5.25) and recalling that \(v \ge \hat{v}\), we conclude \( \displaystyle \lim _{n\rightarrow \infty } (v(x_n)-\hat{v}(x_n))=0\). Since the sequence \(\{x_n\}_{n\in \mathbb {N}}\) was arbitrary, we conclude

$$\begin{aligned} \displaystyle \lim _{x\rightarrow \infty }(v(x)-\hat{ v}(x))=0. \end{aligned}$$
(5.27)

From (5.18) and (5.27) we have \(A_s=0\) and \(B_s\ge 0\). Finally, since \(v\ge \hat{v}\) and v is finite in (0, r), from (5.18) we have \(A_r\ge 0\) and \(B_r=0\). \(\square \)

Set

$$\begin{aligned} \hat{v}^*(z){:}{=}\sup _{x>0} \big \{\hat{v}(x)-zx\big \}, \quad z\in \mathbb {R}_{++}. \end{aligned}$$

We are going to introduce an assumption, requiring that \(c_1\) is not too large, that guarantees, at once, that the action region is not empty and that the structure of the continuation and action regions are

$$\begin{aligned} \mathscr {A}=(0,s] \quad \text{ and } \quad \mathscr {C}=(s,\infty ) \quad \text{ for } \text{ some } \ s>0. \end{aligned}$$

Under this nice structure, it turns out that it is convenient to undertake an action when the system lies below a given threshold and lat it evolve autonomously when the system lies above this threshold. Henceforth, we will call this threshold trigger boundary.

Assumption 5.5

\(c_1<{ \hat{ v}^*}(c_0)\).

The following result provides a way to check explicitly the validity of Assumption 5.5.

Proposition 5.6

Let \(f(x)\ge K x^\gamma \) for some \(K>0\), \(\gamma \in (0,1)\) and set \(K^{\prime }:= \frac{\gamma K}{\rho +\gamma L_b+\frac{1}{2}\gamma (1-\gamma )L_\sigma ^2}\). Then

$$\begin{aligned} \hat{v}^*(c_0)\ge K^{\prime }\frac{1-\gamma }{\gamma }\left( \frac{c_0}{K^{\prime }}\right) ^{\frac{\gamma }{\gamma -1}}. \end{aligned}$$

Proof

Let \(x\in \mathbb {R}_{++}\). With a localization procedure similar to the one of the proof of Proposition 3.2, we get from Itô’s formula

$$\begin{aligned}&\mathbb {E}\left[ e^{-\rho t}\big |X^{x,\emptyset }_t\big |^{\gamma }\right] \\&= x^{\gamma }+\mathbb {E}\left[ \int _0^t e^{-\rho s} \left[ -\rho \big (X^{x,\emptyset }_s\big )^{\gamma }+\gamma \big (X^{x,\emptyset }_s\big )^{\gamma -1} b(X_s^{x,\emptyset }) \right. \right. \\&\qquad \left. \left. +\, \frac{1}{2} \gamma (\gamma -1)\big (X^{x,\emptyset }_s\big )^{\gamma -2} \sigma ^2(X_s^{x,\emptyset }) \right] ds\right] \\&\quad \ge x^{\gamma }+\mathbb {E}\left[ \int _0^t e^{-\rho s}\left[ -\rho \big (X^{x,\emptyset }_s\big )^{\gamma }-{L}_b(1-\gamma )\,\big (X^{x,\emptyset }_s\big )^{\gamma }\right. \right. \\&\qquad \left. \left. -\, \frac{1}{2} {L}^2_\sigma \gamma (1-\gamma )\big (X^{x,\emptyset }_s\big )^{\gamma } \right] ds\right] . \end{aligned}$$

Then we get

$$\begin{aligned} \mathbb {E}\left[ e^{-\rho t}\big (X^{x,\emptyset }_t\big )^{\gamma }\right] \ge x^\gamma e^{-\left( \rho +\gamma L_b+\frac{1}{2}\gamma (1-\gamma )L_\sigma ^2\right) t}, \quad \forall t\in \mathbb {R}_+. \end{aligned}$$

From that and from the assumption on f, we obtain

$$\begin{aligned} \hat{v} (x)\ge \frac{K}{\rho +\gamma L_b+\frac{1}{2}\gamma (1-\gamma )L_\sigma ^2} x^\gamma =\frac{K^{\prime }}{\gamma } x^\gamma , \quad \forall x\in \mathbb {R}_{++}. \end{aligned}$$

Hence,

$$\begin{aligned} \hat{v}^*(c_0):=\sup _{x>0}\big \{\hat{v}(x)-c_0 x\big \} \ge \sup _{x>0}\Bigg \{\frac{K^{\prime }}{\gamma }x^\gamma -c_0 x\Bigg \}= K^{\prime }\frac{1-\gamma }{\gamma }\left( \frac{c_0}{K^{\prime }}\right) ^{\frac{\gamma }{\gamma -1}}. \end{aligned}$$

\(\square \)

Proposition 5.7

Let Assumptions 5.2 and 5.5 hold. Then there exists \(s>0\) such that \(\mathscr {C}=(s,\infty )\) and, consequently, \(\mathscr {A}=(0,s]\).

Proof

First, notice that, as \(\hat{ v}\) satisfies (3.4), it follows that \({ \hat{ v}^*}\) is finite on \(\mathbb {R}_{++}\). Considering that \(v\ge \hat{ v}\) and that \(\hat{ v}\) is nondecreasing, we have

$$\begin{aligned} \lim _{x\rightarrow 0^+}v(x)&\ge \lim _{x\rightarrow 0^+}\mathscr {M}v(x) \ge \lim _{x\rightarrow 0^+}\mathscr {M}\hat{ v}(x)= \lim _{x\rightarrow 0^+}\sup _{i>0}\big \{\hat{ v}(x+i)-c_0i-c_1\big \}\nonumber \\&\ge \lim _{x\rightarrow 0^+} \sup _{i>0} \big \{\hat{v}(i)-c_0i-c_1\big \} =\hat{v}^*(c_0)-c_1>0. \end{aligned}$$
(5.28)

Now assume by contradiction that \((0,r)\subset \mathscr {C}\), for some \(r>0\). By Proposition 5.4 we have

$$\begin{aligned} v(x)=\hat{ v}(x)+A_r \psi (x), \quad x\in (0,r), \end{aligned}$$

for some \(A_r\ge 0\). Then, as \(\psi (0^+)=0\), we must have \(v(0^+)=\hat{v}(0^+)=0\). The latter contradicts (5.28), hence we conclude. \(\square \)

Under Assumptions 5.2 and 5.5, the structure of \(\mathscr {C}\) and \(\mathscr {A}\) established by Proposition 5.7 joined with Proposition 5.4 provides the following structure for v: for some \(B= B_s\ge 0\)

$$\begin{aligned} v(x)=\left\{ \begin{array}{ll} B\varphi (x)+\hat{v}(x),&{}\quad \text{ if } \ x\in (s,\infty ),\\ B\varphi (s)+\hat{v}(s)-c_0(s-x), &{}\quad \text{ if } \ x\in (0,s]. \end{array}\right. \end{aligned}$$
(5.29)

Lemma 5.8

Let Assumption 5.2 hold. Let \(a\ge 0\) and let \(u\in C^2((a,\infty );\mathbb {R})\) satisfy \(\mathscr {L}u=f\) on \((a,\infty )\). If \(x_0\in (a,\infty )\) is a local minimum point for \(u^{\prime }\), then \(u^{\prime }(x_0)> 0\) and there is no local maximum point for \(u^{\prime }\) in \((x_0,\infty )\).

Proof

As \(b,\sigma , {f}\in C^1(\mathbb {R}_{++};\mathbb {R})\), from

$$\begin{aligned} \rho u(x)=b(x)u^{\prime }(x)+\frac{1}{2}\sigma ^2(x)u^{\prime \prime }(x)+f(x), \quad \forall x\in (a,\infty ), \end{aligned}$$
(5.30)

we obtain \(u^{\prime \prime }\in C^1((a,\infty );\mathbb {R})\), i.e. \(u\in C^3((a,\infty );\mathbb {R})\). We differentiate (5.30) getting

$$\begin{aligned} \rho u^{\prime }(x)= b^{\prime }(x)u^{\prime }(x)+b(x)u^{\prime \prime }(x)+\frac{1}{2} \sigma ^2(x)u^{\prime \prime \prime }(x)+\sigma \sigma ^{\prime }(x)u^{\prime \prime }(x)+f^{\prime }(x), \quad \forall x\in (a,\infty ). \end{aligned}$$
(5.31)

Let \(x_0\in (a,\infty )\) be a local minimum point for \(u^{\prime }\). Then \(u^{\prime \prime }(x_0)=0\) and \(u^{\prime \prime \prime }(x_0)\ge 0\) so, by (5.31), we have

$$\begin{aligned} \rho u^{\prime }(x_0)\ge b^{\prime }(x_0)u^{\prime }(x_0)+f^{\prime }(x_0). \end{aligned}$$
(5.32)

Note that from (5.32), using Assumptions 2.3 and 2.4, we obtain \(u^{\prime }(x_0)>0\). Now, arguing by contradiction, assume that \(x_1\in (x_0,\infty )\) is local maximum point for \(u^{\prime }\). Then \(u^{\prime \prime }(x_1)=0\) and \(u^{\prime \prime \prime }(x_1)\le 0\), so, by (5.31), we have

$$\begin{aligned} \rho u^{\prime }(x_1)\le b^{\prime }(x_1)u^{\prime }(x_1)+f^{\prime }(x_1). \end{aligned}$$
(5.33)

Without loss of generality, we can assume that

$$\begin{aligned} u^{\prime }(x_0)\le u^{\prime }(x_1). \end{aligned}$$
(5.34)

Combining (5.32) and (5.33) and taking account that \(f^{\prime }\) is strictly decreasing, we get

$$\begin{aligned} \left( \rho -b^{\prime }(x_1)\right) u^{\prime }(x_1)\le f^{\prime }(x_1)< f^{\prime }(x_0)\le \left( \rho -b^{\prime }(x_0)\right) u^{\prime }(x_0). \end{aligned}$$
(5.35)

Now, by Assumption 5.2 we have \(b^{\prime }(x_0)\ge b^{\prime }(x_1)\). So, the fact that \(u^{\prime }(x_0)> 0\) and (5.35) yield

$$\begin{aligned} \left( \rho -b^{\prime }(x_1)\right) u^{\prime }(x_1)< \left( \rho -b^{\prime }(x_1)\right) u^{\prime }(x_0). \end{aligned}$$
(5.36)

By Assumption 2.4, we have the \(\rho -b^{\prime }(x_1)>0\). Hence, from (5.36) we obtain \(u^{\prime }(x_1)< u^{\prime }(x_0)\), contradicting (5.34). \(\square \)

Recall that a function \(\varphi :\mathscr {O}\rightarrow \mathbb {R}\), with \(\mathscr {O}\) open interval, is said quasiconcave if

$$\begin{aligned} \varphi \left( \lambda x+\left( 1-\lambda \right) x^{\prime }\right) > \min \left\{ \varphi (x),\varphi \left( x^{\prime }\right) \right\} \quad \forall x,x^{\prime }\in \mathscr {O}, \ \forall \lambda \in (0,1). \end{aligned}$$

Strictly quasiconcave functions can be characterized as functions that are either strictly increasing, or strictly decreasing, or strictly increasing on the left of a point \(x^*\in \mathscr {O}\) and strictly decreasing on the right of \(x^*\).

Lemma 5.9

Let Assumption 5.2 hold. Let \(a\ge 0\), let \(u\in C^2((a,\infty );\mathbb {R})\) satisfy \(\mathscr {L}u=f\) on \((a,\infty )\) and assume that \(\displaystyle \liminf _{x\rightarrow \infty }u^{\prime }(x)\le 0\). Then \(u^{\prime }\) is strictly quasiconcave.

Proof

By virtue of [9, Proposition 3.24], it is sufficient to show that \(u^{\prime }\) does not admit any local minimum. Argue by contradiction and assume that \(x_0\in (a,\infty )\) is a local minimum point for \(u^{\prime }\). The proof of Lemma 5.8 shows then that \(u^{\prime }(x_0)>0\). Hence, since \(\displaystyle \liminf _{x\rightarrow \infty }u^{\prime }(x)\le 0\), there must exists a local maximum point \(x_1\in (x_0,\infty )\). This contradicts Lemma 5.8 and we conclude. \(\square \)

Proposition 5.10

Let Assumptions 5.2 and 5.5 hold.

  1. (i)

    There exists a unique \(S\in \mathscr {C}=(s,\infty )\) such that \(v^{\prime }(S)=c_0\).

  2. (ii)

    There exists (a unique) \(x^*\in (s,S)\) such that \(v^{\prime }\) is strictly increasing in \((s,x^*]\) and strictly decreasing in \([x^*,\infty )\).

  3. (iii)

    \(\displaystyle \lim _{x\rightarrow \infty } v^{\prime }(x)= 0\).

Proof

(i) Corollary 4.7(i) and Proposition 4.2(i) yield the existence of \(S\in \mathscr {C}=(s,\infty )\) such that \(v^{\prime }(S)=c_0\). Regarding uniqueness, observe first that v satisfies the requirements of Lemma 5.9 (plugging v in place of u) with \(a=s\) and where

$$\begin{aligned} \liminf _{x\rightarrow \infty }v^{\prime }(x)\le 0 \end{aligned}$$
(5.37)

holds by (3.4). Then the fact that \(v^{\prime }(s)=c_0\) by Corollary 4.7(ii) yields the uniqueness.

(ii) By (5.29) we have \(v^{\prime }(s)=c_0\). By (i) above we have \(v^{\prime }(S)=c_0\) and \(v^{\prime }(x)\ne c_0\) for each \(x\in (s,S)\). Then the claim follows by Lemma 5.9.

(iii) This follows immediately by monotonicity of \(v^{\prime }\) on \([x^*,+\infty )\), (5.37) and Proposition 3.1, which provides \(v^{\prime }\ge 0\). \(\square \)

Theorem 5.11

Let Assumptions 5.2 and 5.5 hold. The value function has the form

$$\begin{aligned} v(x)=\left\{ \begin{array}{ll} B\varphi (x)+\hat{v}(x), &{}\quad \text{ if } \ x\in (s,\infty ),\\ B\varphi (S)+\hat{v}(S)-c_0(S-x)-c_1, &{}\quad \text{ if } \ x\in (0,s], \end{array}\right. \end{aligned}$$
(5.38)

and the triple (BsS) is the unique solution in \(\mathbb {R}_+\times \mathbb {R}_{++}^2\) to the system

$$\begin{aligned} \left\{ \begin{array}{ll} {\text {(i)}} &{} B\varphi (s)+\hat{v}(s)=B\varphi (S)+\hat{v}(S)-c_0(S-s)-c_1,\\ {\text {(ii)}} &{} B\varphi ^{\prime }(s)+\hat{v}^{\prime }(s)=c_0,\\ {\text {(iii)}} &{} B\varphi ^{\prime }(S) +\hat{v}^{\prime }(S)=c_0. \end{array}\right. \end{aligned}$$
(5.39)

Proof

Consider (5.29). The expression of v over \((s,\infty )\) in (5.38) and (5.29) is the same. As for the expression of v over (0, s], we note that, by definition of \(\Xi (s)\), Proposition 4.2, Corollary 4.7 and Proposition 5.10(i), we have

$$\begin{aligned} 0<S-s={\mathop {\hbox {argmax}}\limits _{i>0}}\big \{v(s+i)-c_0i-c_1\big \}. \end{aligned}$$
(5.40)

Since \(s\in \mathscr {A}\), we have \(v(s)=[\mathscr {M}v](s)\); so, from (5.40) we get

$$\begin{aligned} v(s)=v(S)-c_0(S-s)-c_1, \end{aligned}$$

from which we get the expression of v over (0, s] in (5.38). Then the three equations of (5.39) follow, respectively, by imposing the continuity of v at s, the smooth-fit at s (as \(v\in C^1(\mathbb {R}_{++};\mathbb {R})\)) and the condition of Proposition 5.10(i) defining S.

To show that (5.39) has a unique solution in \(\mathbb {R}_+\times \mathbb {R}_{++}^2\), we consider the function

$$\begin{aligned} h\left( \hat{B},x\right) = \hat{B}\varphi (x)+\hat{v}(x), \quad (\hat{B},x)\in \mathbb {R}_{++}\times \mathbb {R}_{++}. \end{aligned}$$

For each \(\hat{B}\ge 0\), \(\mathscr {L}h(\hat{B},\cdot )=0\) in \(\mathbb {R}_{++}\) and \(\displaystyle \liminf _{x\rightarrow \infty }\,h_x(\hat{B},x)\le 0\) by (3.4) and (5.18). By Lemma 5.9\(h_x(\hat{B},\cdot )\) is strictly quasiconcave; hence, there exist at most two solutions \(\hat{s},\hat{S}\) to \(h_x(\hat{B},\cdot )=c_0\) in \(\mathbb {R}_{++}\). If such solutions exist, we have \(h(\hat{B},\cdot )-c_0> 0\) on \((\hat{s}\wedge \hat{S},\hat{s}\vee \hat{S})\). Therefore, if \((\hat{B},\hat{s},\hat{S})\in {\mathbb {R}_+\times \mathbb {R}_{++}^2}\) solves (5.39), then (5.39)(i) yields

$$\begin{aligned} 0<c_1=\left[ \hat{B}\varphi (\hat{S})+\hat{ v}(\hat{S})\right] - \left[ \hat{B}\varphi (\hat{s})+\hat{ v}(\hat{s})\right] -c_s\left( \hat{S}-\hat{s}\right) =\int _{\hat{s}}^{\hat{S}} \left( h_x(\hat{B},r)-c_0\right) dr. \end{aligned}$$

This forces \(\hat{s}=\hat{s}\wedge \hat{S}\), \(\hat{S}=\hat{s}\vee \hat{S}\), \(\hat{s}\ne \hat{S}\). By the argument above we see that, if \((B_1,s_1,S_1)\) and \((B_2,s_2,S_2)\) are two different solutions to (5.39) in \(\mathbb {R}_+\times \mathbb {R}_{++}^2\), we need to have \(s_1<S_1\), \(s_2<S_2\) and \(B_1\ne B_2\).

Now assume, by contradiction, that \((B_1,s_1,S_1)\) and \((B_2,s_2,S_2)\) are two different solutions of (5.39) in \(\mathbb {R}_+\times \mathbb {R}_{++}^2\). Without loss of generality, we can assume \(B_1< B_2\). Recalling that \(\varphi \) is strictly decreasing, we have

$$\begin{aligned} h_x(B_1,\cdot )> h_x(B_2,\cdot ). \end{aligned}$$
(5.41)

The latter inequality, Lemma 5.9 and (5.39)(ii)–(iii) provide

$$\begin{aligned} (s_1,S_1)\supset (s_2,S_2), \quad h_x(B_1,\cdot )-c_0> 0 \ \text{ on } \ (s_1,S_1). \end{aligned}$$
(5.42)

We can then write, using (5.41)–(5.42) and (5.39)(i),

$$\begin{aligned} 0&=c_1-c_1 = \left( h(B_1,S_1)-h(B_1,s_1)- c_0(S_1-s_1) \right) \\&\quad - \left( h(B_2,S_2)-h(B_2,s_2)- c_0(S_2-s_2) \right) \\&= \int _{s_1}^{S_1} (h_x(B_1,\xi )-c_0)d\xi - \int _{s_2}^{S_2} \left( h_x(B_2,\xi )-c_0\right) d\xi \\&\ge \int _{s_2}^{S_2} \left( h_x(B_1,\xi )- h_x(B_2,\xi )\right) d\xi >0, \end{aligned}$$

which is a contradiction. \(\square \)

6 Optimal control

In this section, through Theorem 6.1, we describe the structure of an optimal control for our problem through a recursive rule. In the economic literature—see the stream of papers on stochastic impulse control at the beginning of the paragraph on the related linterature in the Introduction and [16]—this rule is known as (Ss)-rule. Informally, this rule, rigorously stated in Theorem 6.1 below, can be described as follows.

  • The point s works as an optimal trigger boundary: when the state variable is at level s or below such level (i.e., it is within the action region \(\mathscr {A}\)), the controller acts.

  • The point S works as an optimal target boundary: when the controller acts, she/he does that in such a way to place the state variable at the level \(S\in \mathscr {C}\).

  • When the state variable lies in the region \(\mathscr {C}\), the controller let it evolve autonomously without undertaking any action until it exits from this region.

Such rule is made rigorous by the following construction. Let \(x\in \mathbb {R}_{++}\) and consider the control \(I^*=\{(\tau _n,i_n)\}_{n\ge 1}\) defined as follows:

$$\begin{aligned} \left\{ \begin{array}{ll} \tau _1{:}{=}\left\{ \begin{array}{ll} 0 &{} \text{ if } \ x\le s,\\ \inf \left\{ t\ge 0:{Z}_t^{0,x}\le s \right\} &{} \text{ if } \ x>s, \end{array}\right. \\ i_1{:}{=}\left\{ \begin{array}{ll} S-x &{} \quad \text{ if } \ \tau _1=0 \ (\text{ i.e. }\ x\le 0),\\ S-s &{} \quad \text{ if } \ \tau _1>0 \ (\text{ i.e. }\ x>s), \end{array}\right. \end{array}\right. \end{aligned}$$

and then, recursively for \(n\ge 1\),

$$\begin{aligned} \left\{ \begin{array}{ll} \tau _{n+1}{:}{=}\left\{ \begin{array}{ll} \tau _n + \inf \left\{ t>0:Z^{\tau _n,S}_{\tau _n+t}\le s\right\} &{} \text{ if } \tau _n<\infty \\ \infty &{}\text{ otherwise } \end{array}\right. \\ i_{n+1}{:}{=}S-s. \end{array}\right. \end{aligned}$$

Note that, for \(\mathbb {P}\)-a.e. \(\omega \in \{\tau _n<\infty \}\), by continuity of \(\mathbb {R}_+\rightarrow \mathbb {R},\ t\mapsto Z^{\tau _n,S}_{\tau _n+t}(\omega )\) and since \(S>s\), we have \(\tau _{n+1}(\omega )>\tau _n(\omega )\).

Theorem 6.1

(Optimal control) Let Assumptions 5.2 and 5.5 hold.

Let \(x\in \mathbb {R}_{++}\) and consider the control \(I^*=\{(\tau _n,i_n)\}_{n\ge 1}\) defined above. Then \(I^*\in \mathscr {I}\) and it is optimal for the problem starting at x, i.e., \(J(x,I^*)=v(x)\).

Proof

Admissibility As noticed above, \(\tau _n< \tau _{n+1}\)\(\mathbb {P}\)-a.s. on \(\{\tau _n<\infty \}\). Moreover, for each \(n\ge 1\), \(i_n\) is constant; so, as a random variable, it is trivially \(\mathscr {F}_{\tau _n}\)-measurable.

Now, for fixed \(\varepsilon >0\) such that \(S-\varepsilon S^2>s\), define the auxiliary sequence \(\{\tau ^\varepsilon _n\}_{n\ge 1}\) of stopping times by

$$\begin{aligned} \tau ^\varepsilon _1{:}{=}\left\{ \begin{array}{ll} 0&{} \text{ if } x\le s\\ \inf \left\{ t\ge 0 :Z^{0,x}_t -\varepsilon \left( Z^{0,x}_t +t \right) ^2 \le s\right\} &{}\text{ if } x>s \end{array}\right. \end{aligned}$$

and

$$\begin{aligned} \tau ^\varepsilon _{n+1}{:}{=}\tau ^\varepsilon _n+ \inf \left\{ t\ge 0 :Z^{\tau ^\varepsilon _n,S}_{ \tau ^\varepsilon _n+t} -\varepsilon \left( Z^{\tau ^\varepsilon _n,S}_{ \tau ^\varepsilon _n+t} +t \right) ^2 \le s\right\} \text{ for } n\ge 1. \end{aligned}$$

We notice that \(\tau ^\varepsilon _n\) is finite and \(\tau ^\varepsilon _{n+1}> \tau ^\varepsilon _n\)\(\mathbb {P}\)-a.s.. Moreover, the random variables \(\{ \tau ^\varepsilon _{n+1}- \tau ^\varepsilon _n\}_{n\ge 1}\) are identically distributed and \( \tau ^\varepsilon _{n+1}- \tau ^\varepsilon _n\) is independent on \(\mathscr {F}_{ \tau ^\varepsilon _n}\). Finally, it can be verified by induction that

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0^+} \tau ^\varepsilon _n=\tau _n\quad \mathbb {P}\text{-a.s. } \text{ on } \{\tau _n<\infty \}, \end{aligned}$$

from which we obtain

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0^+} e^{-\rho \tau ^\varepsilon _n}\ge e^{-\rho \tau _n}\quad \mathbb {P}\text{-a.s. }. \end{aligned}$$
(6.1)

Define \(Y^\varepsilon {:}{=}\inf \left\{ t\ge 0 :Z^{0,S}_t -\varepsilon \left( Z^{0,S}_t +t \right) ^2 \le s\right\} \). Then \(\tau ^\varepsilon _{n+1}-\tau ^\varepsilon _n\sim Y^\varepsilon \) for all \(n\ge 1\). Observe that \(Y^\varepsilon \) increases as \(\varepsilon \) tends to \(0^+\). Let \({\displaystyle {Y{:}{=}\lim _{\varepsilon \rightarrow 0^+}Y^\varepsilon }}\). Since \(S-\varepsilon S^2>s\) entails \(Y^\varepsilon >0\), we have in particular \(Y>0\). We can then write, using (6.1) and Fatou’s Lemma in the first inequality below,

(6.2)

Summing over \(n\ge 1\) and taking into account that \(\mathbb {E}[e^{-\rho Y}]<1\), from (6.2) we get

$$\begin{aligned} \mathbb {E} \left[ \sum _{n\ge 1}e^{-\rho \tau _{n+1}} \right] <\infty . \end{aligned}$$
(6.3)

Both conditions (2.3) and (2.4) follow from (6.3), so the control \(I^*\) is admissible.

Optimality Set \(X^*{:}{=}X^{x,I^*}\). We observe that, by (3.4), (3.7) and (6.3), we have

$$\begin{aligned} \lim _{T\rightarrow \infty } \mathbb {E}\left[ e^{-\rho T}v(X^*_{T})\right] =0. \end{aligned}$$
(6.4)

Let \(T>0\) and set \(\tau _0{:}{=}0^-\). Observe that by definition \(X^*\in [s,+\infty )\) and recall that \(\mathscr {L} v=f\) on \(\mathscr {C}=(s,\infty )\). For all \(n\in \mathbb {N}\) we apply Itô’s formula to \(v(X^*)\) in the interval \([\tau _n\wedge T,\tau _{n+1}\wedge T)\). Note that \(v^{\prime }\) is bounded in \([s,\infty )\) by Proposition 5.10, so

$$\begin{aligned} \mathbb {E}\left[ \int _{\tau _n\wedge T}^{\tau _{n+1}\wedge T} v^{\prime }(X^*_t)dW_t\right] =0\quad \forall n\in \mathbb {N}. \end{aligned}$$

Hence, taking the expectation in the Itô formula and taking into account that \(\mathscr {L}v(X^*)=f(X^*)\), we get

$$\begin{aligned}&\mathbb {E}\left[ e^{-\rho (\tau _{n+1}\wedge T)}v\left( X^*_{(\tau _{n+1}\wedge T)^-}\right) \right] - \mathbb {E}\left[ e^{-\rho (\tau _n\wedge T)}v\left( X^*_{\tau _n\wedge T}\right) \right] \nonumber \\&\quad = -\,\mathbb {E}\left[ \int _{\tau _n\wedge T}^{\tau _{n+1}\wedge T}e^{-\rho t}f(X^*_t)dt\right] , \ \ \forall n\in \mathbb {N}. \end{aligned}$$
(6.5)

Now fix for the moment \(\omega \in \Omega \), \(n\ge 1\) and assume that \(\tau _{n}(\omega )\le T\). By definition of \(i_n(\omega )\) and considering that \(X^{*}_{\tau _n^-}(\omega )\in \mathscr {A}\) we have (cf. also Corollary 4.7, Proposition 4.2(i) and the definition of S in Proposition 5.10(i))

$$\begin{aligned} i_n(\omega )= {\mathop {\hbox {argmax}}\limits _{i>0}} \left\{ v(X^*_{\tau _n^-}(\omega )+i)-c_0 i-c_1\right\} . \end{aligned}$$

Hence, considering that \(\mathscr {M} v(X^{*}_{\tau _n^-}(\omega ))=v(X^{*}_{\tau _n^-}(\omega ))\), we have

$$\begin{aligned} e^{-\rho \tau _{n}(\omega )}v\left( X^*_{\tau _{n}(\omega )} \right) - e^{-\rho \tau _{n}}v\left( X^*_{\tau _{n}(\omega )^-} \right) =e^{-\rho \tau _{n}(\omega )}\left( c_0i_{n}(\omega )+c_1\right) . \end{aligned}$$
(6.6)

It follows that, for all \(n\ge 1\),

$$\begin{aligned}&\mathbb {E}\left[ e^{-\rho \left( \tau _{n}\wedge T\right) } \left( v\left( X^*_{\tau _{n}\wedge T} \right) -v\left( X^*_{\left( \tau _{n}\wedge T\right) ^-} \right) \right) \right] =\nonumber \\&\quad =\mathbb {E}\left[ e^{-\rho \left( \tau _{n}\wedge T\right) } \left( v\left( X^*_T \right) -v\left( X^*_{T^-}\right) \right) \mathbf {1}_{\{\tau _n>T\}} \right] + \mathbb {E}\left[ e^{-\rho \tau _{n}}\left( c_0i_{n}+c_1\right) \mathbf {1}_{\{\tau _{n} \le T\}}\right] .\nonumber \\ \end{aligned}$$
(6.7)

Using (6.5) and (6.7), we can then write, for \(N\ge 1\),

$$\begin{aligned}&\mathbb {E} \left[ e^{-\rho \left( \tau _{N+1}\wedge T\right) } v\left( X^*_{\tau _{N+1}\wedge T}\right) \right] - v(x) \\&\quad = \sum _{n=0}^N \mathbb {E} \left[ e^{-\rho \left( \tau _{n+1}\wedge T\right) } v\left( X^*_{\tau _{n+1}\wedge T}\right) - e^{-\rho \left( \tau _n\wedge T\right) } v\left( X^*_{\tau _n\wedge T}\right) \right] \\&\quad = \sum _{n=0}^N \mathbb {E} \left[ e^{-\rho (\tau _{n+1}\wedge T)} \left( v\left( X^*_{\tau _{n+1}\wedge T}\right) - v\left( X^*_{\left( \tau _{n+1}\wedge T\right) ^-}\right) \right) \right] \\&\qquad + \sum _{n=0}^N \mathbb {E} \left[ e^{-\rho (\tau _{n+1}\wedge T)} v\left( X^*_{ \left( \tau _{n+1}\wedge T \right) ^- }\right) - e^{-\rho \left( \tau _n\wedge T\right) } v\left( X^*_{\tau _n\wedge T}\right) \right] \\&\quad =\sum _{{{n=0}}}^N \left( \mathbb {E}\left[ e^{-\rho \left( \tau _{n+1}\wedge T\right) } \left( v\left( X^*_T \right) -v\left( X^*_{T^-} \right) \right) \mathbf {1}_{\{\tau _{n+1}>T\}} \right] \right. \\&\qquad \left. +\, \mathbb {E}\left[ e^{-\rho \tau _{n+1}}\left( c_0i_{n+1}+c_1\right) \mathbf {1}_{\{\tau _{n+1} \le T\}}\right] \right) \\&\qquad - \sum _{n=0}^N \mathbb {E}\left[ \int _{\tau _n\wedge T}^{\tau _{n+1}\wedge T}e^{-\rho t} f\left( X^*_t\right) dt \right] . \end{aligned}$$

By passing to the limit \(N\rightarrow \infty \) and using (2.3), we obtain

$$\begin{aligned}&\mathbb {E} \left[ e^{-\rho T} v\left( X^*_T\right) \right] - v(x) + \mathbb {E}\left[ \int _{0}^T e^{-\rho t} f\left( X^*_t\right) dt \right] \\&\quad = \sum _{{{n=0}}}^\infty \left( \mathbb {E}\left[ e^{-\rho (\tau _{n+1}\wedge T)} \left( v\left( X^*_T \right) -v\left( X^*_{T^-} \right) \right) \mathbf {1}_{\{\tau _{n+1}>T\}} \right] \right. \\&\qquad \left. +\, \mathbb {E}\left[ e^{-\rho \tau _{n+1}}\left( c_0i_{n+1}+c_1\right) \mathbf {1}_{\{\tau _{n+1} \le T\}}\right] \right) . \end{aligned}$$

We take now the \({\displaystyle \liminf _{T\rightarrow \infty }}\), using (6.4) on the first addend of the left hand side, monotone convergence on the third addend of the left hand side and Fatou’s lemma on the right hand side. We obtain

$$\begin{aligned} -v(x) +\mathbb {E} \left[ \int _{0}^\infty e^{-\rho t} f\left( X^*_t\right) dt \right] \ge \sum _{n=0}^\infty \mathbb {E} \left[ e^{-\rho \tau _{n+1}} \left( c_0i_{n+1}+c_1\right) \right] , \end{aligned}$$
(6.8)

which shows that \(I^*\) is optimal (Fig. 1). \(\square \)

Fig. 1
figure 1

An illustrative picture of the value function and of the \(S-s\) rule

7 Numerical illustration in the linear case

In the previous sections we have characterized the solution of the dynamic optimization problem through the unique solution of the nonlinear algebraic system (5.39) in the triple (AsS). In this section we specialize the study when the reference process Z follows a geometric Brownian motion dynamics, i.e. when \( b(x){:}{=}\nu x\), \( \sigma (x){:}{=}\sigma x\), with \(\nu \in \mathbb {R}\), \(\sigma >0\) and when \(f(x)=\frac{x^\gamma }{\gamma }\), with \(0<\gamma <1\), assuming

$$\begin{aligned} \rho >\nu ^+. \end{aligned}$$
(7.1)

In this way, Assumptions 2.1, 2.3, 2.4, 3.3(ii)–(iv), 5.2 are satisfied.Footnote 7 In the present case we have

$$\begin{aligned} \varphi (x)= x^m, \end{aligned}$$

where m is the negative root of the characteristic equation

$$\begin{aligned} \rho -\nu m-\frac{1}{2}\sigma ^2 m(m-1)=0 \end{aligned}$$

associated with \(\mathscr {L}u=0\), i.e.

$$\begin{aligned} m= \left( \frac{1}{2}- \frac{\nu }{\sigma ^2}\right) - \sqrt{ \left( \frac{1}{2}- \frac{\nu }{\sigma ^2}\right) ^2+\frac{2\rho }{\sigma ^2}}, \end{aligned}$$
(7.2)

and

$$\begin{aligned} \hat{v}(x)= C_\gamma \frac{x^\gamma }{\gamma }, \quad C_\gamma {:}{=}\left( \rho -\nu \gamma +\frac{1}{2}\gamma (1-\gamma )\sigma ^2\right) ^{-1}. \end{aligned}$$
(7.3)

The problem with no fixed cost, i.e. when \(c_1=0\), is investigated in the singular control setting (the right one to get existence of optimal controls, see Remark 2.5) in [63, Sec. 4.5]. In this case, the value function v and the optimal reflection boundary s are characterized in [63, Th. 4.5.7] through an algebraic system too. Such system can be solved providing, in our notation,

$$\begin{aligned} s=\left( \frac{c_0(m-1)}{C_\gamma (m-\gamma )}\right) ^{\frac{1}{\gamma -1}}, \quad B= \frac{C_\gamma (1-\gamma )}{m(m-1)} s^{\gamma -m}. \end{aligned}$$
(7.4)

We make Assumption 5.5; the latter in the present case reads as

$$\begin{aligned} c_1< C_\gamma ^{\frac{1}{1-\gamma }} c_0^{\frac{\gamma }{1-\gamma }} \left( \frac{1}{\gamma }-1 \right) . \end{aligned}$$
(7.5)

Moreover, Assumption 3.3(i) would read as

$$\begin{aligned} \rho >\max \left\{ 4|\nu |+6\sigma ^2,\ 2|\nu |+2\sigma ^2\right\} = 4|\nu |+6\sigma ^2. \end{aligned}$$

However, as we show below, in the linear-homogeneous case under consideration here, we do not need to make this assumption: we can exploit the linear dependence of the controlled process on the initial datum and the homogeneity of f to show the result of semiconvexity stated, for the general case, in Proposition 3.7. Consequently, the other results of the paper hold under no further assumption. Indeed, observing that the terms \(\{i_n\}_{n\ge 1}\) enter in the dynamics of \(X^{x,I}\) in additive form, we have

$$\begin{aligned} X_t^{x,I} - X_t^{y,I} = X_t^{x,\emptyset } - X_t^{y,\emptyset }=(x-y)e^{(\nu -\frac{\sigma ^2}{2})t+\sigma W_t}, \quad \forall I\in \mathscr {I}, \ \forall x,y\in \mathbb {R}_{++}, \end{aligned}$$
(7.6)

that we can use to prove the following result.

Proposition 7.1

In the above framework we have, for every \(\lambda \in [0,1]\) and every \(x,y\ge \varepsilon >0\)

$$\begin{aligned} v(\lambda x+(1-\lambda )y) -\lambda v(x)-(1-\lambda )v(y) \le \lambda (1-\lambda )(1-\gamma )C_\gamma ^{-1}\varepsilon ^{\gamma -2}(y-x)^2. \end{aligned}$$

Proof

Let \(0< \xi \le \xi ^{\prime }\). Then, for suitable \(\eta ,\eta ^{\prime }\in [\xi ,\xi ^{\prime }]\) we have, by Lagrange’s Theorem,

$$\begin{aligned}&f\left( \lambda \xi +(1-\lambda )\xi ^{\prime }\right) - \lambda f(\xi )-(1-\lambda )f(\xi ^{\prime })=\nonumber \\&\quad = -\lambda \left[ f(\xi )-f\left( \xi +(1-\lambda )(\xi ^{\prime }-\xi )\right) \right] -(1-\lambda )\left[ f(\xi ^{\prime })-f\left( \xi ^{\prime }+\lambda (\xi -\xi ^{\prime })\right) \right] \nonumber \\&\quad =\lambda (1-\lambda ) f^{\prime }(\eta )(\xi ^{\prime }-\xi )-\lambda (1-\lambda )f^{\prime }(\eta ^{\prime })(\xi ^{\prime }-\xi )\nonumber \\&\quad =\lambda (1-\lambda ) \left( f^{\prime }(\eta )-f^{\prime }(\eta ^{\prime }) \right) (\xi ^{\prime }-\xi )\nonumber \\&\quad \le \lambda (1-\lambda ) |f^{\prime \prime }(\xi )|(\xi ^{\prime }-\xi )^2\nonumber \\&\quad = \lambda (1-\lambda ) (1-\gamma )\xi ^{\gamma -2} (\xi ^{\prime }-\xi )^2. \end{aligned}$$
(7.7)

Let now \(0<\varepsilon \le x\le y\), \(\lambda \in [0,1]\) and set \(z{:}{=}\lambda x+(1-\lambda )y\). Let \(\delta >0\) and let \(I_\delta \in {\mathscr {I}}\) be a \(\delta \)-optimal control for v(z). Then, using (7.7), the fact that \(X^{x,I}\ge X^{x,\emptyset }\), and recalling (7.6), we get

$$\begin{aligned}&v(\lambda x+(1-\lambda )y)-\delta -\lambda v(x)-(1-\lambda )v(y) \le J(z,I_\delta )- \lambda J(x,I_\delta )-(1-\lambda )J(y,I_\delta )\\&\quad = \mathbb {E}\left[ \int _0^{+\infty } e^{-\rho t} \left( f\left( X_t^{z,I_\delta }\right) - \lambda f\left( X_t^{x,I_\delta }\right) -(1-\lambda ) f\left( X_t^{y,I_\delta }\right) \right) dt\right] \\&\quad \le \lambda (1-\lambda )(1-\gamma )\mathbb {E}\left[ \int _0^{+\infty } e^{-\rho t} \left( X^{x,I_\delta }_t\right) ^{\gamma -2}\left( X^{y,I_\delta }_t-X^{x,I_\delta }_t\right) ^2 dt\right] \\&\quad \le \lambda (1-\lambda )(1-\gamma )\mathbb {E}\left[ \int _0^{+\infty } e^{-\rho t} (X^{x,\emptyset }_t)^{\gamma -2}\left( X^{y,\emptyset }_t-X^{x,\emptyset }_t\right) ^2 dt\right] \\&\quad = \lambda (1-\lambda )(1-\gamma )C_\gamma ^{-1}x^{\gamma -2}(y-x)^2 \le \lambda (1-\lambda )(1-\gamma )C_\gamma ^{-1}\varepsilon ^{\gamma -2}(y-x)^2, \end{aligned}$$

the claim. \(\square \)

7.1 Numerical illustration

We perform a numerical analysis of the solution solving the nonlinear system (5.39). In Fig. 2, we provide the picture of the value function and its derivative when the parameters are set as follows: \(\rho = 0.08, \ \nu = - 0.07, \ \sigma = 0.25, \ c_0 = 1, \ c_1 = 10, \ \gamma = 0.5.\) Solving (5.39) with these entries and with \(\varphi (x)=x^m\), where m is given by (7.2), yields

$$\begin{aligned} (B,s,S)= (97.0479, \ 8.7492, \ 56.9930). \end{aligned}$$

In the rest of this section we discuss numerically the solution, illustrating how changes in parameters affect the value function and the trigger and target boundaries sS, which describe the optimal control.Footnote 8

Fig. 2
figure 2

Value function (above) and its derivative (below)

7.1.1 Impact of volatility

In Table 1 we report the relevant values the solution for different values of the volatility \(\sigma \). The other parameters are set as follows: \(\rho = 0.08, \ \nu = -0.07,\ \gamma = 0.5,\ c_0 = 1,\ c_1 = 10\).

Table 1 Solution as function of \(\sigma \)

Figure 3, drawn imposing the same values of parameters, represents the trigger level s, the target level S, and their difference \(S-s\) as functions of the volatility \(\sigma \). The figure and the table show that, when uncertainty increases, the action region \(\mathscr {A}\) shrinks and the investment size \(S-s\) shrinks. The first effect is well-known in the economic literature of irreversible investments without fixed costs as value of waiting to invest: an increase of uncertainty leads to postpone the investment (see [54]). We can see that, in our fixed cost context, also the size of the optimal investment is negatively affected by an increase of uncertainty.

Fig. 3
figure 3

The trigger level s, the target level S and the difference \(S-s\) as functions of \(\sigma \)

7.1.2 Impact of fixed cost

In Table 2 we report the relevant values of the solution for different values of the fixed cost \(c_1\), when the other parameters are set as follows: \(\sigma =0.1,\ \rho = 0.08,\ \nu = - 0.07,\ \gamma = 0.5,\ c_0 = 1.\) In the row corresponding to \(c_1=0\), there are reported the outputs of the corresponding singular control problem, computed according to the values of s and B expressed by (7.4).Footnote 9). It can be observed that the convergence as \(c_1\rightarrow 0^+\) is pretty slow; this is consistent with the theoretical result of [61], which would state, in our case, \(\frac{\partial v (\cdot ;c_1)}{\partial c_1}(0^+)=-\infty \).

Table 2 Solution as function of \(c_1\)

Figure 4, drawn imposing the same values of parameters, shows that, as \(c_1\) increases, the action region \(\mathscr {A}\) shrinks and the investment size \(S-s\) expands. Both these effects are expected: the first one is the counterpart of the value of waiting to invest, now with respect to the fixed cost of investment, rather than with respect to uncertainty; the second one expresses the fact that an increase of the fixed cost leads to invest less often, then to provide a larger investment size when the investment is undertaken.

Fig. 4
figure 4

The trigger level s, the target level S and the difference \(S-s\) as functions of \(c_1\)