1 Introduction

In the classical Merton portfolio optimization problem, an investor allocates her/his investment between a risky asset (e.g., stock) and a riskless asset (e.g., bond). The goal is to choose the optimal investment (allocation) strategy and the consumption strategy to maximize the total discounted expected utility. Thousands of papers on this and related topics have been published since the seminal paper by Merton [1]. This kind of portfolio optimization problem can usually be formulated as a stochastic control problem, where the controls are the investment and the consumption. For example, at any time t, the investor chooses how much of the wealth to invest into the risky asset and how much to consume to maximize the total expected utility.

A very popular model to describe the price of the risky asset is the classical geometric Brownian motion model, in which the volatility is assumed as a constant. However, it is widely accepted that stock volatilities exhibit random characteristics, which presents limitations in models with a constant stock volatility. One commonly noted example is the discrepancy between observed option prices and the prices predicted by the Black–Scholes formula. Implied volatilities are known to vary with strike price (for a fixed maturity), creating a volatility smile or smirk. Stochastic volatility models can capture this smile/smirk effect. Fouque et al. [2] discuss further benefits of stochastic volatility models, including the generalization of more realistic return distributions with fatter tails. They also discuss more difficulties presented by such models and ways to address the challenges. Recently, Lorig and Sircar [3] consider a finite-time horizon portfolio optimization model in a general local-stochastic volatility setting, and they derive the approximations for both the value function and the optimal investment strategy. In Fatone et al. [4, 5], some multi-scale stochastic volatility models and the market calibration problems are considered. Some other results with stochastic factor or stochastic volatility models can be found in Zariphopoulou [6], Fleming and Hernández-Hernández [7], Fouque and Han [8], Fouque et al. [9], and the references therein.

Moreover, unlike the constant interest rate assumed in the classical Merton model, the interest rate on the riskless asset may fluctuate from time to time. Portfolio optimization problems with stochastic interest rate models are considered in Fleming and Pang [10] and Pang [11, 12]. In those papers, the interest rate for the riskless asset is assumed to follow a stochastic process with a mean-reverting feature. Goel and Kumar [13] consider a class of risk-sensitive portfolio optimization problems, where a fixed income security with stochastic interest rate is included. They prove the existence results under certain conditions. Moreover, in Fleming and Hernández-Hernández [14], Nagai [15], and Noh and Kim [16], stochastic interest rates and stochastic volatility are both incorporated. Hata and Sheu [17, 18] consider an optimal investment problem in which they allow both the drift and volatility of price to be stochastic. In [19], Kaise and Sheu prove existence and uniqueness of ergodic-type Bellman equations.

The classical Merton model assumes that the risky asset does not pay any dividend (or equivalently, there is no productivity yield with the risky asset), and the investor only makes profits from the asset price changes. However, in the real world, many stocks do pay dividends, and there are some derivatives based on dividends (see Tunaru [20]). Moreover, the dividend yield tends not to be constant due to the possibility of bankruptcy (see Geske [21]), and the dividend yield rate and/or the dividend amount usually changes from time to time. In [21], Geske considers the stochastic dividend in the classical Black–Scholes–Merton option pricing formula, and a new formula is derived in discrete time, under the assumption of a lognormal distribution for dividend yield. Lioui [22] proposes a mean-reverting stochastic process to model the stochastic dividend yield in continuous time, under the complete market case. In Pang and Varga [23], a portfolio optimization problem with stochastic dividend is considered. Chevalier et al. [24] consider an optimal dividend and investment control problem with debt constraints. We want to point out that the (stochastic) dividend model can be used to model any risky asset with productivity yield, such as foreign currency, gold (when lease yield is considered), farm land. In Fleming and Pang [25], a portfolio optimization model with stochastic productivity is considered and the model can be applied to an optimal investment problem on a stock with stochastic dividends.

In this paper, we consider a portfolio optimization problem in which the risky asset price is modeled by a stochastic differential equation with stochastic volatility and stochastic yields (dividends). In particular, we assume that the stochastic volatility process is driven by a mean-reverting Ornstein–Uhlenbeck factor process, and the stochastic dividend yield rate is modeled by a white noise-type stochastic process. Investment and consumption controls are chosen to maximize the expected discounted utility of consumption. Our model can be used to describe an economic unit with productive capital and liabilities in the form of debt.

This paper is an extension of Pang and Varga [23] by including stochastic volatility in the model to make it more realistic. The introduction of stochastic volatility brings more mathematical challenges, and a new method is needed to establish the results. Similar to [23], we assume that the stochastic dividend yield rate is modeled by a white noise-type stochastic process given by (4). There are a couple of reasons we use this model for the dividend. First, the stock price can be treated as the total present value of the future dividends, so the stock price can be written as an integral of discounted future dividends. If we use the popular geometric Brownian motion to describe the stock price, the dividend yield is more like the derivative of the geometric Brownian motion, i.e., a white noise-type stochastic process. Second, for technical reasons, the assumption of (4) for the dividend yield \(b_t\) makes the model mathematically more tractable.

The problem is formulated as a stochastic control problem, and we derive the associated Hamilton–Jacobi–Bellman (HJB) equation using the dynamic programming principle. The HJB equation is a second-order nonlinear partial differential equation for which the current PDE existence results do not apply. By virtue of the subsolution–supersolution method, which is proposed by Fleming and Pang [10] and is later extended by Hata and Sheu [17, 18], we establish the existence results of the HJB equation. Further, we derive the optimal investment and consumption control policies and establish the verification results.

The rest of the paper is organized as follows. In Sect. 2, the problem is introduced and formulated as a stochastic control problem. The HJB equation for the value function is derived by the dynamic programming principle, and some preliminary results are given. In Sect. 3, we establish the existence result of the solution for the HJB equation by virtue of the subsolution–supersolution method. The verification results are given in Sect. 4, and the optimal investment and consumption strategies are derived in this section as well. We conclude the paper in Sect. 5.

2 Problem Formulation

We consider a portfolio optimization problem of Merton’s type. In particular, we consider an investor who, at time t,  owns \(N_t\) shares of stock at price per share \(P_t\). The total worth of investments is given by \(K_t = N_tP_t\). Then, we can write

$$\begin{aligned} dK_t = N_tdP_t + P_tdN_t= K_t\frac{dP_t}{P_t} + I_t\mathrm {d}t, \end{aligned}$$
(1)

where \(I_t=\frac{d N_t}{N_t}\) is the investment rate at time t. Here, we assume that the number of shares \(N_t\) is of finite variation, so the term \(dK_t\cdot dN_t\) vanishes here.

The investor’s debt, \(L_t,\) increases with interest payments, investment, and consumption and decreases with income. Thus, the equation for the change in debt is given by

$$\begin{aligned} dL_t = [rL_t + I_t + C_t - D_t]\mathrm {d}t, \end{aligned}$$
(2)

where \(r\ge 0\) is a constant interest rate, \(C_t\) is the consumption rate, and \(D_t\) is the rate of income from the risky asset yield. For example, \(D_t\) can be treated as the total earned dividends per unit time at time t. The total dividend is equal to the total number of shares times the productivity of capital, or dividend rate, \(b_t\):

$$\begin{aligned} D_t\mathrm {d}t = b_tN_t\mathrm {d}t. \end{aligned}$$
(3)

The investor’s net worth is given by \(X_t = K_t - L_t\), and we require \(X_t > 0\).

It is worth noting that the model applies to any economic unit with productive capital and liabilities. For another example, consider a farm on which products are grown and then sold for profits. \(N_t\) may represent the number of acres of the farm, with \(P_t\) being the property value per acre. Debt increases with property taxes, the purchase of new land, and consumption and decreases with income from selling the produce. From here on, we continue our explanations with the investor example.

We assume that the dividend rate has a constant average growth rate of b with a white noise. In particular, we assume that the dividend rate \(b_t\) is governed by the following equation:

$$\begin{aligned} b_t\mathrm {d}t = b \mathrm {d}t + \sigma _1 \mathrm {d}B_{1,t}, \end{aligned}$$
(4)

where \(b, \sigma _1 >0\) are constants and \(B_{1,t}\) is a standard Brownian motion. We assume that the stock price follows a geometric Brownian motion with a stochastic volatility:

$$\begin{aligned} \displaystyle \frac{dP_t}{P_t} = \mu \mathrm {d}t + \sigma _2(Z_t) \mathrm {d}B_{2,t} \end{aligned}$$
(5)

where \(\mu >0\) is a constant, and the function \(\sigma _2(z) \in C^1(\mathbb {R})\) satisfies

$$\begin{aligned} 0 < {\tilde{\sigma }}_2 \le \sigma _2(Z_t) \le \hat{\sigma }_2, \quad \left| \frac{d\sigma _2(z)}{dz}\right| \le \hat{\sigma }_2', \end{aligned}$$
(6)

for some constants \(\tilde{\sigma }_2, \hat{\sigma }_2\), and \(\hat{\sigma }_2'\), and \(B_{2,t}\) is a standard Brownian motion. We further assume that the volatility is driven by a mean-reverting Ornstein–Uhlenbeck process given by

$$\begin{aligned} dZ_t = a(\bar{z} - Z_t)\mathrm {d}t + \sigma _3 \mathrm {d}B_{3,t}, \end{aligned}$$
(7)

where \(a, \sigma _3, \bar{z} \) are positive constants and \(B_{3,t}\) is a standard Brownian motion. The mean-reverting feature captures the tendency of the stochastic volatility to revert back to its invariant, or long-run, distribution.

Remark 2.1

In Eq. (5), instead of assuming \(\mu \) is a constant, we can assume that \(\mu \) is a positive, bounded, and smooth function of \(Z_t\). All the results still hold, and all the arguments are very similar. In this paper, we just consider the constant \(\mu \) case for notation convenience.

In the above equations, we introduce three one-dimensional standard Brownian motions, \(B_{1,t}, B_{2,t},\) and \(B_{3,t}\). We allow \(B_{1,t}\) and \(B_{2,t}\) to be correlated with a correlation constant \(\rho \in [\rho _0,1]\) for some constant \(-1<\rho _0 <0,\) and we suppose \(B_{3,t}\) is uncorrelated with \(B_{1,t}\) and \(B_{2,t}\). That is,

$$\begin{aligned} \mathbf {E}[B_{1,t} \cdot B_{2,t}] = \rho \mathrm {d}t, \quad \mathbf {E}[B_{1,t} \cdot B_{3,t}] = \mathbf {E}[B_{2,t} \cdot B_{3,t}] = 0. \end{aligned}$$

It is reasonable to assume that the dividend process and the stock price process are not perfectly negatively correlated. So here we restrict \(\rho _0 \ne -1\).

By virtue of (1) and (2), we can get the equation for the investor’s net worth \(X_t=K_t-L_t\) as

$$\begin{aligned} dX_t = K_t\frac{dP_t}{P_t} - [rL_t + C_t - D_t]\mathrm {d}t. \end{aligned}$$
(8)

Define \(k_t \equiv \displaystyle \frac{K_t}{X_t}\) and \( c_t \equiv \displaystyle \frac{C_t}{X_t}\) as the control variables. Noting that

$$\begin{aligned} L_t = K_t - X_t =(k_t-1) X_t, \end{aligned}$$

and using (3), (4), and (5), we can get

$$\begin{aligned} dX_t = X_t \left[ \left( \frac{b}{P_t} + \mu - r\right) k_t + (r-c_t)\right] \mathrm {d}t + X_t\left[ \frac{\sigma _1 k_t }{P_t}\mathrm {d}B_{1,t} + \sigma _2(Z_t) k_t \mathrm {d}B_{2,t}\right] . \end{aligned}$$

Let \(Y_t \equiv \log P_t\). Then, the above equation can be written as

$$\begin{aligned} dX_t= & {} X_t\Big [(be^{-Y_t} + \mu - r) k_t + (r-c_t)\Big ]\mathrm {d}t \nonumber \\&+ X_t\Big [\sigma _1 k_t e^{-Y_t}\mathrm {d}B_{1,t} + \sigma _2(Z_t) k_t\mathrm {d}B_{2,t}\Big ]. \end{aligned}$$
(9)

The \(Y_t\) process follows

$$\begin{aligned} dY_t = {\tilde{\mu }}(Z_t)\mathrm {d}t + \sigma _2(Z_t)\mathrm {d}B_{2,t}, \end{aligned}$$
(10)

where \({\tilde{\mu }}(Z_t) = \mu - \frac{1}{2}\sigma _2^2(Z_t)\). Note that \({\tilde{\mu }}(z)\) is bounded:

$$\begin{aligned} |{\tilde{\mu }}(Z_t)| = \left| \mu - \frac{1}{2}\sigma _2^2(Z_t)\right| \le |\mu | + \frac{1}{2}\hat{\sigma }_2^2 \equiv {\tilde{M}}. \end{aligned}$$
(11)

We define the admissible control space \(\varPi \) as the following:

Definition 2.1

(Admissible Control Space) The pair \((k_t, c_t)\) is said to be in the admissible control space\(\varPi \) if \((k_t,c_t)\) is an \(\mathbb {R}^2\)-process which is progressively measurable with respect to a \((B_{1,t}, B_{2,t}, B_{3,t})\)-adapted family of \(\sigma _1\)-algebras \(\{ \mathcal {F}_t, t \ge 0\}\). Moreover, we require that \(k_t, c_t \ge 0,\) and

$$\begin{aligned} Pr\left( \int _0^T k_t^2 \mathrm{d}t< \infty \right) = 1, \quad Pr\left( \int _0^T c_t \mathrm{d}t < \infty \right) = 1 \quad \text {for all} \,\,T>0. \end{aligned}$$

We consider the hyperbolic absolute risk aversion (HARA) utility function

$$\begin{aligned} U(C)= \frac{1}{\gamma } C^\gamma , \quad \gamma <1, \gamma \ne 0. \end{aligned}$$

\(\gamma =0\) is the value corresponding to the log utility case \(U(C)=\log C\). In this paper, we only consider \(0<\gamma <1\). The method to solve the problem for the \(\gamma <0\) case is the same, so we omit it in this paper. The goal is to maximize the expected total discounted HARA utility of consumption subject to the constraints \((k_t,c_t) \in \varPi \) and \(X_t > 0\). The objective function is

$$\begin{aligned} J(x,y,z,k.,c.) = \mathbf {E} \left[ \int _0^\infty e^{-\beta t}\frac{1}{\gamma }(c_t X_t)^\gamma \mathrm{d}t\right] , \end{aligned}$$
(12)

and the corresponding value function is given by

$$\begin{aligned} V(x,y,z) = \sup _{(c_t, k_t) \in \varPi } \mathbf {E} \left[ \int _0^\infty e^{-\beta t}\frac{1}{\gamma }(c_t X_t)^\gamma \mathrm{d}t\right] , \end{aligned}$$
(13)

where the discount factor \(\beta >0\) is a constant, and xy, and z are the initial values of the state variables \(X_t, Y_t\), and \(Z_t,\) respectively.

The state variables \(X_t, Y_t\), and \(Z_t\) are given by (9), (10), and (7), respectively. Using the dynamic programming principle (refer to Fleming and Soner [26] for details), we get the following HJB equation for V(xyz) : 

$$\begin{aligned} \beta V&= \frac{{\tilde{\sigma }}_1^2(z)}{2} V_{yy} + \frac{\sigma _3^2}{2} V_{zz} + rxV_x + {\tilde{\mu }}(z) V_y + a(\bar{z} - z) V_z \nonumber \\&\quad + \max _{c \ge 0} \left[ \frac{1}{\gamma } (cx)^\gamma - cxV_x \right] + \max _{k \ge 0} \bigg \{ (by + \mu - r)kxV_x \nonumber \\&\quad + \frac{k^2x^2}{2} q(y,z) V_{xx} + kx \big (\tilde{\sigma }_1^2(z) + \rho \sigma _1 \sigma _2(z) y\big ) V_{xy} \bigg \}, \end{aligned}$$
(14)

where

$$\begin{aligned} q(y,z) = \sigma _1^2 e^{-2y} + 2\rho \sigma _1\sigma _2(z) e^{-y} + \sigma _2^2(z). \end{aligned}$$
(15)

We look for a solution of the following form:

$$\begin{aligned} V(x, y, z) = \frac{1}{\gamma }x^\gamma W(y,z). \end{aligned}$$
(16)

Substituting this form into Eq. (14), we can get the equation for W by canceling \(x^\gamma \):

$$\begin{aligned} \beta W= & {} \frac{{\tilde{\sigma }}_1^2(z)}{2} W_{yy} + \frac{\sigma _3^2}{2} W_{zz} + \gamma r W + {\tilde{\mu }}(z) W_y + a(\bar{z} - z) W_z \nonumber \\&+\max _{c\ge 0} \big [c^\gamma - \gamma c W \big ] + \gamma \max _{k \ge 0} \bigg \{ (be^{-y} + \mu - r)kW \nonumber \\&+ (\sigma _2^2(z) + \rho \sigma _1\sigma _2(z)e^{-y})kW_y - \frac{k^2}{2}(1-\gamma )q(y,z) \bigg \}. \end{aligned}$$
(17)

Define a function \(Q(y, z) = \log W (y,z)\). Then, we can get the equation for Q:

$$\begin{aligned} \beta= & {} \frac{\sigma _2^2(z)}{2}(Q_y^2 + Q_{yy}) + \frac{\sigma _3^2}{2}(Q_z^2 + Q_{zz})\nonumber \\&+ \gamma r + {\tilde{\mu }}(z) Q_y + a(\bar{z} - z)Q_z+ \max _{c \ge 0} \big [c^\gamma - \gamma c e^Q \big ] +G(y, z, Q_y), \end{aligned}$$
(18)

where the function G(yzp) is defined by

$$\begin{aligned} G(y,z,p)\equiv & {} \max _{k \ge 0} \bigg \{ \Big [ be^{-y} + \mu - r + (\sigma _2^2(z) + \rho \sigma _1\sigma _2(z)e^{-y}) p \Big ] k \nonumber \\&- \frac{k^2}{2} (1-\gamma ) q(y,z) \bigg \}. \end{aligned}$$
(19)

The candidates for the optimal controls are

$$\begin{aligned} k^*(y,z)= & {} \left[ \frac{be^{-y} + \mu - r + (\sigma _2^2(z) + \rho \sigma _1\sigma _2(z)e^{-y})Q_y}{(1-\gamma )q(y,z)} \right] ^+, \end{aligned}$$
(20)
$$\begin{aligned} c^*(y,z)= & {} e^\frac{Q(y,z)}{\gamma - 1}. \end{aligned}$$
(21)

Substitute the \((k^*, c^*)\) given by (20) and (21) into (18), and we can rewrite the equation for Q as

$$\begin{aligned} \frac{\sigma _2^2(z)}{2}Q_{yy} + \frac{\sigma _3^2}{2}Q_{zz} = H(y,z,Q, Q_y, Q_z), \end{aligned}$$
(22)

where

$$\begin{aligned} H(y, z, u, p, s)\equiv & {} -\frac{\sigma _2^2(z)}{2} p^2 - \frac{\sigma _3^2}{2} s^2 - {\tilde{\mu }}(z) p - a(\bar{z} - z)s \nonumber \\&- \gamma G(y, z, p) + \beta - \gamma r - (1-\gamma )e^{\frac{u}{\gamma -1}}. \end{aligned}$$
(23)

Equation (22) is the reduced HJB equation for Q(yz), and we want to show that \(V(x,y,z) = \frac{1}{\gamma }x^\gamma e^{Q(y,z)}\) is equal to the value function given by (13). Next, we establish the existence of the solution Q(yz) to (23) in Sect. 3. Then, in Sect. 4, we verify that \(V(x,y,z) = \frac{1}{\gamma }x^\gamma e^{Q(y,z)}\) is the value function, and the optimal investment and consumption strategies are given by (20) and (21), respectively.

We present some useful results before we move to the next section. First, we have the following lemma about the q(yz) function defined by (15).

Lemma 2.1

For \(\rho \in [\rho _0, 1]\), where \(-1<\rho _0 < 0\), we have

$$\begin{aligned} q(y,z) \ge q_0 > 0, \end{aligned}$$
(24)

where \(q_0 \equiv \tilde{\sigma }_2^2(1-\rho ^2_0)\).

The proof can be found in “Appendix A:”. \(\square \)

Note that if \(k^* = 0\), then \(G = 0\). If \(k^* > 0,\) then

$$\begin{aligned} G (y, z, p)= \displaystyle \frac{[be^{-y} + \mu - r + (\sigma _2^2(z) +\rho \sigma _1\sigma _2(z)e^{-y}) p ]^2}{2(1-\gamma )q(y,z)} \ge 0. \end{aligned}$$
(25)

So we have \(G(y,z,p) \ge 0\). Define

$$\begin{aligned} \varPsi (y,z) \equiv {\left\{ \begin{array}{ll} \displaystyle \frac{(be^{-y} + \mu - r)^2}{2(1-\gamma )q(y,z)}, \quad &{}\text {if} \,\,\, be^{-y} + \mu - r > 0,\\ 0, &{}\text {otherwise}.\end{array}\right. } \end{aligned}$$
(26)

Then, it is easy to verify that \(G(y, z, 0) = \varPsi (y,z)\). We have the following lemma about the function \(\varPsi \).

Lemma 2.2

\(\varPsi (y,z)\) is bounded.

The proof can be found in “Appendix B:”. \(\square \)

3 Existence Results

In this section, we prove the existence of a classical solution to (22). In particular, we use the subsolution–supersolution method to establish the existence results. The method is first introduced by Fleming and Pang [10] to solve the HJB equations that arise in stochastic control problems, and it is later extended in Hata and Sheu [17, 18]. The problem we consider here is not covered by the problems considered in [17, 18]. Due to the particular structure of (22), the results of [17, 18] cannot be applied here and there are some technical difficulties that we have to overcome.

3.1 Subsolution and Supersolution

We first define subsolutions and supersolutions of (22).

Definition 3.1

Q(yz) is a subsolution (supersolution) of (22), if

$$\begin{aligned} \frac{\sigma _2^2(z)}{2}Q_{yy} + \frac{\sigma _3^2}{2}Q_{zz} \ge (\le ) H(y, z, Q, Q_y, Q_z). \end{aligned}$$
(27)

In addition, if \(\tilde{Q}\) is a subsolution, \(\hat{Q}\) is a supersolution, and \(\tilde{Q} \le \hat{Q},\) then, \(\langle \tilde{Q}, \hat{Q}\rangle \) is an ordered pair of subsolution–supersolution.

Next, we show that there exists a pair of ordered subsolution and supersolutions.

Lemma 3.1

Suppose \(0< \gamma < 1\) and

$$\begin{aligned} \beta > \gamma (r+{\bar{\varPsi }}), \end{aligned}$$
(28)

where \({\bar{\varPsi }}\) is defined by

$$\begin{aligned} {\bar{\varPsi }} \equiv \max \left\{ 1, \displaystyle \frac{1}{1+\rho _0}\right\} \cdot \max \left\{ \frac{b^2}{\sigma _1^2(1-\gamma )}, \frac{(\mu - r)^2}{\tilde{\sigma }_2^2(1-\gamma )} \right\} \end{aligned}$$

In addition, define

$$\begin{aligned} K_1 \equiv (\gamma - 1)\log {\left[ \frac{\beta - \gamma r}{1-\gamma }\right] }, \quad \text{ and } \quad K_2 \equiv (\gamma - 1)\log \left[ \frac{\beta - \gamma (r + {\bar{\varPsi }})}{1-\gamma }\right] . \end{aligned}$$
(29)

Then, \(\langle K_1,K_2\rangle \) is an ordered pair of subsolution–supersolution to (22).

The proof is straightforward, and we omit it here. The main existence result is as follows:

Theorem 3.1

Suppose \(\sigma _2^2(z) \in C^{1,\alpha }({\bar{B}}_R)\). Define \(\tilde{Q} \equiv K_1\) and \(\hat{Q} \equiv K_2\), where \(K_1\) and \(K_2\) are given by (29). Then, there exists a solution \( Q \in C^{2,\beta }(\mathbb {R}^2)\) to (22) such that \(\tilde{Q} \le Q(y,z) \le \hat{Q}\) for all \((y,z) \in \mathbb {R}^2\).

The proof of Theorem 3.1 is given in Sect. 3.4. We first prove that, on the closed ball \({\bar{B}}_R \equiv \{(y, z) \in \mathbb {R}^2 : y^2 + z^2 \le R^2\}\), there exists a classical solution to the following boundary value problem

$$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2}Q_{yy} + \frac{\sigma _3^2}{2}Q_{zz} - H(y,z,Q, Q_y, Q_z) = 0, \quad &{}\text {on} \quad B_R,\\ Q = \psi , \quad &{}\text {on} \quad \partial B_R, \end{array} \end{aligned}$$
(30)

for a particular choice of \(\psi \). Once we have a solution to (30) on \({\bar{B}}_R\) for each R, we take the limit as \(R \rightarrow \infty \) to show the existence of solution to (22). The details are provided in the proof of Theorem 3.1 in Sect. 3.4.

3.2 Existence Results of Boundary Value Problem (30)

Following the approach taken by Hata and Sheu [17], we start by introducing the parameter \(\tau \in [0,1]\) into our equation:

$$\begin{aligned} \frac{\sigma _2^2(z)}{2} Q^\tau _{yy} + \frac{\sigma _3^2}{2} Q^\tau _{zz} - H(y,z, Q^\tau , Q^\tau _y, Q^\tau _z, \tau ) = 0, \end{aligned}$$
(31)

where \(H(y,z, u, p, s, \tau )\) is equal to H(yzups) with \(\gamma \) replaced by \(\tau \gamma :\)

$$\begin{aligned} H(y, z, u, p, s, \tau )\equiv & {} -\frac{\sigma _2^2(z)}{2} p^2 - \frac{\sigma _3^2}{2} s^2 - {\tilde{\mu }}(z) p - a(\bar{z} - z)s - \tau \gamma G^\tau (y, z, p) \nonumber \\&+ \, \beta - \tau \gamma r - (1-\tau \gamma )e^{\displaystyle \frac{u}{\tau \gamma -1}}, \end{aligned}$$
(32)

and

$$\begin{aligned} G^\tau (y,z,p)= & {} \max _{k \ge 0} \bigg \{\Big [ be^{-y} + \mu - r + (\sigma _2^2(z) + \rho \sigma _1\sigma _2(z)e^{-y}) p \Big ] k\\&- \frac{k^2}{2} (1-\tau \gamma ) q(y,z) \bigg \}. \end{aligned}$$

For \(0<\tau \le 1\), (31) is the reduced HJB equation corresponding to the value function

$$\begin{aligned} V^\tau (x,y,z) = \sup _{(k_t, c_t) \in \varPi } \mathbf {E} \left[ \int _0^\infty e^{-\beta t}\frac{1}{\tau \gamma }(c_t X_t)^{\tau \gamma } \mathrm{d}t\right] , \end{aligned}$$
(33)

by taking \(V^\tau (x,y,z)=\frac{1}{\tau \gamma } x^{\tau \gamma } e^{Q^\tau (y,z)}\). We consider \(\tau =0\) to be a limiting case of \(0<\tau \le 1\). This corresponds to the consumption problem for log utility, for which (31) has a unique solution [please refer to (49)].

The boundary value problem for (31) is

$$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2} Q^\tau _{yy} + \frac{\sigma _3^2}{2} Q^\tau _{zz} - H(y,z, Q^\tau , Q^\tau _y, Q^\tau _z, \tau ) = 0, &{}\text {on} \quad B_R,\\ Q^\tau = \tau \psi , &{}\text {on} \quad \partial B_R. \end{array} \end{aligned}$$
(34)

The above boundary value problem is used to obtain the existence result over the whole space.

The following theorem states the sufficient conditions for the existence of solution to (30).

Theorem 3.2

Let \(0<\alpha <1 \) and \(R>0\) be fixed. We assume the following conditions:

  1. (a)

    \(H(y,z,u,p,s,1) = H(y,z,u,p,s)\).

  2. (b)

    \(\sigma _2^2(z) \in C^{1,\alpha }({\bar{B}}_R); \, H(\cdot ,\cdot ,\cdot ,\cdot ,\cdot , \tau ) \in C^\alpha ({\bar{B}}_R \times \mathbb {R}^3)\) for \(\tau \in [0,1],\) and the function \(H(y,z,u,p,s,\tau )\) is continuous when considered as a mapping from [0, 1] into \(C^\alpha ({\bar{B}}_R \times \mathbb {R}\times \mathbb {R}^2)\).

  3. (c)

    \(\psi \in C^{2,\alpha }({\bar{B}}_R)\).

  4. (d)

    There exists a constant M such that every \(C^{2,\alpha }({\bar{B}}_R)\)-classical solution \(Q^\tau \) of (34) satisfies

    \(|Q^\tau (y,z)| < M, \quad (y,z) \in {\bar{B}}_R\), where M is independent of \(Q^\tau \) and \(\tau \).

  5. (e)

    There are \({\bar{k}}>0, \underline{c}, {\bar{c}},\) such that the following inequalities hold for \((y,z) \in {\bar{B}}_R, |u| \le M, \eta \in \mathbb {R}^2, \tau \in [0,1],\) and arbitrary (ps):

    $$\begin{aligned}&\underline{c} \sum _{i=1}^2 \eta _i^2 \le \sigma _2^2(z)\eta _1^2 + \sigma _3^2\eta _2^2 \le {\bar{c}} \sum _{i=1}^2 \eta _i^2,\nonumber \\&\quad |H(y,z,u,p,s,\tau )| + \left| \frac{d \sigma _2^2(z)}{d z} \right| \le {\bar{c}} (1 + p^2 + s^2)^{\frac{{\bar{k}}}{2}}. \end{aligned}$$
    (35)
  6. (f)

    There is an \(M_1 > 0\) such that for any \(\tau \in [0,1]\), \(|Q_\tau ^0(y,z)| < M_1\) and \( (y,z) \in {\bar{B}}_R\), where \(Q^0_\tau (\cdot )\) is an arbitrary solution of

    $$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2} Q_{yy} + \frac{\sigma _3^2}{2} Q_{zz} - \tau H(y,z, Q, Q_y,Q_z,0) = 0 &{}\text {on} \quad B_R, \\ Q = 0 &{}\text {on} \quad \partial B_R. \end{array} \end{aligned}$$

Then, boundary value problem (30) is solvable in \(C^{2,\alpha }({\bar{B}}_R)\).

See Theorem 3.4 and the proof in [17] for more details. \(\square \)

The next step is to prove that the conditions in Theorem 3.2 hold for the problem considered in this paper. Then we can get the existence of solution to boundary value problem (30). Several results are needed before that can be done. We begin with the following definition.

Definition 3.2

Let \(\tilde{Q}, \hat{Q}\) be continuous second-order differentiable functions defined on \({\bar{B}}_R\). \(\tilde{Q}\) (\(\hat{Q}\)) is called a subsolution (supersolution) of (30) if it satisfies

$$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2}{Q}_{yy} + \frac{\sigma _3^2}{2}{Q}_{zz} - H(y,z,{Q}, {Q}_y, {Q}_z) \ge (\le ) 0, \quad &{}on \quad B_R,\\ {Q} \le (\ge ) \psi , \quad &{}on \quad \partial B_R. \end{array} \end{aligned}$$

In addition, \(\langle \tilde{Q}, \hat{Q}\rangle \) is called an ordered pair of subsolution–supersolution if they also satisfy

$$\begin{aligned} \tilde{Q}(y,z) \le \hat{Q}(y,z), \quad \forall (y,z) \in {\bar{B}}_R. \end{aligned}$$

We can define an ordered pair of subsolution–supersolution to (34) in a similar manner. The following lemma is used later.

Lemma 3.2

For G given by (19), there exist constants \({\tilde{C}}_1, {\tilde{C}}_2 > 0\) such that

$$\begin{aligned} G(y,z,p) \le {\tilde{C}}_1 + {\tilde{C}}_2 p^2. \end{aligned}$$
(36)

The proof can be found in “Appendix C:”. \(\square \)

Next, we establish a comparison result.

Lemma 3.3

Let \(0<\tau \le 1\). Assume \(\tilde{Q}, \hat{Q}\) are second-order continuous differentiable functions on \({\bar{B}}_R\) and satisfy

$$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2} \tilde{Q}_{yy} + \frac{\sigma _3^2}{2} \tilde{Q}_{zz} - H(y,z, \tilde{Q}, \tilde{Q}_y, \tilde{Q}_z, \tau ) \ge 0 \quad &{}on \quad B_R,\\ \frac{\sigma _2^2(z)}{2} \hat{Q}_{yy} + \frac{\sigma _3^2}{2} \hat{Q}_{zz} - H(y,z, \hat{Q}, \hat{Q}_y, \hat{Q}_z, \tau ) \le 0 \quad &{}on \quad B_R,\\ \tilde{Q} \le \hat{Q} \quad &{}on \quad \partial B_R. \end{array} \end{aligned}$$
(37)

Then, \(\tilde{Q} \le \hat{Q}\) holds in \({\bar{B}}_R\).

The proof can be found in “Appendix D:”. From Lemma 3.3, we can get the following result:

Corollary 3.1

For \(0<\tau \le 1\), the solution \(Q^\tau \in C^2({\bar{B}}_R)\) of (34) is unique.

The proof can be found in “Appendix E:”. \(\square \)

To establish the existence result for (30) by virtue of Theorem 3.2, we need to verify that the conditions \((a){-}(f)\) are satisfied.

The coefficients of (22), \(\sigma _2^2(z), {\tilde{\mu }}(z), g_0(y,z), g_1(y,z),\) and \(g_2(y,z),\) are Lipschitz continuous for all \((y,z) \in \mathbb {R}\). If \(Q(y,z) \in C^1({\bar{B}}_R)\), it follows that \(H(y,z,Q,Q_y, Q_z) \in C^\alpha ({\bar{B}}_R\times \mathbb {R} \times \mathbb {R}^2)\). This helps to verify condition (b).

The next two theorems provide us with bounds on \(Q^\tau \), which are useful when we verify the conditions (d) and (f).

Theorem 3.3

Suppose \(\sigma _2^2(z) \in C^{1,\alpha }({\bar{B}}_R), \psi \) is continuous, and let \(\hat{Q}\) be a supersolution of (34) for \(\tau = 1\). Suppose that \(Q^\tau \) is a solution of (34) with \(0<\tau \le 1\). Then,

$$\begin{aligned} e^{Q^\tau (y,z)} \le \tau e^{\hat{Q}(y,z)} + (1-\tau )f(y,z), \end{aligned}$$
(38)

where f(yz) satisfies

$$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2}f_{yy}(y, z) + \frac{\sigma _3^2}{2}f_{zz}(y,z) + {\tilde{\mu }}(z)f_y(y,z) &{}\\ \quad \quad + \; a(\bar{z} - z)f_z(y,z) - \beta f(y,z) + 1 = 0, \,\, &{}{\mathrm{on}} \,\, B_R,\\ f(y,z) = 1, &{}{\mathrm{on}} \,\, \partial B_R, \end{array} \end{aligned}$$
(39)

and is given by

$$\begin{aligned} f(y,z) = \frac{1}{\beta } + \left( 1 - \frac{1}{\beta }\right) \mathbf {E}_{y,z}[ e^{-\beta t_R}], \end{aligned}$$
(40)

where \( t_R = \inf \left\{ t \ge 0 ; \sqrt{Y_t^2 + Z_t^2} = R \right\} . \) Moreover, for \(0< \gamma < 1,\)

$$\begin{aligned} Q^\tau (y,z) \ge \, - \log \left( \max \left\{ \frac{\beta }{1-\gamma }, 1 \right\} \right) - \sup _{\partial B_R} \{|\psi (y,z)|\}. \end{aligned}$$
(41)

Theorem 3.4

Let \(0<\tau \le 1\). Suppose \(\sigma _2^2(z) \in C^{1,\alpha }(\bar{B}_R),\) and that \(Q_\tau ^0\) is a solution of

$$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2} Q_{yy} + \frac{\sigma _3^2}{2} Q_{zz} - \tau H(y,z, Q, Q_y, Q_z, 0) = 0, \quad &{}on \quad B_R,\\ Q = 0, \quad &{}on \quad \partial B_R. \end{array} \end{aligned}$$
(42)

Then,

$$\begin{aligned} -\beta \mathbf {E}[{\bar{t}}_R] \le Q^0_\tau (y,z) \le \log (1 + \mathbf {E}[{\bar{t}}_R]), \end{aligned}$$
(43)

where \( {\bar{t}}_R = \inf \left\{ t\ge 0; \sqrt{\hat{Y}_t^2 + \hat{Z}_t^2} = R \right\} , \) and \(\hat{Y}_t, \hat{Z}_t\) are defined by

$$\begin{aligned} d\hat{Y}_t= & {} \tau {\tilde{\mu }}(\hat{Z}_t)\mathrm {d}t + \sigma _2(\hat{Z}_t)\mathrm {d}B_{2,t}, \quad \hat{Y}_0 = y, \\ d\hat{Z}_t= & {} \tau a(\bar{z} - \hat{Z}_t)\mathrm {d}t + \sigma _3 \mathrm {d}B_{3,t}, \quad \hat{Z}_0 = z. \end{aligned}$$

The proofs of Theorems 3.3 and 3.4 can be found in “Appendix G: and H:,” respectively.

The following theorem gives us the existence of solution to boundary value problem (30) with \(\psi = \tilde{Q},\) on the closed ball given by

$$\begin{aligned} {\bar{B}}_R = \{(y, z) \in \mathbb {R}^2 : y^2 + z^2 \le R^2\}. \end{aligned}$$

Theorem 3.5

Assume \(\sigma _2^2(z) \in C^{1,\alpha }({\bar{B}}_R), \) (22) has an ordered pair of subsolution–supersolution \(\langle \tilde{Q}, \hat{Q}\rangle \), and \(\tilde{Q} \in C^{2,\beta }({\bar{B}}_R)\) for some \(0<\beta \le 1\). Further assume that \(\hat{Q}\) is a supersolution of (34) for \(\tau =1\), and \(Q^\tau \) is a solution of (34) with \(0<\tau \le 1\). Then, the boundary value problem

$$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2} Q_{yy} + \frac{\sigma _3^2}{2} Q_{zz} - H(y,z, Q, Q_y, Q_z) = 0 \quad &{}on \quad B_R,\\ Q = \tilde{Q} \quad &{}on \quad \partial B_R, \end{array} \end{aligned}$$
(44)

has a unique solution in \(C^{2,\beta }({\bar{B}}_R)\).

Proof

Notice that (44) is equivalent to (30) with \(\psi = \tilde{Q}\). We use Theorem 3.2 to prove the existence of solution. Note that conditions (a), (b), and (c) are automatically satisfied. By Theorem 3.3, we have inequality (38):

$$\begin{aligned} e^{Q^\tau (y,z)} \le \tau e^{\hat{Q}(y,z)} + (1-\tau )f(y,z). \end{aligned}$$

Since \(0<\tau \le 1\) and \(f > 0\), we can write

$$\begin{aligned} e^{Q^\tau (y,z)} \le e^{\hat{Q}(y,z)} + f(y,z), \quad \text{ or } \quad Q^\tau (y,z) \le \log \big [e^{\hat{Q}(y,z)} + f(y,z)\big ] \le {\bar{M}}, \end{aligned}$$

where \(\bar{M}\equiv \max _{\bar{B}_R} \log \big [e^{\hat{Q}(y,z)} + f(y,z)\big ]\). Also by Theorem 3.3, we have the following inequality:

$$\begin{aligned} Q^\tau (y,z) \ge \, -\sup _{(y,z)\in \partial B_R} \{|\tilde{Q}(y,z)|\} - \log \left( \max \left\{ \frac{\beta }{1-\gamma }, 1 \right\} \right) \equiv \underline{M}. \end{aligned}$$
(45)

Note that \(\underline{M}\) and \({\bar{M}}\) are independent of both \(\tau \) and \(Q^\tau \). Here, we take

$$\begin{aligned} M \equiv \max \{|\underline{M}|, {\bar{M}}\} + 1. \end{aligned}$$
(46)

Then, we have the bound

$$\begin{aligned} |Q^\tau (y,z)| < M, \end{aligned}$$
(47)

where M is independent of \(\tau \) and \(Q^\tau \). Hence condition (d) of Theorem 3.2 is satisfied.

For condition (e), take \(\underline{c} \equiv \min \{\tilde{\sigma }_2^2, \sigma _3^2\}\) and \({\bar{c}} \equiv \max \{\hat{\sigma }_2^2, \sigma _3^2\}\). Then, for any \((\eta _1,\eta _2) \in \mathbb {R}^2,\)

$$\begin{aligned} \underline{c} (\eta _1^2 + \eta _2^2) \le \sigma _2^2(z) \eta _1^2 + \sigma _3^2 \eta _2^2 \le {\bar{c}} (\eta _1^2 + \eta _2^2). \end{aligned}$$

For the second part of condition (e), we have that for \(|z| \le M,\)

$$\begin{aligned}&|H(y,z,u,p,s,\tau )| + 2 \sigma _2(z)\left| \displaystyle \frac{d \sigma _2(z)}{d z}\right| \nonumber \\&\quad = \bigg |-\frac{\sigma _2^2(z)}{2} p^2 - \frac{\sigma _3^2}{2} s^2 - {\tilde{\mu }}(z) p - \alpha (\bar{z} - z)s - \gamma G(y, z, p)\nonumber \\&\qquad + \beta - \gamma r - (1-\gamma )e^{\frac{u}{\gamma -1}}\bigg | + 2 \sigma _2(z)\left| \displaystyle \frac{d \sigma _2(z)}{d z}\right| \nonumber \\&\quad \le C(1 + p^2 + s^2). \end{aligned}$$
(48)

Thus, condition (e) is satisfied, with \({\bar{k}}=2\).

By Theorem 3.4, we have estimate (43). Then, it is easy to get that \( |Q_0^\tau (y,z)| < M_1\), where \(M_1 = \max \big \{\beta \mathbf {E}[{\bar{t}}_R], \log (1 + \mathbf {E}[{\bar{t}}_R]) \big \} + C_1,\) for some constant \(C_1 > 0\). Thus, condition (f) of Theorem 3.2 is satisfied.

In (34), take \(\psi \equiv \tilde{Q}\). For the \(\tau = 0\) case, (34) has the unique solution

$$\begin{aligned} Q_0^\tau = \log f, \end{aligned}$$
(49)

where f is given by (40). Indeed, setting \(\tau = 0\) in (34) and substituting (49) into the equation satisfied on \(B_R\), we obtain

$$\begin{aligned}&\frac{\sigma _2^2(z)}{2}\left[ -\frac{1}{f^2}f_y^2 + \frac{1}{f}f_{yy}\right] + \frac{\sigma _3^2}{2}\left[ -\frac{1}{f^2} + \frac{1}{f}f_{zz}\right] + \frac{\sigma _2^2(z)}{2}\frac{1}{f^2}f_y^2 + \frac{\sigma _3^2}{2}\frac{1}{f^2}f_z^2 \\&\qquad +\, {\tilde{\mu }}(z)\frac{1}{f}f_y + \; a(\bar{z} - z)\frac{1}{f}f_z - \beta + \frac{1}{f} \\&\quad = \frac{1}{f}\left[ \frac{\sigma _2^2(z)}{2}f_{yy} + \frac{\sigma _3^2}{2}f_{zz} + {\tilde{\mu }}(z)f_y + a(\bar{z} - z)f_z - \beta f + 1\right] \\&\quad =0, \end{aligned}$$

by (39). On the boundary \(\partial B_R, \tau _R = 0\). Then, by (40) we see that \(f=1\), and so \(Q_0^\tau = 0\) on \(\partial B_R\). Therefore, \(Q_0^\tau = \log f\) is a solution to (34) for \(\tau = 0\). Uniqueness can be proven using a method similar to that in Corollary 3.1.

The solution for \(\tau =0\) corresponds to the solution of the log utility problem, which is the limit case of the HARA utility case when parameter \(\gamma \) goes to 0. We do not discuss the log utility case in detail in this paper as our main focus is on the nonlog HARA utility with \(0<\gamma <1\).

By Theorem 3.2, (44) has a solution in \(C^{2,\alpha }({\bar{B}}_R)\). For \(\tau = 1\), (34) is equivalent to (30), which is equivalent to (44) when \(\psi = \tilde{Q}\). Therefore, by Corollary 3.1, the solution to (44) is unique. \(\square \)

3.3 A Uniform Bound for \(\sup _{B_R} |DQ|^2\)

Before proving existence of a classical solution to (22), we must prove the existence of a uniform bound for \(\sup _{B_R} |DQ|^2\) for any R.

Theorem 3.6

Let \(Q_{{\tilde{R}}}\) be a smooth function satisfying HJB equation (22) in \(B_{{\tilde{R}}}\). For each \(R>0\) and \({{\tilde{R}}}>2R\), we have

$$\begin{aligned} \displaystyle \sup _{B_R} |D Q_{{\tilde{R}}}|^2 \le C_R + C(\beta - \gamma r), \end{aligned}$$
(50)

where C is a nonnegative constant independent of R and \({\tilde{R}}\), and \(C_R\) is a constant depending only on R.

To prove Theorem 3.6, we need the following lemma:

Lemma 3.4

Let \(Q(x) \in C^2(\mathbb {R}^N)\), and \(a^{i,j}(x) \in C^2(\mathbb {R}^N)\) for all ij ranging from 1 to N. Then,

$$\begin{aligned} \displaystyle \sum _{i,j,k=1}^N D_k a^{ij} D_k Q D_{ij} Q \le \frac{1}{2\epsilon } \left( \displaystyle \sum _{i,j=1}^N |Da^{ij}|^2 \right) |DQ|^2 + \frac{\epsilon }{2} |D^2 Q|^2, \end{aligned}$$
(51)

where \(\epsilon >0\) is a small constant.

The proof can be found in “Appendix F:”. \(\square \)

Now, we can give the proof of Theorem 3.6.

Proof of Theorem 3.6

We write G as defined in (19) in its quadratic form,

$$\begin{aligned} G(y,z,Q_y) = \gamma g_2 Q_y^2 + \gamma g_1 Q_y + \gamma g_0, \end{aligned}$$

where \(g_2, g_1,\) and \(g_0\) are defined by

$$\begin{aligned} g_2(y,z)= & {} \displaystyle \frac{(\sigma _2^2(z) + \rho \sigma _1\sigma _2(z)e^{-y})^2}{2(1-\gamma )q(y,z)}, \\ g_1(y,z)= & {} \displaystyle \frac{(be^{-y}+\mu - r)( \sigma _2^2(z) + \rho \sigma _1\sigma _2(z)e^{-y})}{(1-\gamma )q(y,z)},\\ g_0(y,z)= & {} \displaystyle \frac{(be^{-y} + \mu - r)^2}{2(1-\gamma )q(y,z)}. \end{aligned}$$

Then, \(Q_{{\tilde{R}}}\) satisfies

$$\begin{aligned}&\frac{\sigma _2^2(z)}{2}Q_{yy} + \frac{\sigma _3^2}{2}Q_{zz} + \frac{1}{2}\left( \sigma _2^2(z) + 2\gamma g_2(y,z)\right) Q_y^2 + \frac{\sigma _3^2}{2}Q_z^2 \nonumber \\&\quad +\, ({\tilde{\mu }}(z) + \gamma g_1(y,z)) Q_y + a(\bar{z} - z)Q_z + \gamma g_0(y,z) \nonumber \\&\quad +\, (1-\gamma )e^{\frac{Q}{\gamma -1}} = \beta - \gamma r \end{aligned}$$
(52)

on \(B_{{\tilde{R}}}\). Note that this is the case of \(be^{-y} + \mu - r \ge 0\), which is true if we assume \(\mu >r\). The more simple case of \(G = 0\) (if \(be^{-y} + \mu - r < 0\)) can be proved using the same steps as in this proof, so we do not prove this case explicitly.

For simplicity, we drop the subscript \({{\tilde{R}}}\) and set \(Q \equiv Q_{{\tilde{R}}}\). Differentiating (52) with respect to y and z, we obtain

$$\begin{aligned}&\frac{\sigma _2^2(z)}{2}Q_{yyy} + \frac{\sigma _3^2}{2}Q_{zz y} + (\sigma _2^2(z) + 2\gamma g_2)Q_y Q_{yy} + \gamma (g_2)_y Q_y^2 + \sigma _3^2 Q_z Q_{z y} \nonumber \\&\quad +\, ({\tilde{\mu }}(z) + \gamma g_1) Q_{yy} + \gamma (g_1)_y Q_y + a(\bar{z} - z)Q_{z y} + \gamma (g_0)_y - e^{\frac{Q}{\gamma -1}} Q_y = 0,\nonumber \\ \end{aligned}$$
(53)

and

$$\begin{aligned}&\frac{\sigma _2^2(z)}{2}Q_{yyz} + \sigma _2(z)\sigma _2'(z) Q_{yy} + \frac{\sigma _3^2}{2}Q_{zzz} + (\sigma _2^2(z) + 2\gamma g_2)Q_y Q_{yz} \nonumber \\&\quad +\, (\sigma _2(z)\sigma _2'(z) + \gamma (g_2)_z) Q_y^2 + \sigma _3^2 Q_z Q_{zz} + ({\tilde{\mu }}(z) + \gamma g_1) Q_{yz}\nonumber \\&\quad +\, ({\tilde{\mu }}'(z) + \gamma (g_1)_y) Q_y + a(\bar{z} - z)Q_{zz} - aQ_z + \gamma (g_0)_z - e^{\frac{Q}{\gamma -1}} Q_z = 0,\nonumber \\ \end{aligned}$$
(54)

respectively. Next, we take the sum of Eq. (53) multiplied by \(Q_y\) and Eq. (54) multiplied by \(Q_z\). Rearranging the result (to a form that is useful in a later step), we get

$$\begin{aligned}&-\frac{\sigma _2^2(z)}{2} Q_{yyy}Q_y - \frac{\sigma _3^2}{2}Q_{zz y}Q_y - \frac{\sigma _2^2(z)}{2}Q_{yyz} Q_z - \frac{\sigma _3^2}{2} Q_{zzz}Q_z \nonumber \\&\qquad -\, (\sigma _2^2(z) + 2\gamma g_2)(Q_y Q_{yy}Q_y + Q_y Q_{yz} Q_z) - \sigma _3^2 (Q_z Q_{z y}Q_y + Q_z Q_{zz} Q_z) \nonumber \\&\qquad -\, ({\tilde{\mu }}(z) + \gamma g_1) (Q_{yy}Q_y + Q_{yz}Q_z) - a(\bar{z} - z) (Q_{z y}Q_y + Q_{zz}Q_z) \nonumber \\&\quad = \sigma _2(z)\sigma _2'(z) Q_{yy} Q_z + \gamma (g_2)_y Q_y^3 + (\sigma _2(z)\sigma _2'(z) + \gamma (g_2)_z) Q_y^2 Q_z \nonumber \\&\qquad +\, \gamma (g_1)_y Q_y^2 + ({\tilde{\mu }}'(z) + \gamma (g_1)_z)Q_y Q_z - a Q_z^2 + \gamma (g_0)_y Q_y + \gamma (g_0)_z Q_z \nonumber \\&\qquad -\, e^{\frac{Q}{\gamma -1}}[Q_y^2+Q_z^2] \end{aligned}$$
(55)

Define

$$\begin{aligned} \varPhi \equiv \frac{1}{2}|DQ|^2 = \frac{1}{2}(Q_y^2 + Q_z^2). \end{aligned}$$
(56)

Then, we have

$$\begin{aligned} D_y \varPhi= & {} Q_y Q_{yy} + Q_z Q_{z y}, \quad D_z \varPhi = Q_y Q_{yz} + Q_z Q_{zz},\\ D_{yy} \varPhi= & {} Q_{yy}^2 + Q_y Q_{yyy} + Q_{yz}^2 + Q_z Q_{z yy}, \\ D_{zz}\varPhi= & {} Q_{yz}^2 + Q_z Q_{yzz} + Q_{zz}^2 + Q_z Q_{zzz}. \end{aligned}$$

By (55), we can get

$$\begin{aligned}&-\frac{\sigma _2^2(z)}{2}D_{yy}\varPhi - \frac{\sigma _3^2}{2}D_{zz}\varPhi - (\sigma _2^2(z) + 2\gamma g_2) Q_y D_y \varPhi - \sigma _3^2 Q_z D_z \varPhi \nonumber \\&\qquad -\, ({\tilde{\mu }}(z) + \gamma g_1)D_y \varPhi - a(\bar{z} - z)D_z \varPhi \nonumber \\&\quad = \sigma _2(z)\sigma _2'(z) Q_{yy} Q_z + \gamma (g_2)_y Q_y^3 + (\sigma _2(z)\sigma _2'(z) + \gamma (g_2)_z) Q_y^2 Q_z + \gamma (g_1)_y Q_y^2 \nonumber \\&\qquad +\, ({\tilde{\mu }}'(z) + \gamma (g_1)_z)Q_y Q_z - a Q_z^2\nonumber \\&\qquad +\, \gamma (g_0)_y Q_y + \gamma (g_0)_z Q_z \nonumber \\&\qquad -\, e^{\frac{Q}{\gamma -1}}(Q_y^2+Q_z^2)- \frac{\sigma _2^2(z)}{2}(Q_{yy}^2 + Q_{yz}^2) - \frac{\sigma _3^2}{2}(Q_{yz}^2 + Q_{zz}^2). \end{aligned}$$
(57)

By Lemma 3.4, we have that

$$\begin{aligned} \sigma _2(z)\sigma _2'(z)Q_z Q_{yy} \le \frac{1}{4\epsilon } (2\sigma _2(z) \sigma _2'(z))^2(Q_y^2 + Q_z^2) + \frac{\epsilon }{4}(Q_{yy}^2 + 2Q_{yz}^2 + Q_{zz}^2). \end{aligned}$$

Then, we can get the following inequality of the right-hand side (RHS) of (57):

$$\begin{aligned} \text{ RHS } \text{ of } (57)\le & {} \frac{1}{4\epsilon } (2\sigma _2(z) \sigma _2'(z))^2(Q_y^2 + Q_z^2) + \frac{\epsilon }{4}(Q_{yy}^2 + 2Q_{yz}^2 + Q_{zz}^2)\\&+\, \gamma (g_2)_y Q_y^3 + (\sigma _2(z)\sigma _2'(z) + \gamma (g_2)_z) Q_y^2 Q_z + \gamma (g_1)_y Q_y^2\\&+\, ({\tilde{\mu }}'(z) + \gamma (g_1)_z)Q_y Q_z - a Q_z^2 + \gamma (g_0)_y Q_y + \gamma (g_0)_z Q_z\\&-\, e^{\frac{Q}{\gamma -1}}(Q_y^2+Q_z^2) - \frac{\sigma _2^2(z)}{4}(Q_{yy}^2 + Q_{yz}^2) - \frac{\sigma _3^2}{4}(Q_{yz}^2 + Q_{zz}^2) \\&-\, \frac{\sigma _2^2(z)}{4}(Q_{yy}^2 + Q_{yz}^2) - \frac{\sigma _3^2}{4}(Q_{yz}^2 + Q_{zz}^2)\\\le & {} \frac{1}{\epsilon } (\sigma _2(z) \sigma _2'(z))^2(Q_y^2 + Q_z^2) \\&+\, \gamma (g_2)_y Q_y^3 + (\sigma _2(z)\sigma _2'(z) + \gamma (g_2)_z) Q_y^2 Q_z + \gamma (g_1)_y Q_y^2 \\&+\, ({\tilde{\mu }}'(z) + \gamma (g_1)_z)Q_y Q_z - a Q_z^2 + \gamma (g_0)_y Q_y + \gamma (g_0)_z Q_z \\&-\, e^{\frac{Q}{\gamma -1}}(Q_y^2 + Q_z^2)- \frac{\sigma _2^2(z)}{4}(Q_{yy}^2 + Q_{yz}^2) - \frac{\sigma _3^2}{4}(Q_{yz}^2 + Q_{zz}^2), \end{aligned}$$

where the last step is based on the fact that

$$\begin{aligned}&\frac{\epsilon }{4}(Q_{yy}^2 + 2Q_{yz}^2 + Q_{zz}^2) - \frac{\sigma _2^2(z)}{4}(Q_{yy}^2 + Q_{yz}^2) - \frac{\sigma _3^2}{4}(Q_{yz}^2 + Q_{zz}^2)\\&\quad = \frac{1}{4}\Big [(\epsilon - \sigma _2^2(z))Q_{yy}^2 + (2\epsilon - \sigma _2^2(z) - \sigma _3^2)Q_{yz}^2 + (\epsilon - \sigma _3^2) Q_{zz}^2\Big ]\\&\quad \le 0, \end{aligned}$$

for any constant \(\epsilon \) such that \(0 < \epsilon \le \min \{\tilde{\sigma }_2^2,\sigma _3^2\},\) and \(\tilde{\sigma }_2\) given by (6).

Consider the matrix inequality \((tr(AB))^2 \le N \nu _2 (tr(AB^2))\), where A and B are \(N \times N\) symmetric matrices, A is positive semidefinite, and \(\nu _2\) is the maximum eigenvalue of A. We use this inequality with \(A=\begin{bmatrix}\sigma _2^2(z)&0\\ 0&\sigma _3^2\end{bmatrix}\) and \(B = \begin{bmatrix} Q_{yy}&Q_{yz} \\ Q_{yz}&Q_{zz} \end{bmatrix}\) to get

$$\begin{aligned} -\frac{1}{4}\Big [\sigma _2^2(z) Q_{yy}^2 + (\sigma _2^2(z) + \sigma _3^2)Q_{yz}^2 + \sigma _3^2 Q_{zz}^2\Big ] \le -\frac{1}{8\nu _2}(\sigma _2^2(z)Q_{yy} + \sigma _3^2 Q_{zz})^2. \end{aligned}$$

Then, we have

$$\begin{aligned}&-\frac{\sigma _2^2(z)}{2}D_{yy}\varPhi - \frac{\sigma _3^2}{2}D_{zz}\varPhi - (\sigma _2^2(z) + 2\gamma g_2) Q_y D_y \varPhi - \sigma _3^2 Q_z D_z \varPhi \nonumber \\&\qquad -\, ({\tilde{\mu }}(z) + \gamma g_1)D_y \varPhi - a(\bar{z} - z)D_z \varPhi \nonumber \\&\quad \le C_R |DQ| + C_R|DQ|^2 + C_R|DQ|^3 - \frac{1}{8\nu _2}\left( \sigma _2^2(z) Q_{yy} + \sigma _3^2 Q_{zz}\right) ^2 \quad \text {in} \,\; B_{2R}.\nonumber \\ \end{aligned}$$
(58)

We use \(C_R\) to represent an arbitrary constant depending only on R, and we use C for a nonnegative constant independent of R and \({{\tilde{R}}}\).

Fix arbitrary \(\xi \in B_R\), and let \(B_R(\xi )\) be an open ball with radius R and center \(\xi \). Let \(\phi \in C^\infty _0(\mathbb {R}^2)\) be a cutoff function such that

$$\begin{aligned} 0\le \phi \le 1 \,\; \text {in} \,\; \mathbb {R}^2, \ \phi (\xi ) = 1, \,\, \phi \equiv 0 \,\; \text {in} \,\; (B_R(\xi ))^c,\,\; |D\phi | \le C\phi ^{1/2},\,\; |D^2 \phi | \le C. \end{aligned}$$

Suppose the maximum of \(\phi \varPhi \) in \({\bar{B}}_R(\xi )\) is attained at \((y_0, z_0)\). By the maximum principle, at \((y_0,z_0)\) we have

$$\begin{aligned} D_i(\phi \varPhi ) = \phi D_i \varPhi + \varPhi D_i \phi = 0, \quad \text {and} \quad \sum _{i,j}D_{i,j}(\phi \varPhi ) \eta _i \eta _j \le 0, \end{aligned}$$

where the second condition means that the Hessian of \((\phi \varPhi )(y_0,z_0)\) is negative semidefinite. Then, at \((y_0,z_0)\), we have

$$\begin{aligned}&0 \le -\frac{\sigma _2^2(z)}{2}D_{yy}(\phi \varPhi ) - \frac{\sigma _3^2}{2} D_{zz}(\phi \varPhi ) - (\sigma _2^2(z) + 2\gamma g_2) Q_y D_y(\phi \varPhi ) \nonumber \\&\qquad - \sigma _3^2 Q_z D_z(\phi \varPhi ) - ({\tilde{\mu }}(z) + \gamma g_1)D_y(\phi \varPhi ) - a(\bar{z} - z)D_z(\phi \varPhi )\nonumber \\&\quad = -\frac{\sigma _2^2(z)}{2}\phi D_{yy}\varPhi - \sigma _2^2(z) D_y \phi D_y \varPhi - \frac{\sigma _2^2(z)}{2} \varPhi D_{yy}\phi \nonumber \\&\qquad - \frac{\sigma _3^2}{2} \phi D_{zz}\varPhi - \sigma _3^2D_z \phi D_z \varPhi \nonumber \\&\qquad - \frac{\sigma _3^2}{2}\varPhi D_{zz}\phi - (\sigma _2^2(z) + 2\gamma g_2) Q_y (\phi D_y \varPhi + \varPhi D_y \phi ) \nonumber \\&\qquad - \sigma _3^2 Q_z (\phi D_z \varPhi + \varPhi D_z \phi )\nonumber \\&\qquad - ({\tilde{\mu }}(z) + \gamma g_1)(\phi D_y \varPhi + \varPhi D_y \phi ) - a(\bar{z} - z)(\phi D_z \varPhi + \varPhi D_z \phi )\nonumber \\&\quad = \phi \bigg \{ -\frac{\sigma _2^2(z)}{2} D_{yy}\varPhi - \frac{\sigma _3^2}{2} D_{zz}\varPhi - (\sigma _2^2(z) + 2\gamma g_2) Q_y D_y \varPhi - \sigma _3^2 Q_z D_z \varPhi \nonumber \\&\qquad - ({\tilde{\mu }}(z) + \gamma g_1) D_y \varPhi - a(\bar{z} - z) D_z \varPhi \bigg \} - \frac{\sigma _2^2(z)}{2} \varPhi D_{yy}\phi - \frac{\sigma _3^2}{2}\varPhi D_{zz}\phi \nonumber \\&\qquad - \sigma _2^2(z) D_y \phi D_y \varPhi - \sigma _3^2D_z \phi D_z \varPhi - (\sigma _2^2(z) + 2\gamma g_2) \varPhi Q_y D_y \phi - \sigma _3^2 \varPhi Q_z D_z \phi \nonumber \\&\qquad - ({\tilde{\mu }}(z) + \gamma g_1)\varPhi D_y \phi - a(\bar{z} - z)\varPhi D_z \phi \nonumber \\&\quad \le \phi \bigg \{ -\frac{\sigma _2^2(z)}{2} D_{yy}\varPhi - \frac{\sigma _3^2}{2} D_{zz}\varPhi - (\sigma _2^2(z) + 2\gamma g_2) Q_y D_y \varPhi - \sigma _3^2 Q_z D_z \varPhi \nonumber \\&\qquad - ({\tilde{\mu }}(z) + \gamma g_1) D_y \varPhi - a(\bar{z} - z) D_z \varPhi \bigg \} + C_R \varPhi + C \phi ^{1/2} \varPhi ^{3/2} \nonumber \\&\quad \le \phi \bigg \{C_R |DQ| + C_R|DQ|^2 + C_R|DQ|^3 - \frac{1}{8\nu _2}\left( \sigma _2^2(z) Q_{yy} + \sigma _3^2 Q_{zz}\right) ^2 \bigg \} \nonumber \\&\qquad + C_R \varPhi + C \phi ^{1/2} \varPhi ^{3/2}\nonumber \\&\quad = \phi \bigg \{C_R |DQ| + C_R|DQ|^2 + C_R|DQ|^3 \nonumber \\&\qquad - \frac{1}{2\nu _2}\Big [-\left( \frac{\sigma _2^2(z)}{2} + \gamma g_2\right) Q_y^2 - \frac{\sigma _3^2}{2}Q_z^2\nonumber \\&\qquad - ({\tilde{\mu }}(z) + \gamma g_1) Q_y - a(\bar{z} - z)Q_z - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}}\Big ]^2 \bigg \} \nonumber \\&\qquad + C_R \varPhi + C \phi ^{1/2} \varPhi ^{3/2}, \end{aligned}$$
(59)

where the last two steps are based on (58) and (52).

Note that there exist \(0< \mu _1 < \mu _2\) such that

$$\begin{aligned} \mu _1|\eta |^2 \le (\sigma _2^2(z) + 2\gamma g_2)\eta _1^2 + \sigma _3^2\eta _2^2 \le \mu _2|\eta |^2 \quad \text {for all} \,\, (y,z), \eta \in \mathbb {R}^2. \end{aligned}$$

In fact, we can take \(\mu _1 = \min \{L^2,\sigma _3^2\}\) and \(\mu _2 = \max \{U^2 + \gamma {\tilde{C}}_2\}\), where \({\tilde{C}}_2\) is the bound of the coefficient to \(p^2\) in (36). Then, we have

$$\begin{aligned}&-\frac{1}{2}\big (\sigma _2^2(z) + 2\gamma g_2\big )Q_y^2 - \frac{\sigma _3^2}{2}Q_z^2 - ({\tilde{\mu }}(z) + \gamma g_1) Q_y - a(\bar{z} - z)Q_z\nonumber \\&\qquad -\, \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}} \nonumber \\&\quad \le -\mu _1 |DQ|^2 + C_R|DQ| - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}}\nonumber \\&\quad \le -\kappa |DQ|^2 + C_R - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}}, \end{aligned}$$
(60)

where \(\kappa >0\) is a constant that depends only on \(\mu _1\), and the last step follows from the identity \(|DQ| \le \frac{1}{2}(|DQ|^2 + 1)\).

Now we split the problem up into cases. First, consider the case

$$\begin{aligned} -\kappa |DQ|^2 + C_R - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}} \ge 0, \quad \text{ at } (y_0,z_0)\text{. } \end{aligned}$$

Then, it is easy to see that

$$\begin{aligned} \kappa |DQ|^2 (y_0,z_0) \le C_R + \beta - \gamma r. \end{aligned}$$

Noting that \((y_0,z_0)\) is the maximizer of \(\phi \varPhi \) in \(\bar{B}_R(\xi )\), we can get that

$$\begin{aligned} \frac{1}{2}|DQ|^2(\xi ) = \varPhi (\xi )\phi (\xi ) \le \varPhi (y_0,z_0)\phi (y_0,z_0). \end{aligned}$$

Therefore,

$$\begin{aligned} \kappa |DQ|^2 (\xi )\le & {} \kappa |DQ|^2 (y_0,z_0)\phi (y_0,z_0) \le \phi (y_0,z_0)[C_R + \beta - \gamma r] \\\le & {} C_R + \beta - \gamma r. \end{aligned}$$

So in this case, we have

$$\begin{aligned} |DQ|^2 (\xi ) \le C_R + \frac{1}{\kappa }(\beta - \gamma r). \end{aligned}$$
(61)

Next, consider the case \( -\kappa |DQ|^2 + C_R - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}} \le 0\) at \((y_0,z_0)\). By virtue of (60), we obtain

$$\begin{aligned} \text{ RHS } \text{ of } (59)\le & {} \phi \bigg \{C_R |DQ| + C_R|DQ|^2 + C_R|DQ|^3 - \frac{1}{2\nu _2}\Big [-\kappa |DQ|^2 + C_R \nonumber \\&-\, \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}}\Big ]^2 \bigg \} + C_R \varPhi + C \phi ^{1/2} \varPhi ^{3/2}\nonumber \\= & {} \phi \bigg \{C_R |DQ| + C_R|DQ|^2 + C_R|DQ|^3 - \frac{1}{2\nu _2}\Big [\kappa ^2 |DQ|^4\nonumber \\&-\, 2\kappa |DQ|^2\Big (C_R - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}}\Big ) \nonumber \\&+\, \Big (C_R - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}}\Big )^2\Big ] \bigg \} \nonumber \\&+\, C_R \varPhi + C \phi ^{1/2} \varPhi ^{3/2} \nonumber \\\le & {} \phi \bigg \{C_R \varPhi ^{1/2} + C_R\varPhi + C_R\varPhi ^{3/2} - \frac{2\kappa ^2}{\nu _2}\varPhi ^2 \nonumber \\&+\, \frac{2\kappa }{\nu _2}\varPhi \Big (C_R - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}}\Big )\Big ] \bigg \} \nonumber \\&+\, C_R \varPhi + C \phi ^{1/2} \varPhi ^{3/2}\nonumber \\= & {} \phi \bigg \{C_R \varPhi ^{1/2} + C_R\varPhi + C_R\varPhi ^{3/2} \nonumber \\&- \,\frac{2\kappa }{\nu _2}\varPhi \Big [\kappa \varPhi - \Big (C_R - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}}\Big )\Big ] \bigg \} \nonumber \\&+\, C_R \varPhi + C \phi ^{1/2} \varPhi ^{3/2}, \end{aligned}$$
(62)

at \((y_0,z_0)\). If \(\kappa \varPhi \le C_R - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}} = C_R + \beta - \gamma r,\) then,

$$\begin{aligned} \kappa \varPhi (\xi ) = \kappa \phi (\xi ) \varPhi (\xi ) \le \kappa \phi (y_0,z_0) \varPhi (y_0,z_0) \le \kappa \varPhi (y_0,z_0) \le C_R + \beta - \gamma r,\nonumber \\ \end{aligned}$$
(63)

so that

$$\begin{aligned} |DQ|^2(\xi ) \le C_R + \frac{2}{\kappa }(\beta - \gamma r). \end{aligned}$$
(64)

If \(C_R \ge \varPhi (y_0,z_0)\), we have a similar result:

$$\begin{aligned} |DQ|^2(\xi ) \le C_R \le C_R + C(\beta - \gamma r), \end{aligned}$$
(65)

for some nonnegative constant C.

Finally, suppose both \(\kappa \varPhi \ge C_R - \gamma g_0 + \beta - \gamma r - (1-\gamma )e^{\frac{Q}{\gamma -1}} = C_R + \beta - \gamma r\) and \(C_R \le \varPhi (y_0,z_0)\). Then,

$$\begin{aligned} \text{ RHS } \text{ of } (62)\le & {} \phi \bigg \{\varPhi ^{3/2} + C_R\varPhi + C_R\varPhi ^{3/2} \bigg \} + C_R \varPhi + C \phi ^{1/2} \varPhi ^{3/2}\nonumber \\\le & {} C_1\phi \varPhi ^2 + C_2 \phi ^{1/2}\varPhi ^{3/2} + C_3 C_R \varPhi \quad \text {at} \, (y_0,z_0), \; \end{aligned}$$
(66)

where \(C_1, C_2\), and \(C_3\) are positive constants independent of \(R, {{\tilde{R}}},\) and \((\beta - \gamma r)\). In above inequality (66), we used the inequality

$$\begin{aligned} C_R\varPhi ^{3/2} = C_R\varPhi ^{1/2}\varPhi \le \frac{1}{2}(C_R^2\varPhi + \varPhi ^2) , \end{aligned}$$

and the fact that \(\phi \le 1\).

Let \(Y \equiv ( \phi (y_0,z_0)\varPhi (y_0,z_0))^{\frac{1}{2}}\). Then, from (66) we have

$$\begin{aligned} 0 \le - C_1 Y^2 + C_2 Y + {\tilde{C}}_3, \end{aligned}$$
(67)

where \({\tilde{C}}_3 = C_3C_R\). The quadratic on the right-hand side is concave down. Since the quadratic is greater than or equal to zero, Y is bounded by the zeros of the quadratic. This, along with the fact that \(Y \ge 0\), implies

$$\begin{aligned} \phi \varPhi (y_0,z_0)= & {} Y^2 \le \left( \frac{C_2 + \sqrt{C_2 + 4 C_1{\tilde{C}}_3}}{2C_1}\right) ^2 \\\le & {} \frac{C_2^2}{2C_1^2} + \frac{C_2^2 + 4C_1{\tilde{C}}_3}{2C_1^2} = \frac{C_2^2}{C_1^2} + \frac{2C_3C_R}{C_1}. \end{aligned}$$

By virtue of \(|DQ|^2(\xi ) \le \phi (y_0,z_0)|DQ|^2(y_0,z_0)\), we obtain the bound

$$\begin{aligned} |DQ|^2(\xi ) \le C_R + C. \end{aligned}$$

In each case, we have a bound for \(|DQ|^2\) in \(B_{{\tilde{R}}}\) that can be written in the form \(C_R + C(\beta - \gamma r)\), where \(C_R\) is a constant depending only on R, and C is a positive constant. \(\square \)

3.4 Existence of Solution to HJB Equation (22)

Now we are ready to prove the existence result.

Proof of Theorem 3.1

By Theorem 3.5, there is a unique solution \(Q^l\) to

$$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2} Q_{yy} + \frac{\sigma _3^2}{2} Q_{zz} - H(y,z, Q, Q_y, Q_z) = 0, \quad &{}\text {on} \quad B_l,\\ Q = \tilde{Q}, \quad &{}\text {on} \quad \partial B_l, \end{array} \end{aligned}$$
(68)

for \(l = 1, 2, 3, \ldots \) Since \(\tilde{Q}\) and \(Q^{l+1}\) satisfy the following,

$$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2} \tilde{Q}_{yy} + \frac{\sigma _3^2}{2} \tilde{Q}_{zz} - H(y,z, \tilde{Q}, \tilde{Q}_y, \tilde{Q}_z) \ge 0, &{}\forall (y,z)\in \mathbb {R}^2,\\ \frac{\sigma _2^2(z)}{2} Q^{l+1}_{yy} + \frac{\sigma _3^2}{2} Q^{l+1}_{zz} - H(y,z, Q^{l+1}, Q^{l+1}_y, Q^{l+1}_z) = 0, &{}\text {on} \ B_{l+1},\\ \tilde{Q} = Q^{l+1}, &{}\text {on} \ \partial B_{l+1}, \end{array} \end{aligned}$$
(69)

we see that they satisfy (37) with \(\tau =1\). Thus, by Lemma 3.3,

$$\begin{aligned} \tilde{Q} \le Q_{l+1} \quad \text {in} \,\, {\bar{B}}_{l+1}. \end{aligned}$$

In particular,

$$\begin{aligned} \tilde{Q} \le Q^{l+1} \quad \text {on} \,\, \partial B_{l}. \end{aligned}$$

On \(\partial B_l\), we also have \(Q^l = \tilde{Q}\). Therefore,

$$\begin{aligned} Q^l \le Q^{l+1} \quad \text {on} \,\, \partial B_l. \end{aligned}$$

Now we see that \(Q^l\) and \(Q^{l+1}\) satisfy

$$\begin{aligned} \begin{array}{ll} \frac{\sigma _2^2(z)}{2} Q^l_{yy} + \frac{\sigma _3^2}{2} Q^l_{zz} - H(y,z, Q^l, Q^l_y, Q^l_z) = 0, \quad &{}\text {on} \quad B_l,\\ \frac{\sigma _2^2(z)}{2} Q^{l+1}_{yy} + \frac{\sigma _3^2}{2} Q^{l+1}_{zz} - H(y,z, Q^{l+1}, Q^{l+1}_y, Q^{l+1}_z) = 0, \quad &{}\text {on} \quad B_{l},\\ Q^l \le Q^{l+1}. \quad &{}\text {on} \quad \partial B_{l}. \end{array} \end{aligned}$$
(70)

By Lemma 3.3 again,

$$\begin{aligned} Q^l \le Q^{l+1} \quad \text {in} \,\, B_l. \end{aligned}$$

Thus, for any k such that \(|(y, z)| \le k, Q^l(y,z)\) is nondecreasing in l for \(l > k,\) and is bounded above by \(\hat{Q}\). So by taking \(l \rightarrow \infty \), we can get that \(\{Q^l(y,z)\)} converges pointwise to some function \({\tilde{Q}}(y,z)\).

By Theorem 3.6, we have a uniform bound for \(|DQ^l|^2\) on \({\bar{B}}_R\) for any \(R > 0\). By the Arzela–Ascoli theorem, \(\{Q^l\}\) contains a subsequence that converges to a function \( Q \in C^{2,\beta }({\bar{B}}_R)\) as \(l \rightarrow \infty \). Since \(\{Q^l\}\) also converges pointwise to \({\tilde{Q}}\), we must have \({\tilde{Q}} \equiv Q\). By (48) and (50), it follows that \({\tilde{Q}}\) is a solution to (22) on \({\bar{B}}_R\) for any R. Take R to infinity, and we can get that \({\tilde{Q}}\) is a solution to (22) on \(\mathbb {R}^2\). \(\square \)

4 The Verification Theorem

In Sect. 3, we prove existence of a classical solution \({\tilde{Q}}(y,z)\) to HJB equation (22). In this section, we prove that \({\tilde{V}} = \frac{1}{\gamma }x^\gamma e^{{\tilde{Q}}}\) is equal to the value function given by (13). In effect, we maximize the expected discounted utility of an investor whose net worth depends on stochastic dividends and stochastic volatility of stock price.

We begin by stating a useful result.

Lemma 4.1

Let \({\tilde{Q}}(y,z)\) be a classical solution to (22) such that

$$\begin{aligned} K_1 \le {\tilde{Q}} \le K_2, \end{aligned}$$

where \(K_1\) and \(K_2\) are defined by (29). Define \(k^*(y,z)\) and \( c^*(y,z)\) by (20) and (21), respectively. Then, under the control policies \((k^*, c^*)\), we have

$$\begin{aligned} \mathbf {E}[X_T^m] < \infty \end{aligned}$$
(71)

for all fixed \(T\ge 0\) and \(m > 0\).

The proof can be found in “Appendix I:”. \(\square \)

We now state and prove the verification theorem.

Theorem 4.1

(Verification Theorem) Suppose \(0< \gamma < 1\) and (28) holds. Let \(\tilde{Q}(y,z)\) denote a classical solution of (22) which satisfies \(K_1 \le {\tilde{Q}} \le K_2\), where \(K_1, K_2\) are defined by (29). Denote

$$\begin{aligned} {\tilde{V}}(x,y,z) \equiv \frac{1}{\gamma }x^\gamma e^{{\tilde{Q}}(y,z)}. \end{aligned}$$
(72)

Then, we have

$$\begin{aligned} {\tilde{V}}(x,y,z) \equiv V(x,y,z), \end{aligned}$$

where V(xyz) is the value function defined by (13). Moreover, the optimal control policy is

$$\begin{aligned} k^*(y,z)= & {} \left[ \displaystyle \frac{be^{-y} + \mu - r + (\sigma _2(z)^2 + \rho \sigma _1\sigma _2(z) e^{-y})\tilde{Q}_y(y,z)}{(1-\gamma )q(y,z)}\right] ^+, \end{aligned}$$
(73)
$$\begin{aligned} c^*(y,z)= & {} e^{\frac{{\tilde{Q}}(y,z)}{\gamma -1}}. \end{aligned}$$
(74)

Before we give the proof of the above theorem, we like to make the following remark:

Remark 4.1

From (22), we can get that its solution \(\tilde{Q}\) not only depends on the average dividend rate b, but also depends on the stochastic dividend volatility parameter \(\sigma _1\), and it is true for the value function V, too. In addition, the optimal investment control \(k^*\), given by (73), and the optimal consumption control \(c^*\), given by (74), both depend on b and \(\sigma _1\) explicitly or implicitly through the dependence on \(\tilde{Q}\). Therefore, under the stochastic dividend case, both the average dividend rate (in terms of b) and the volatility of the dividend rate (in terms of \(\sigma _1\)) have some effects on the value function as well as the optimal investment and consumption controls.

Proof of Theorem 4.1

Since \({\tilde{Q}}\) is a classical solution of (22), it is not hard to show that \({\tilde{V}}\), given by (72), is a classical solution of (14). For any admissible control \((k_t,c_t) \in \varPi \), using Ito’s rule, we can get

$$\begin{aligned} d{\tilde{V}}(X_t,Y_t,Z_t)= & {} \Big [\frac{k_t^2X_t^2 q(Y_t,Z_t)}{2} {\tilde{V}}_{xx}+ \frac{\sigma _2(z)^2}{2}{\tilde{V}}_{yy} + \frac{\sigma _3^2}{2}{\tilde{V}}_{zz} \\&+\, (\sigma _2(Z_t)^2 +\rho \sigma _1\sigma _2(Z_t) e^{-Y_t})k_tX_t {\tilde{V}}_{xy}+ {\tilde{\mu }}(Z_t) {\tilde{V}}_y \\&+\, (r-c_t)X_t{\tilde{V}}_x + (be^{-Y_t} + \mu - r) k_tX_t {\tilde{V}}_x+ a(\bar{z} - Z_t){\tilde{V}}_z \Big ]\mathrm {d}t \\&+ \sigma _1 e^{-Y_t} k_t X_t {\tilde{V}}_x \mathrm {d}B_{1,t} + \Big [\sigma _2(Z_t) k_tX_t {\tilde{V}}_x + {\tilde{\sigma }}_1(Z_t) {\tilde{V}}_y \Big ] \mathrm {d}B_{2,t}\\&+ \,\sigma _3 {\tilde{V}}_z \mathrm {d}B_{3,t}. \end{aligned}$$

Further, by Ito’s rule, we have

$$\begin{aligned} d[e^{-\beta t}{\tilde{V}}(X_t,Y_t, Z_t)] = e^{-\beta t} d\tilde{V}(X_t,Y_t,Z_t)-\beta e^{-\beta t} {\tilde{V}}(X_t, Y_t, Z_t)\mathrm {d}t . \end{aligned}$$
(75)

Then, using the fact that \(\tilde{V}\) is a solution of (14), we can get

$$\begin{aligned}&e^{-\beta T} {\tilde{V}}(X_T, y_T,z_T) - {\tilde{V}}(x,y,z) \nonumber \\&\quad = \int _0^T e^{-\beta t} \mathrm{d}{\tilde{V}}(X_t, Y_t,Z_t) - \int _0^T \beta e^{-\beta t} {\tilde{V}} (X_t,Y_t,Z_t)\mathrm{d}t \nonumber \\&\quad = \int _0^T e^{-\beta t} \bigg [ \frac{k_t^2X_t^2 q(Y_t,Z_t)}{2} {\tilde{V}}_{xx} +\frac{\sigma _2(Z_t)^2}{2}{\tilde{V}}_{yy} + \frac{\sigma _3^2}{2}{\tilde{V}}_{zz} + {\tilde{\mu }}(Z_t) {\tilde{V}}_y \nonumber \\&\qquad +\, (r- c_t)X_t {\tilde{V}}_x+ (be^{-Y_t} + \mu - r) k_tX_t {\tilde{V}}_x + a(\bar{z} - Z_t){\tilde{V}}_z \nonumber \\&\qquad +\, k_tX_t (\sigma _2(Z_t)^2 + \rho \sigma _1\sigma _2(Z_t) e^{-Y_t}) {\tilde{V}}_{xy} - \beta {\tilde{V}} (X_t,Y_t,Z_t)\bigg ]\mathrm{d}t \nonumber \\&\qquad +\, m_{1, T} +m_{2, T}+m_{3, T}\nonumber \\&\quad \le - \int _0^T e^{-\beta t} \frac{1}{\gamma }(c_tX_t)^\gamma \mathrm{d}t + m_{1, T} + m_{2,T} + m_{3, T}, \end{aligned}$$
(76)

where

$$\begin{aligned}&m_{1, T} \equiv \int _0^T \sigma _1 e^{-Y_t} k_t X_t {\tilde{V}}_x \mathrm{d}B_{1,t}, \\&m_{2,T} \equiv \int _0^T \Big (\sigma _2(Z_t) k_tX_t {\tilde{V}}_x + {\tilde{\sigma }}_1(Z_t) {\tilde{V}}_y \Big ) \mathrm{d}B_{2,t}, \\&m_{3, T} \equiv \int _0^T \sigma _3 {\tilde{V}}_z \mathrm{d}B_{3,t} . \end{aligned}$$

Then, we can get

$$\begin{aligned} {\tilde{V}}(x,y,z)\ge & {} \int _0^T e^{-\beta t} \frac{1}{\gamma }(c_tX_t)^\gamma \mathrm{d}t + e^{-\beta T} {\tilde{V}}(X_T,y_T, Z_T) \nonumber \\&-\, m_{1, T} - m_{2,T} - m_{3, T}. \end{aligned}$$
(77)

It is easy to show that \(m_{1, T}, m_{1, T} \), and \(m_{3, T} \) are local martingales. Define

$$\begin{aligned} \tau _R\equiv \inf _{t\ge 0} \{t: X_t^2 + Y_t^2 + Z_t^2 = R^2 \}. \end{aligned}$$

Then, we have

$$\begin{aligned} \mathbf {E} [m_{1, {T\wedge \tau _R}}] =\mathbf {E}[ m_{2, T\wedge \tau _R}] = \mathbf {E}[ m_{3, T\wedge \tau _R}] = 0. \end{aligned}$$

Replacing T with \(T \wedge \tau _R\) in (77) and taking expectations, we arrive at the following:

$$\begin{aligned} {\tilde{V}}(x,y,z)\ge & {} \mathbf {E}\left[ \int _0^{T\wedge \tau _R} e^{-\beta t}\frac{1}{\gamma }(c_tX_t)^\gamma \mathrm{d}t +e^{-\beta T\wedge \tau _R} {\tilde{V}}(X_{T\wedge \tau _R}, Y_{T\wedge \tau _R}, Z_{T\wedge \tau _R})\right] \\\ge & {} \mathbf {E}\left[ \int _0^{T\wedge \tau _R} e^{-\beta t}\frac{1}{\gamma }(c_tX_t)^\gamma \mathrm{d}t\right] . \end{aligned}$$

The second inequality is true because \({\tilde{V}} > 0\). Now let R approach infinity, and by Fatou’s lemma, we can get that

$$\begin{aligned} {\tilde{V}}(x,y,z) \ge \liminf _{R \rightarrow \infty } \mathbf {E} \left[ \int _0^{T \wedge \tau _R} e^{-\beta t}\frac{1}{\gamma }(c_tX_t)^\gamma \mathrm{d}t\right] \ge \mathbf {E}\left[ \int _0^T e^{-\beta t}\frac{1}{\gamma }(c_tX_t)^\gamma \mathrm{d}t \right] . \end{aligned}$$

Now, letting \(T \rightarrow \infty \) and using the monotone convergence theorem, we have

$$\begin{aligned} {\tilde{V}}(x,y,z)\ge & {} \lim _{T \rightarrow \infty } \mathbf {E}\left[ \int _0^T e^{-\beta t}\frac{1}{\gamma }(c_tX_t)^\gamma \mathrm{d}t\right] \nonumber \\\ge & {} \mathbf {E}\left[ \int _0^\infty e^{-\beta t}\frac{1}{\gamma }(c_tX_t)^\gamma \mathrm{d}t \right] . \end{aligned}$$
(78)

Since (78) holds for all arbitrary values of \((k_t,c_t) \in \varPi \), we have

$$\begin{aligned} {\tilde{V}}(x,y,z) \ge \sup _{(k_t, c_t) \in \varPi } \mathbf {E}\left[ \int _0^\infty e^{-\beta t}\frac{1}{\gamma }(c_tX_t)^\gamma \mathrm{d}t\right] = V(x,y,z). \end{aligned}$$
(79)

Next we show the reverse inequality, \({\tilde{V}}(x,y,z) \le V(x,y,z)\). For \((k^*, c^*)\) given by (73) and (74), it is easy to check that \((k_t^*,c_t^*) \in \varPi \). Using \(k_t^*\) and \(c_t^*\) instead of arbitrary \(k_t, c_t > 0\), we have equality in (77):

$$\begin{aligned} {\tilde{V}}(x,y,z)= & {} \int _0^T e^{-\beta t}\frac{1}{\gamma }(c^*_tX_t)^\gamma \mathrm{d}t + e^{-\beta T} \tilde{V}(X_T,Y_T,Z_T) \nonumber \\&-\, m^*_{1,T} - {\tilde{m}}_{2,T}^* - m^*_{3,T}, \end{aligned}$$
(80)

where \(m^*_{1,T}, m^*_{2,T},\) and \( m^*_{3,T}\) are equal to the expressions for \(m_{1, T}, m_{2,T},\) and \(m_{3, T}\), respectively, with arbitrary \(k_t\) and \(c_t\) replaced with \(k^*_t\) and \(c^*_t\). It is not hard to verify that \(m^*_{1,T}, m^*_{2,T},\) and \( m^*_{3,T}\) are martingales (see the proof of Lemma 4.1). Therefore, we can get

$$\begin{aligned} {\tilde{V}}(x,y,z) = \mathbf {E} \left[ \int _0^T e^{-\beta t}\frac{1}{\gamma }(c^*_tX_t)^\gamma \mathrm{d}t\right] + \mathbf {E}\left[ e^{-\beta T} {\tilde{V}}(X_T,Y_T, Z_T)\right] . \end{aligned}$$
(81)

From (78), we can see that

$$\begin{aligned} \mathbf {E}\left[ \int _0^\infty e^{-\beta t} \frac{1}{\gamma }(c_t^*X_t)\mathrm{d}t\right] \le {\tilde{V}}(x,y,z) < \infty . \end{aligned}$$
(82)

This implies that

$$\begin{aligned} \liminf _{T\rightarrow \infty } \, \mathbf {E}\left[ e^{-\beta T}\frac{1}{\gamma }(c^*_TX_T)^\gamma \right] = 0, \end{aligned}$$
(83)

which is easily seen with a proof by contradiction. Note that \(c_t^*\) is bounded below by a positive constant:

$$\begin{aligned} c_t^* \ge e^{\frac{K_2}{\gamma - 1}} \equiv {{\underline{c}}} > 0. \end{aligned}$$

Then, we have

$$\begin{aligned} 0 = \liminf _{T\rightarrow \infty } \, \mathbf {E}\left[ e^{-\beta T}\frac{1}{\gamma }(c^*_TX_T)^\gamma \right] \,\ge \,\, {\underline{c}}^\gamma \, \liminf _{T\rightarrow \infty } \, \mathbf {E}\left[ e^{-\beta T}\frac{1}{\gamma }X_T^\gamma \right] \ge 0, \end{aligned}$$

which implies

$$\begin{aligned} \liminf _{T\rightarrow \infty } \, \mathbf {E}\left[ e^{-\beta T}\frac{1}{\gamma }X_T^\gamma \right] = 0. \end{aligned}$$
(84)

Using this along with the fact that \(e^{{\tilde{Q}}} \le e^{K_2}\), we have that

$$\begin{aligned} \liminf _{T\rightarrow \infty } \, \mathbf {E}\left[ e^{-\beta T}\tilde{V}(X_T,Y_T,Z_T)\right] = 0. \end{aligned}$$
(85)

Now taking the lim inf of (81) as T approaches infinity, we get

$$\begin{aligned} {\tilde{V}}(x,y,z)= & {} \liminf _{T\rightarrow \infty }\, \mathbf {E}\left[ \int _0^T e^{-\beta t}\frac{1}{\gamma }(c^*_tX_t)^\gamma \mathrm{d}t \right] \nonumber \\= & {} \mathbf {E}\left[ \int _0^\infty e^{-\beta t}\frac{1}{\gamma }(c^*_tX_t)^\gamma \mathrm{d}t \right] . \end{aligned}$$
(86)

The second equality is true due to the monotone convergence theorem. Finally, by (86) and the definition of V, we have

$$\begin{aligned} {\tilde{V}}(x,y,z) \le V(x,y,z). \end{aligned}$$
(87)

Combining (79) and (87), we have \({\tilde{V}}(x,y,z) = V(x,y,z)\). \(\square \)

5 Conclusions

We consider a portfolio optimization problem with stochastic dividend and stochastic volatility in this paper. The problem is formulated as a stochastic control problem, and the HJB equation is derived. We then establish the existence results of the HJB equation by virtue of the subsolution–supersolution method and some PDE techniques. It is verified that the solution of the HJB equation is the same as the value function, and the optimal investment and consumption strategies are derived. It turns out that both the average dividend rate and the volatility of the dividend play some roles in the value function, the optimal investment strategy, and the consumption strategy.

There are some extensions that can be done beyond the work presented in this paper. For example, one could consider other utility functions such as exponential utility. Another problem is to consider the optimization problem over a finite-time horizon. Moreover, some other realistic features, such as delay effects and transaction costs, can be added to the model. Those can be topics for future research.