1 Introduction

Designing dividend payment policies has long been an interesting research issue in actuarial science and finance literature. For early work on dividend related issues, see [1], in which it is shown that the optimal dividend strategy is a barrier strategy under a simple random walk model. Gerber [2, 3] investigated the optimal dividend problem for both the compound Poisson process and diffusion process models. Recently, there have been increasing efforts on applying advanced methods of stochastic control to study the optimal dividend policy; see [4, 5].

Azcue and Muler [6] analyzed the problem of the maximization of total discounted dividend payment for an insurance company. Empirical studies indicate, in particular, that traditional surplus models fail to capture more extreme movements such as market switches. To reflect reality, much effort has been devoted to produce better models. One of the recent trends is to use regime-switching models. Hamilton [7] introduced a regime-switching time series model. Recent work on risk models and related issues can be found in [8]. In [9], the optimal dividend strategy with restricted payment rate was studied for the regime-switching jump diffusion model. A comprehensive study of switching diffusions with “state-dependent” switching is presented in [10].

In this work, we consider the optimal strategy using the collective risk model under the Markovian regime-switching setting with more general claim size distribution. We allow the investment of surplus in a continuous-time financial market and the management of the dividend payment policy. In our model, with the golden rule “never borrow money to do risky investment,” the insurers cannot put too much money in risky assets for the sake of risk management. Therefore, two constraints for the investment strategies are imposed: (i) The weight of the risky asset should be no more than one. (ii) A short selling risky asset is prohibited. In addition, a dividend process is not necessarily absolutely continuous. In fact, dividends are not usually paid out continuously in practice. For instance, insurance companies may distribute dividends on discrete time intervals resulting in unbounded payment rate. In such a scenario, the surplus level changes drastically on a dividend payday. Thus, abrupt or discontinuous changes occur due to “singular” dividend distribution policy. Together with the investment strategies and the incurred claims, this gives rise to a mixed regular–singular stochastic control problem with jump diffusion. We model the surplus process by a regime-switching jump diffusion process. Investment strategies and dividend payment policies are introduced as regular and singular stochastic controls. The goal is to maximize the expected total discounted dividend payment until ruin. The formulation of our model is very general and versatile. Nevertheless, due to the inclusion of both regular and singular controls, and the random switching environment, closed-form solutions are virtually impossible to obtain. Thus, we focus on developing numerical solutions. Azcue and Muler [6] considered the optimal investment policy and dividend payment strategy in an insurance company, but with the independent and identically distributed claim sizes without regime-switching. The model we consider appears to be more versatile and realistic. To find the optimal investment and dividend payment strategies, one usually solves a so-called Hamilton–Jacobi–Bellman equation. However, in our work, because of the regime-switching jump diffusion and the mixed regular and singular control formulation, the Hamilton–Jacobi–Bellman equation is in fact a coupled system of nonlinear integro-differential quasi-variational inequalities. A closed-form solution is virtually impossible to obtain. A viable alternative is to employ numerical approximations. In this work, we adapt the Markov chain approximation methodology developed in [11]. To the best of our knowledge, numerical methods for singular controls of regime-switching jump diffusions have not been studied in the literature to date. Even for singular controlled diffusions without regime-switching, the related results are relatively scarce; [12] and [13] are the only papers that carry out a convergence analysis using weak convergence and relaxed control formulation of numerical schemes for singular control problems in the setting of Itô diffusions. We focus on developing numerical methods that are applicable to mixed regular and singular controls for regime-switching jump diffusion models. In a recent work, Jin et al. [14] developed numerical algorithms for approximating optimal reinsurance and dividend payment policies under regime-switching diffusion models. In that paper, one needs to deal with a system of quasi-variational inequalities. This paper further treats models with jumps. As a result, we have to deal with a system of integro-differential quasi-variational inequalities. The problem becomes more complex and difficult to handle. Although the primary motivation stems from insurance risk controls, the techniques and the algorithms suggested appear to be applicable to other singular control problems as well.

The rest of the paper is organized as follows. A general formulation of optimal investment strategies and dividend policies and assumptions are presented in Sect. 2. Some properties of the optimal value function and the verification theorem are also presented. Section 3 deals with the numerical algorithm of Markov chain approximation method. The Poisson jumps, regular control, and the singular control are well approximated by the approximating Markov chain and the dynamic programming equations are presented. Section 4 deals with the convergence of the approximation scheme. The technique of “rescaling time” is introduced, and the convergence theorems are proved. Three numerical examples are provided in Sect. 5 to illustrate the performance of the approximation method. Finally, some additional remarks are provided in Sect. 6.

2 Formulation and Preliminaries

Following the classical risk model introduced in [15], we assume that X(t), the surplus of an insurance company in the absence of dividend payment and investment, satisfies the classical Cramér Lundberg process,

$$ X(t)= x + ct -R(t), \quad t \geq0, $$
(1)

where x is the initial surplus, the constant c is the rate of premium, and \(R(t)=\sum_{n=1}^{N(t)} A_{n}\) is a compound Poisson process with the claim size A n .

To delineate the random environment and other random factors, we use a continuous-time Markov chain α(t) taking values in the finite space \(\mathcal{M}=\{1,\ldots,m\}\). The market states are represented by the Markov chain α(t), and they undergo a Markov regime switching. Let the continuous-time Markov chain α(t) be generated by \(Q=(q_{ij})\in\mathcal{R}^{m\times m}\). That is,

$$ \Pr \bigl\{ \alpha(t + \delta)=j | \alpha (t)=i, \alpha(s), s \leq t \bigr\} = \left \{ \begin{array}{l@{\quad}l}q_{ij}\delta+ o(\delta) & \mbox{if}\ j \neq i, \\[3pt] 1 + q_{ii}\delta+ o(\delta) & \mbox{if}\ j = i, \end{array} \right . $$
(2)

where q ij ≥0 for i,j=1,2,…,m with ji and q ii =−∑ ji q ij <0 for each i=1,2,…,m.

The surplus process X(t) under consideration is a jump diffusion process with regime switching under singular control. For each \(i\in\mathcal{M}\), the premium rate is c(i)>0. Let ζ n be the inter-arrival time of the nth claim, \(\nu_{n} =\sum^{n}_{j=1}\zeta_{j}\), and

$$ N(t)=\max\{n\in\mathcal{N}: \nu_n\leq t\} $$
(3)

counts the number of claims up to time t, which is a Poisson counting process. The function q(X,i,ρ) is assumed to be the magnitude of the claim sizes, where ρ has distribution Π(⋅). Note that our formulation is general, the claim sizes are assumed to depend on the switching regime. At different regimes, the values of q could be much different, which takes into consideration of random environment. Then the Poisson measure N(⋅) has intensity λdt×Π() where Π()=f(ρ) . Assume that q(⋅,i,ρ) is continuous for each ρ and each \(i\in\mathcal{M}\). Then the surplus process in the absence of dividend payment and investment is a regime-switching jump process given by

(4)

We consider the financial market with a risk-free asset B(t) and a risky asset S(t) with prices satisfying

$$ \left \{ \begin{array}{l}\displaystyle \frac{dB(t)}{B(t)}=l \bigl(\alpha(t) \bigr)\,dt, \\[12pt] \displaystyle \frac{dS(t)}{S(t)}=b \bigl(\alpha(t) \bigr)\,dt + \sigma \bigl(\alpha(t) \bigr)\,dW(t), \end{array} \right . $$
(5)

where for each \(i\in\mathcal{M}\), l(i) and b(i) are the return rates of the risk-free and risky assets, respectively. σ(α(t)) is the corresponding volatility, and W(t) is a standard Brownian motion. The investment behavior of the insurer is modeled as a portfolio process u(t), where proportional surplus u(t)∈[0,1] was invested in the risky asset S(t). We are now working on a filtered probability space \((\varOmega, \mathcal{F}, \{\mathcal{F}_{t}\}, P)\), where \(\mathcal{F}_{t}\) is the σ-algebra generated by the random variables {α(s),W(s),N(s):0≤st}.

A dividend strategy Z(⋅) is an \(\mathcal{F}_{t}\)-adapted process {Z(t):t≥0} corresponding to the accumulated amount of dividends paid up to time t such that Z(t) is a nonnegative and nondecreasing stochastic process that is right continuous with left limits. Throughout the paper, we use the convention that Z(0)=0. The surplus process considering dividend payment and investment satisfies the stochastic differential equation

$$ \left \{ \begin{array}{l}\displaystyle dX(t)= \bigl[ \bigl[l \bigl(\alpha(t) \bigr) \bigl(1-u(t) \bigr)+u(t) b \bigl(\alpha(t) \bigr) \bigr]X(t)+ c (\alpha (t) ) \bigr]\,dt \\[2pt] \displaystyle\quad\quad\quad\quad{}+ u(t)\sigma \bigl(\alpha(t) \bigr)X(t)\,dW(t) -dR(t)-dZ(t), \\[3pt] \displaystyle R(t) =\int_0^t\!\!\int_{\mathcal{R}_+}q \bigl(X(t^-), \alpha(t),\rho \bigr) N(dt,d\rho),\\ [10pt] \displaystyle X(0)=x, \end{array} \right . $$
(6)

for all t<τ, and we impose X(t)=0 for all t>τ, where τ=inf{t>0:X(t)≤0} represents the time of ruin. The jump size of Z is denoted by ΔZ(t):=Z(t)−Z(t ), and Z c(t):=Z(t)−∑0≤st ΔZ(s) denotes the continuous part of Z. Also note that ΔX(t):=X(t)−X(t )=−ΔZ(t) for any t≥0.

Remark 2.1

From a numerical approximation point of view, making c, b, and σ X-dependent will not introduce any essential difficulty.

Denote by r>0 the discount factor. For an arbitrary admissible pair π=(u,Z), the expected discounted payoff is

$$ J(x,i,\pi):= E_{x,i} \biggl[\int _{0}^{\tau} e^{- rt} dZ(t) \biggr]. $$
(7)

We request r>b(i) for all \(i \in\mathcal{M}\), otherwise the optimal value of the payoff will be infinite. The pair π=(u,Z) is said to be admissible if u and Z satisfy

  1. (i)

    u(t) and Z(t) are nonnegative for any t≥0,

  2. (ii)

    Z is right continuous, has left limits, and is nondecreasing,

  3. (iii)

    X(t)≥0 for any tτ,

  4. (iv)

    both u and Z are adapted to \(\mathcal{F}_{t}:=\sigma \{W(s),\alpha(s),N(s), 0\le s \le t \}\) augmented by the P-null sets, and

  5. (iv)

    J(x,i,π)<∞ for any \((x, i)\in G\times \mathcal{M}\) and admissible pair π=(u,Z), where J is the functional defined in (7).

Denote by \(\mathcal{A} \) the collection of all admissible pairs, and U the collection of all investment strategies, which is assumed to be a compact set. Define the value function as

$$ V(x,i) := \sup_{\pi\in\mathcal{A}} J(x,i, \pi). $$
(8)

To proceed, we will introduce the following notation. For all \(i\in \mathcal{M}\),

Lemma 2.1

Consider the process

with Y(0)=x. We have

$$E_{x,i} \bigl(e^{-rt}Y(t) \bigr)\leq e^{-(r-\bar{b})t} \biggl(x + \frac{\bar {c}(1-e^{-\tilde{l}t})}{\tilde{l}} \biggr). $$

Proof

This result is obtained by using estimates for the linear SDE; see also [16]. □

Proposition 2.1

For all x>0 and \(i\in\mathcal{M}\), the value function V(x,i) satisfies

$$x + \frac{\tilde{c}}{\lambda+ r} \leq V(x, i) \leq \frac{\bar{b}}{r-\bar{b}} \biggl(1+ \frac{\bar{c}}{\tilde{l}} \biggr) + \frac{\bar{c}}{r}-\frac{\bar{b}\bar{c}}{\tilde{l}(r-\bar {b}+\tilde{c})}. $$

Proof

Since the claim sizes are always nonnegative, we have X(t)≤Y(t). Then

$$E_{x,i} \bigl[e^{-rt} X(t) \bigr] \leq e^{(\bar{b}-r)t} \biggl(x + \frac{\bar {c}(1-e^{-\tilde{l}t})}{\tilde{l}} \biggr). $$

It follows that

Thus, the second inequality is obtained.

To prove the first inequality, let us consider an admissible strategy \(\hat{\pi}\) that pays the initial surplus as dividend immediately and the premium until the first claim comes at \(\hat{\tau}\) which leads to ruin. Then we have the cost function under strategy \(\hat{\pi}\)

$$J(x,i,\hat{\pi})= x + E_{x,i}\int_0^{\hat{\tau}} e^{-rt}c \bigl(\alpha (t) \bigr)\,dt \geq x + \frac{\tilde{c}}{\lambda+ r}. $$

Since \(V(x,i) \geq J(x,i,\hat{\pi})\), we get the first inequality. □

Proposition 2.2

For all xy>0, and \(i\in\mathcal{M}\), the value function V(x,i) satisfies

$$x-y \leq V(x,i)-V(y,i) \leq \bigl(e^{(r+\lambda) (x-y)/\tilde{c}}-1 \bigr)V(y,i). $$

Proof

Consider an admissible strategy π 0 with J(y,i,π 0)≥V(y,i)−ε for any ε>0. For any xy, we define a new strategy π 1 that pays xy as dividend immediately and follows π 0. Then for any ε>0, it holds that

$$V(x,i) \geq x-y + J(y, i, \pi_0) \geq x-y + V(y, i)-\varepsilon. $$

Since ε is arbitrary, we have V(x,i)≥xy+V(y,i).

For any xy, we take another admissible strategy π 2 with J(x,i,π 2)≥V(x,i)−ε for any ε>0. We define another new strategy π 3 that holds money and invests them in the risk-free asset and follows the π 2 when the surplus reaches x at time \(\hat{t}\). The probability of reaching x before the first claim is \(e^{-\lambda\hat{t}}\). Since \(\hat{t} < (x-y)/\tilde{c}\), we have that

$$V(y,i)\geq J(y, i, \pi_3) \geq J(x, i, \pi_2)e^{-(r+\lambda) (x-y)/\tilde{c}} \geq \bigl(V(x,i)-\varepsilon \bigr)e^{-(r+\lambda) (x-y)/\tilde{c}}. $$

Thus, the right inequality is obtained by the arbitrariness of ε. □

For an arbitrary \(\pi\in\mathcal{A}\), \(i=\alpha(t) \in\mathcal {M}\), and \(V(\cdot,i)\in C^{2}(\mathcal{R})\), define the operator \(\mathcal{L}^{\pi}\) by

(9)

where V x and V xx denote the first and second derivatives with respect to x, and

$$Q V(x,\cdot) (i)=\sum_{j\neq i}q_{ij} \bigl(V(x,j)-V(x,i) \bigr). $$

Formally, we obtain that V satisfies the following coupled system of integro-differential quasi-variational inequalities (QVIs):

$$ \left \{ \begin{array}{l@{\quad}l} \displaystyle \max \bigl\{ \mathcal{L}^\pi V(x,i)- rV(x,i) , 1- V_x(x,i) \bigr\} =0 &\hbox{for each }\ i\in\mathcal{M},\\[4pt] \displaystyle V(0, i)=0 & \hbox{for each}\ i\in\mathcal{M}.\\ \end{array} \right . $$
(10)

Remark 2.2

The value function V(x,α) is not smooth enough in our problem, in which case a classical solution of the QVIs cannot be obtained. An alternative definition for a solution to the quasi-variational inequalities (10) is that of a viscosity solution (see [17]). In our work, we focus on the numerical solutions; the definition of viscosity solution will not lead any difficulty in numerical approximation.

3 Numerical Algorithm

Our goal is to design a numerical scheme to approximate the value function V in (8). As a standing assumption, we assume that V(⋅) is continuous with respect to x. In this section we will construct a locally consistent Markov chain approximation for the jump diffusion model with singular control and regime switching. The discrete-time and finite-state controlled Markov chain is so defined that it is locally consistent with (6). First let us recall some facts of Poisson random measure that are useful for constructing the approximating Markov chain and for the convergence theorem.

There is an equivalent way to define the process (6) by working with the claim times and values. To do this, setting ν 0=0, let ν n , n≥1, denote the time of the nth claim, and let q(⋅,⋅,ρ n ) be the corresponding claim intensity with a suitable function of q(⋅). Let {ν n+1ν n ,ρ n ,n<∞} be mutually independent random variables with ν n+1ν n being exponentially distributed with mean 1/λ, and let ρ n have a distribution Π(⋅). Furthermore, we assume that {ν k+1ν k ,ρ k ,kn} is independent of {x(s),α(s),s<ν n ,ν k+1ν k ,ρ k ,k<n}. then the nth claim term is \(q(X(\nu_{n}^{-}), \alpha(\nu_{n}), \rho_{n})\), and the claim amount R(t) can be written as

$$R(t)=\sum_{\nu_n\leq t}q \bigl(X \bigl(\nu_n^- \bigr), \alpha(\nu_n), \rho_n \bigr). $$

We note the local properties of claims for (6). Because ν n+1ν n is exponentially distributed, we can write

$$ P \bigl\{\hbox{claim occurs on} \ [t,t+\Delta)\mid x(s), \alpha(s),W(s), N(s), s\leq t \bigr\}=\lambda\Delta+o(\Delta). $$
(11)

By the independence and the definition of ρ n , for any \(H\in \mathcal{B}(\mathcal{R}_{+})\), we have

(12)

It is implied by the above discussion that x(⋅) satisfying (6) can be viewed as a process that involves regime-switching diffusion with claims according to the claim rate defined by (11). Given that the nth claim occurs at time ν n , we construct the values according to the conditional probability law (12) or, equivalently, write it as \(q(X(\nu_{n}^{-}), \alpha(\nu_{n}), \rho_{n})\). Then the process given in (6) is a switching diffusion process until the time of the next claim. To begin, we construct a discrete-time, finite-state, controlled Markov chain to approximate the controlled diffusion process with regime switching, with the dynamic system

$$ \left \{ \begin{array}{l}\displaystyle dX(t)= \bigl[ \bigl[l \bigl(\alpha(t) \bigr) \bigl(1-u(t) \bigr)+u(t) b \bigl(\alpha(t) \bigr) \bigr]X(t)+ c \bigl(\alpha (t) \bigr) \bigr]\,dt \\[3pt] \quad\quad\quad\quad{}+ u(t)\sigma \bigl(\alpha(t) \bigr)X(t)\,dW(t)-dZ, \\[5pt] \displaystyle X(0)=x. \end{array} \right . $$
(13)

We will construct a locally consistent Markov chain approximation for the mixed regular–singular control model with regime switching. The discrete-time controlled Markov chain is so defined that it is locally consistent with (6). Note that the state of the process has two components x and α. Hence, in order to use the methodology in [11], our approximating Markov chain must have two components: one component delineates the diffusive behavior , whereas the other keeps track of the regimes. Let h>0 be a discretization parameter representing the step size. Define L h ={x:x=kh,k=0,±1,±2,…} and S h =L h G h , where G h =(0,B+h), and B is an upper bound introduced for numerical computation purpose. Moreover, assume without loss of generality that the boundary point B is an integer multiple of h. Let \(\{(\xi_{n}^{h}, \alpha_{n}^{h}), n<\infty \}\) be a controlled discrete-time Markov chain on \(S_{h} \times\mathcal{M}\) and denote by \(p_{D}^{h}((x,\ell),(y,\iota)|\pi^{h})\) the transition probability from a state (x,) to another state (y,ι) under the control π h. We need to define \(p_{D}^{h}\) so that the chain’s evolution well approximates the local behavior of the controlled regime-switching diffusion (13). At any discrete time n, we can either exercise a regular control, a singular control, or a reflection step. That is, if we put \(\Delta\xi_{n}^{h} = \xi_{n+1}^{h} -\xi_{n}^{h}\),

(14)

The chain and the control will be chosen so that there is exactly one nonzero term in (14). Denote by \(\{I_{n}^{h}: n=0,1,\dots \}\) a sequence of control actions, where \(I_{n}^{h}=0, 1\), or 2, if we exercise a singular control, regular control, or reflection at time n, respectively.

When \(I_{n}^{h} =1\), we regard \(u_{n}^{h}\subset U\) as the random variable that is the regular control action for the chain at time n. Let \(\tilde{\Delta}t^{h}(\cdot,\cdot,\cdot)>0\) be the interpolation interval on \(S_{h} \times\mathcal{M} \times U\). Assume that \(\inf_{x,\ell,u} \tilde{\Delta}t^{h}(x,\ell,u)>0\) for each h>0 and \(\lim_{h\to0} \sup_{x,\ell,u}\tilde{\Delta}t^{h}(x,\ell,u) \to0\).

Let \(E_{x,\ell,n}^{u,h,1}\), \({\text{Var}}_{x,\ell,n}^{u,h,1}\), and \({P}_{x,\ell,n}^{u,h,1}\) denote the conditional expectation, variance. We also denote \(\{\xi_{k}^{h}, \alpha_{k}^{h}, u_{k}^{h}, I_{k}^{h}, k\le n, \xi_{n}^{h} = x, \alpha_{n}^{h} = \ell,I_{n}^{h}=1, u_{n}^{h} = u\}\) as the given marginal probability. When

(15)

the sequence \(\{(\xi_{n}^{h},\alpha_{n}^{h})\}\) is said to be locally consistent.

When \(I_{n}^{h} =0\), we regard \(\Delta z_{n}^{h} \) as the random variable that is the singular control action for the chain at time n if \(\xi_{n}^{h} \in[0, B]\). Note that \(\Delta\xi_{n}^{h} = -\Delta z_{n}^{h} = - h\). When \(I_{n}^{h} =2\), or \(\xi_{n}^{h}= B+h\), the reflection step is exerted definitely. The dividend is paid out to lower the surplus level. Moreover, we require that reflection takes the state from B+h to B. That is, if we denote by \(\Delta g_{n}^{h} \) the random variable that is the reflection action for the chain at time n, then \(\Delta\xi_{n}^{h} = -\Delta g_{n}^{h} = -h\).

The singular control can be seen as a combination of “inside” part (\(I_{n}^{h} =0\)) and “boundary” part (\(I_{n}^{h} =2\)). Also, we require the singular control and reflection to be “impulsive” or “instantaneous.” In other words, the interpolation interval on \(S_{h} \times\mathcal {M}\times U\times \{0,1,2 \}\) is

$$ \Delta t^h(x,\ell,u,i)= \tilde{\Delta}t^h(x,\ell,u) I_{ \{i=1 \}} \quad\text{for any } (x, \ell,u,i) \in S_h \times\mathcal{M}\times U\times \{0,1,2 \}. $$
(16)

Denote by \(\pi^{h}:=\{\pi_{n}^{h}, n\ge0 \}\) the sequence of control actions, where

$$\pi_n^h:= \Delta z^h_n I_{\{I_n^h=0\}} + u_n^h I_{\{I_n^h=1\}}+ \Delta g^h_n I_{\{I_n^h=2\}}. $$

The \(\pi^{h}_{n}\) is admissible if \(\pi_{n}^{h}\) is \(\sigma\{(\xi_{0}^{h},\alpha_{0}^{h}),\dots,(\xi_{n}^{h},\alpha_{n}^{h}), \pi_{0}^{h},\dots, \pi^{h}_{n-1}\}\) adapted, and for any \(E\in\mathcal{B}(S_{h}\times\mathcal{M})\), we have

$${P} \bigl\{ \bigl(\xi_{n+1}^h,\alpha_{n+1}^h \bigr) \in E |\sigma \bigl\{ \bigl(\xi_0^h, \alpha_0^h \bigr),\dots, \bigl(\xi_n^h, \alpha_n^h \bigr), \pi_0^h, \dots, \pi^h_{n} \bigr\} \bigr\} = p^h \bigl( \bigl( \xi_n^h,\alpha_n^h \bigr), E| \pi_n^h \bigr) $$

and

Put

$$t_0^h:=0,\quad t_n^h: = \sum _{k=0}^{n-1} \Delta t^h \bigl( \xi_k^h, \alpha_k^h, u_k^h, I_k^h \bigr),\quad \text{and}\quad n^h(t):= \max \bigl\{n: t_n^h \le t \bigr\}. $$

Then the piecewise constant interpolations, denoted by (ξ h(⋅),α h(⋅)), u h(⋅), g h(⋅), and z h(⋅), are naturally defined as

$$ \begin{array}{ll}&\displaystyle \xi^h(t) := \xi_n^h, \quad\quad \alpha^h(t) := \alpha_n^h, \quad\quad u^h(t) := u_n^h, \\ [8pt] & g^h(t) := \displaystyle\sum_{k\le n^h(t)} \Delta g_k^h I_{ \{I_k^h=2 \}}, \quad \quad z^h(t) := \sum_{k\le n^h(t)} \Delta z_k^h I_{ \{I_k^h=0 \}}, \end{array} $$
(17)

for \(t\in[t_{n}^{h}, t_{n+1}^{h})\). Let \(\eta_{h} := \inf \{n: \xi_{n}^{h} \in\partial G \}\). Then the first exit time of ξ h from G is \(\tau^{h}= t^{h}_{\eta_{h}}\). Let \((\xi_{0}^{h},\alpha_{0}^{h})=(x,\ell)\in S_{h}\times\mathcal{M}\), and let π h be an admissible control. The cost function for the controlled Markov chain is defined as

$$ J_B^h \bigl(x,\ell, \pi^h \bigr) := E\sum_{k=1}^{\eta_h -1 }e^{-r t_k^h} \Delta z_k^h , $$
(18)

which is analogous to (7). Regarding the definition of interpolation intervals in (16), the value function of the controlled Markov chain is

(19)

We shall show that \(V^{h}_{B}(x,\ell)\) satisfies the dynamic programming equation

$$ V^h_B(x,\ell) = \left \{ \begin{array}{ll}\displaystyle\displaystyle\max_{u\in U} \biggl\{\sum_{(y,\iota)} e^{-r \Delta t^h (x,\ell,u, 1 )}p^h \bigl((x,\ell),(y,\iota)|\pi \bigr) V^h(y,\iota) , \\[14pt] \displaystyle \qquad \biggl[\displaystyle\sum_{(y,\iota)} p^h \bigl((x,\ell),(y,\iota)|\pi \bigr) V^h(y,\iota) + h \biggr] \biggr\}&\text{for } x \in S_h,\\ \displaystyle0 &\text{for } x=0. \end{array} \right . $$
(20)

Note that discount does not appear in the second line above because the singular control is impulsive. In the actual computing, we use iteration in value space or iteration in policy space together with Gauss–Seidel iteration to solve V h. The computations will be very involved. In contrast to the usual state space S h in [11], here we need to deal with an enlarged state space \(S_{h} \times\mathcal{M}\) due to the presence of regime switching.

Define the approximation to the first and second derivatives of V(⋅,) by finite difference method in the first part of QVIs (10) using stepsize h>0 as:

$$ \begin{array}{ll} &\displaystyle V(x,\ell) \to V^h(x,\ell) \\[6pt] &\displaystyle V_x(x,\ell) \to\frac{V^h(x+h, \ell)-V^h(x, \ell)}{h} \quad \hbox{for } \bigl[l(\ell)(1-u)+u b(\ell) \bigr]x+ c(\ell)>0 ,\\[10pt] &\displaystyle V_x(x,\ell) \to\frac{V^h(x,\ell)-V^h(x-h, \ell)}{h} \quad \hbox{for } \bigl[l(\ell)(1-u)+u b(\ell) \bigr]x+ c(\ell)<0, \\[10pt] &\displaystyle V_{xx}(x,\ell) \to\frac{V^h(x+h,\ell)-2V^h(x, \ell) + V^h(x-h, \ell)}{h^2}. \end{array} $$
(21)

For the second part of the QVIs, we choose

$$V_x(x,\ell) \to\frac{V^h(x,\ell)-V^h(x-h, \ell)}{h}. $$

It leads to

(22)

where [[l()(1−u)+ub()]x+c()]+ and [[l()(1−u)+ub()]x+c()] are the positive and negative parts of [[l()(1−u)+ub()]x+c()], respectively. Simplifying (22) and comparing with (20), we achieve the transition probabilities of the first part of the right side of (20) as follows:

$$ \begin{array}{ll}&\displaystyle p_D^h \bigl((x,\ell),(x+h,\ell)|\pi \bigr)= \frac{(\sigma^2(\ell)u^2 x^2/2)+h [ [l(\ell)(1-u)+u b(\ell)]x+ c(\ell) ]^+}{D -r h^2},\\[12pt] &\displaystyle p_D^h \bigl((x,\ell),(x-h,\ell)|\pi \bigr)=\frac{(\sigma^2(\ell)u^2 x^2/2)+h [ [l(\ell)(1-u)+u b(\ell)]x+ c(\ell) ]^- }{D-r h^2},\\[12pt] &\displaystyle p_D^h \bigl((x,\ell),(x,\iota)|\pi \bigr)=\frac{h^2}{D-r h^2}q_{\ell\iota}\quad \hbox{for}\ \ell\neq\iota,\\[12pt] &\displaystyle p_D^h(\cdot)=0\quad \hbox{otherwise},\\[4pt] &\displaystyle\Delta t^h(x,\ell,u, 1)=\frac{h^2}{D}, \end{array} $$
(23)

with

$$D=\sigma^2(\ell)u^2 x^2+h \big| \bigl[l(\ell) (1-u)+u b(\ell) \bigr]x+ c(\ell ) \big|+h^2(r-q_{\ell\ell}) $$

being well defined. We also find the transition probability for the second part of the right-hand side of (20). That is,

$$p_D^h \bigl((x,\ell),(x-h,\ell)|\pi \bigr)=1. $$

Suppose that the current state is \(\xi_{n}^{h}=x\), \(\alpha_{n}^{h}=\ell\), and the control is \(u_{n}^{h}=u\). Next interpolation interval Δt h(x,,u) is determined by (23). To present the claim terms, we determine the next state \((\xi_{n+1}^{h}, \alpha_{n+1}^{h})\) by noting:

  1. 1.

    With probability (1−λΔt h(x,,u)+ot h(x,,u))) no claims occur in \([t_{n}^{h}, t_{n+1}^{h})\); we determine \((\xi_{n+1}^{h}, \alpha_{n+1}^{h})\) by transition probability \(p_{D}^{h}(\cdot)\) as in (23).

  2. 2.

    There is a claim in \([t_{n}^{h}, t_{n+1}^{h})\) with probability λΔt h(x,,u)+ot h(x,,u)); we then determine \((\xi_{n+1}^{h}, \alpha_{n+1}^{h})\) by

    $$\xi_{n+1}^h=\xi_{n}^h-q_h(x, \ell,\rho),\quad\quad \alpha_{n+1}^h=\alpha_{n}^h, $$

    where ρΠ(⋅), and \(q_{h}(x,\ell,\rho)\in S_{h}\subseteq\mathcal{R}_{+}\) such that q h (x,,ρ) is the nearest value of q(x,,ρ), so that \(\xi_{n+1}^{h}\in S_{h}\). |q h (x,,ρ)−q(x,,ρ)|→0 follows as h→0, uniformly in x.

Let \(H_{n}^{h}\) denote the event that \((\xi_{n+1}^{h}, \alpha_{n+1}^{h})\) is determined by the first alternative above and use \(T_{n}^{h}\) to denote the event of the second case. Let \(I_{H_{n}^{h}}\) and \(I_{T_{n}^{h}}\) be the corresponding indicator functions, respectively. Then \(I_{H_{n}^{h}}+I_{T_{n}^{h}}=1\). Then we need a new definition of the local consistency for Markov chain approximation of compound Poisson process with diffusion and regime switching.

Definition 3.1

A controlled Markov chain \(\{(\xi_{n}^{h},\alpha_{n}^{h}), n<\infty\}\) is said to be locally consistent with (6) if there is an interpolation interval Δt h(x,,u)→0 as h→0 uniformly in x,, and u such that:

  1. 1.

    there is a transition probability \(p_{D}^{h}(\cdot)\) that is locally consistent with (13) in the sense that (15) holds.

  2. 2.

    there is a δ h(x,,u)=ot h(x,,u)) s.t. the one-step transition probability {p h((x,),(y,ι))|π} is given by

    (24)

Furthermore, the system of dynamic programming equations is a modification of (20). That is,

$$ \begin{array}{ll} V^h(x,\ell) &\displaystyle=\left \{ \begin{array}{l@{\quad}l}\displaystyle \max_{\pi\in\mathcal{A}} \biggl[ \bigl(1-\lambda\Delta t^h(x, \ell, u)+ \delta^h(x, \ell, u) \bigr)e^{-r\Delta t^h(x,\ell,u)} \\[14pt] \displaystyle\ \ \times \sum_{(y,\iota)} \bigl(p_D^{h} \bigl((x,\ell),(y,\iota) \bigr)|\pi \bigr)\\[14pt] \displaystyle \ \ \times V^h(y,\ell) + \bigl(\lambda\Delta t^h(x, \ell, u)+ \delta^h(x,\ell,u) \bigr)e^{-r\Delta t^h(x,\ell,\pi)} \\[12pt] \displaystyle\ \ \times\int_0^xV^h \bigl(x-q_h(x,\ell,\rho),\ell \bigr)\varPi(d\rho), V^h(x-h, \ell)+h \biggr]&\hbox{for}\ x\in S_h, \\ \displaystyle0 &\hbox{for}\ x=0. \end{array} \right . \end{array} $$
(25)

Remark 3.1

The first part of the QVIs can be seen as a “continuation” region where the regular control is dominant. The Markov approximating chain can switch between regimes and states nearby with the transition probabilities defined above. But the second part of the QVIs is the “jump” region, where the dividends are paid out, and the singular control is dominant. The singular control will project the Markov approximation chain back one step h w.p. 1 on the boundary due to the representation.

4 Convergence of Numerical Approximation

This section focuses on the asymptotic properties of the approximating Markov chain proposed in the last section. The main techniques are methods of weak convergence. To begin with, the technique of time rescaling and the interpolation of the approximation sequences are introduced in Sect. 4.1. The definition of relax controls is presented in Sect. 4.2. Section 4.3 deals with the weak convergence of \(\{\hat{\xi}^{h}(\cdot), \hat{\alpha}^{h}(\cdot), \hat{m}^{h}(\cdot), \hat{W}^{h}(\cdot), \hat{N}(\cdot), \hat{R}^{h}(\cdot), \hat{z}^{h}(\cdot), \hat{g}^{h}(\cdot), \hat{T}^{h}(\cdot)\}\), a sequence of rescaled process. As a result, a sequence of controlled surplus processes converges to a limit surplus process. By using the techniques of inversion, Sect. 4.3 also takes up the issue of the weak convergence of the surplus process. Finally, Sect. 4.4 establishes the convergence of the value function.

4.1 Interpolation and Rescaling

Based on the approximating Markov chain constructed above, the piecewise constant interpolation is obtained, and the appropriate interpolation interval level is chosen. Recalling (17), the continuous-time interpolations (ξ h(⋅),α h(⋅)),u h(⋅),g h(⋅), and z h(⋅) are defined. In addition, let \({\mathcal{U}}^{h}\) denote the collection of controls that are determined by a sequence of measurable functions \(F_{n}^{h}(\cdot)\) such that

$$ u_n^h=F_n^h \bigl( \xi_k^h, \alpha_k^h,k\leq n;u_k^h,k\le n \bigr). $$
(26)

Let the discrete times at which claims occur be denoted by \(\nu_{j}^{h}\), j=1,2,… . Then we have

$$\xi_{\nu_j^h-1}^h-\xi_{\nu_j^h}^h=q_h \bigl(\xi_{\nu_j^h-1}^h, \alpha_{\nu_j^h-1}^h,\rho \bigr). $$

The smallest σ-algebra of \(\{\xi_{k}^{h},\alpha_{k}^{h},u_{k}^{h}, H_{k}^{h}, g_{k}^{h}, z_{k}^{h}, k\leq n; \nu_{k}^{h}, \rho_{k}^{h}: \nu_{k}^{h}\leq t_{n}\}\) is denoted as \(\mathcal{D}_{n}^{h}\). In addition, \(\mathcal{U}^{h}\) defined by (26) is equivalent to the collection of all piecewise constant admissible controls with respect to \(\mathcal{D}_{n}^{h}\).

Using the representations of regular control, singular control, reflection step, and the interpolations defined above, (14) yields

(27)

The local consistency leads to

(28)

Denote

$$ \begin{array}{ll} &\displaystyle M_n^h=\sum_{k=0}^{n-1} \bigl(\Delta\xi_k^h-E_k^h\Delta\xi_k^h \bigr)I_{H_k^h},\\[14pt] &\displaystyle R_n^h=-\sum_{k=0}^{n-1}\Delta\xi_k^h(1-I_{H_k^h})= \sum_{k:\nu_k<n}q_h \bigl(\xi_{\nu_k}^h, \alpha_{\nu_k}^h,\rho_k \bigr), \end{array} $$
(29)

where \(M_{n}^{h}\) is a martingale with respect to \(\mathcal{D}_{n}^{h}\). Note that

$$E\sum_{k=0}^{n-1}I_{T_k^h}=E \bigl[ \hbox{number of}\ \ n: \nu_n^h\leq t \bigr]\to\lambda t \ \quad\hbox{as }h\to0. $$

This implies

$$\Bigl(\max_{k'\leq n}\Delta t_{k'}^h \Bigr)O \Biggl( \sum_{k=0}^{n-1}I_{T_k^h} \Biggr)\to0 \quad \hbox{in probability as} \ h\to0. $$

Hence, we can drop the term involving \(I_{T_{k}^{h}}\) without affecting the limit in (28). We attempt to represent M h(t) similar to the diffusion term in (6). Define W h(⋅) as

(30)

Combining (28)–(30), we rewrite (27) by

(31)

where \(R^{h}(t)=\sum_{\nu_{n}^{h}\leq t}q_{h}(\xi_{\nu_{n}^{-}}^{h}, \alpha_{\nu_{n}}^{h},\rho_{n})\), and ε h(t) is a negligible error satisfying

$$ \lim_{h\to\infty}\sup_{0\leq t\leq T}E \big| \varepsilon^h(t) \big| \to0 \quad \hbox{for any } 0<T<\infty. $$
(32)

Next we will introduce the rescaling process. The basic idea of rescaling time is to “stretch out” the control and state processes so that they are “smoother” so that the tightness of g h(⋅) and z h(⋅) can be proved. Define \(\Delta\hat{t}_{n}^{h}\) by

$$ \Delta\hat{t}_n^h= \begin{cases} \Delta t_n^h&\hbox{for a diffusion on step $n$},\\ |\Delta z_n^h|=h &\hbox{for a singular control on step $n$}, \\ |\Delta g_n^h|=h &\hbox{for a reflection on step $n$}, \end{cases} $$
(33)

and define \(\hat{T}^{h}(\cdot)\) by \(\hat{T}^{h}(t)=\sum_{i=0}^{n-1}\Delta t_{i}^{h}=t_{n}^{h}\) for \(t\in[\hat{t}_{n}^{h}, \hat{t}_{n+1}^{h}]\) Thus, \(\hat{T}^{h}(\cdot)\) will increase with the slope of unity if and only if a regular control is exerted. In addition, define the rescaled and interpolated process \(\hat{\xi}^{h}(t)=\xi^{h}(\hat{T}^{h}(t))\), likewise define \(\hat{\alpha}^{h}(t)\), \(\hat{u}^{h}(t)\), \(\hat{N}^{h}(\cdot)\), \(\hat{R}^{h}(t)\), \(\hat{g}^{h}(t)\) similarly. The time scale is stretched out by h at the reflection and singular control steps. We can now write

(34)

4.2 Relaxed Controls

Let \(\mathcal{B}({U} \times[0,\infty))\) be the σ-algebra of Borel subsets of U×[0,∞). An admissible relaxed control (or deterministic relaxed control) m(⋅) is a measure on \(\mathcal{B}({U} \times[0,\infty))\) such that m(U×[0,t])=t for each t≥0. Given a relaxed control m(⋅), there is an m t (⋅) such that m(dt)=m t () dt. We can define \(m_{t}(B) = \lim_{\delta\to0} {\frac{m(B\times[t-\delta,t])}{ \delta}}\) for \(B\in\mathcal{B}(U)\). With the given probability space, we say that m(⋅) is an admissible relaxed (stochastic) control for (W(⋅),α(⋅)) or (m(⋅),W(⋅),α(⋅)) is admissible if m(⋅,ω) is a deterministic relaxed control with probability one and if m(A×[0,t]) is \(\mathcal{F}_{t}\)-adapted for all \(A \in\mathcal{B}(U)\). There is a derivative m t (⋅) such that m t (⋅) is \(\mathcal{F}_{t}\)-adapted for all \(A \in\mathcal{B}(U)\).

Given a relaxed control m(⋅) of u h(⋅), we define the derivative m t (⋅) such that

$$ m^h(K)= \int_{U \times[0,\infty)} I_{\{(u^h, t) \in K\}}m_t(d\phi) \,dt $$
(35)

for all \(K \in\mathcal{B}(U\times[0,\infty))\) and such that for each t,m t (⋅) is a measure on \(\mathcal{B}(U)\) satisfying m t (U)=1. For example, we can define m t (⋅) in any convenient way for t=0 and as the left-hand derivative for t>0,

$$ m_t(A) = \lim_{\delta\to 0} \frac{m(A\times[t-\delta,t])}{\delta} \quad \forall A\in {\mathcal{B}}(U). $$
(36)

Note that m(dt)=m t () dt. It is natural to define the relaxed control representation m h(⋅) of u h(⋅) by

$$ m_t^h(A)=I_{\{u^h(t)\in A\}} \quad\forall A \in{ \mathcal{B}}(U). $$
(37)

Let \(\mathcal{F}_{t}^{h}\) be the filtration consisting of the minimal σ-algebras that measure

$$ \bigl\{\xi^h(s),\alpha^h(s),m_s^h( \cdot),W^h(s),N^h(s), R^h(s), z^h(s),g^h(s), s\leq t \bigr\}. $$
(38)

Use Γ h to denote the set of admissible relaxed controls m h(⋅) with respect to (α h(⋅),W h(⋅)) such that \(m_{t}^{h}(\cdot)\) is a fixed probability measure in the interval \([t_{n}^{h}, t_{n+1}^{h})\) given \(\mathcal{F}_{t}^{h}\). Then Γ h is a larger control space containing \(\mathcal{U}^{h}\). Referring to the stretched out time scale, we denote the rescaled relax control as \(m_{\hat{T}^{h}(t)}(d\phi)\). Define M t (A) and \(M^{h}_{t}(d\phi)\) by

$$\begin{array}{ll}&\displaystyle M_t(A)\,dt=dW(t)I_{u(t)\in A} \quad \forall A\in{\mathcal{B}}(U), \\[5pt] &\displaystyle M^h_t(d\phi)\,dt=dW^h(t)I_{u^h(t)\in\mathcal{U}}. \end{array} $$

Analogously, as an extension of time rescaling, we let

$$\hat{M}^h_{\hat{T}^h(t)}(d\phi)\,d\hat{T}^h(t)=d \hat{W}^h \bigl(\hat{T}^h(t) \bigr)I_{u^h(\hat{T}^h(t))\in\mathcal{U}}. $$

With the notation of relaxed control given above, we can write (31), (34), and the value function (8) as

(39)
(40)

and

$$ V^h(x,i)=\inf_{m^h\in\varGamma^h}J^h \bigl(x,i,m^h \bigr). $$
(41)

Now we give the definition of existence and uniqueness of a weak solution.

Definition 4.1

With the regular control replaced by the relaxed control, by a weak solution of (6) we mean that there exists a probability space \((\varOmega,\mathcal{F},P)\), a filtration \(\mathcal{F}_{t}\), and a process (X(⋅),α(⋅),m(⋅),W(⋅),N(⋅)) such that W(⋅) is a standard \(\mathcal{F}_{t}\)-Wiener process, α(⋅) is a Markov chain with generator Q and state space \(\mathcal{M}\), N(⋅) is an \(\mathcal{F}_{t}\)-Poisson process, m(⋅) is admissible with respect to X(⋅) and is \(\mathcal{F}_{t}\)-adapted, and (6) is satisfied. For an initial condition (x,), by the weak sense uniqueness we mean that the probability law of the admissible process (α(⋅),m(⋅),W(⋅),N(⋅)) determines the probability law of solution (X(⋅),α(⋅),m(⋅),W(⋅),N(⋅)) to (6), irrespective of probability space.

To proceed, we need some assumptions.

  1. (A1)

    Let u(⋅) be an admissible ordinary control with respect to W(⋅), α(⋅), and N(⋅), and suppose that u(⋅) is piecewise constant and takes only a finite number of values. For each initial condition, there exists a solution to (39), where m(⋅) is the relaxed control representation of u(⋅), and this solution is unique in the weak sense.

4.3 Convergence of a Sequence of Surplus Processes

Lemma 4.1

Using the transition probabilities {p h(⋅)} defined in (23), the interpolated process of the constructed Markov chain \(\{\hat{\alpha}^{h}(\cdot)\}\) converges weakly to \(\hat{\alpha}(\cdot)\), the rescaled Markov chain with generator Q=(q ℓι ).

Proof

The proof is similar to Theorem 3.1 in [18]. By using the same technique in the rescaled process, the convergence can be achieved. □

Theorem 4.1

Let the approximating chain \(\{\xi_{n}^{h},\alpha_{n}^{h},n<\infty\}\) constructed with transition probabilities defined in (23) be locally consistent with (6), the relaxed control representation of \(\{u_{n}^{h},n<\infty\}\) be m h(⋅), (ξ h(⋅),α h(⋅)) be the continuous-time interpolation defined in (17), and the corresponding rescaled processes be \(\{\hat{\xi}^{h}(\cdot), \hat{\alpha}^{h}(\cdot), \hat{m}^{h}(\cdot), \hat{W}^{h}(\cdot), \hat{N}^{h}(\cdot), \hat{R}^{h}(\cdot), \hat{z}^{h}(\cdot), \hat{g}^{h}(\cdot), \hat{T}^{h}(\cdot)\}\). Then

$$\bigl\{\hat{\xi}^h(\cdot), \hat{\alpha}^h(\cdot), \hat{m}^h(\cdot), \hat{W}^h(\cdot), \hat{N}^h( \cdot), \hat{R}^h(\cdot), \hat{z}^h(\cdot), \hat{g}^h(\cdot), \hat{T}^h(\cdot) \bigr\} $$

is tight.

The proof of the tightness of the sequence is similar to that of Theorem 4.5 in [14].

Since \(\{\hat{\xi}^{h}(\cdot), \hat{\alpha}^{h}(\cdot), \hat{m}^{h}(\cdot), \hat{W}^{h}(\cdot), \hat{N}^{h}(\cdot), \hat{R}^{h}(\cdot), \hat{z}^{h}(\cdot), \hat{g}^{h}(\cdot), \hat{T}^{h}(\cdot)\}\) is tight, we can extract a weakly convergent subsequence. Let \(\{\hat{X}(\cdot), \hat{\alpha}(\cdot), \hat{m}(\cdot), \hat{W}(\cdot), \hat{N}(\cdot), \hat{R}(\cdot), \hat{z}(\cdot), \hat{g}(\cdot), \hat{T}(\cdot)\} \) be the limit of the subsequence. Also, the paths of \(\{\hat{X}(\cdot), \hat{\alpha}(\cdot), \hat{m}(\cdot), \hat{W}(\cdot), \hat{N}(\cdot), \hat{R}(\cdot), \hat{z}(\cdot), \hat{g}(\cdot), \hat{T}(\cdot)\} \) are continuous w.p. 1.

Theorem 4.2

Let W(⋅) be a standard \(\mathcal{F}_{t}\)-Wiener process, and m(⋅) be admissible. We also have that \(\{\hat{\xi}^{h}(\cdot), \hat{\alpha}^{h}(\cdot), \hat{m}^{h}(\cdot), \hat{W}^{h}(\cdot), \hat{N}^{h}(\cdot), \hat{R}^{h}(\cdot), \hat{z}^{h}(\cdot), \hat{g}^{h}(\cdot), \hat{T}^{h}(\cdot)\}\) generates the σ-algebra \(\hat{\mathcal{F}}_{t}\). Then we obtain that \(\hat{W}(t)=W(\hat{T}(t))\) is an \(\hat{\mathcal{F}}_{t}\)-martingale with quadratic variation \(\hat{T}(t)\). The limit processes satisfy

(42)

Proof

For δ>0, define the process f(⋅) by f h,δ(t)=f h(), t∈[,(n+1)δ). Then, by the tightness of \(\{\hat{\xi}^{h}(\cdot), \hat{\alpha}^{h}(\cdot)\}\), (40) can be rewritten as

(43)

where

$$ \lim_{\delta\to0}\limsup_{h\to 0}E \big|\varepsilon^{h,\delta}(t) \big|=0. $$
(44)

If we can verify that \(\hat{W}(\cdot)\) is an \(\hat{\mathcal {F}}_{t}\)-martingale, then (42) could be obtained by taking limits in (43). To characterize W(⋅), let t>0,δ>0,p,κ, and {t k :kp} be given such that t k tt+δ for all kp, φ j (⋅) for jκ are real-valued and continuous functions on U×[0,∞) having compact supports for all jq. Define

$$ (\varphi_j, \hat{m})_t=\int _0^t\!\!\int_\mathcal{U} \varphi_j(\phi,s)\hat{m}_{\hat{T}(s)}^h(d\phi )\,d \hat{T}(s). $$
(45)

Let \(\{\varGamma_{j}^{\kappa}, j\leq \kappa\}\) be a sequence of nondecreasing partition of Γ such that \(\varPi(\partial\varGamma_{j}^{\kappa})=0\) for all j and all κ, where \(\partial\varGamma_{j}^{\kappa}\) is the boundary of the set \(\varGamma_{j}^{\kappa}\). As κ→∞, let the diameters of the sets \(\varGamma_{j}^{\kappa}\) go to zero. Let K(⋅) be a real-valued and continuous function of its arguments with compact support. In view of the definition of \(\hat{W}(t)\), for each \(i\in\mathcal{M}\), we have

(46)

By using the Skorokhod representation and the dominant convergence theorem, letting h→0, we obtain

(47)

Since \(\hat{W}(\cdot)\) has continuous sample paths, (47) implies that \(\hat{W}(\cdot)\) is a continuous \(\mathcal{F}_{t}\)-martingale. On the other hand, since

$$ E \bigl[ \bigl(\hat{W}^h(t+\delta) \bigr)^2- \bigl(\hat{W}^h(t) \bigr)^2 \bigr] =E \bigl[ \bigl(\hat{W}^h(t+\delta)-\hat{W}^h(t) \bigr)^2 \bigr] =\hat{T}(t+\delta)- \hat{T}(t), $$
(48)

by using the Skorokhod representation and the dominant convergence theorem together with (48) we have

(49)

The quadratic variation of the martingale \(\hat{W}(t)\) is \(\Delta\hat{T}\), then \(\hat{W}(\cdot)\) is an \(\hat{\mathcal{F}}_{t}\)-Wiener process.

Letting h→0, by using the Skorokhod representation we obtain

(50)

uniformly in t. On the other hand, \(\{\hat{m}^{h}(\cdot)\}\) converges in the compact weak topology, that is, for any bounded and continuous function φ(⋅) with compact support, as h→0,

$$ \int_0^{\infty}\!\!\int _\mathcal {U}\varphi (\phi,s)\hat{m}_{\hat{T}^h(s)}^h(d \phi )\,d\hat{T}^h(s) \to\int_0^{\infty} \!\! \int_\mathcal{U}\varphi(\phi,s)\hat{m}_{\hat{T}(s)}(d \phi )\,d \hat{T}(s). $$
(51)

Again, the Skorokhod representation (with a slight abuse of notation) implies that, as h→0,

(52)

uniformly in t on any bounded interval.

In view of (43), since ξ h,δ(⋅) and α h,δ(⋅) are piecewise constant functions,

(53)

as h→0. Combining (45)–(53), we have

(54)

where lim δ→0 E|ε δ(t)|=0. Finally, taking limits in the above equation as δ→0, (42) is obtained. □

Theorem 4.3

For t<∞, define the inverse \(L(t)=\inf\{s:\hat{T}(s)>t \}\). Then L(t) is right continuous and L(t)→∞ as t→∞ w.p. 1. For any process \(\hat{\psi}(\cdot)\), define the rescaled process ψ(⋅) by \(\psi(t)=\hat{\psi}(L(t))\). Then, W(⋅) is a standard \(\mathcal {F}_{t}\)-Wiener process, N(⋅) is a Poisson measure, and (6) holds.

Proof

Since \(\hat{T}(t) \to\infty\) w.p. 1 as t→∞, L(t) exists for all t, and L(t)→∞ as t→∞ w.p. 1. Similar to (47) and (49), for each \(i\in\mathcal{M}\),

Thus, we can verify that W(⋅) is an \(\mathcal{F}_{t}\)-Wiener process. A rescaling of (42) yields

(55)

In other words, (6) holds. □

4.4 Convergence of Cost and Value Functions

Theorem 4.4

Let \(\{\hat{\xi}^{h}(\cdot),\hat{\alpha}^{h}(\cdot), \hat{m}^{h}(\cdot),\hat{W}^{h}(\cdot), \hat{N}^{h}(\cdot), \hat{R}^{h}(\cdot), \hat{z}^{h}(\cdot) ,\hat{g}^{h}(\cdot), \hat{T}^{h}(\cdot)\}\) be the weak convergent subsequence of the sequence with limit \(\{\hat{X}(\cdot), \hat{\alpha}(\cdot), \hat{m}(\cdot), \hat{W}(\cdot), \hat{N}(\cdot), \hat{R}(\cdot), \hat{z}(\cdot), \hat{g}(\cdot),\hat{T}(\cdot)\} \). Then,

$$ J^h \bigl(x,\ell,\pi^h \bigr)\to E_{x,\ell}^{\pi} \int_0^\tau\!\!\int _\mathcal {U}e^{-r\hat{T}(t)}d\hat{Z} = E_{x,\ell}^{\pi} \int_0^\tau\!\!\int_\mathcal {U}e^{-rt} dZ = J(x,\ell, \pi). $$
(56)

Proof

Noting that Δz hg h=h, the uniform integrability of dZ can be easily verified. Due to the tightness and uniform integrability properties, for any t, \(\int_{0}^{t} d\hat{Z}\) can be well approximated by a Riemann sum uniformly in h. By the weak convergence and the Skorokhod representation,

$$J_B^h \bigl(x,\ell,\pi^h \bigr) = E\sum _{k=1}^{\eta_h -1 }e^{-r t_k^h}\Delta z_k^h \to E_{x,\ell}^{\pi}\int _0^\tau\!\!\int_\mathcal{U}e^{-r\hat{T}(t)}d \hat{Z} . $$

By an inverse transformation,

$$E_{x,\ell}^{\pi}\int_0^\tau\!\!\int _\mathcal{U}e^{-r\hat{T}(t)}d\hat{Z} =E_{x,\ell}^{\pi} \int_0^\tau\!\!\int_\mathcal{U}e^{-rt} dZ. $$

Thus, as h→0,

$$J^h \bigl(x,\ell,\pi^h \bigr) \to J(x,\ell, \pi). $$

 □

Theorem 4.5

V h(x,) and V(x,) are value functions defined in (41) and (8), respectively. Then V h(x,)→V(x,) as h→0.

The proof of the convergence of the value functions is similar to that in [14]. Thus, we omit it.

5 Numerical Examples

This section is devoted to several examples. For simplicity, we consider the case that the discrete event has two states. That is, the continuous-time Markov chain has two states with given claim size distributions. By using value iteration methods we numerically solve the optimal control problems and set the error tolerance to be 10−8.

Example 5.1

The continuous-time Markov chain α(t) representing the discrete event state has the generator and takes values in \(\mathcal{M}=\{1,2\}\). The premium rate depends on the discrete state with c(1)=1 and c(2)=3. The portfolio rate u(t) taking values in [0,1] is the control. Corresponding to the different discrete states, the yield rate of the riskless asset is l(1)=0.03 and l(2)=0.04, whereas the risky asset return rate is b(1)=0.06 and b(2)=0.08. The volatility of the financial market σ(α(t)) is valued as σ(1)=0.2 and σ(2)=2. R(t) is the compound Poisson process with uniform claim sizes ρ=0.01. Then {ν n+1ν n } is a sequence of exponentially distributed random variables with mean 1/4. Let λ=4. Furthermore, the initial surplus x is supposed to have the maximum 20 and the minimum 0. The discount rate r=0.1. We obtain the computation results depicted in Fig. 1 as follows.

Fig. 1
figure 1

Compound Poisson process with uniform claim sizes

Example 5.2

Comparing to Example 5.1, we consider a more general claim size distribution. R(t) is a compound Poisson process interpreted as aggregated claims with \(R(t)=\sum_{\nu_{n}\leq t}\rho_{n}\), where ρ n ∈{0.01,0.02} with distribution Π(0.01)=0.6, Π(0.02)=0.4. See Fig. 2 for this case.

Fig. 2
figure 2

Compound Poisson process with general claim size distribution

Example 5.3

Comparing to Example 5.2, we change the generator of the Markov chain α(t) to .

The generator is symmetric. We obtain Fig. 3 in this case.

Fig. 3
figure 3

Faster switching Markov chains

All the figures contain at least two lines since we consider the two-regime case. Figures 1(a), 2(a), and 3(a) show that the value function is concave and increases with unity slope after some barrier level, which means that the extra surplus will all be paid out as the dividend after reaching certain barrier. It is also shown that the total expected discounted value of all dividends is bounded in all cases in Figs. 1(b), 2(b), and 3(b), which is consistent with our results in Proposition 2.1. This result is under the assumption that the discount rate r is higher than the maximal yield rate \(\bar{b}\), whereas if \(r < \tilde{b}\), the total expected discounted value of all dividends could lead to infinity.

Regarding the investment strategy, we observe from Figs. 1(c), 2(c), and 3(c) that the proportion of investment in the risky asset will be zero after certain threshold. To maximize the total expected discounted value of all the dividends, the rational insurers seem to be risk averse. Especially in Fig. 3(c), the insurer is totally risk averse with portfolio in risky asset being zero all the time. In addition, we observe that from Figs. 1(c) and 2(c) that the insurer prefers to put big weight money in the risky asset when the surplus is not high enough. At the mean time, the optimal discounted dividend increases with a faster pace (the derivative is greater than 1). In other words, with small amount of money, the rational investor makes the investment more efficient by choosing investment strategy aggressively. Furthermore, from the two lines in the graphs it is shown that the investment strategy varies in different regimes due to the Markov switching.

In Figs. 1(d), 2(d), and 3(d), we use “1” to denote the region in QVIs when regular control is dominant and “2” to denote the region in QVIs when singular control is dominant. Although the optimal values of the discounted total dividend in different regimes do not have big difference, the dividend payment policies are very different in different regimes. In particular regime, from Figs. 1(c), 2(c), and 3(c) we find that the singular control is dominant when the investment in risky assets becomes zero. It seems that the insurer chooses to put money in the riskless asset or pay out the surplus as dividend when it is high enough to avoid the possible risk.

6 Concluding Remarks

In this work, we have developed numerical approximation schemes for finding the optimal investment and dividend payment policy to maximize the total discounted dividend paid out until the lifetime of ruin. A generalized regime-switching jump diffusion formulation of surplus with singular control is presented. Although one could derive the associated system of integro-differential QVIs by using the usual dynamic programming approach together with the use of properties of regime-switchings, solving the mixed regular-singular control problem is very difficult to solve analytically. As an alternative, we presented a Markov chain approximation method using mainly probabilistic methods. For the singular control part, a technique of time rescaling is used. In the actual computation, the optimal value function can be obtained by using the value or policy iteration methods. The method proposed in this paper can be extended to more complicated payoff functions.

In addition, not only will the numerical approaches provide guidance for the decision makers in the financial and insurance industries but also help researchers to gain further understanding for more complex problems encountered. The numerical results that may not be obtained by using classical models will provide insight in studying the dividend payment and investment strategies and have the potential to benefit society as a whole. Furthermore, although the primary motivation stems from investment and dividend payout strategies, the techniques and the algorithms suggested in fact are applicable to a wide range of regime-switching impulse control problems.