Keywords

AMS classification: primary 91B30, 93E20; secondary 49l20, 49L25, 49M25

1 Prologue

This paper is based on a short course given at University of Cartagena, Columbia, during the Second International Congress on Actuarial Science and Quantitative Finance. Its issue is an introduction into stochastic control in insurance, with special emphasis on new problems, new approaches and new methods, as well as on numerical issues. We will consider control for minimizing ruin probability (which results in reduction of solvency capital) as well as maximizing dividend payment (which has impact on the company value). Combining these two objectives, we consider maximization of dividend value under a ruin constraint. We will start with a simple discrete example where the tools and methods for more complex models are introduced. This example is just for illustration, it is too simple for applications or for advanced mathematics. Such simple models have their merits in education (see, e.g., De Finetti 1957). In this discrete example we consider

  1. 1.

    infinite time ruin probability,

  2. 2.

    minimal ruin probability by control of reinsurance,

  3. 3.

    minimal ruin probability by control of investment,

  4. 4.

    company value, i.e. maximal dividend value by control of dividend payment,

  5. 5.

    maximal company value by control of reinsurance,

  6. 6.

    company value under ruin constraint, and

  7. 7.

    maximal company value under ruin constraint with control of reinsurance.

A Discrete Example We consider a discrete time and space risk process S(t), t ≥ 0, which jumps from s to s + 2 with probability p 1 = 0. 55, to s − 1 with probability p 2 = 0. 3, and to s − 3 with probability p 3 = 0. 15. This can be regarded as a risk process of an insurer who in each period receives a premium of size 2 and pays claims of size 3 and 5, respectively. The infinite horizon ruin probability

$$\displaystyle{\psi (s) = \mathbb{P}\{S(t) <0\ \mbox{ for some}\ t \geq 0\vert S(0) = s\}}$$

satisfies the dynamic equation

$$\displaystyle{ \psi (s) = p_{1}\psi (s + 2) + p_{2}\psi (s - 1) + p_{3}\psi (s - 3),s \geq 0, }$$
(1)

with ψ(s) = 1 for s < 0. Using the operator

$$\displaystyle{\mathcal{G}f(s) = p_{1}f(s + 2) + p_{2}f(s - 1) + p_{3}f(s - 3)}$$

the above dynamic equation reads

$$\displaystyle{\psi (s) = \mathcal{G}\psi (s),s \geq 0.}$$

The common computation of ψ(s) is done via generating functions, the solution of the characteristic equation and the adjustment to the boundary values ψ(s) = 1, s < 0 and ψ() = 0. The characteristic equation

$$\displaystyle{z^{3} = p_{ 1}z^{5} + p_{ 2}z^{2} + p_{ 3}}$$

has the five complex solutions z i which with coefficients C i form the solution ψ(s) = C 1 z 1 + + C 5 z 5 having the appropriate boundary values. In our example, in particular, ψ(12) = 0. 08828824. 

Table 1

For numerical computation and for the next problems it is useful to consider instead a nonstationary approach. For t ≥ 0 define ψ(s, t) as the probability of ruin after time t, given S(t) = s. The functions sψ(s, t) satisfy

$$\displaystyle{ \psi (s,t - 1) = \mathcal{G}\psi (s,t),s \geq 0, }$$
(2)

with ψ(s, t) = 1 for s < 0. Starting with large T > 0 and initial function ϕ(s, T) = 0, s ≥ 0,  ϕ(s, T) = 1, s < 0 we calculate with (2) for the functions ϕ(s, t) all terms down to t = 0, and ϕ(s, 0) is a good approximation for ψ(s): ϕ(s, 0) is the probability for ruin before or at T which is close to ψ(s) when T is large. For T = 5000 we obtain for ϕ(12, 0) all the digits for ψ(12) shown above.

Assume that for each period we can buy reinsurance: for the price of 1 the reinsurer pays 3 when a claim of size 5 occurs, and 1 when the claim has size 3. So for each claim the first insurer has to pay 2; this type of risk sharing is called excess of loss reinsurance. What is the optimal reinsurance strategy to minimize the ruin probability, and what is the corresponding ruin probability \(\overline{\psi }(s)?\) The nonstationary approach—with a slightly changed dynamic equation—produces the solution: replace (2) by

$$\displaystyle\begin{array}{rcl} \overline{\psi }(s,t - 1)& =& \min [\mathcal{G}\overline{\psi }(s,t),\mathcal{G}_{1}\overline{\psi }(s,t)]{}\end{array}$$
(3)
$$\displaystyle\begin{array}{rcl} \mathcal{G}_{1}f(s)& =& p_{1}f(s + 1) + p_{2}f(s - 1) + p_{3}f(s - 1)].{}\end{array}$$
(4)

The operator \(\mathcal{G}\) shows the dynamics in the case without reinsurance, while the operator in (4) corresponds to the dynamics with reinsurance. The numerical procedure and the initial functions are the same as above. With dynamic reinsurance, the ruin probability is reduced to \(\overline{\psi }(12) = 0.063095.\) The optimal reinsurance strategy is: buy reinsurance whenever s ≥ 2. With static reinsurance, i.e. reinsurance for all s ≥ 0, we obtain \(\overline{\psi }(12) = 0.073629.\)

Assume that for each period we can invest an amount of 1 which in this period either doubles with probability w > 1∕2, or is lost with probability 1 − w, where investment return is independent of insurance business. What is the optimal investment strategy to minimize the ruin probability? The nonstationary approach—again with a slightly changed dynamic equation—produces the solution: replace (2) by

$$\displaystyle\begin{array}{rcl} \overline{\psi }(s,t - 1)& =& \min [\mathcal{G}\overline{\psi }(s,t),\mathcal{G}_{2}\overline{\psi }(s,t)]{}\end{array}$$
(5)
$$\displaystyle\begin{array}{rcl} \mathcal{G}_{2}f(s)& =& w\mathcal{G}f(s + 1) + (1 - w)\mathcal{G}f(s - 1){}\end{array}$$
(6)

The operator \(\mathcal{G}\) shows the dynamics in the case without investment, while the operator in (6) corresponds to the dynamics with investment. The numerical procedure and the initial functions are the same as above. With dynamic investment, the ruin probability for w = 0. 55 is reduced to \(\tilde{\psi }(12) = 0.07611.\) The optimal investment strategy is: invest whenever s ∉ {0, 2}. With static investment, i.e. investment for all s ≥ 0, we obtain \(\tilde{\psi }(12) = 0.0923987\) which is larger than without investment. Of course, one might extend the control to invest more than 1, which can be solved with a larger set of operators.

For discount rate 0 < r < 1 and for a given dividend strategy d(t), t ≥ 0, we consider the expected discounted dividends

$$\displaystyle{V ^{d}(s) = E\left [\sum _{ n=0}^{\infty }r^{n}d(n)\vert S(0) = s\right ],}$$

where d(t) is paid at time t and depends on the history up to time t − 1. The dividend risk process is S d(s) = S(t) − d(0) − d(t − 1); its ruin time is denoted by τ d. Dividend payments are forbidden at and after ruin. The company value is given by

$$\displaystyle{V (s) =\sup _{d}V ^{d}(s),s \geq 0,}$$

its dynamic equation equals

$$\displaystyle{ V (s) =\max \{ r\mathcal{G}V (s),V (s - 1) + 1\}, }$$
(7)

with V (s) = 0 for s < 0. As above, the generating function method can be applied here, but this equation can also be solved with a nonstationary approach. For t ≥ 0 consider the time t dividend functions

$$\displaystyle{V ^{d}(s,t) = E\left [\sum _{ n=t}^{\infty }r^{n}d(n)\vert S(t) = s\right ],}$$

and define V (s, t) as the supremum of these dividend functions. The functions V (s, t) satisfy

$$\displaystyle{ V (s,t - 1) =\max \{ \mathcal{G}V (s,t),V (s - 1,t - 1) + r^{t-1}\}, }$$
(8)

with V (s, t) = 0,  s < 0. Starting with large T > 0 and initial function V (s, T) = 0 we calculate with (8) the functions V (s, t) down to t = 0, and V (s, 0) is a good approximation for V (s). For r = 0. 98 we obtain V (12) = 17. 933928.

For simultaneous control of dividend payments and reinsurance we simply replace the expression \(\mathcal{G}V (s,t)\) in (8) by \(\max (\mathcal{G}V (s,t),\mathcal{G}_{1}V (s,t)).\) The dividend value changes to V (12) = 18. 104876. The optimal reinsurance strategy is: buy reinsurance when s ≥ 10. 

Optimal dividend payment with a ruin constraint has the value function

$$\displaystyle{V (s,\alpha ) =\sup _{d}[V ^{d}(s): \mathbb{P}\{\tau ^{d} <\infty \}\leq \alpha ].}$$

To solve it we use the Lagrange multiplier method and maximize for a large constant L the expression

$$\displaystyle{W(s,L) =\sup _{d}[V ^{d}(s) - L\mathbb{P}\{\tau ^{d} <\infty \}].}$$

For the nonstationary approach we consider again the dividend payments after time t, together with the ruin time τ t d after time t: 

$$\displaystyle{W(s,t) =\sup _{d}[V ^{d}(s,t) - L\mathbb{P}\{\tau _{ t}^{d} <\infty \vert S(t) = s\}].}$$

The dynamic equation for these functions reads

$$\displaystyle{ W(s,t - 1) =\max \{ \mathcal{G}W(s,t),W(s - 1,t - 1) + r^{t-1}\}, }$$
(9)

where W(s, t) = −L for s < 0. The initial function here is W(s, T) = −(s). For the computation of the corresponding ruin probability, we simultaneously compute functions ψ(s, t) from

$$\displaystyle{ \psi (s,t - 1) = \mathcal{G}\psi (s,t), }$$
(10)

when the maximum in (9) is at \(\mathcal{G}W(s,t)\) (no dividend payment), and ψ(s, t − 1) = ψ(s − 1, t − 1) otherwise. The initial function is ψ(s, T) = ψ(s). For s = 12 we have a ruin probability without dividend payment ψ(12) = 0. 088288 and a dividend value without constraint V (12) = 18. 933928. We take L = 40 and obtain W(12) = 5. 646781. The ruin probability with dividend payments equals ψ(12) = 0. 160923, so the dividend value is W(12) + (12) = 12. 083708. 

For simultaneous control of dividend payments under a ruin constraint and reinsurance we simply replace the expression \(\mathcal{G}W(s,t)\) in (9) by

$$\displaystyle{\max (\mathcal{G}W(s,t),\mathcal{G}_{1}W(s,t)).}$$

Also here, we obtain the corresponding ruin probability in a simultaneous computation: we use the dynamic equations (10) when no dividends are paid and no reinsurance is bought, or the relation

$$\displaystyle{\psi (s,t - 1) = \mathcal{G}_{1}\psi (s,t)}$$

when no dividends are paid and reinsurance is bought, or finally

$$\displaystyle{\psi (s,t - 1) =\psi (s - 1,t - 1)}$$

when dividends are paid. For s = 12 and L = 13. 754 we obtain a ruin probability ψ(12) = 0. 160828 and a dividend value W(12, L) + (12) = 17. 635244. Comparing this company value with the one without ruin constraint, one can see that a ruin constraint can be cheap when an appropriate reinsurance cover is available (which, in our case, has a rather small loading: the premium is 1 while the expected payments are 0.75).

Apparently, the nonstationary approach seems to be well suited for the computation of value functions and optimal strategies since it can easily be adapted to various different problems. It is superior to the stationary method given in Hipp (2003) which is based on a modified Hamilton-Jacobi-Bellman equation: it is much faster. This is caused by the fact that in the stationary approach one has to compute value functions for all 0 ≤ α ≤ 1, while in the nonstationary approach one has only one fixed α (specified by L).

The following figure shows the optimal strategies for control of dividends with ruin constraint with and without reinsurance. They depend on the current surplus s and time t. In both cases the optimal dividend strategies are barrier strategies defined by a barrier M(t); the optimal reinsurance strategy is also a barrier strategy: buy reinsurance when sN(t). The values of M(t) and N(t) are piecewise constant; they are shown for t = 0, 200. The highest curve shows M(t) for the case without reinsurance, while the two curves below show M(t) and N(t) for the case with reinsurance.

Continuous Time Models and Their Generators For applications, continuous time models are of major importance. The classical risk model for insurance is the Lundberg model in which the claim arrivals are modeled as a homogeneous Poisson process N(t), t ≥ 0, with constant intensity λ > 0, and the claim sizes X, X 1, X 2,  are independent and identically distributed and independent of the process N(t). The risk process is then given by

$$\displaystyle{S(t) = s + ct - X_{1} -\ldots -X_{N(t)},}$$

where s is the initial surplus and c the premium rate. We always assume a positive loading, i.e. c > λE[X]. S(t) is a time homogeneous process with independent increments. The above operator \(\mathcal{G}\) in continuous time homogeneous Markov processes is the infinitesimal generator

$$\displaystyle{ \mathcal{G}f(s) =\lim _{h\rightarrow 0}E[f(S(h) - f(s)\vert S(0) = s]/h, }$$
(11)

which for the Lundberg model equals

$$\displaystyle{\mathcal{G}f(s) =\lambda E[f(s - X) - f(s)] + cf'(s)}$$

which is defined on the set of all bounded differentiable functions f(s). 

A large scale approximation of stationary risk processes with independent increments is the simple diffusion with dynamics

$$\displaystyle{ dS(t) =\mu dt +\sigma dW(t),\ t \geq 0, }$$
(12)

where W(t) is the standard Brownian motion. The generator equals

$$\displaystyle{\mathcal{G}f(s) =\mu f'(s) +\sigma ^{2}f''(s)/2,}$$

it is defined on the set of locally bounded functions with second derivative.

One possible way to include parameter uncertainty is the choice of mixture models for S(t), such as the Cox process in which the intensity of the claims arrival process is random and modeled as a time homogeneous finite Markov process. Here, we have a finite number of possible non-negative and distinct intensities λ i ,  i = 1, , m, and λ(t) jumps between these intensities in a homogeneous Markovian way. This is usually described via parameters b i, j ,  i, j = 1, , m, satisfying b i, j ≥ 0,  ij, and

$$\displaystyle{b_{i,i} = -\sum _{j\neq i}b_{i,j}.}$$

If the intensity is in state λ i , then it stays there an exponential waiting time with parameter − b i, i , and then it jumps to λ j with probability − b i, j b i, i . 

Mixture models are more complex than the above-mentioned models, they sometimes lack the independence of increments and often also the Markov property. When λ(t) is observable, then the state (S(t), λ(t)) has the Markov property, and the generator for this at s ≥ 0,  i = 1, , m equals

$$\displaystyle{ \mathcal{G}f(s,i) =\lambda _{i}E[f(s - X,i) - f(s,i)] + cf_{s}(s,i) +\sum _{ j=1}^{m}b_{ i,j}(f(s,j) - f(s,i)). }$$
(13)

When λ(t) is not observable, then again one can enlarge the state vector to obtain the Markov property: if \(\mathcal{F}_{t}\) is the filtration generated by S(t),  t ≥ 0, then (S(t), p 1(t), , p m (t)) has the Markov property, where p i (t) is the conditional probability of λ(t) = λ i , given \(\mathcal{F}(t).\) The processes p k (t) are piecewise deterministic, they depend on t and the history S u , ut. Between claims they can be computed using the following system of interacting differential equations:

$$\displaystyle{ p_{k}^{{\prime}}(t) =\sum _{ j=1}^{I}p_{ j}(t)b_{j,k} -\lambda _{k}p_{k}(t) + p_{k}(t)\sum _{j=1}^{I}p_{ j}(t)\lambda _{j},\,i = 1,\ldots,I. }$$
(14)

This follows from the fact that from t to t + dt, given λ(t) = λ k , there is no transition and no claim with probability 1 −λ k dt + b k, k dt + o(dt), and for jk, given λ(t) = λ j , there is a transition from λ j to λ k and no claim with probability b j, k dt + o(dt). So, given N(t + dt) = N(t), 

$$\displaystyle\begin{array}{rcl} p_{k}(t + dt)& =& \frac{p_{k}(t)\left (1 -\lambda _{k}dt + b_{k,k}dt\right ) +\sum _{j\neq k}b_{j,k}p_{j}(t)dt} {P\{N(t + dt) = N(t)\vert \mathcal{F}_{t}\}} + o(dt) {}\\ & =& \frac{p_{k}(t)\left (1 -\lambda _{k}dt\right ) +\sum _{j}b_{j,k}p_{j}(t)dt} {1 -\sum _{j}p_{j}(t)\lambda _{j}dt} + o(dt) {}\\ & =& p_{k}(t)\left (1 -\lambda _{k}dt +\sum _{j}p_{j}(t)\lambda _{j}dt\right ) +\sum _{j}b_{j,k}p_{j}(t)dt + o(dt). {}\\ \end{array}$$

At a claim, the process p k (t) has a jump: given N(t + dt) > N(t) we have for k = 1, , I

$$\displaystyle{ p_{k}^{+}:= p_{ k}(t+) = \frac{\lambda _{k}p_{k}(t)} {\sum _{j}p_{j}(t)\lambda _{j}}. }$$
(15)

This follows from

$$\displaystyle\begin{array}{rcl} P\{N(t + h)&>& N(t),\lambda (t + h) =\lambda _{k}\vert \mathcal{F}_{t}\} = p_{k}(t)\lambda _{k}h + o(h), {}\\ P\{N(t + h)&>& N(t)\vert \mathcal{F}_{t}\} =\sum _{ j=1}^{I}p_{ j}(t)\lambda _{j}h + o(h). {}\\ \end{array}$$

From this dynamics we obtain the following generator:

$$\displaystyle{ \mathcal{G}f(s,p) =\sum _{k}p_{k}\lambda _{k}E[f(s - X,p^{+}) - f(s,p)] + cf_{ s}(s,p) +\sum _{k}f_{p_{k}}(s,p)p_{k}^{{\prime}}. }$$
(16)

Here, f s and \(f_{p_{k}}\) are the partial derivatives with respect to s and p k , respectively.

Mixtures with constant but unknown parameters in a finite set {λ 1, , λ m } can be modeled as follows: let λ be a random variable with \(p_{i} = \mathbb{P}\{\lambda =\lambda _{i}\}\) known. Assume that given λ = λ i ,  S(t) is a classical Lundberg process with intensity λ i . With p i (t) the conditional probability of λ = λ i , given S(u),  ut, the vector (S(t), p 1(t), , p m (t)) has the Markov property. The dynamics of the p i (t) is the same as in the above example, with b j, k = 0 for j, k = 1, , m. The generator is the same as in (16).

For mixture models as well as for dividend problems with ruin constraint, it is convenient to consider also non-stationary generators. As illustration we mention the example which is also considered in Sect. 5. It is a delayed compound Poisson process where up to a random time T we have S(t) = s + ct, and for t > T, given T the risk process s + S(t) − S(T) is a compound Poisson process. The time T has an exponential distribution. We want to minimize the ruin probability by control of reinsurance. For this, write V (s, t) for the controlled ruin probability after time t, given that no claim happened until t. Then V (s, t) has a dynamic equation of the form

$$\displaystyle{0 =\inf _{a}p(t)\lambda E[V _{1}(s - g_{a}(X)) - V (s,t)] + (c - h(a))V _{s}(s,t) + V _{t}(s,t),}$$

where p(t) is the conditional probability of t < T, given no claim up to time t, and V 1(s) is the minimal ruin probability for the case with constant positive intensity. The quantity h(a) is the reinsurance price for risk sharing g a (X). 

2 Ruin and Company Value

We shall restrict ourselves to three types of control problem: one in which we minimize the infinite time of ruin, next the maximization of the company value, and finally the maximization of a company value with a ruin constraint. We shall always consider an infinite horizon view, since insurance uses diversification in time, and some insurance products are long term.

For Lundberg models, the infinite time ruin probability is a classical bounded solution of the dynamic equation \(\psi (s) = \mathcal{G}\psi (s),s \geq 0,\) with a continuous first derivative. It is the unique classical solution satisfying ψ(s) = 1,  s < 0,  ψ() = 0, and ψ′(0) = −λ(1 − V (0))∕c. Analytic expressions for ψ(s) can be given for exponential or more general phase-type distributions (see Chap. IX of Albrecher and Asmussen 2010).

The company value is itself the result of a control problem: what is the maximal expected discounted sum of paid dividends? In mathematical terms:

$$\displaystyle{V _{0}(s) =\sup _{D}\left \{E\left [\int _{0}^{\infty }e^{-\delta t}dD(t)\vert S(0) = s\right ]\right \},}$$

where D = D(t),  t ≥ 0 is the sum of dividends paid up to time t with some admissible dividend payment strategy. Already in the Lundberg model, this question is hard, too hard for applications in insurance. The answer is simpler if we restrict dividend payment to strategies which are barrier strategies. Optimal barrier strategies can be derived from a classical solution v(s) of the dynamic equation

$$\displaystyle{ 0 =\delta v(s) + \mathcal{G}v(s), }$$
(17)

with v(0) = v′(0) = 1, where \(\mathcal{G}\) is the generator of the risk process and δ is the discount rate. Then

$$\displaystyle{V _{0}(s) = v(s)/v'(M),\ s \leq M,\ V _{0}(s) = V _{0}(M) + s - M,\ s \geq M,}$$

where the optimal barrier is given by

$$\displaystyle{M =\arg \min v'(s)}$$

(see Schmidli 2007, Sect. 2.4.2). This simplified answer is a sub-solution of the above control problem. It is an optimal dividend strategy only for special claim size distributions (see Loeffen 2008). Generally, optimal dividend strategies are band strategies, i.e. there might exist M < M 1 < M 2 for which no dividends are paid as long as M 1 < S(t) < M 2, and for M < S(t) ≤ M 1 a lump sum M 1S(t) is paid out immediately. However, optimal barrier strategies are useful for applications since for sM the dividend values of the barrier strategy are the same as the dividend value of the optimal band strategy.

For the company value, a discount rate is needed which can be a market interest rate (which should be modeled with some stochastic process which is allowed to be negative), or a value which shareholders and accountants agree upon. We will be concerned only with positive constant discounting rates.

A company value with ruin constraint is an even more complex quantity since it involves a control problem with two objective functions. Its computation is still work in progress. We consider it here since it appealing to both, the policy holders and the stock holders. The value is given by

$$\displaystyle{V (s,\alpha ) =\sup _{D}\left \{E\left [\int _{0}^{\infty }e^{-\delta t}dD(t)\vert S(0) = s\right ]:\psi ^{D}(s) \leq \alpha \right \},}$$

where 0 < α ≤ 1 is the allowed ruin probability and ψ D(s) is the with dividend ruin probability. Clearly, V (s, 1) = V 0(s), and if ψ(s) is the without dividend ruin probability, then V (s, ψ(s)) = 0. 

The meaning of a company value with ruin constraint might become clearer when we meditate a little about special dividend strategies which have constrained ruin probabilities. Let us do this in a diffusion model which does not have downward jumps. Let s(α) be the solution of ψ(s) = α. The simplest strategy is: pay out ss(α) immediately, and stop dividends forever. This has a ruin probability α and a dividend value ss(α). A better strategy is constructed using the optimal unconstrained dividend strategy based on the barrier M and leading to the dividend value V 0(s). Choose s > 0 and α > ψ(s); then s(α) < s, so you can put aside s(α) (e.g., into your pocket), and then use the unconstrained dividend strategy with initial surplus ss(α). At ruin, i.e. when you reach zero, you stop paying dividends forever. With your money from the pocket, you indeed stopped with s(α), and so your ruin probability equals α. And your corresponding dividend value equals V 0(ss(α)) > ss(α). 

Money in the pocket is never optimal, and so there should exist improvements of the dividend strategy with the same ruin probability. Our next strategy is based on the improvement procedure introduced in Hipp (2016). We assume s < M and α > ψ(s). Do not pay dividends until you reach M. You will be ruined before reaching M with probability A = (ψ(s) −ψ(M))∕(1 −ψ(M)) ≤ ψ(s) < α. Define 0 < γ < α via equation

$$\displaystyle{A +\gamma (1 - A) =\alpha.}$$

When you reach M, you put aside the amount s(γ) and pay out dividends with the unconstrained strategy until you reach s(γ). Then you again stop paying dividends forever. The resulting ruin probability is α, and the dividend value will be V 0(Ms(γ)), discounted over the time τ until you reach M. With our function V 0(s) above we have E[e τδ] = V 0(s)∕V 0(M), and so the dividend value of our strategy is

$$\displaystyle{ \frac{V _{0}(s)} {V _{0}(M)}V _{0}(M - s(\gamma ))}$$

which is larger than V 0(ss(α)). The reason for this is: in the first case we stop dividend payment forever at s(α), also when we did not reach M yet, and this reduces the dividend payments. In the second we wait until we reach M, and then money goes to our pocket.

3 Hamilton-Jacobi-Bellman Equations

The use of these equations might seem a bit old-fashioned, but with the concept of viscosity solutions it is still a standard approach. For a stationary Markov process which should be controlled by actions aA we consider the process with a constant (in time) action a and the corresponding generator \(\mathcal{G}^{a}\) of the resulting Markov process. If we minimize ruin probability, the Hamilton-Jacobi-Bellman equation reads

$$\displaystyle{ 0 =\inf _{a}\mathcal{G}^{a}V (x), }$$
(18)

where V (x) stands for the value function of the problem and x = (s, p) is the (enlarged) vector of states, s ≥ 0 being the surplus. If we maximize dividend payments, it is given by the formula

$$\displaystyle{ 0 = -\delta V (s) +\sup _{a}\mathcal{G}^{a}V (x), }$$
(19)

where δ > 0 is the discount rate. But here the range of x is restricted to {(x, p): xM(p)}, where for fixed p, M(p) is the smallest point x at which V s (x, p) = 1. For larger x the function is linear with slope 1:

$$\displaystyle{V (x,p) = V (M(p),p) + x - M(p).}$$

Notice that we neglect a possible second and third, etc. band.

In Lundberg models, Eq. (18) involves a first derivative and an expectation. Such an equation needs two boundary values to identify a unique solution. For ruin probabilities V (s) = 1 for s < 0, and so we can use the two conditions V () = 0 and V ′(0) = λ(1 − V (0)∕c. For dividend values we first use a solution with v(s) = 0 for s < 0 and v(0) = 1, v′(0) = λc, and then we minimize v′(s) (see Chap. 6).

In simple diffusion models, Eq. (18) shows a first and a second derivative. For this we again need two conditions, which are V (0) = 1, V () = 0 for the ruin probability, and V (0) = 0, V ′(M) = 1 for dividend values, where M is again the minimizer for v′(s). 

We shall frequently use a nonstationary approach, even for stationary problems. In our introductory example, we have computed the infinite horizon ruin probability with such an approach: we considered the ruin probability V (s, t) after time t when starting in s at t. We used a large number T and used the initial guess V (s, T) = 1 if s < 0, and V (s, T) = 0 elsewhere. Using the dynamic equation for the nonstationary case, we calculated backward in t to the end t = 0, and V (s, 0) was an almost exact value for the ruin probability in the stationary model.

For this we need the dynamic equation for a nonstationary setup in the case of a stationary Markov model. This is most simple: if \(\mathcal{G}\) is the generator of the model, then the equation is

$$\displaystyle{ 0 = \mathcal{G}V (s,t) + V _{t}(s,t). }$$
(20)

In the dividend case, there is no extra term with δ as in (19) since the discounting is modeled in the time dependence.

For cases like the volcano problem in Chap. 5, we obtain a nonstationary dynamic equation in which time dependent quantities enter.

4 Investment Control

What is the optimal investment strategy for an insurer to minimize her ruin probability? This is one of the oldest questions in the field of stochastic control in insurance, it was solved for the simple diffusion case by Browne (1995) in 1995. A simple framework for this problem is a Lundberg process for the risk and a logarithmic Brownian motion for the capital market (a stock or an index) in which the insurer can invest.

Our first example is of little use in insurance industry, but it might serve as an introduction since it shows many features which are present in other cases. We assume that the insurer does not earn interest and pay taxes, and that he can invest an unrestricted amount, i.e. unlimited leverage and short-selling are allowed. We assume in the following that the Lundberg process has parameters c (premium rate), λ (claim frequency), X (generic claim size) with bounded density, and c > λE[X] (positive loading).

The price process of the asset has dynamics

$$\displaystyle{dZ(t) =\mu Z(t)dt +\sigma Z(t)dW(t),\ t \geq 0,}$$

where W(t) is standard Brownian motion and μ, σ are positive.

Theorem 1

The minimal ruin probability V (s) is a classical solution to the dynamic equation

$$\displaystyle{ 0 =\inf _{A}\{\lambda E[V (s - X) - V (s)] + (c +\mu A)V '(s) + A^{2}\sigma ^{2}V ''(s)/2\},\ s> 0. }$$
(21)

The function V (s) has a continuous second derivative V ″(s) < 0 in s > 0, with lim s → 0 V ″(s) = −. The optimal amount A(s) invested at state s is A(0) = 0 and A (s) = −μV ′(s)∕(σ 2 V ″(s)),  s > 0. 

Two different proofs are given in Hipp and Plum (20002003).

There are only a few parameters and exponential claim sizes for which A(s) or V (s) can be given in explicit form.

Example 2

Let μ = σ = λ = 1 and c = 3∕2. The claim size has an exponential distribution with mean a. Then

$$\displaystyle{A(s) = \sqrt{2c/a}\sqrt{1 - e^{-2as}}.}$$

Here, A(s)∕s, and this is a typical behavior of the optimal investment strategy. Since unlimited leverage is forbidden for insurers, leverage has to be bounded or completely forbidden by constraints on the strategies. Such constraints can be defined state dependent, allowing a range \(\mathcal{A}(s)\) for the choice of the amount invested at surplus s. With these we can deal with the case of no restriction \(\mathcal{A}(s) = (-\infty,\infty )\), no leverage \(\mathcal{A}(s) = (-\infty,s]\), no short-selling \(\mathcal{A}(s) = [0,\infty ),\) neither leverage nor short-selling \(\mathcal{A}(s) = [0,1],\) and bounded leverage and bounded short-selling \(\mathcal{A}(s) = [-as,bs].\) The constraints change the control problem substantially; so, e.g. for no leverage constraint there is no optimal investment strategy. An optimal strategy would be to invest the amount − on the market (volatility hunger). Furthermore, constraints can lead to non-smoothness of the value function.

Such cases are investigated in the papers Azcue and Muler (2010), Belkina et al. (2014), Edalati (2013), Edalati and Hipp (2013), and Hipp (2015). While the proofs and arguments in these papers are all different, it is good to have a universal numerical method (see Sect. 8) which works in all these situations.

The corresponding dynamic equation for the value function reads

$$\displaystyle{0 =\inf _{A\in \mathcal{A}(s)}\{\lambda E[V (s - X) - V (s)] + (c +\mu A)V '(s) + A^{2}\sigma ^{2}V ''(s)/2\},\ s> 0.}$$

If \(\mathcal{A}(s) = [a(s),b(s)]\) is an interval, then the minimization with respect to A is easy: there are only three possible minima: at a(s), b(s) or at the unconstrained minimizer A (s). 

The resulting optimal investment strategies vary according to the claim size distribution. For the unconstrained case, we see that A (s) is

  1. 1.

    bounded and converging to 1∕R, the adjustment coefficient of the problem, in the small claims case (Hipp and Schmidli 2004),

  2. 2.

    unbounded increasing in the large claims case as Weibull, Lognormal, Pareto: the larger risk, the higher the amount invested (Schmidli 2005)

  3. 3.

    asymptotically linear for Pareto claims.

  4. 4.

    very special when claims are constant (see Sect. 8).

Extensions to other models cause little technical problems. Interest rate earned on surplus or paid for loans can be implemented (see Hipp and Plum 2003).

In the case of two (correlated) stocks, a very simple model would be the dynamics

$$\displaystyle{dZ_{i}(t) = a_{i}Z_{i}(t)dt + b_{i}Z_{i}(t)dW_{i}(t),\ t \geq 0,i = 1,2,}$$

where ρ is the correlation between W 1(t) and W 2(t). If we first choose the proportion p and 1 − p in which we invest the amount A in stock 1 and stock 2, then we obtain the usual dynamic equation with μ and σ 2 depending on p. Taking the minimum over A the dynamic equation remains with the term

$$\displaystyle{-\frac{1} {2} \frac{\mu ^{2}V '(s)^{2}} {\sigma ^{2}V ''(s)^{2}},}$$

and since V ′(s), V ″(s) are fixed, we have to maximize μ 2σ 2 which produces the well-known optimum

$$\displaystyle{p = \frac{a_{1}b_{2}^{2} - a_{2}\rho } {a_{1}b_{2}^{2} + a_{2}b_{1}^{2} - (a_{1} + a_{2})\rho }}$$

which is constant. So we indeed have investment into just one index with price process pZ 1(t) + (1 − p)Z 2(t). 

In other market models for the stock price Z(t), the return on investment will depend on the current price Z(t) which must be included as state variable: for the dynamics

$$\displaystyle{dZ(t) = (\mu -Z(t))^{2}dt + Z(t)dW(t),\ t \geq 0}$$

we will do no or only little investment when Z(t) is close to μ. 

Optimal investment can also be used to maximize the company value. This leads to a similar dynamic equation in which changes for dividend payment are necessary (see Azcue and Muler 2010). For simultaneous control of investment and reinsurance, also with constraints, see Edalati (2013).

5 Reinsurance Control

Reinsurance is a most important tool for risk management in insurance. We restrict ourselves on reinsurance of single claims, so we disregard stop loss reinsurance which would ask for a time discrete model. In single claims reinsurance we have a risk sharing between first insurer and reinsurer described by some function g(x) satisfying 0 ≤ g(x) ≤ x which denotes the amount paid by the first insurer; the amount of the reinsurer for a claim of size x is xg(x). Let G be the set of all risk sharings on the market, and assume that there is g 0G with g 0(x) = x (no reinsurance).

Optimal reinsurance will here be considered for minimizing the first insurer’s ruin probability. For maximizing the company value, see Azcue and Muler (2005).

Optimal control for reinsurance is done on a market in which for a risk sharing g(x) a price is specified, and this price determines the optimal strategy. If reinsurance is unaffordable on the market, then it will be optimal not to buy reinsurance. On the other hand, if reinsurance is cheap, then it might be optimal to transfer the total risk to the reinsurer and reach a position with zero ruin probability.

For this exposition of reinsurance control we take a Lundberg model for the risk process.

Assume now that a price system h(g),  gG, is given which for each risk sharing defines its reinsurance price, with h(g 0) = 0. If at time t the reinsurance contract g t (x) is active, then the risk process of the first insurer is given by

$$\displaystyle{S(t) = s + ct -\int _{0}^{t}h(g_{ u})du -\sum _{1}^{N(t)}g_{ T_{i}}(X_{i}),}$$

where T 1, T 2,  are the time points at which claims occur. The generator for a fixed risk sharing gG is

$$\displaystyle{ \mathcal{G}^{g}f(s) =\lambda E[f(s - g(X)) - f(s)] + (c - h(g))f'(s). }$$
(22)

We are minimizing the infinite horizon ruin probability through dynamic reinsurance, which leads us—as in the discrete case—to a dynamic equation for the control problem, the well-known Hamilton-Jacobi-Bellman equation:

$$\displaystyle{ 0 =\inf _{g\in G}\{\lambda E[V (s - g(X)) - V (s)] + (c - h(g))V '(s)\},\ s \geq 0, }$$
(23)

with the boundary values V () = 0 and V (s) = 1,  s < 0. Rearranging terms, we obtain

$$\displaystyle{ V '(s) =\sup _{g\in G:h(g)<c}\frac{\lambda E[V (s) - V (s - g(X))]} {c - h(g)} }$$
(24)

From this equation we come to the recursion

$$\displaystyle\begin{array}{rcl} V _{n+1}^{{\prime}}(s)& =& \sup _{ g\in G:h(g)<c}\frac{\lambda E[V _{n}(s) - V _{n}(s - g(X))]} {c - h(g)},{}\end{array}$$
(25)
$$\displaystyle\begin{array}{rcl} V _{n+1}(s)& =& -\int _{s}^{\infty }V _{ n+1}^{{\prime}}(x)dx,{}\end{array}$$
(26)

which produces an increasing sequence of continuous function converging to a solution of (23) when we start with V 1(s) = ψ(s), the infinite time ruin probability without reinsurance. This recursion is, however, not adequate for numerical computations.

In order to obtain a nontrivial solution for our control problem, total reinsurance g 0(x) = 0 should be expensive in the sense that h(g 0) > c. Otherwise total insurance would be affordable and yield a ruin probability zero for the first insurer.

In this paper, we will restrict ourselves to reinsurance prices computed as a loaded expectation:

$$\displaystyle{h(g) =\lambda \rho E[X - g(X)],}$$

where λρE[X] > c. 

Common reinsurance forms are

  1. 1.

    proportional reinsurance with g(x) = bx,  0 ≤ b ≤ 1, 

  2. 2.

    unlimited XL reinsurance with g(x) = (xM)+, 0 ≤ M, and

  3. 3.

    limited XL reinsurance with g(x) = min((xM)+, L), 0 ≤ M, L. 

XL is the usual shorthand for excess of loss. The numbers M and L are called priority and limit, respectively.

Under the above pricing formula, static proportional reinsurance (which is constant over time) does not decrease the first insurer’s ruin probability. However in dynamic control, expensive proportional reinsurance can reduce ruin probability.

The unlimited XL reinsurance is optimal for the static situation in the following sense: if g(x) is an arbitrary risk sharing function and g M (x) an unlimited XL risk sharing with E[g(X)] = E[g M (X)], then the first insurer’s ruin probability with g M is smaller than the ruin probability with g. Unlimited XL reinsurance is illiquid and/or expensive on reinsurance markets, more common are limited XL forms. Also these have some optimality in the static situation: if g is an arbitrary risk sharing with xg(x) ≤ L and, for some M, g M, L a limited XL reinsurance with E[g(X)] = E[g M, L (X)], then the first insurer’s ruin probability with g M, L is smaller than the ruin probability with g. 

Optimal dynamic reinsurance strategies take often the position no reinsurance when the surplus is small. For proportional reinsurance this was shown by Schmidli (see Schmidli 2007, Lemma 2.14) under the assumption that the price function h(b) satisfies liminf b → 0(ch(b))∕b > 0. 

A similar results holds for unlimited XL reinsurance: If h(M) is continuous at M = 0, then there exists M 0 > 0 for which h(M) > c for all 0 ≤ MM 0. Choose sM 0. The supremum in (24) is taken over M > M 0 > s. For s < M

$$\displaystyle{E[V (s -\min (X,M))] = \mathbb{P}\{X \geq s\} + E[V (s - X)1_{(X\leq s)}]}$$

does not depend on M, so the supremum in (24) is attained at M = (no reinsurance) For more details, see Hipp and Vogt (2003).

For limited XL reinsurance, for small surplus s we will see an optimal reinsurance strategy with M and L as well as a price h(M, L) close to but not at zero.

Example 3

We consider a delayed compound Poisson process which has an exponential first waiting time T with mean β = 1 in which no claims occur, and after time T the claims arrival is a Poisson process with constant intensity λ = 1. Also the claim sizes have an exponential distribution with a mean 1; the premium rate is c = 2. What is the optimal dynamic unlimited XL reinsurance which minimizes the ruin probability?

Volcanos show long waiting times between periods with frequent seismic waves. One could model claims caused by these waves as above.

The standard approach for a solution would be to solve the corresponding Hamilton-Jacobi-Bellman equation (16) for the value function V (s, p), where p(t) is the conditional probability of λ(t) = λ, given S(u),  ut. But we cannot solve the equation with V (s, 1) as boundary condition since the factor of V p (s, p) is zero when p = 1. Since we know λ(t) = λ after the first claim, we only need the optimal reinsurance strategy until the first claim. Given no claim up to time t, the function p(t) has derivative given in (14) which yields p(t) = t∕(1 + t). We use a nonstationary approach.

This seems to work well for ruin without reinsurance: let ψ(s) be the ruin probability for the uncontrolled Lundberg process with intensity λ, ψ(s) = exp(−s∕2)∕2. From (sX)] = 2ψ(s) and ψ′(s) = −ψ(s)∕2 we can see that the separation of variables works: for V (s, t) = f(t)ψ(s) the dynamic equation

$$\displaystyle{0 = p(t)E[\psi (s - X) - V (s,t)] + cV _{s}(s,t) + V _{t}(s,t)}$$

yields

$$\displaystyle{p(t)(2 - f(t)) - f(t) + f'(t) = 0,\ t \geq 0,\ f(\infty ) = 1,}$$

with the solution

$$\displaystyle{f(t) = \frac{1} {2} \frac{1 + 2t} {1 + t} }$$

and the value f(0) = 1∕2. So, V (s, t) = (1 + p(t))e s∕2∕4. These exact values can be reproduced numerically with T = 300,  ds = 0. 01 and dt = 0. 001. 

With reinsurance we consider the value function V (s, t) and its dynamic equation

$$\displaystyle{0 =\sup _{M}\{p(t)E[V _{1}(s - g_{M}(X)) - V (s,t)] + (2 - h(M))V _{s}(s,t) + V _{t}(s,t)\},}$$

where g M (X) = min(X, M) and h(M) = 2ρE[Xg M (X)] is the reinsurance price for priority M, and V 1(s) is the value function for the problem with constant intensity 1. The optimal priority M(s, t) is derived from maximizing

$$\displaystyle{p(t)E[V _{1}(s - g_{M}(X))] + (2 - h(M))V _{s}(s,t).}$$

For large T we start with V (s, T) = V 1(s), and calculate backwards to t = 0 using the recursion

$$\displaystyle{ V (s,t-dt) = V (s,t)+dt\{p(t)E[V (s,t)-V _{1}(s-g_{M}(X))]-(c-h(M))V _{s}(s,t) }$$
(27)

in which M = M(s, t) is the optimal priority. The parameter for reinsurance is ρ = 1. 1. Of course, no reinsurance is optimal for all s ≥ 0 when t = 0. We see six priority curves M(s, t), 0. 2 ≤ s ≤ 2, for t = 0. 05, 0. 025, 0. 045, 0. 095, 0. 17, 300 (from the right) (Fig. 1). The curves do not intersect; for smaller t we transfer less risk to the reinsurer. In particular, the interval without reinsurance decreases with t from [0, 1. 47] to [0, 0. 23]. 

Fig. 1
figure 1

M(s, t) for t = 0. 005, 0. 025, 0. 045, 0. 095, 0. 17, 3005

For more general Markov switching models one could perhaps adopt the above approach. Starting with a given initial probability vector at time 0, we can compute the filter p(t) for the time without claim. Assume the vectors p(t) converge to p. Since the control problem with initial distribution p can be solved easily, we can use the corresponding value function V 0(s) as V (s, ), so again we would start at some large T instead of , and would compute backward to t = 0 with the appropriate dynamic equation.

6 Dividend Control

Management decisions in insurance, such as reinsurance or investment, have an impact on the company value, and control of investment and reinsurance can be done with the objective to maximize this value. Since the company value is itself the result of a control problem, the maximizing by investment or reinsurance is a control problem with two (or more) control variables, dividends and investment and/or reinsurance. For simplification we restrict ourselves to dividend strategies which are barrier strategies.

Azcue and Muler (in Azcue and Muler 20052010) solve the problems for reinsurance and investment. They mainly characterize the value function as a solution to the dynamic equation, without showing numerical results. For applications in insurance it might be interesting to see whether reinsurance can increase the company value. For reinsurance one has to pay reinsurance premia, and this will reduce the value. But the reduction can be compensated by the reduction of the ruin probability or by increasing the time to ruin for the company. The answer to this question will depend on the relation between premium rate c and reinsurance premia, as well as on the discount rate δ (a large δ reduces the effect of a longer time to ruin). We will present some numerical examples in Sect. 8.

For company values V (s) in a simple diffusion the initial value is V (0) = 0. For Lundberg models V (0) is positive and known only in the trivial case when all surplus and premia are paid out, i.e. V (0) = c∕(λ + δ) (see Schmidli 2007, Sect. 2.4.2). The starting value in the general case—with control—can be found exactly as in the case without control: first compute a solution of the dynamic equation v(s) with v(0) = 1, and then define the barrier M as M = argminv′(s), and finally

$$\displaystyle{V (0) = v(s)/v'(M).}$$

For the computation of company values with ruin constraint we suggest the Lagrange multiplier method and a nonstationary approach. For the nonstationary approach we consider dividend payments and ruin probabilities after time t:

$$\displaystyle\begin{array}{rcl} V ^{D}(s,t)& =& E\left [\int _{ t}^{\infty }e^{-\delta u}dD(u)\vert S(t) = s\right ] {}\\ & & -L\mathbb{P}\{S^{D}(u) <0\ \mbox{ for some}\ u \geq t\vert S(t) = s\}, {}\\ V (s,t)& =& \sup _{D}V ^{D}(s,t), {}\\ V (s,\infty )& =& -L\psi (s). {}\\ \end{array}$$

Here, ψ(s) is the ruin probability without dividends, and S D(u) the risk process with dividends which, from time t until time u, add up to D(u). The last relation inspires the following method for computation: start at a large number T, take as initial value the function V (s, T) = −(s), and then compute backward until t = 0 using the non-stationary dynamic equations, modified for dividend payment.

The equations for the backward computation are

$$\displaystyle\begin{array}{rcl} M(t)& =& \min \{s: V _{s}(s,t) =\exp (-\delta t)\}, {}\\ V (s,t)& =& V (s,t + dt) - dt\mathcal{G}V (s,t + dt),\ s \leq M(t), {}\\ V (s,t)& =& V (M(t),t) + (s - M(t))\exp (-\delta t),\ s> M(t). {}\\ \end{array}$$

For a generator involving V ″(s, t) which is the case for the simple diffusion model we add V (0, s) = −L. For other models we get V (0, t) from V (0, t + dt). 

The nonstationary approach deals with partial differential equations for which we most often have to use different discretisations for time and state. The right choice of discretisations is a major problem in the context of these dividend problems (see Sect. 8).

In Sect. 2 an improvement approach was mentioned for the optimal dividend problem with ruin constraint. This was presented in Hipp (2016); however, the method is not sufficiently convincing to be a standard for the numerical computation of the value function in this problem. It might help to find reasonable sub-solutions; it is a method for patient owners of fast computers.

Improvement Approach Assume we have a function V n (s, α) which is the dividend value for initial surplus s of a strategy which has a ruin probability not exceeding α. We fix B > s and wait without paying dividends until we reach B. We will reach B before ruin with the probability A = (1 −ψ(s))∕(1 −ψ(B)), where ψ(x) is the ruin probability without dividends with initial surplus x. At B (we have no upward jumps) we start paying dividends with a strategy corresponding to a ruin probability a(B) having dividend value V n (B, a(B)). The ruin probability of this strategy is 1 − A + Aa(B), and the dividend value is the number V n (B, a(B)), discounted to zero. Let τ be the waiting time to reach B, and v(s) be the unique solution of the equation \(0 =\delta v(s) + \mathcal{G}v(s)\) with v(0) = V ′(0) = 1, where δ is the discount rate and \(\mathcal{G}\) the generator of the underlying stationary Markov process:

$$\displaystyle\begin{array}{rcl} 0& =& \delta v(s) +\lambda E[v(s - X) - v(s)] + cv'(s)\ \mbox{ for the Lundberg process} {}\\ 0& =& \delta v(s) +\mu v'(s) +\sigma ^{2}v''(s)\ \mbox{ for the simple diffusion model.} {}\\ \end{array}$$

Then

$$\displaystyle{ E[\exp (-\delta \tau )] = v(s)/v(B). }$$
(28)

If we define a(B) from the equation

$$\displaystyle{A + (1 - A)a(B) =\alpha,}$$

then our dividend strategy has ruin probability α and dividend value

$$\displaystyle{V _{n}(s)v(s)/v(B).}$$

For Bs we obtain the limit V n (s, α), so a new value dividend function which is an improvement over V n (s, α) can be defined:

$$\displaystyle{ V _{n+1}(s,\alpha ) =\sup _{B>s}V _{n}(s,a(B))v(s)/v(B). }$$
(29)

In each iteration step, we have to compute the V -function for all s ≥ 0 and ψ(s) ≤ α ≤ 1. And it has to be done on a fine grid. This causes long computation times.

One can start with the function V 1(s, α) = ss(α), where s(α) is defined through

$$\displaystyle{\psi (s(\alpha )) =\alpha.}$$

The strategy for this value is: pay out the lump sum ss(α) at time 0 and stop paying dividends forever. One should also try other initial functions which are closer to the true function, such as V 1(s, α) = V 0(ss(α)) in the simple diffusion model. For the Lundberg model, one can similarly use the function V 0(s), the company value without ruin constraint, but s(α) has to be replaced by a number s 1(α) defined via the equation

$$\displaystyle{E[\psi (s - Y )] =\alpha,}$$

where Y is the deficit at ruin in the without dividend process. For exponential claims, we can replace Y by X (see Hipp 2016). Notice that s can be smaller than s 1(α), for which the initial value could be V 1(s) = 0 or V 1(s) = ss(α). 

7 Viscosity Solutions

In many control problems, the value function can be characterized as the unique viscosity solution to the classical Hamilton-Jacobi-Bellman equation. What is more important: it helps in the proof for convergence of numerical methods (discretizations).

The concept of viscosity solutions—introduced in 1980—is well known today, but still not well enough understood. It is not a subject in most lectures on stochastic processes and control. There are various attempts to make the concept more popular: the famous User’s guide of Crandall et al. (1992) as well as the books by Fleming and Soner (2006) and Pham (2009). We aim at a better understanding for the concept and properties of viscosity solutions, and its use for the proof of convergence for Euler type discretization schemes of a Hamilton-Jacobi-Bellman equation. This use is based on the fact that upper and lower limits of discretization schemes are viscosity solutions.

In particular we try to provide

  1. 1.

    a better understanding of the Crandall-Ishii maximum principle

  2. 2.

    a proof for the comparison argument which uses V (0) and V ′(0)

  3. 3.

    an understanding that the concept, being rather technical, is of major importance for applications (numerics and understanding control problems).

For this, we think that a complete and detailed proof for the Crandall-Ishii comparison argument should be included in this section, although for smooth reading one would transfer the proof to an appendix.

Value functions are not always smooth, the viscosity concept is useful to deal with these value functions. Here are two figures from optimization problems with singular value functions; they come from the optimal investment problem with constraint sets \(\mathcal{A}(s)\): the amount A(s) invested in stock must lie in \(\mathcal{A}(s)\) when we are in state s. In both figures the blue line shows the proportion A(s)∕s invested, while the black is the first derivative of the value function V (s) (Figs. 2 and 3).

Fig. 2
figure 2

\(\mathcal{A}(s) =\{ 0\},\) \(s <1,\mathcal{A}(s) = [0,\infty ),s \geq 1\)

Fig. 3
figure 3

\(\mathcal{A}(s) = [0,\infty ),\) \(s <1,\mathcal{A}(s) =\{ 0\},s \geq 1\)

The dynamic equation for our control problem, valid for s > 0, is

$$\displaystyle{ 0 =\sup _{A\in \mathcal{A}(s)}\left \{\lambda E[V (s - U) - V (s)] + (c + A)V '(s) + A^{2}V ''(s)/2\right \}. }$$

Because of the above examples there is no hope for the statement: the value function is the unique smooth solution of the above dynamic equation. Instead one can try to prove that the value function is a (unique smooth) viscosity solution of the above HJB. For this section we will always consider the optimal investment problem with constraints; in particular, sub-solutions, super-solutions, and viscosity solutions are always defined with respect to the above HJB.

Definition 4

A function V (s),  s ≥ 0, is a viscosity super-solution at s > 0 if for V (x) ≥ ϕ(x) ∈ C 2 having in s a local minimum for V (x) −ϕ(x)

$$\displaystyle{\sup _{A\in \mathcal{A}(s)}\left \{\lambda E[V (s - U) - V (s)] + (c + A)\phi '(s) + A^{2}\phi ''(s)/2\right \} \leq 0.}$$

V (s) is a viscosity sub-solution at s > 0 if for V (x) ≤ ϕ(x) ∈ C 2 having in s a local maximum for V (x) −ϕ(x)

$$\displaystyle{\sup _{A\in \mathcal{A}(s)}\left \{\lambda E[V (s - U) - V (s)] + (c + A)\phi '(s) + A^{2}\phi ''(s)/2\right \} \geq 0.}$$

V (s) is a viscosity solution: if it is a super- and sub-solution at all s > 0. 

An equivalent definition using sub- and superjets is

Definition 5

V (s) is a viscosity super-solution at s > 0 if for V (s + h) ≤ V (s) + ah + bh 2∕2 + o(h 2) we have

$$\displaystyle{\sup _{A\in \mathcal{A}(s)}\left \{\lambda E[V (s - U) - V (s)] + (c + A)a + A^{2}b/2\right \} \leq 0.}$$

V (s) is a viscosity sub-solution at s > 0 if for V (s + h) ≥ v(s) + ah + bh 2∕2 + o(h 2)

$$\displaystyle{\sup _{A\in \mathcal{A}(s)}\left \{\lambda E[V (s - U) - V (s)] + (c + A)a + A^{2}b/2\right \} \geq 0.}$$

(a, b) are called 2nd order sub- and super-jet of V (x) at s. 

The concept of viscosity solutions is important for numerical methods which are based on Euler type discretisations of the dynamic equation. The discretized version of the value function \(V _{\Delta }(s)\) is the numerical solution for step size \(\Delta> 0\) which, at \(s = k\Delta,\) is defined from

$$\displaystyle{0 =\sup _{A\in \mathcal{A}(s)}\left \{\lambda (g_{\Delta }(s) - V _{\Delta }(s)) + (c + A)V _{\Delta }^{{\prime}}(s) + A^{2}V _{ \Delta }^{{\prime\prime}}(s)/2\right \},}$$

with

$$\displaystyle{g_{\Delta }(s) =\sum _{ i=1}^{k}V _{ \Delta }((k - i)\Delta )\mathbb{P}\{(i - 1)\Delta \leq X <i\Delta \}.}$$
$$\displaystyle{V _{\Delta }^{{\prime}}(s) = (V _{ \Delta }(s + \Delta ) - V _{\Delta }(s))/\Delta,}$$
$$\displaystyle{V _{\Delta }^{{\prime\prime}}(s) = (V _{ \Delta }^{{\prime}}(s) - V _{ \Delta }^{{\prime}}(s - \Delta ))/\Delta.}$$

Its computation is possible via the recursion:

$$\displaystyle{V _{\Delta }^{{\prime}}(s) =\inf _{ A\in \mathcal{A}(s)}\frac{\lambda \Delta (V _{\Delta }(s) - g_{\Delta }(s)) + A^{2}V _{\Delta }^{{\prime}}(s - \Delta )/2} {\Delta (c + A) + A^{2}/2} }$$

Then

$$\displaystyle{V ^{{\ast}}(x) = \lim \sup \nolimits _{ s=k\Delta \rightarrow x,\Delta \rightarrow 0}V _{\Delta }(s)}$$

is a viscosity sub-solution, while

$$\displaystyle{V _{{\ast}}(x) = \lim \inf \nolimits _{s=k\Delta \rightarrow x,\Delta \rightarrow 0}V _{\Delta }(s)}$$

is a viscosity super-solution of the dynamic equation. The convergence is a strong convergence concept: it implies uniform convergence on compact sets.

A convergence proof (see Chap. IX of Fleming and Soner 2006) can now be very short: since a sub-solution can never be larger than a super-solution, we have V (x) ≤ V (x). Since V (x) ≤ V (x), by definition, we have equality. For the above inequality between the sub- and the super-solution one uses the famous Crandell-Ishii maximum principle which we discuss later. First we give a proof for the sub-solution property for limsup V (s): 

Proof

Let ϕ(x) ∈ C 2 for which V (x) −ϕ(x) has a strict local minimum at s 0. With \(\phi _{\Delta }(s)\) being the restriction of ϕ(x) to the \(\Delta\)-grid we define

$$\displaystyle{s_{\Delta } = \arg \min \nolimits _{s=k\Delta \geq 0}V _{\Delta }(s) -\phi _{\Delta }(s).}$$

Then

$$\displaystyle{V _{\Delta }(s_{\Delta }) -\phi _{\Delta }(s_{\Delta }) \leq V _{\Delta }(s_{\Delta } \pm \Delta ) -\phi _{\Delta }(s_{\Delta } \pm \Delta ),}$$

and so \(V _{\Delta }^{{\prime}}(s_{\Delta }) \leq \phi _{\Delta }^{{\prime}}(s_{\Delta })\) and \(V _{\Delta }^{{\prime\prime}}(s_{\Delta }) \leq \phi _{\Delta }^{{\prime\prime}}(s_{\Delta })\). We can find a sequence \(\Delta _{n}\) for which \(s_{\Delta _{n}} \rightarrow s\) and \(V _{\Delta _{n}}(s_{\Delta _{n}}) \rightarrow V ^{{\ast}}(s).\) Recall that \(V _{\Delta }(s)\) is a solution to the discretised dynamic equation

$$\displaystyle{0 =\sup _{A\in \mathcal{A}(s)}\left \{\lambda (g_{\Delta }(s) - V _{\Delta }(s)) + (c + A)V _{\Delta }^{{\prime}}(s) + A^{2}V _{ \Delta }^{{\prime\prime}}(s)/2\right \}.}$$

Now let \(s = s_{\Delta }\) and \(\Delta = \Delta _{n}\) and n. Then the first term in the brackets has only limits ≤ E[V (sU)] (by Fatou’s lemma), the second term in the brackets has limit = −V (s), the third term in the brackets has limits ≤ (c + A)ϕ′(s), and the last term in the brackets has limits ≤ A 2∕2 ϕ″(s). So

$$\displaystyle{0 \leq \sup _{A\in \mathcal{A}(s)}\left \{\lambda E[V ^{{\ast}}(s - U) - V ^{{\ast}}(s)] + (c + A)\phi ^{{\prime}}(s) + A^{2}\phi ^{{\prime\prime}}(s)/2\right \},}$$

which is the desired result for a sub-solution. □

The inequality sub-solutionsuper-solution is based on the famous maximum principle.

Theorem 2

Assume that \(\mathbb{P}\{U> x\}> 0\) for all x > 0, and that the constraints \(\mathcal{A}(x)\) are intervals [a(x), b(x)] with Lipschitz functions a(x), b(x) satisfying b(x) > 0, x > 0. Let v(x), w(x) with v(0) ≤ w(0) be locally Lipschitz, v(x) a sub-solution and w(x) a super-solution of our dynamic equation. Assume that v(x) − w(x) has a strict local maximum in (0, ). Then v(x) ≤ w(x) for all x ≥ 0. 

This statement is concerned with the values v(x), w(x) for x > 0. We define v(x) = w(x) = 0 for x < 0 and note that \(\mathbb{P}\{U \leq 0\} = 0.\) We shall first give a simple proof for the case that the function v(x) and w(x) have continuous second derivatives.

Proof

Simple version: Assume that v(x), w(x) are twice differentiable on (0, ) having a global maximum x for v(x) − w(x) in (0, K). For ξ > 0 let

$$\displaystyle{(x_{\xi },y_{\xi }) =\arg \max _{0<x,y<K}v(x) - w(y) -\xi (x - y)^{2}.}$$

Then v′(x ξ ) = w′(y ξ ) = 2ξ(x ξ y ξ ), and v″(x ξ ) = w″(y ξ ) = 2ξ. For ξ we have x ξ x and y ξ x and furthermore ξ(x ξ y ξ )2 → 0. Define

$$\displaystyle{H_{1}(A) = E[v(x_{\xi } - U) - v(x_{\xi })] + (c + A)v'(x_{\xi }) + A^{2}v''(x_{\xi }),}$$
$$\displaystyle{H_{2}(A) = E[w(y_{\xi } - U) - w(y_{\xi })] + (c + A)w'(y_{\xi }) + A^{2}w''(y_{\xi }).}$$

Then

$$\displaystyle{\sup _{A\in \mathcal{A}(x_{\xi })}H_{1}(A) \geq 0.\ \mbox{ and}\sup _{A\in \mathcal{A}(y_{\xi })}H_{2}(A) \leq 0.}$$

So there is \(A_{\xi } \in \mathcal{A}(x_{\xi })\) and \(B_{\xi } \in \mathcal{A}(y_{\xi })\) with

$$\displaystyle{\vert A_{\xi } - B_{\xi }\vert \leq L\vert x_{\xi } - y_{\xi }\vert }$$

where L is the Lipschitz constant, giving

$$\displaystyle\begin{array}{rcl} H_{1}(A_{\xi }) - H_{2}(B_{\xi })& =& I(1) + I(2) + I(3) \geq 0, {}\\ I(1)& =& \lambda E[v(x_{\xi } - U) - w(y_{\xi } - U)] - (v(x_{\xi }) - w(y_{\xi })), {}\\ I(2)& =& (c + A_{\xi })v'(x_{\xi }) - (c + B(y_{\xi }))w'(y_{\xi })), {}\\ I(3)& =& A_{\xi }^{2}v''(x_{\xi })/2 - B_{\xi }^{2}w''(y_{\xi })/2. {}\\ & & {}\\ \end{array}$$

Now

$$\displaystyle\begin{array}{rcl} I(2)& =& (A_{\xi } - B_{\xi })2\xi (x_{\xi } - y_{\xi }) \leq 2L(x_{\xi } - y_{\xi })^{2} \rightarrow 0,\ \xi \rightarrow \infty. {}\\ I(3)& \leq & 2\xi (A_{\xi } - B_{\xi })^{2} \leq 2L^{2}\xi (x_{\xi } - y_{\xi })^{2} \rightarrow 0. {}\\ \end{array}$$

With f(h) = v(x ξ + Ah) − w(y ξ + Bh) −ξ(x ξ + Ahy ξ Bh)2 we have f(h) ≤ f(0) and so f″(0) ≤ 0, i.e.

$$\displaystyle{A^{2}v''(x_{\xi }) - B^{2}w''(y_{\xi }) \leq 2\xi (A - B)^{2}.}$$

This yields

$$\displaystyle{I(1) \rightarrow \lambda E[v(x^{{\ast}}- U) - w(x^{{\ast}}- U)] - (v(x^{{\ast}}) - w(x^{{\ast}}))}$$
$$\displaystyle{\leq M(\mathbb{P}\{U \leq x^{{\ast}}\}- 1) <0,}$$

with M = v(x ) − w(x ), a contradiction. □

Here is a proof without derivatives. It is clearly inspired by the proof given in the User’s guide, with some modifications.

Proof

First we restrict the argument x to a finite interval (0, K) containing a global maximum x with v(x ) − w(x ) = M > 0. For n > 0 and 0 < x < K define

$$\displaystyle{ v_{n}(x) =\sup _{\hat{x}\in [0,K]}v(\hat{x}) - n^{2}(x -\hat{ x})^{2}. }$$
(30)

These functions are semiconvex (i.e., v n (x) + Sx 2 is convex for some S > 0). Similarly, for n > 0 we define

$$\displaystyle{w_{n}(y) =\inf _{\hat{y}\in [0,K]}w(\hat{y}) + n^{2}(y -\hat{ y})^{2},}$$

which is semiconcave (w n (y) − Sy 2 concave for some S). We have

$$\displaystyle{0 \leq v(x) - v_{n}(x) \leq L^{2}/n^{2}\ \mbox{ and}\ \ 0 \leq w_{ n}(y) - w(y) \leq L^{2}/n^{2}.}$$

The functions v n (x), w n (y) are twice differentiable almost everywhere (according to Alexandrov’s theorem, see Crandall et al. 1992, Theorem A.2, p. 56, with a 1.5 pp proof).

Now let \(\overline{x},\overline{y}\) be given at which we have second derivatives for v n (x), w n (y). Let \(\hat{x}\) be the maximizer in (30), i.e. satisfying \(v_{n}(\overline{x}) = v(\hat{x}) - n^{2}(\overline{x} -\hat{ x})^{2},\) and denote the similar point for w n (x) and \(\overline{y}\) by \(\hat{y}.\) For notational convenience we omitted the dependence on n. 

Then for small enough h we have \(v_{n}(\overline{x} + h) \geq v(\hat{x} + h) - n^{2}(\overline{x} -\hat{ x})^{2}\) and then

$$\displaystyle{v(\hat{x} + h) \leq v(\hat{x}) + hv_{n}^{{\prime}}(\overline{x}) + h^{2}v_{ n}^{{\prime\prime}}(\overline{x})/2 + o(h^{2}).}$$

Similarly,

$$\displaystyle{w(\hat{y} + h) \geq w(\hat{y}) + hw_{n}^{{\prime}}(\overline{y}) + h^{2}w_{ n}^{{\prime\prime}}(\overline{y})/2 + o(h^{2}),}$$

which implies the two inequalities

$$\displaystyle{\sup _{A\in \mathcal{A}(\hat{x})}\left \{\lambda E[v(\hat{x} - U) - v(\hat{x})] + (c + A)v_{n}^{{\prime}}(\overline{x}) + A^{2}v_{ n}^{{\prime\prime}}(\overline{x})/2\right \} \geq 0,}$$
$$\displaystyle{\sup _{A\in \mathcal{A}(\hat{y})}\left \{\lambda E[w(\hat{y} - U) - w(\hat{y})] + (c + A)w_{n}^{{\prime}}(\overline{y}) + A^{2}w_{ n}^{{\prime\prime}}(\overline{y})/2\right \} \leq 0,}$$

Finally we apply Jensen’s Lemma for semiconvex functions (Lemma A.3 in Crandall et al. 1992), which in our special situation reads

Lemma 7

Let r > 0 and δ > 0 be arbitrary. Then the set of (x , y ) with | | (x , y ) − (x ξ , y ξ ) | | < δ for which

$$\displaystyle{v_{n}(x) - w_{n}(y) -\xi (x - y)^{2} - p_{ 1}x - p_{2}y}$$

is maximized at (x , y ) for some p 1, p 2 with p 1 2 + p 2 2 < r has positive measure.

For ξ > 0 let

$$\displaystyle{(x_{\xi },y_{\xi }) = \arg \max \nolimits _{x,y\in [0,K]}\{v_{n}(x) - w_{n}(y) -\xi (x - y)^{2}\} + p_{ 1}x - p_{2}y}$$

with p 1 2 + p 2 2 small for which the second derivatives of v n and w n exist at x ξ and y ξ , respectively. (x ξ , y ξ ) depends on ξ, p, n. 

For some \(A \in \mathcal{A}(\hat{x}_{\xi })\)

$$\displaystyle{I(A) =\lambda E[v(\hat{x}_{\xi } - U) - v(\hat{x}_{\xi })] + (c + A)v_{n}^{{\prime}}(x_{\xi }) + A^{2}v_{ n}^{{\prime\prime}}(x_{\xi })/2 \geq 0.}$$

For all \(B \in \mathcal{A}(\hat{y}_{\xi })\)

$$\displaystyle{I(B) =\lambda E[w(\hat{y}_{\xi } - U) - v(\hat{y}_{\xi })] + (c + B)w_{n}^{{\prime}}(y_{\xi }) + B^{2}w_{ n}^{{\prime\prime}}(y_{\xi })/2 \leq 0.}$$

The difference I(A) − I(B) is non-negative for some \(A_{\xi } \in \mathcal{A}(\hat{x}_{\xi })\) and \(B_{\xi } \in \mathcal{A}(\hat{y}_{\xi })\) satisfying

$$\displaystyle{\vert A_{\xi } - B_{\xi }\vert \leq L\vert \hat{x}_{\xi } -\hat{ y}_{\xi }\vert.}$$

We now let p → 0, n, ξ in this order! The difference consists of three terms:

$$\displaystyle\begin{array}{rcl} I(1)& =& E[v(x_{\xi } - U) - v(x_{\xi })] - E[w(y_{\xi } - U) - w(y_{\xi })], {}\\ I(2)& =& (c + A_{\xi })v_{n}^{{\prime}}(x_{\xi }) - (c + B_{\xi })w_{ n}^{{\prime}}(y_{\xi }), {}\\ I(3)& =& B_{\xi }^{2}v_{ n}^{{\prime\prime}}(x_{\xi })/2 - A_{\xi }^{2}w_{ n}^{{\prime\prime}}(y_{\xi })/2 {}\\ v_{n}^{{\prime}}(x_{\xi })& =& 2\xi (x_{\xi } - y_{\xi }) + p_{ 1}, {}\\ w_{n}^{{\prime}}(y_{\xi })& =& 2\xi (x_{\xi } - y_{\xi }) + p_{ 2}. {}\\ \end{array}$$

We have

$$\displaystyle{\vert I(2)\vert \leq c\vert \vert p\vert \vert + 2\xi \vert \hat{x}_{\xi } -\hat{ y}_{\xi }\vert \vert x_{\xi } - y_{\xi }\vert }$$

converges to zero for p → 0, n, ξ.

The argument in the proof with second derivatives leads to

$$\displaystyle{\vert I(3)\vert \leq 2\xi (A_{\xi } - B_{\xi })^{2} \leq 2L^{2}\xi (\hat{x}_{\xi } -\hat{ y}_{\xi })^{2}}$$

which converges to 0 for ξ. 

Finally, with x ξ x and y ξ x

$$\displaystyle{I(1) \rightarrow E[v(x^{{\ast}}- U) - w(x^{{\ast}}- U)] - (v(x^{{\ast}}) - w(x^{{\ast}})) <0}$$

because of v(x) − w(x) ≤ v(x ) − w(x ) = M and

$$\displaystyle{E[v(x^{{\ast}}- U) - w(x^{{\ast}}- U)] \leq M\mathbb{P}\{U \leq x^{{\ast}}\} <M.}$$

This contradicts that the difference must be non-negative, so M > 0 cannot be true, and thus our assertion v(x) ≤ w(x), x ≥ 0, holds. □

Usually, the maximum principle is applied for v(0) = w(0) and v() = w(), so the initial conditions are for values of the functions. This is appropriate for diffusion models where we often have v(0) = w(0) = 0, v() = w() = 1. 

In Lundberg models we have instead v() = w() = 1 and a given value for the derivative at zero:

$$\displaystyle{v'(0) = -\lambda (1 - v(0))/c,w'(0) = -\lambda (1 - w(0))/c.}$$

Fortunately, with the above maximum principle one can also handle this situation.

Lemma 8

Assume that \(\mathbb{P}\{U> x\}> 0\) for all x > 0, and that the constraints \(\mathcal{A}(x)\) are intervals [a(x), b(x)] with Lipschitz functions a(x), b(x) satisfying b(x) > 0, x > 0. 

Let v(x), w(x) be viscosity solutions of our dynamic equation having continuous first derivatives with v(0) = w(0) and v′(0) = w′(0). Then v(x) = w(x) for all x ≥ 0. 

Proof

Assume that there exists x 0 ≥ 0 such that v(x) = w(x), 0 ≤ xx 0 and that v(x) < w(x) for x 0 < xx 0 + ɛ. The case v() ≥ w() is easy.

Assume v()(1 + γ)2 < w(). 

Choose x 2 > x 0 close to x 0 such that v′(x)(1 + γ) ≥ w′(x), 0 ≤ xx 2. Define

$$\displaystyle{V (x) = w(x),x \leq x_{2},\ \mbox{ and}\ V '(x) = v'(x)(1+\gamma ),x \geq x_{2}.}$$

Similarly,

$$\displaystyle{W(x) = v(x),x \leq x_{2},\ \mbox{ and}\ W'(x) = w'(x)/(1+\gamma ),x \geq x_{2}.}$$

with the properties

  1. 1.

    V (0) = W(0), 

  2. 2.

    V (x), W(x) are Lipschitz,

  3. 3.

    V (x) is a sub-solution and W(x) a super-solution,

  4. 4.

    V () ≤ W(). 

Hence

$$\displaystyle{V (x) \leq W(x),x \geq 0,\ \mbox{ rm contradicting}\ V (x_{2}) = w(x_{2})> v(x_{2}) = W(x_{2}).\qquad }$$

For the above discretization schemes, one can prove equi-continuity of the approximations \(V _{\Delta }'(s),s = k\Delta \geq 0\) (see Hipp 2015) which implies that limsup and liminf have continuous derivatives.

In all, we can prove that the discretization schemes converge to some function W(x) having a continuous first derivative. For many optimization problems one can also show that the value function V (x) is a viscosity solution of the corresponding HJB equation. However, we need a continuous first derivative for W(x) to obtain V (x) = W(x) from the above comparison argument. It is still open for which optimization problem the value function V (x) has a continuous derivative. So, regrettably, we do not know whether the limit of our discretizations is the value function of the given control problem.

Cases in which the value function is known to have a continuous first and second derivative are

  • unrestricted case: \(\mathcal{A}(x) = (-\infty,\infty )\) (see Hipp and Plum 2003),

  • no short-selling and limited leverage: \(\mathcal{A}(x) = [0,bx]\) (see Azcue and Muler 2010),

  • bounded short-selling and bounded leverage: \(\mathcal{A}(x) = [-ax,bx]\) (see Belkina et al. 2014)

8 Numerical Issues

Numerical computations for solutions of control problems are demanding, they cannot be done on a simple spreadsheet. The results shown in this article are all done with MatLab. This matrix oriented programming language is well suited for the handling of large arrays; in particular, the commands find and cumsum (or cumtrapz) are used frequently, and arrays with more that a million entries were stored and handled easily.

Continuous time and state functions have to be discretized, and the same is done with integrals and derivatives. The step size for the surplus will be denoted by ds and for time by dt. If other state variables show up in the model (e.g., in mixture models), we try to replace them by t in a nonstationary model. We will use Euler type discretisations of the following kind: with s = k ds

$$\displaystyle\begin{array}{rcl} V _{s}(s,t)& =& (V (s + ds,t) - V (s,t))/ds, {}\\ V _{ss}(s,t)& =& (V _{s}(s,t) - V _{s}(s - ds))/ds, {}\\ V _{t}(s,t)& =& (V (s,t + dt) - V (s,t))/dt, {}\\ E[V (s - X,t)]& =& \sum _{i=1}^{k}V (s - i\ ds)\mathbb{P}\{(i - 1)ds \leq X <i\ ds\}. {}\\ \end{array}$$

For the expectation, one could use higher order integration methods; however, we here essentially need summation with weights which add up to 1.

In most control problems, the difference between maximizing survival probability and maximizing company value is very small: Rearranging the dynamic equation to solve for V ′(s), we obtain in the reinsurance control problem

$$\displaystyle\begin{array}{rcl} V '(s)& =& \min \frac{\lambda V (s) -\lambda E[V (s - g_{a}(X))]} {c - h(a)} \ \mbox{ for survival prob.} {}\\ V '(s)& =& \min \frac{(\lambda +\delta )V (s) -\lambda E[V (s - g_{a}(X))]} {c - h(a)} \ \mbox{ for dividends}. {}\\ & & {}\\ \end{array}$$

Since the equations are homogeneous, one can use an arbitrary value for V (0) to see the optimal strategy.

Reinsurance Example In our first example we consider optimal unlimited XL reinsurance for a Lundberg model, first for maximizing the company value, and second to minimize the ruin probability. The parameters are λ = 1,  c = 2,  δ = 0. 07, and the claims have an exponential distribution with mean 1. First we show the derivative of the function v(s) solving the dynamic equation, and next you see the optimal priority M(s) (middle). On the right you see the optimal M(s) which minimizes ruin probability. We see that v(s) has one minimum which is at M = 4. 84. So the possible values of s are [0, M]. In both cases we have a region of small s in which no reinsurance is optimal. Then we see a region with M(s) = s, which means reinsurance for the next claim. Then M(s) is increasing almost linearly for the dividend case, while for the ruin case M(s) is almost constant. In both cases, reinsurance is paid for, and in the dividend case this starts at larger surplus. Furthermore, M(s) is higher in the dividend case (which means less reinsurance) (Figs. 45, and 6).

Fig. 4
figure 4

Derivative of HJB-solution v′(s)

Fig. 5
figure 5

Optimal priority dividends

Fig. 6
figure 6

Optimal priority ruin

In most optimization problems, the optimizers are found by complete search. In problems with more than one control parameter one should check wether the optimal parameters are continuous in s. Then one can speed up the search: restrict the search for state s on a neighborhood of the value for sds. 

The numerically demanding term is the expectation in the dynamic equation: E[V n (sg M (X))]. It has to be computed for many s and M′s, and for each iteration n. In some cases this nonlocal term can be transformed to a local one (e.g., for exponential or phase-type distributions,) but with MatLab one can produce the values—following the MatLab-rule no loops—in one line. Once define the matrix P of probabilities with step size ds, range 0, ds, 2ds, . . , KS ds for s, and \(f(i) = \mathbb{P}\{(i - 1)ds \leq X <i\ ds\},i = 1,..,KS\) as

$$\displaystyle\begin{array}{rcl} P(i,j)& =& f(j),j = 1,\ldots,i - 1, {}\\ P(i,i)& =& \sum _{j=i}^{KS}f(j), {}\\ P(i,j)& =& 0,j = i + 1,\ldots,KS. {}\\ \end{array}$$

If A = {1 ≤ iKS: h(i ds) < c}, then the vector VI with entries

$$\displaystyle{E[V (s - M)]: M = i\ ds,\ i \in A,}$$

is generated by

$$\displaystyle{V I = (P(A,1: (i - 1)) {\ast} V (i - 1: -1: 1)');}$$

and the dynamic equation for the value function V in the Lundberg model leads to the formula

$$\displaystyle{[V '(s),sb] =\min (\lambda {\ast}(V (i) - V I')/(c - h(A)));}$$

In special cases the set A can be replaced by a smaller set which speeds up computation.

Investment Example Optimal investment for minimal ruin probability in the Lundberg model has the following equation (where we set μ = σ 2 = 1): 

$$\displaystyle{0 =\sup _{A}\lambda E[V (s - X) - V (s)] + (c + A)V '(s) + A^{2}V ''(s)/2,}$$

which has maximizer A(s) = −V ′(s)∕V ″(s). With U(s) = A 2(s) we obtain the equation

$$\displaystyle{V '(s) = \frac{\lambda E[V (s) - V (s - X)} {c + \sqrt{U(s)}/2}.}$$

For U(s) we get in the case of exponential claims with parameter θ

$$\displaystyle{U'(s) = \sqrt{U(s)}(\lambda +1/2 -\theta c -\theta \sqrt{U(s)}/2) + c}$$

(see Hipp and Plum 2003, Remark 8). To obtain the optimal strategy, we can restrict ourselves on U(s) and start with U(0) = 0. For the dividend objective we just have to replace λ by λ + δ. In the special case θ = 1, c = λ + 1∕2 we can see that for the dividend objective investment is higher than for the ruin probability objective: for dividends we obtain

$$\displaystyle{U'(s) = c - U(s)/2,}$$

while for dividends it reads

$$\displaystyle{U'(s) = c - U(s)/2 +\delta \sqrt{U(s)}.}$$

The above system of two coupled differential equations enables a simple, robust, and efficient computation. The resulting strategies never use short selling, the amount invested A(s) is not always increasing, and generally: the more risky the insurance business is, the larger A(s) will be.

Optimal Investment with Constraints In the unconstrained case, optimal investment is completely different from the one in the unrestricted case. The following figures are based on a Lundberg model with exponential claims for which the unconstrained optimal strategy is increasing and concave and almost constant for large surplus. In the case without leverage and shortselling in the next figure, we see the proportion A(s)∕s and the second derivative of the value function. For small s we see A(s) = s, and the value function is not concave. An example with volatility hunger is seen in the next figure: here we have the same model and the constraints \(\mathcal{A}(s) = [-4s,s]\) (see Belkina et al. 2014). For very small s we have A(s) = s, then in a larger interval A(s) = −4s, and then the strategy switches back to A(s) = s and continues continuously. The jump from maximal long to a maximal short position can be explained by the fact that a high volatility position can produce also high up movements. The black curve is again the second derivative of the value function.

Constraints can generate singularities in the value function, even the first derivative can have a jump. Such singularities are present also in uncontrolled ruin probabilities, when the claim size distribution has atoms. An example with X = 1 is given in the third figure below, it shows A(s) in the unconstrained case (blue line) and for \(\mathcal{A}(s) = [0,s]\) (Figs. 78 and 9).

Fig. 7
figure 7

Constrained optimal investment

Fig. 8
figure 8

Example with extreme jumps

Fig. 9
figure 9

Optimal investment for X = 1

Optimal Dividends with Ruin Constraint The method for the computation of company values with ruin constraint has been described before; we will here discuss the numerical problems and results for the computation using Lagrange multipliers and the nonstationary approach. Our backward calculation starts with V (s, T) = −(s) which will produce good approximations if T is large enough such that dividend payments do not matter after time T since they are discounted by e δT at least. But the discretization dt must be quite small to get convergence: in the simple diffusion model, for ds = 0. 02 we need a step size dt of at most 0. 0004; for ds = 0. 02 and dt = 0. 00041 we obtain results which are completely wrong: barrier close to zero and value functions close to V (s) = sL. The Lundberg model is less sensitive: it works with ds = 0. 02 and dt = 0. 004. 

The next two figures show the results for simple diffusion models. First, we show the computed curves V (s, t) for 21 values of t, where the largest values belong to t = 0. The second is the curve of barriers M(t) which has the expected form: increasing, asymptotically linear, with a decrease close to T. The same form had been obtained in the discrete case of Hipp (2003), the decrease is caused by the choice of V (s, T). The parameters for the plots are μ = σ 2 = 1 and the discount rate δ = 0. 03. 

The third figure shows an efficiency curve for company values and ruin probabilities, which is the same as a plot for V (s, α), the maximal dividend value with a ruin constraint of α. For this we computed V (s, L) with the corresponding ruin probabilities, and plotted the results for a number of L′s from 0 to 100. The plot is given for a simple diffusion model with σ = μ = 1 and δ = 0. 07. The initial surplus is 5. We could not produce reliable results for larger L since they produce α = 1 or α < ψ(5). Surprisingly, the dividend value stays near the unconstrained value V 0(5) = 16. 126 over a long range for α (Figs. 1011, and 12).

Fig. 10
figure 10

V (s, t) for 0 ≤ s ≤ 20 and various values of t

Fig. 11
figure 11

Optimal barrier M(t)

Fig. 12
figure 12

V (s, α) for s = 5,  ψ(s) ≤ α ≤ 1

Results for the Lundberg model are given in the contribution (Hipp 2016) in these proceedings.

9 Open Problems

Here is a collection of questions which—according to my knowledge—are open, and in my opinion interesting enough to attract (young) mathematicians. They are of course biased by my preferences, but they might still be of some use. They are given in a numbered list, where the order is random (no ordering according to difficulty or importance).

  1. 1.

    For the proof that the discretisations converge to the value function in the optimal investment problem with constraints, one needs that the value function has a continuous derivative. What is the class of problems for which the value function has this property?

  2. 2.

    Optimal company values with ruin constraint are computed with the Lagrange multiplier approach. Do we have a Lagrange gap here? Some positive results are in Hernandez and Junca (20152016).

  3. 3.

    Optimal investment is considered here in a market with constant parameters. How do the solutions change if the market values change as in a finite Markov chain with fixed or random transition rates? What changes if also negative interest is possible?

  4. 4.

    What is the right model for simultaneous control of stop loss and excess of loss reinsurance?

  5. 5.

    Can the nonstationary approach solve control problems also in more complex Markov switching models?

  6. 6.

    Is the capital V (s, α) with ruin constraint a smooth function of s and α?

  7. 7.

    Existing results in models with capital injections solicit the question whether classical reinsurance is still efficient. What is the right model for this question, and what is the answer?

  8. 8.

    Does the approach described at the end of Chap. 5 work for a model in which the state λ is not absorbing?

  9. 9.

    Can the improvement approach in Chap. 6 be applied in the Lundberg model with claim size not exponentially distributed?