1 Introduction

It is quite common for insurance companies to manage their assets by investing in a financial market and reduce their risk exposures through purchasing reinsurance protection. This seems to partly motivate the research studying optimal reinsurance and investment problems of an insurer in the actuarial science literature. Commonly used optimality criteria include the expected utility maximization, the ruin probability minimization and the mean-variance criterion. Using techniques in stochastic optimal control, Zeng et al. (2013) studied an investment and reinsurance optimization problem for mean-variance insurers in a dynamic setting. Schmidli (2002) obtained optimal strategies that minimized a ruin probability under a compound Poisson risk model. Meng and Zhang (2013) reported that an excess-of-loss reinsurance contract was better than any other reinsurance forms under their model settings by minimizing the insurer’s ruin probability. Zhao et al. (2013) and Brachetta and Ceci (2019) determined optimal reinsurance–investment strategies by maximizing the expected exponential utility of an insurer’s terminal wealth. Further investigations to the optimal investment and reinsurance problems can be found in Promislow and Young (2005), Liang and Yuen (2016), Bi and Cai (2019) and the relevant references therein.

In most of the prior studies, the effects of model ambiguity on optimal reinsurance–investment strategies have not been very well-explored. It may be noted that some parameters in the financial models, such as the appreciation rates of risky assets, could be difficult to predict or estimate precisely in the long run (e.g., Merton, 1980). As far as the model uncertainty for an insurance market is concerned, it may be noted that an insurer produces estimates for the parameters about claim sizes and arrival rate of claims in light of information about policyholders’ characteristics and other risk factors. It may be difficult, if not impossible, to obtain perfect/accurate information about policyholders’ risk characteristics and other risk factors as well as to monitor the information collection process. Consequently, the insurer and a reinsurer may cast doubts about the estimated claim process. This partly motivates the incorporation of model ambiguity or uncertainty into the reinsurance–investment optimization problems. One popular approach used to describe model ambiguity was proposed by Anderson et al. (2003), who studied asset pricing problems in a stochastic continuous-time model setup by incorporating the investor’s concerns about model misspecification. Basically, it is a “penalty" approach to robust control, and they considered equivalent priors, which were given by probability measures equivalent to a given reference probability measure, as alternative measures and formulated the robust stochastic optimal control problems in a maxi-min framework. In the past two decades, much of the relevant research has included the application of the method developed in Anderson et al. (2003) to study robust optimal investment and reinsurance problems. For instance, Maenhout (2004) studied an optimal asset allocation problem when model misspecification was taken into account. Zhang and Siu (2009) examined a reinsurance–investment optimization problem with model uncertainty under the expected utility criterion and the survival probability criterion. Li et al. (2018) studied a robust excess-of-loss reinsurance and investment optimization problem for an ambiguity-averse insurance company. Pun (2018) constructed a modelling framework for the time-inconsistent stochastic control problems involving model uncertainty and used a portfolio selection problem to illustrate an application of the framework. Wang et al. (2019a) discussed a robust non-zero-sum stochastic differential game between two competitive insurers who were ambiguity-averse in deriving optimal reinsurance–investment strategies under mean-variance criteria. Other relevant papers include Yi et al. (2013, 2015), Sun et al. (2017), Gu et al. (2018), Wang et al. (2019b) and Feng et al. (2021), to mention a few.

It appears that optimal reinsurance problems considered in much of the existing literature were discussed from the insurer’s point of view and the interests of the reinsurer were neglected. However, a reinsurance contract may be thought of as a mutual agreement between the reinsurer and the insurer. Consequently, analyzing optimal reinsurance design problems from the perspectives of both an insurer and a reinsurer is appropriate. This approach has been considered in some prior works. For example, in a discrete-time single-period setting, Cai et al. (2013) derived optimal reciprocal reinsurance contracts by maximizing the joint survival and profitable probabilities of an insurer and a reinsurer. Jiang et al. (2017) developed the Pareto-optimal reinsurance agreements in the Value-at-Risk (VaR) settings. Zhang et al. (2018) obtained the optimal quota-share reinsurance treaties through using optimality criteria and utility improvement constraints reflecting mutual beneficiary. Another strand of literature applies game theory to model the strategic interaction between an insurer and a reinsurer in optimal reinsurance design. Borch (1960) was evidently the first to discuss the optimal reinsurance contract problems within the context of bargaining games. Indeed, there have been many interesting developments in applications of game theory to reinsurance design problems. For instance, Chen and Shen (2018, 2019) considered the collective interests of both the insurer and the reinsurer in a reinsurance contract design problem by modeling the two negotiating parties using a stochastic leader-follower differential game framework. Jiang et al. (2019) derived the Pareto-optimal reinsurance contracts under the two-person cooperative game framework. Chen et al. (2019) studied an optimal risk-sharing problem in a stochastic differential game theoretic framework, where the insurer’s objective was to minimize the ruin probability, and the main goal of the reinsurer was to maximize their profits up to the time when the insurer’s bankruptcy occurred. Li and Young (2021) obtained the Bowley solution of a mean-variance Stackelberg game in a one-period model setup.

By contrast, the two-party nature of reinsurance agreement suggests that it is natural to adopt the methods in economic contract theory and apply a principal–agent framework to describe the relationship between the insurer and the reinsurer. In such a paradigm, the reinsurer is considered the principal and the insurer is thought of as the agent. Using a principal–agent framework, Hu et al. (2018a) studied optimal excess-of-loss and proportional reinsurance contracts when the reinsurer was ambiguity-averse and the insurance claims were described by a classical Cramér–Lundberg model. Their objectives were to maximize the expected exponential utility of the terminal wealth in the worst-case scenario over a family of alternative measures. Hu et al. (2018b) investigated optimal proportional reinsurance contracts when the reinsurer had robust preferences and the insurer’s claim process was approximated by a diffusion model. Hu and Wang (2019) obtained the robust optimal proportional and excess-of-loss reinsurance treaties when both the principal and the agent were ambiguity-averse under the classical Cramér–Lundberg model for insurance claims. Gu et al. (2020) discussed an optimal excess-of-loss reinsurance contracting problem when the insurer and the reinsurer were ambiguity-averse. They also supposed that both the insurer and the reinsurer could invest in a financial market consisting of one risk-free asset and one risky asset. In a recent paper by Wang and Siu (2020), robust optimal reinsurance contracting was studied in a principal–agent modeling framework in the presence of a risk constraint formulated by VaR.

The principal–agent problems in the aforementioned papers were studied under the assumption that the principal and the agent shared the same information. However, in reality, the principal can only gain partial information from the agent, and this information asymmetry crucially determines what kind of contract is optimal. Two distinct types of these problems include moral hazard and adverse selection. When the action of the agent is hidden to the principal, moral hazard problems are employed. Seminal works, such as Shavell (1979) and Holmstrom (1979), provided a foundation for optimal insurance contracting problems under moral hazard. Doherty and Smetters (2005) developed a two-period principal–agent model and provided empirical evidence of moral hazard in the reinsurance market. For a more recent review about this topic, the readers may refer to Winter (2013) and the references therein. When key characteristics of the agent are hidden, we may establish adverse selection problems. Following the celebrated works by Rothschild and Stiglitz (1976) and Stiglitz (1977), various models have been proposed to study adverse selection in insurance contracting. Examples include Crocker and Snow (1985, 2008), Cohen and Siegelman (2010), Spinnewijn (2017) and Cheung et al. (2019).

Markowitz (1952) studied the portfolio selection problem under the mean-variance criterion in the single-period model. This pioneering work has stimulated numerous extensions in the literature. Li and Ng (2000) and Zhou and Li (2000) extended Markowitz’s work to a multi-period model and a continuous-time model, respectively. In traditional mean-variance optimization problems, it appears that a considerable amount of literature may obtain the pre-commitment strategy which could be time-inconsistent and only optimal at the initial time (e.g., Bian et al., 2018; Cong & Oosterlee, 2016; Sun et al., 2016). However, it is a basic requirement for a rational decision-maker to have time-consistent optimal strategies. Björk and Murgoci (2010), Björk et al. (2014) and Kronborg and Steffensen (2015) articulated an approach to derive a time-consistent investment strategy. A key feature of this method is that the problem is tackled within a non-cooperative game theoretic framework, where the players are the future incarnations of the decision-maker at different time points. Since then, (robust) optimal reinsurance and investment problems involving mean-variance criteria have been studied using time-consistent controls. For instance, Zeng and Li (2011) pioneered the study of optimal time-consistent investment and reinsurance problems for the insurers who had mean-variance preference. This approach was later applied by Li et al. (2015a) to derive the time-consistent reinsurance and investment strategies when the insurer could purchase a proportional reinsurance contract and invest the insurance surplus in a financial market comprising a risk-free asset, a risky asset, a zero-coupon bond and inflation-protected securities. Lin and Qian (2016) obtained the time-consistent reinsurance–investment strategy for an insurer whose surplus process was governed by a compound Poisson model, and a constant elasticity of variance (CEV) model was adopted to describe the risky asset’s time-varying volatility. Zeng et al. (2016) studied the robust reinsurance–investment optimization problem for a mean-variance insurer who was concerned about model uncertainty and obtained robust equilibrium strategies when the price process of the risky asset was described by a jump-diffusion model. Chen et al. (2021) considered a dynamic Pareto optimal risk-sharing problem between n insurers under the time-consistent mean-variance criterion. More literature regarding the application of this approach can be referred to Zeng et al. (2013), Li et al. (2015b), Guan et al. (2018), Chen and Shen (2019), Wang et al. (2019a), Wang et al. (2019) and Zhao and Siu (2020).

Notably, the robust reinsurance problems with mean-variance criteria from the perspective of a principal–agent problem have not been well-explored. The purpose of this study was to investigate the interaction between an insurer and a reinsurer who were both ambiguity-averse. Suppose that the decision-makers aim to maximize the expected return of the surplus and minimize the corresponding risk. In this case, we apply the mean-variance criteria to formulate the objective functions of the insurer and the reinsurer, where the expected returns and the risks are measured by the expected values and the variances of their terminal surpluses, respectively. Following Hu et al. (2018a, 2018b) and Hu and Wang (2019), the insurer is allowed to purchase a proportional reinsurance treaty, and the safety loading factor of the reinsurer in the expected value premium principle has been extended to be time-varying, which could be regarded as a choice variable of the reinsurer. Here, it has also been supposed that both the insurer and the reinsurer invest their surpluses in the financial market comprising a risk-free asset and a risky asset. Additionally, it has been assumed that both the insurer and the reinsurer are concerned about model uncertainty. Specifically, the ambiguity-averse decision-makers may regard the claim process and the financial market’s dynamics as reference models and aim to obtain robust strategies under the worst-case scenario over a set of alternative models. This paper includes three main contributions. First, different from the techniques used in Hu et al. (2018a, 2018b), Hu and Wang (2019) and Gu et al. (2020), where the expected utility maximization criteria were applied, we shall embed non-cooperative games into the principal–agent framework and establish two systems of extended HJB equations to derive the time-consistent optimal reinsurance contract and investment strategies of the insurer and the reinsurer. Another difference between the current paper and the works by Hu et al. (2018a, 2018b) and Hu and Wang (2019) is that the contracting parties are permitted to invest their surpluses into a risky asset with a view to enhancing their profits. Finally, though Chen and Shen (2019) considered the Stackelberg differential game between the insurer and the reinsurer under mean-variance criteria, they assumed that the decision-makers were ambiguity-neutral.

The remaining parts of this paper are organized in the following manner. Section 2 presents the model formulation. In Sect. 3, we derive the explicit expressions for the robust equilibrium optimal strategies and value functions of the principal and the agent. We thereafter analyze the decision-makers’ utility losses associated with strategies ignoring ambiguity in Sect. 4. Numerical analyses have been provided to illustrate the effects of some key parameters on the equilibrium reinsurance–investment policies and the utility losses of the principal and the agent in Sect. 5. Finally, Sect. 6 concludes the paper.

2 Problem formulation

The model setup considered here resembles that used in Wang and Siu (2020). To describe uncertainties, as it is usual, we consider a complete filtered probability space \((\Omega , {\mathscr {F}}, {\mathbb {P}})\), where \({\mathbb {P}}\) is a reference probability measure under which a reference model is specified. The time horizon of the model for investment and reinsurance is given by a finite horizon [0, T], where \(T < \infty \). The resolution of uncertainties over the horizon [0, T] is described by a \({\mathbb {P}}\)-augmented filtration \({\mathscr {F}}=\{ {{\mathscr {F}}}_t \}_{t \in [0, T]}\). The classical Cramér–Lundberg model is adopted to describe the insurer’s risk process:

$$\begin{aligned} S(t)=x_0+ pt -\sum _{i=1}^{N(t)} Z_i, \end{aligned}$$

where \(x_0\ge 0\) is the initial surplus, p denotes the constant insurance premium rate, \(\{N(t)\}_{t\in [0,T]}\) is a homogeneous Poisson process with intensity \(\lambda >0,\) and the claim sizes \(Z_i, i=1,2,\ldots ,\) are i.i.d. random variables supposed to be independent of N(t) under the reference probability measure \({\mathbb {P}}\). Denote the mean and second moment of the claim size as \(\mu \) and \(\sigma ^2,\) respectively. For simplicity, the constant insurance premium rate p can be calculated by the expected value premium principle, i.e., \(p=(1+\theta )\lambda \mu ,\) where \(\theta >0\) is the insurer’s positive safety loading.

It has been further assumed that an insurance company can purchase proportional reinsurance contracts or acquire new businesses to transfer and manage insurance risks. Though reinsurance policies may take more complicated forms than proportional reinsurance in practice, the consideration of proportional reinsurance here may render the problem more tractable and throw light on certain theoretical aspects on the optimal reinsurance and investment problem under the principal–agent modelling framework. We use \(q(t): [0,T]\rightarrow [0, 1]\) to represent the risk retention level of the insurer at time t. In this case, the insurer must allocate parts of the premium incomes at a rate of \(p^q(t)\) at time t to the reinsurer. To simplify the analysis, the reinsurance premium is also evaluated using the expected value premium principle. In contrast to some prior studies where the relative safety loading factor of the reinsurer is a given positive constant, we assume that the reinsurer’s safety loading factor could be adjusted according to the reinsurance demand, i.e.,

$$\begin{aligned} p^q(t)=(1+\eta (t)){\mathbb {E}}^{{\mathbb {P}}}(\,\cdot \,). \end{aligned}$$
(2.1)

See also Hu et al. (2018a, 2018b), Hu and Wang (2019) and Wang and Siu (2020), which imposed the same assumption. Unlike charging the same premium per unit of risk exposure per unit time as in the traditional expected value premium principle, formulation (2.1) may allow the flexibility in modelling the strategic interaction between the insurer and the reinsurer. Following Hu and Wang (2019), we refer to \(\eta =\{\eta (t)\ge \theta :0\le t \le T\}\) as the reinsurance price, and, in this paper, only non-cheap reinsurance has been considered. Thus, the reinsurance premium payable to the reinsurer is given by:

$$\begin{aligned} p^q(t)=\lambda \mu (1+\eta (t))(1-q(t)). \end{aligned}$$

After considering reinsurance protection, the insurer’s surplus process becomes:

$$\begin{aligned} U(t)=x_0+\int _0^t\left[ (1+\theta )\lambda \mu -\lambda \mu (1+\eta (s))(1-q(s))\right] \mathrm {d}s-\sum _{i=1}^{N(t)} q(T_i)Z_i, \end{aligned}$$
(2.2)

where \(T_i\) denotes the arrival time of the i-th claim. Using the diffusion approximation associated with an insurance surplus process in Grandell (1991), the dynamics of U(t) in (2.2) can be approximated by the following diffusion process:

$$\begin{aligned} \mathrm {d}U(t)=\lambda \mu (\theta -\eta (t)+q(t)\eta (t))\mathrm {d}t+\sigma \sqrt{\lambda }q(t)\mathrm {d}B(t), \end{aligned}$$

where \(\{B(t)\}_{t\in [0,T]}\) is a standard one-dimensional Brownian motion on \((\Omega , {\mathscr {F}}, {\mathbb {P}})\). With a minor adjustment of notation, \(\{ U (t) \}_{t \in [0, T]}\) has been used to denote the original surplus process and its diffusion approximation. Similarly, the dynamics of the reinsurer’s surplus process can be approximated by the following diffusion process:

$$\begin{aligned} \mathrm {d}V(t)=\lambda \mu \eta (t)(1-q(t))\mathrm {d}t+\sigma \sqrt{\lambda }(1-q(t))\mathrm {d}B(t). \end{aligned}$$

As pointed out by Promislow and Young (2005) and Li et al. (2015a), it needs to be assumed that \(\sqrt{\lambda }\mu /\sigma \) is large enough \((\text {e.g.}, \sqrt{\lambda }\mu /\sigma >3)\) to guarantee that at any times the probability of achieving a negative claim is relatively small.

In addition, it has been supposed that both the insurer and the reinsurer invest their surpluses in the financial market consisting of one risk-free asset and one risky asset. The following ordinary differential equation (ODE) is used to describe the price process \(\{ S_0 (t) \}_{t \in [0, T]}\) of the risk-free asset:

$$\begin{aligned} \mathrm {d}S_0(t)=rS_0(t)\mathrm {d}t, \end{aligned}$$

where \(r>0\) is the risk-free, instantaneous interest rate, and \(S_0(0)=s_0>0\). We assume that the price process \(\{ S_1(t) \}_{t \in [0, T]}\) of the risky asset evolves according to a geometric Brownian motion:

$$\begin{aligned} \mathrm {d}S_1(t)=S_1(t)\left[ {\tilde{\mu }}\mathrm {d}t+{\tilde{\sigma }}\mathrm {d}{\widetilde{B}}(t)\right] , \end{aligned}$$

where \(\{{\widetilde{B}}(t)\}_{t\in [0,T]}\) is another standard Brownian motion on \((\Omega , {\mathscr {F}}, {\mathbb {P}}),\) which is supposed to be independent of \(\{B(t)\}_{t\in [0,T]},\) \({{\tilde{\mu }}}>r\) and \({{\tilde{\sigma }}}>0\) represents the appreciation rate and the volatility respectively, and \(S_1(0)=s_1>0\).

For all \(\ t\in [0,T],\) \(\pi (t)\) is denoted as the dollar amounts invested by the insurer in the risky asset at time t. The outstanding amount of the surplus, \(X^{u,v}(t)-\pi (t),\) is invested in the risk-free asset, where \(X^{u,v}(t)\) is the insurer’s surplus process controlled by the reinsurance–investment strategy \(u(t):=(q(t), \pi (t))\) and v(t) is the control policy of the reinsurer. Admissible strategies u(t) and v(t) will be defined in Definitions 2.1 and 2.2. Hence, the surplus process \(\{X^{u,v}(t)\}_{t\in [0,T]}\) of the insurer under \({\mathbb {P}}\) with investments in the financial market is governed by:

$$\begin{aligned} \begin{aligned} \mathrm {d}X^{u,v}(t)&=[rX^{u,v}(t)+({\tilde{\mu }}-r)\pi (t)+\lambda \mu (\theta -\eta (t)\\&\quad +q(t)\eta (t))]\mathrm {d}t+\sigma \sqrt{\lambda }q(t)\mathrm {d}B(t)\\&\quad +{\tilde{\sigma }}\pi (t)\mathrm {d}{\widetilde{B}}(t), \end{aligned} \end{aligned}$$
(2.3)

where \(X^{u,v}(0)=x_0\) is the initial surplus of the insurer.

Similarly, \(\forall \ t\in [0,T],\) it has been supposed that the reinsurer invests \({\tilde{\pi }}(t)\) in the risky asset, and the rest of their surplus would be invested in the risk-free asset. Taking account of the reinsurance–investment strategy \(v(t):=(\eta (t), {\tilde{\pi }}(t)),\) the surplus process of the reinsurer can be expressed as follows:

$$\begin{aligned} \begin{aligned} \mathrm {d}Y^{u,v}(t)&=[rY^{u,v}(t)+({\tilde{\mu }}-r){\tilde{\pi }}(t)+\lambda \mu \eta (t)(1-q(t))]\mathrm {d}t\\&\quad +\sigma \sqrt{\lambda }(1-q(t))\mathrm {d}B(t)\\&\quad +{\tilde{\sigma }}{\tilde{\pi }}(t)\mathrm {d}{\widetilde{B}}(t), \end{aligned}\end{aligned}$$
(2.4)

where \(Y^{u,v}(0)=y_0\) is the reinsurer’s initial surplus.

In practice, model uncertainty or ambiguity prevails in financial and insurance modelling. Consequently, it may be of some interest to investigate how the insurer and the reinsurer having ambiguity aversion attitudes make their investment and reinsurance decisions consistently. In this current paper, we take model uncertainty or ambiguity into account by considering an ambiguity-averse insurer (AAI) and an ambiguity-averse reinsurer (AAR). From the perspectives of AAI and AAR, the probability measure \({\mathbb {P}}\) is taken as a reference measure, and they are interested in considering a family of alternative probability measures surrounding the reference measure in a certain sense to be described in the sequel. A class of probability measures that are equivalent to \({\mathbb {P}}\) is defined. That is,

$$\begin{aligned} {\mathcal {Q}}:=\{{\mathbb {Q}}|{\mathbb {Q}}\sim {\mathbb {P}}\}, \end{aligned}$$

where \({\mathbb {Q}}\) is to be defined in what follows.

For ease of reference, we define a variable \(k\in \{1,2\},\) where \(k=1\) refers to the insurer and \(k=2\) corresponds to the reinsurer. Define, for each \(k \in \{1, 2 \}\), an exponential process \(\{ \Lambda ^{\phi _k}(t) \}_{t \in [0, T]}\) by putting:

$$\begin{aligned} \Lambda ^{\phi _k}(t)= & {} \exp \Bigg \{\int _0^t\phi _{k1}(s)\mathrm {d}B(s)-\frac{1}{2}\int _0^t\phi _{k1}^2(s)\mathrm {d}s\nonumber \\&+\int _0^t\phi _{k2}(s)\mathrm {d}{\widetilde{B}}(s)-\frac{1}{2}\int _0^t\phi _{k2}^2(s)\mathrm {d}s\Bigg \}, \end{aligned}$$
(2.5)

where \(\{ \phi _k (t) \}_{t \in [0, T]}\) is a measurable process and it is defined by \(\phi _k (t)\!:=\!(\phi _{k1} (t), \phi _{k2} (t))^{\prime }\) for each \(t \in [0, T]\).

Assumption 2.1

Suppose that, for each \(k\in \{1, 2\}\), the density generator process \(\{ \phi _k (t) \}_{t \in [0, T]}\) satisfies the following two conditions:

  1. 1.

    \(\{ \phi _k (t) \}_{t \in [0, T]}\) is \(\{ {\mathscr {F}}_t \}_{t \in [0, T]}\)-adapted;

  2. 2.

    \({\mathbb {E}}^{{\mathbb {P}}}\left[ \exp \left( \frac{1}{2}\int _0^T\Vert \phi _k(t)\Vert ^2\mathrm {d}t\right) \right] <\infty \) with \(\Vert \phi _k(t)\Vert ^2=\phi _{k1}^2(t)+\phi _{k2}^2(t).\) This condition is called Novikov’s condition.

We denote \(\Sigma _k\) as the space of all such processes \(\{\phi _k(t) \}_{t \in [0, T]}\).

Under Assumption 2.1, the exponential process \(\{ \Lambda ^{\phi _k} (t) \}_{t \in [0, T]}\) defined in (2.5) is a \((\{{\mathscr {F}}_t \}_{t \in [0, T]}, {\mathbb {P}})\)-martingale, for each \(k\in \{1, 2\},\) which implies that \({\mathbb {E}}^{{\mathbb {P}}} [\Lambda ^{\phi _k} (T)] = 1\). Consequently, for each \(k\in \{1, 2\}\), we define a new probability measure \({\mathbb {Q}}_k \sim {\mathbb {P}}\) on \({\mathscr {F}}_T\) by putting:

$$\begin{aligned} \frac{\mathrm {d}{\mathbb {Q}}_k}{\mathrm {d}{\mathbb {P}}} \bigg |_{{\mathscr {F}}_T} := \Lambda ^{\phi _k} (T). \end{aligned}$$

According to Girsanov’s theorem for Brownian motion, under an alternative probability measure \({\mathbb {Q}}_k,\) the processes \(\{B^{{\mathbb {Q}}_k}(t)\}_{t\in [0,T]},\) \(\{{\widetilde{B}}^{{\mathbb {Q}}_k}(t)\}_{t\in [0,T]}\) are real-valued standard Brownian motions, and for each \(\{ \phi _k(t) \}_{t \in [0, T]} \in \Sigma _k\) they have the following dynamics:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \mathrm {d}B^{{\mathbb {Q}}_k}(t)=\mathrm {d}B(t)-\phi _{k1}(t)\mathrm {d}t,\\ \mathrm {d}{\widetilde{B}}^{{\mathbb {Q}}_k}(t)=\mathrm {d}{\widetilde{B}}(t)-\phi _{k2}(t)\mathrm {d}t. \end{array}\right. } \end{aligned}$$

It should be noted that the Brownian motions \(B^{{\mathbb {Q}}_k}(t)\) and \({\widetilde{B}}^{{\mathbb {Q}}_k}(t)\) are independent under the probability measure \({\mathbb {Q}}_k.\)

Accordingly, the insurer’s surplus process under the alternative measure \({\mathbb {Q}}_1\) satisfies:

$$\begin{aligned} \begin{aligned} \mathrm {d}X^{u,v}(t)&=\Big [rX^{u,v}(t)+({\tilde{\mu }}-r)\pi (t)+\lambda \mu (\theta -\eta (t)+q(t)\eta (t)) +\sigma \sqrt{\lambda }q(t)\phi _{11}(t)\\&\quad +{\tilde{\sigma }}\pi (t)\phi _{12}(t)\Big ]\mathrm {d}t +\sigma \sqrt{\lambda }q(t)\mathrm {d}B^{{\mathbb {Q}}_1}(t)+{\tilde{\sigma }}\pi (t)\mathrm {d}{\widetilde{B}}^{{\mathbb {Q}}_1}(t), \end{aligned} \end{aligned}$$
(2.6)

and the reinsurer’s surplus process under the alternative measure \({\mathbb {Q}}_2\) is governed by the following stochastic differential equation (SDE):

$$\begin{aligned} \begin{aligned} \mathrm {d}Y^{u,v}(t)&=\Big [rY^{u,v}(t)+({\tilde{\mu }}-r){\tilde{\pi }}(t)+\lambda \mu \eta (t)(1-q(t)) +\sigma \sqrt{\lambda }(1-q(t))\phi _{21}(t)\\&\quad +{\tilde{\sigma }}{\tilde{\pi }}(t)\phi _{22}(t)\Big ]\mathrm {d}t +\sigma \sqrt{\lambda }(1-q(t))\mathrm {d}B^{{\mathbb {Q}}_2}(t)+{\tilde{\sigma }}{\tilde{\pi }}(t)\mathrm {d}{\widetilde{B}}^{{\mathbb {Q}}_2}(t). \end{aligned} \end{aligned}$$
(2.7)

Next, we first define the admissible set of reinsurance and investment strategies for the insurer and the reinsurer in the following two definitions. In practice, regulations may prevent insurers and reinsurers from short-selling risky assets. This may partly motivate the assumptions for no short-selling of the risky share in the admissible investment strategies for the insurer and the reinsurer.

Mathematically, we give the following definition of an admissible strategy of the insurer and the reinsurer, respectively.

Definition 2.1

A reinsurance–investment strategy \(u(t):=(q(t), \pi (t))\) is said to be admissible for the insurer, if

  1. 1.

    \(q(t), \pi (t)\in [0,\infty ),\) \(\forall \ t\in [0,T],\) that is, the insurer can acquire reinsurance or new business and short-selling for the share is not allowed;

  2. 2.

    \(\{ u(t) \}_{t \in [0, T]}\) is a progressively measurable process with respect to the filtration \(\{{\mathscr {F}}_t\}_{t\in [0,T]}\) and it satisfies that \({\mathbb {E}}^{{\mathbb {Q}}_1^*}_{t,x}\left[ \int _0^T\Vert u(t)\Vert ^2\mathrm {d}t\right] <\infty ,\) where \(\Vert u(t)\Vert ^2=q^2(t)+\pi ^2(t),\) \({\mathbb {E}}^{{\mathbb {Q}}_1^*}_{t,x}[\,\cdot \,]={\mathbb {E}}^{{\mathbb {Q}}_1^*}\left[ \,\cdot \,\big |X^{u,v}(t)=x\right] .\) \({\mathbb {Q}}_1^*\) is an optimal probability measure corresponding to the worst-case scenario to be determined by the insurer;

  3. 3.

    For all \((t,x)\in [0,T]\times {\mathbb {R}},\) the SDE in (2.3) has a unique strong solution \(\{ X^{u,v}(t) \}_{t \in [0, T]}\), \({\mathbb {P}}\)-almost surely.

Let \({\mathcal {U}}\) denote the set of all admissible strategies for the insurer.

Definition 2.2

A pricing (or reinsurance premium)-investment strategy \(v(t):=(\eta (t), {\tilde{\pi }}(t))\) is said to be admissible for the reinsurer, if

  1. 1.

    \(\eta (t), {\tilde{\pi }}(t) \in [0,\infty ),\) \(\forall \ t\in [0,T],\) which indicates that short-selling in the risky share is also not allowed for the reinsurer;

  2. 2.

    \(\{ v(t) \}_{t \in [0, T]}\) is a progressively measurable process with respect to the filtration \(\{{\mathscr {F}}_t\}_{t\in [0,T]}\) and it satisfies that \({\mathbb {E}}^{{\mathbb {Q}}_2^*}_{t,y}\left[ \int _0^T\Vert v(t)\Vert ^2\mathrm {d}t\right] <\infty ,\) where \(\Vert v(t)\Vert ^2=\eta ^2(t)+{\tilde{\pi }}^2(t),\) \({\mathbb {E}}^{{\mathbb {Q}}_2^*}_{t,y}[\,\cdot \,]={\mathbb {E}}^{{\mathbb {Q}}_2^*}\left[ \,\cdot \,\big |Y^{u,v}(t)=y\right] .\) \({\mathbb {Q}}_2^*\) is an optimal probability measure corresponding to the worst-case scenario to be selected by the reinsurer;

  3. 3.

    \(\forall \ (t,y)\in [0,T]\times {\mathbb {R}},\) the SDE given by (2.4) has a unique strong solution \(\{ Y^{u,v}(t) \}_{t \in [0, T]}\), \({\mathbb {P}}\)-almost surely.

Let \({\mathcal {V}}\) denote the set of all admissible strategies for the reinsurer.

For our purposes, both the insurer and the reinsurer are assumed to have a mean-variance preference. Note that the mean-variance preference may be related to a quadratic utility function. When the insurer and reinsurer are ambiguity-neutral, some of the existing papers derive their optimal control policies by considering the optimality of the solution at the initial time, where the corresponding value functions are defined by:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} {\check{J}}_1^{v}(0, x_0):=\underset{u\in {\mathcal {U}}}{\sup }\, \left\{ {\mathbb {E}}^{{\mathbb {P}}}_{0,x_0}\left[ X^{u,v}(T)\right] -\frac{m_1}{2}\mathrm {Var}^{{\mathbb {P}}}_{0,x_0} \left[ X^{u,v}(T)\right] \right\} ,\\ {\check{J}}_2^{u}(0, y_0):=\underset{v\in {\mathcal {V}}}{\sup }\, \left\{ {\mathbb {E}}^{{\mathbb {P}}}_{0,y_0}\left[ Y^{u,v}(T)\right] -\frac{m_2}{2}\mathrm {Var}^{{\mathbb {P}}}_{0,y_0} \left[ Y^{u,v}(T)\right] \right\} , \end{array}\right. } \end{aligned}$$
(2.8)

where

$$\begin{aligned} {\left\{ \begin{array}{ll}{} {\mathbb {E}}^{{\mathbb {P}}}_{t,x}[\,\cdot \,]={\mathbb {E}}^{{\mathbb {P}}}\left[ \,\cdot \,\big |X^{u,v}(t)=x\right] ,\\ {\mathbb {E}}^{{\mathbb {P}}}_{t,y}[\,\cdot \,]={\mathbb {E}}^{{\mathbb {P}}}\left[ \,\cdot \,\big |Y^{u,v}(t)=y\right] ,\\ \mathrm {Var}^{{\mathbb {P}}}_{t,x}[\,\cdot \,]=\mathrm {Var}^{{\mathbb {P}}}\left[ \,\cdot \,\big |X^{u,v}(t)=x\right] , \\ \mathrm {Var}^{{\mathbb {P}}}_{t,y}[\,\cdot \,]=\mathrm {Var}^{{\mathbb {P}}}\left[ \,\cdot \,\big |Y^{u,v}(t)=y\right] , \end{array}\right. } \end{aligned}$$

and \(m_k>0,\) for \(k\in \{1,2\},\) is the risk-averse coefficient of the insurer and the reinsurer. It is obvious that we can only obtain the strategies that are optimal at time zero by solving the optimization problem provided in (2.8). As in Björk et al. (2014) and Kronborg and Steffensen (2015), we aim to establish time-consistent reinsurance–investment strategies by defining time-varying (indirect) value functions for the insurer and the reinsurer as follows: \(\forall \ (x, t)\in {\mathbb {R}}\times [0,T]\) and \(\forall \ (y, t)\in {\mathbb {R}}\times [0,T],\)

$$\begin{aligned} {\left\{ \begin{array}{ll}{} {\widehat{J}}_1^{v}(t, x):=\underset{u\in {\mathcal {U}}}{\sup }\, \left\{ {\mathbb {E}}^{{\mathbb {P}}}_{t,x}\left[ X^{u,v}(T)\right] -\frac{m_1}{2}\mathrm {Var}^{{\mathbb {P}}}_{t,x} \left[ X^{u,v}(T)\right] \right\} ,\\ {\widehat{J}}_2^{u}(t, y):=\underset{v\in {\mathcal {V}}}{\sup }\, \left\{ {\mathbb {E}}^{{\mathbb {P}}}_{t,y}\left[ Y^{u,v}(T)\right] -\frac{m_2}{2}\mathrm {Var}^{{\mathbb {P}}}_{t,y} \left[ Y^{u,v}(T)\right] \right\} . \end{array}\right. } \end{aligned}$$
(2.9)

Next, we shall incorporate ambiguity aversion attitudes into (2.9) to consider the decision-makers’ concerns for model misspecification. The rationale of incorporating ambiguity aversion is that both the insurer and the reinsurer may distrust the accuracy of the reference measure \({\mathbb {P}}\) and tends to select an alternative measure \({\mathbb {Q}}_k\) from \({\mathcal {Q}}\). Using a robust approach to ambiguity, the insurer and the reinsurer aim to solve the mean-variance optimization problems under the worst-case scenario of the alternative probability measure. The objective functions of the insurer and the reinsurer in the robust optimization problems are respectively given by:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} J_1^{v}(t, x) :=\underset{u\in {\mathcal {U}}}{\sup }\,\underset{{\mathbb {Q}}_1\in {\mathcal {Q}}}{\inf }\, \left\{ {\mathbb {E}}^{{\mathbb {Q}}_1}_{t,x}\left[ X^{u,v}(T)\right] \!-\!\frac{m_1}{2}\mathrm {Var}^{{\mathbb {Q}}_1}_{t,x} \left[ X^{u,v}(T)\right] \!+\!{\mathbb {E}}^{{\mathbb {Q}}_1}_{t,x}\left[ P_1({\mathbb {P}}\Vert {\mathbb {Q}}_1)\right] \right\} ,\\ J_2^{u}(t, y) :=\underset{v\in {\mathcal {V}}}{\sup }\,\underset{{\mathbb {Q}}_2\in {\mathcal {Q}}}{\inf }\, \left\{ {\mathbb {E}}^{{\mathbb {Q}}_2}_{t,y}\left[ Y^{u,v}(T)\right] \!-\!\frac{m_2}{2}\mathrm {Var}^{{\mathbb {Q}}_2}_{t,y} \left[ Y^{u,v}(T)\right] \!+\!{\mathbb {E}}^{{\mathbb {Q}}_2}_{t,y}\left[ P_2({\mathbb {P}}\Vert {\mathbb {Q}}_2)\right] \right\} , \end{array}\right. } \end{aligned}$$
(2.10)

where \(P_k({\mathbb {P}}\Vert {\mathbb {Q}}_k)\ge 0,\) for \(k\in \{1,2\},\) is a penalty function measuring the divergence of \({\mathbb {Q}}_k\) from \({\mathbb {P}}.\) Here, we allow that the insurer and the reinsurer apply different penalty functions, where \(P_1\) and \(P_2\) are the penalty functions adopted by the insurer and the reinsurer, respectively. We interpret the penalty functions as follows: if \( P_k({\mathbb {P}}\Vert {\mathbb {Q}}_k)\rightarrow \infty \), the decision-maker is completely confident about the reference model and these alternative models straying away from it would incur a penalty. In this circumstance, the robust optimization problem in (2.10) reduces to the traditional optimization problem in (2.9) and the decision-maker has no robustness preference at all. Additionally, if \( P_k({\mathbb {P}}\Vert {\mathbb {Q}}_k)\rightarrow 0\), i.e., the penalty term disappears, the decision-maker will not penalize model misspecification for any alternative probability measures in \(\mathcal{Q}\), which indicates that the decision-maker is extremely ambiguous. In this respect, the penalty function captures the decision-maker’s degree of confidence in the reference model.

Throughout this paper, under a principal–agent modelling framework, we refer to the insurer (resp., the reinsurer) and the agent (resp., the principal) interchangeably. The robust optimization problems for the insurer and the reinsurer under the principal–agent framework with ambiguity as well as the dynamic mean-variance criterion are presented in the following definitions.

Definition 2.3

The robust mean-variance optimization problem of the insurer is the following stochastic optimization problem:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \underset{u\in {\mathcal {U}}}{\sup } \,\underset{{\mathbb {Q}}_1\in {\mathcal {Q}}}{\inf }\, {\widetilde{J}}_1^{{\mathbb {Q}}_1, u, v}(t, x) \\ \qquad :=\underset{u\in {\mathcal {U}}}{\sup }\,\underset{{\mathbb {Q}}_1\in {\mathcal {Q}}}{\inf }\,\left\{ {\mathbb {E}}^{{\mathbb {Q}}_1}_{t,x}\left[ X^{u,v}(T)\right] \!-\!\frac{m_1}{2}\mathrm {Var}^{{\mathbb {Q}}_1}_{t,x} \left[ X^{u,v}(T)\right] \!+\!{\mathbb {E}}^{{\mathbb {Q}}_1}_{t,x}\left[ P_1({\mathbb {P}}\Vert {\mathbb {Q}}_1)\right] \right\} ,\\ \text {subject to that}\ X^{u,v}(t)\ \text {satisfies} \ (2.6),\ \text {for any}\ v\in {\mathcal {V}}. \end{array}\right. } \end{aligned}$$
(2.11)

Here, we define

$$\begin{aligned} J_1^{u, v}(t, x):=\underset{{\mathbb {Q}}_1\in {\mathcal {Q}}}{\inf }\,{\widetilde{J}}_1^{{\mathbb {Q}}_1, u, v}(t, x). \end{aligned}$$

Definition 2.4

The robust mean-variance optimization problem of the reinsurer is the following stochastic optimization problem:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \underset{v\in {\mathcal {V}}}{\sup }\,\underset{{\mathbb {Q}}_2\in {\mathcal {Q}}}{\inf }\, {\widetilde{J}}_2^{{\mathbb {Q}}_2, u^*, v}(t, y) \\ \qquad :=\underset{v\in {\mathcal {V}}}{\sup }\,\underset{{\mathbb {Q}}_2\in {\mathcal {Q}}}{\inf }\, \left\{ {\mathbb {E}}^{{\mathbb {Q}}_2}_{t,y}\left[ Y^{u^*,v}(T)\right] \!-\!\frac{m_2}{2}\mathrm {Var}^{{\mathbb {Q}}_2}_{t,y} \left[ Y^{u^*,v}(T)\right] \!+\!{\mathbb {E}}^{{\mathbb {Q}}_2}_{t,y}\left[ P_2({\mathbb {P}}\Vert {\mathbb {Q}}_2)\right] \right\} ,\\ \text {subject to that}\ Y^{u^*,v}(t)\ \text {satisfies} \ (2.7)\ \text {and}{\ u^*\ \text {is}\ \text {an}\ \text {optimal}\ \text {solution}}\ \text {to Problem}\ (2.11). \end{array}\right. } \end{aligned}$$
(2.12)

Here, we define

$$\begin{aligned} J_2^{u, v}(t, y):=\underset{{\mathbb {Q}}_2\in {\mathcal {Q}}}{\inf }\,{\widetilde{J}}_2^{{\mathbb {Q}}_2, u, v}(t, y). \end{aligned}$$

Employing the approach in Maenhout (2004), it is easy to show that the increase in the relative entropy in the infinitesimal period from t to \(t +\mathrm {d}t\) equals:

$$\begin{aligned} \frac{1}{2}\left[ \phi _{k1}^2(t)+\phi _{k2}^2(t) \right] \mathrm {d}t. \end{aligned}$$

To solve the problem in (2.11), we follow the work by Maenhout (2004) and specify the penalty function as follows:

$$\begin{aligned} P_1({\mathbb {P}}\Vert {\mathbb {Q}}_1)=\int _t^T\Psi _1\left( s,\phi _1(s),X^{u,v}(s)\right) \mathrm {d}s, \end{aligned}$$

and define the value function of the insurer as follows:

$$\begin{aligned} \begin{aligned} V_1(t,x,v)&:=\underset{u\in {\mathcal {U}}}{\sup }\,\underset{{\mathbb {Q}}_1\in {\mathcal {Q}}}{\inf }\, \Bigg \{{\mathbb {E}}^{{\mathbb {Q}}_1}_{t,x}\left[ X^{u, v}(T)\right] -\frac{m_1}{2}\mathrm {Var}^{{\mathbb {Q}}_1}_{t,x} \left[ X^{u, v}(T)\right] \\&\quad +{\mathbb {E}}^{{\mathbb {Q}}_1}_{t,x}\left[ \int _t^T\Psi _1\left( s,\phi _1(s),X^{u,v}(s)\right) \mathrm {d}s\right] \Bigg \}\\ \,&=\underset{u\in {\mathcal {U}}}{\sup }\,J_1^{u, v}(t, x), \end{aligned} \end{aligned}$$

where

$$\begin{aligned} \Psi _1\left( s,\phi _1(s),X^{u,v}(s)\right) =\frac{\phi _{11}^2(s)}{2\psi _{11}\left( s,X^{u,v}(s)\right) }+\frac{\phi _{12}^2(s)}{2\psi _{12}\left( s,X^{u,v}(s)\right) }. \end{aligned}$$

For each \(j\in \{1,2\},\) \(\psi _{1j}\left( s,X^{u,v}(s)\right) \) is a strictly positive deterministic function in (sx). The larger \(\psi _{1j}\left( s,X^{u,v}(s)\right) \) is, the less deviation from the reference model is penalized. Consequently, this indicates that the AAI is less confident about the reference model and tends to consider other feasible models. In other words, a larger \(\psi _{1j}\left( s,X^{u,v}(s)\right) \) indicates that the insurer is more ambiguity-averse. For analytical tractability, as in Zeng et al. (2016), we assume that \(\psi _{1j}\) for each \(j\in \{1,2\}\) is a given state-independent function by putting:

$$\begin{aligned} \psi _{1j}(t, x)=\beta _{1j}, \end{aligned}$$

where \(\beta _{1j}\ge 0\) is the insurer’s ambiguity aversion coefficient, and \(\beta _{11}\) corresponds to the claim process and \(\beta _{12}\) corresponds to the stock price. As \(\beta _{1j}\) approaches zero, the insurer tends to be ambiguity-neutral about that kind of diffusion risk. Similarly, for the reinsurer’s robust optimization problem presented in (2.12), the following penalty function is adopted:

$$\begin{aligned} P_2({\mathbb {P}}\Vert {\mathbb {Q}}_2)=\int _t^T\Psi _2\left( s,\phi _2(s),Y^{u,v}(s)\right) \mathrm {d}s, \end{aligned}$$

and the value function of the reinsurer is now defined as:

$$\begin{aligned} \begin{aligned} V_2(t,y)&:=\underset{v\in {\mathcal {V}}}{\sup }\,\underset{{\mathbb {Q}}_2\in {\mathcal {Q}}}{\inf }\, \Bigg \{{\mathbb {E}}^{{\mathbb {Q}}_2}_{t,y}\left[ Y^{u, v}(T)\right] -\frac{m_2}{2}\mathrm {Var}^{{\mathbb {Q}}_2}_{t,y} \left[ Y^{u, v}(T)\right] \\&\quad +{\mathbb {E}}^{{\mathbb {Q}}_2}_{t,y}\left[ \int _t^T\Psi _2\left( s,\phi _2(s),Y^{u,v}(s)\right) \mathrm {d}s\right] \Bigg \}\\ \,&=\underset{v\in {\mathcal {V}}}{\sup }\,J_2^{u, v}(t, y), \end{aligned} \end{aligned}$$

where

$$\begin{aligned} \Psi _2\left( s,\phi _2(s),Y^{u,v}(s)\right) =\frac{\phi _{21}^2(s)}{2\psi _{21}\left( s,Y^{u,v}(s)\right) }+\frac{\phi _{22}^2(s)}{2\psi _{22}\left( s,Y^{u,v}(s)\right) }. \end{aligned}$$

For each \(j\in \{1,2\},\) it is also supposed that \(\psi _{2j}\) is a fixed and state-independent function by setting:

$$\begin{aligned} \psi _{2j}(t, y)=\beta _{2j}, \end{aligned}$$

where \(\beta _{2j}\) is the ambiguity aversion parameter of the AAR with respect to the diffusion risk and \(\beta _{2j}\ge 0\). The reinsurer tends to be ambiguity-neutral for the diffusion risk when \(\beta _{2j} \rightarrow 0.\)

To articulate the time-inconsistency issue in the principal–agent problem given in (2.11) and (2.12), we follow the approach in Björk and Murgoci (2010), Björk et al. (2014) and Kronborg and Steffensen (2015). Basically, they formulated the decision-maker’s optimization problem with time-inconsistency as a non-cooperative game and sought a subgame perfect Nash equilibrium. The equilibrium strategies and the equilibrium value functions for the optimization problems in (2.11) and (2.12) are defined below. These two definitions appear to be standard (e.g., Björk et al., 2014; Kronborg & Steffensen, 2015).

Definition 2.5

For any given reinsurance price \(\eta (t)\) and any initial states \((t,x) \in [0,T]\times {\mathbb {R}},\) let \(u^*(t)=\left( q^*(t), \pi ^*(t)\right) =(q^*(t,\eta (t)), \pi ^*(t))\) be an admissible strategy of the insurer, and we define the following (perturbed) reinsurance–investment strategy:

$$\begin{aligned}u^{\epsilon }(s):= {\left\{ \begin{array}{ll} {\hat{u}}, \ \quad \qquad t\le s<t+\varepsilon ,\\ u^*(s), \ \quad t+\varepsilon \le s<T, \end{array}\right. } \end{aligned}$$

where \({\hat{u}}=({\hat{q}}, {\hat{\pi }})\) and \(\varepsilon \in {\mathbb {R}}_+.\) If \(\forall \ {\hat{u}}\in {\mathbb {R}}_+\times {\mathbb {R}}_+,\) we have

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0}\frac{J_1^{u^*, v}(t, x)-J_1^{u^{\varepsilon }, v}(t, x)}{\varepsilon }\ge 0, \end{aligned}$$

then \(u^*(t)\) is called an equilibrium reinsurance–investment strategy of the insurer and the equilibrium value function of the insurer is given by:

$$\begin{aligned} V_1(t,x,v)=J_1^{u^*, v}(t, x), \end{aligned}$$

where \(J_1^{u^*, v}(t, x)\) was defined in Definition 2.3.

Definition 2.6

For any initial states \((t,y)\in [0,T]\times {\mathbb {R}},\) let \(v^*(t)=(\eta ^*(t),{\tilde{\pi }}^*(t))\) be an admissible strategy of the reinsurer, and we define a perturbed strategy as follows:

$$\begin{aligned}v^{\epsilon }(s):= {\left\{ \begin{array}{ll} {\bar{v}}, \ \quad \qquad t\le s<t+\varepsilon ,\\ v^*(s), \ \quad t+\varepsilon \le s<T, \end{array}\right. } \end{aligned}$$

where \({\bar{v}}=({\bar{\eta }}, {\bar{\pi }})\) and \(\varepsilon \in {\mathbb {R}}_+.\) If \(\forall \ {\bar{v}}\in {\mathbb {R}}_+\times {\mathbb {R}}_+,\) we have

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0}\frac{J_2^{u^*, v^*}(t, y)-J_2^{u^*, v^{\varepsilon }}(t, y)}{\varepsilon }\ge 0, \end{aligned}$$

then \(v^*(t)\) is called an equilibrium reinsurance–investment strategy of the reinsurer and the equilibrium value function of the reinsurer is given by:

$$\begin{aligned} V_2(t,y)=J_2^{u^*, v^*}(t, y), \end{aligned}$$

where \(J_2^{u^*, v^*}(t, y)\) was defined in Definition 2.4. Furthermore, when there is no risk of confusion, we write

$$\begin{aligned} V_1(t,y):=J_1^{u^*, v^*}(t, y). \end{aligned}$$

Notably, two game theoretic problems were used in our model setting. Specifically, the first one is the game problem between the insurer and the reinsurer arising from the principal–agent perspective. The other game problem can be regarded as a non-cooperative game between each decision-maker at time t and future incarnations of themselves, which is introduced to articulate the time-inconsistency of the optimization problems involving mean-variance criteria. Specifically, the equilibrium strategies in the Definitions 2.5 and 2.6 are time-consistent. Hereafter, the equilibrium strategy used to solve (2.11) and satisfying Definition 2.5 is referred to as the robust optimal time-consistent strategy of the insurer; the equilibrium strategy used to solve (2.12) and satisfying Definition 2.6 is referred to as the robust optimal time-consistent strategy of the reinsurer; the corresponding equilibrium value functions satisfying Definitions 2.5 and 2.6 are referred to as the optimal value functions of the insurer and the reinsurer, respectively.

3 Solution to the robust reinsurance contract

In this section, the verification theorems are presented and the robust equilibrium reinsurance–investment strategies of the insurer and the reinsurer are derived. Let \(C^{1,2}([0,T]\times {\mathbb {R}})\) denote the space of functions f(tx) that are continuously differentiable in \(t \in [0, T]\) and twice continuously differentiable in \(x \in {\mathbb {R}},\) respectively. Write \(D_p^{1,2}([0,T]\times {\mathbb {R}})\) for the space of functions \(f (t, x) \in C^{1,2}([0,T]\times {\mathbb {R}})\) such that all of its first-order partial derivatives satisfy the polynomial growth conditions.

3.1 The insurer’s problem

For notation brevity, we have suppressed the arguments of the functions in the following paragraphs. For all \((t, x) \in [0,T]\times {\mathbb {R}},\) we define the infinitesimal generator \({\mathcal {L}}_1\) acting on \(W_1(t,x)\in C^{1,2}([0,T]\times {\mathbb {R}})\) as follows:

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_1^{u, v, \phi _1, \phi _2} W_1(t, x)&:=\frac{\partial W_1(t, x)}{\partial t}+\Big [rx+({\tilde{\mu }}-r)\pi +\lambda \mu (\theta -\eta )+\lambda \mu \eta q+\sigma \sqrt{\lambda }\phi _{11}q\\&\quad +{\tilde{\sigma }}\phi _{12}\pi \Big ]\frac{\partial W_1(t, x)}{\partial x} +\frac{1}{2}\left( \lambda \sigma ^2q^2+{\tilde{\sigma }}^2\pi ^2\right) \frac{\partial ^2W_1(t, x)}{\partial x^2}. \end{aligned} \end{aligned}$$

Theorem 3.1

(Verification Theorem for the insurer’s optimization problem) For Problem (2.11), if there exist real-valued functions \(W_1(t,x)\) and \(g_1(t,x)\in D_p^{1,2}([0,T] \times {\mathbb {R}})\) that satisfy the following extended HJB system of equations: \(\forall \ (t, x) \in [0,T]\times {\mathbb {R}},\)

$$\begin{aligned}&\underset{u\in {\mathcal {U}}}{\sup }\,\underset{\phi _1\in \Sigma _1}{\inf }\,\Bigg \{ {\mathcal {L}}_1^{u, v, \phi _1, \phi _2} W_1(t, x)-{\mathcal {L}}_1^{u, v, \phi _1, \phi _2}\frac{m_1}{2}g_1^2(t,x)\nonumber \\&\quad + m_1g_1(t,x){\mathcal {L}}_1^{u, v, \phi _1, \phi _2}g_1(t,x) +\sum _{j=1}^2\frac{\phi _{1j}^2}{2\beta _{1j}}\Bigg \}=0, \end{aligned}$$
(3.1)
$$\begin{aligned}&W_1(T, x)=x, \end{aligned}$$
(3.2)
$$\begin{aligned}&g_1(T, x)=x, \end{aligned}$$
(3.3)
$$\begin{aligned}&{\mathcal {L}}_1^{u^*, v, \phi _1^*, \phi _2}g_1(t, x)=0, \end{aligned}$$
(3.4)

where

$$\begin{aligned} \begin{aligned} (u^*, \phi _1^*):&=\arg \underset{u\in {\mathcal {U}}}{\sup }\,\underset{\phi _1\in \Sigma _1}{\inf }\,\Bigg \{ {\mathcal {L}}_1^{u, v, \phi _1, \phi _2} W_1(t, x)-{\mathcal {L}}_1^{u, v, \phi _1, \phi _2}\frac{m_1}{2}g_1^2(t,x)\\&\quad +m_1g_1(t,x){\mathcal {L}}_1^{u, v, \phi _1, \phi _2}g_1(t,x) \!+\!\sum _{j=1}^2\frac{\phi _{1j}^2}{2\beta _{1j}}\Bigg \}, \end{aligned} \end{aligned}$$

then we have \(W_1(t, x)=V_1(t,x),\) \({\mathbb {E}}^{{\mathbb {Q}}_1^*}_{t,x}\left[ X^{u^*, v}(T)\right] =g_1(t, x),\) \(u^*\) is the robust equilibrium reinsurance–investment strategy of the insurer, and \(\phi _1^*\) is the worst-case scenario density generator of the insurer.

Proof

The proof of this theorem is similar to that of Theorem 4.1 in Björk and Murgoci (2010), and thus, it does not need to be repeated here. \(\square \)

Simplifying Eq. (3.1) in Theorem 3.1, the following is obtained:

$$\begin{aligned}&\underset{u\in {\mathcal {U}}}{\sup }\,\underset{\phi _1\in \Sigma _1}{\inf }\,\Bigg \{\frac{\partial W_1(t, x)}{\partial t}+\Big [rx+({\tilde{\mu }}-r)\pi +\lambda \mu (\theta -\eta )+\lambda \mu \eta q+\sigma \sqrt{\lambda }\phi _{11}q +{\tilde{\sigma }}\phi _{12}\pi \Big ]\nonumber \\&\qquad \frac{\partial W_1(t, x)}{\partial x}+\frac{1}{2}\left( \lambda \sigma ^2q^2+{\tilde{\sigma }}^2\pi ^2\right) \left( \frac{\partial ^2W_1(t, x)}{\partial x^2}-m_1\left( \frac{\partial g_1(t, x)}{\partial x}\right) ^2\right) +\frac{\phi _{11}^2}{2\beta _{11}}+\frac{\phi _{12}^2}{2\beta _{12}}\Bigg \}\nonumber \\&\quad =0. \end{aligned}$$
(3.5)

To solve (3.4) and (3.5), it is conjectured that the solutions have the following separated affine forms:

$$\begin{aligned} \begin{aligned} W_1(t,x)&=A_1(t)x+B_1(t),\ A_1(T)=1,\ B_1(T)=0,\\ g_1(t,x)&={\widetilde{A}}_1(t)x+{\widetilde{B}}_1(t),\ {\widetilde{A}}_1(T)=1,\ {\widetilde{B}}_1(T)=0, \end{aligned} \end{aligned}$$
(3.6)

where the terminal conditions for \(A_1\), \(B_1\), \({\widetilde{A}}_1\) and \({\widetilde{B}}_1\) are determined from the terminal conditions for \(W_1\) and \(g_1\) in (3.2) and (3.3). These functions are supposed to be sufficiently smooth. Differentiating \(W_1\) and \(g_1\) with respect to t and x gives:

$$\begin{aligned}&\frac{\partial W_1(t, x)}{\partial t}=A_1'(t)x+B_1'(t),\qquad \frac{\partial W_1(t, x)}{\partial x}=A_1(t),\qquad \frac{\partial ^2 W_1(t, x)}{\partial x^2}=0, \end{aligned}$$
(3.7)
$$\begin{aligned}&\frac{\partial g_1(t, x)}{\partial t}= {\widetilde{A}}_1'(t)x+ {\widetilde{B}}_1'(t), \qquad \frac{\partial g_1(t, x)}{\partial x}={\widetilde{A}}_1(t),\qquad \frac{\partial ^2 g_1(t, x)}{\partial x^2}=0. \end{aligned}$$
(3.8)

Substituting (3.7) and (3.8) into (3.5) yields:

$$\begin{aligned} \begin{aligned}&\underset{u\in {\mathcal {U}}}{\sup }\,\underset{\phi _1\in \Sigma _1}{\inf }\,\Bigg \{ A_1'x+ B_1'+\Big [rx+({\tilde{\mu }}-r)\pi +\lambda \mu (\theta -\eta )+\lambda \mu \eta q+\sigma \sqrt{\lambda }\phi _{11}q+{\tilde{\sigma }}\phi _{12}\pi \Big ]A_1 \\&\quad -\frac{m_1{\widetilde{A}}_1^2}{2}\left( \lambda \sigma ^2q^2+{\tilde{\sigma }}^2\pi ^2\right) +\frac{\phi _{11}^2}{2\beta _{11}}+\frac{\phi _{12}^2}{2\beta _{12}}\Bigg \}=0. \end{aligned} \end{aligned}$$
(3.9)

For each fixed u, the first-order optimality condition on the value function with respect to \(\phi _1\) yields the infimum point \(\phi _1^*(t):=(\phi _{11}^*(t),\phi _{12}^*(t))\) as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \begin{aligned} \phi _{11}^*(t)&{}=-\beta _{11}\sigma \sqrt{\lambda }A_1(t)q(t),\\ \phi _{12}^*(t)&{}=-\beta _{12}{\tilde{\sigma }}A_1(t)\pi (t). \end{aligned} \end{array}\right. } \end{aligned}$$
(3.10)

Next, \(\phi _1^*\) given in (3.10) is justified as the infimum point by evaluating the second-order derivatives, which is to check the convexity conditions. To this end, we gather the terms of \(\phi _{1j},\) for \(j\in \{1,2\},\) in (3.9) and define the following functions:

$$\begin{aligned} {\left\{ \begin{array}{ll} f_{1}(\phi _{11}):=\sigma \sqrt{\lambda }q\phi _{11}A_1+\frac{\phi _{11}^2}{2\beta _{11}},\\ f_{2}(\phi _{12}):={\tilde{\sigma }}\pi \phi _{12}A_1+\frac{\phi _{12}^2}{2\beta _{12}}. \end{array}\right. } \end{aligned}$$

Accordingly, we have that:

$$\begin{aligned} f_j''(\phi _{1j})=\frac{1}{\beta _{1j}}>0,\quad \ j\in \{1,2\}, \end{aligned}$$

which implies that the first-order optimality condition gives rise to the infimum point of the left-hand side of (3.9).

Putting (3.10) back into (3.9), we obtain:

$$\begin{aligned}&\underset{u\in {\mathcal {U}}}{\sup }\,\Bigg \{ A_1'x\!+\! B_1'\!+\!\Big [rx+({\tilde{\mu }}-r)\pi +\lambda \mu (\theta -\eta )+\lambda \mu \eta q-\beta _{11}\sigma ^2\lambda A_1q^2-\beta _{12}{\tilde{\sigma }}^2 A_1\pi ^2\Big ]A_1\nonumber \\&\quad -\frac{m_1{\widetilde{A}}_1^2}{2}\left( \lambda \sigma ^2q^2+{\tilde{\sigma }}^2\pi ^2\right) +\frac{\beta _{11}\lambda \sigma ^2A_1^2q^2}{2}+\frac{\beta _{12}{\tilde{\sigma }}^2A_1^2\pi ^2}{2}\Bigg \}=0. \end{aligned}$$
(3.11)

The first-order optimality condition on the value function with respect to u yields the optimal reinsurance–investment strategy \(u^*(t):=(q^*(t),\pi ^*(t))\) of the insurer as follows:

figure a

To check that \(u^*\) is the maximum point, we define

$$\begin{aligned} h_1(\pi ):=\left[ ({{\tilde{\mu }}}-r)\pi -\beta _{12}{\tilde{\sigma }}^2A_1\pi ^2\right] A_1-\frac{{\tilde{\sigma }}^2\pi ^2(m_1{\widetilde{A}}_1^2-\beta _{12}A_1^2)}{2}, \end{aligned}$$

and we then have the following second-order condition:

$$\begin{aligned} h_1''(\pi )=-m_1{\widetilde{A}}_1^2{\tilde{\sigma }}^2-\beta _{12}A_1^2{\tilde{\sigma }}^2<0. \end{aligned}$$

Finally, the function involving the insurer’s reinsurance strategy is defined as:

$$\begin{aligned} h_2(q):=\lambda \mu q\eta A_1-\frac{m_1{\widetilde{A}}_1^2}{2}\lambda \sigma ^2q^2-\frac{\beta _{11}\sigma ^2\lambda q^2A_1^2}{2}, \end{aligned}$$

which leads to the following second-order condition:

$$\begin{aligned} h_2''(q)=-m_1{\widetilde{A}}_1^2\lambda \sigma ^2-\beta _{11}\sigma ^2\lambda A_1^2<0. \end{aligned}$$

Therefore, the reinsurance–investment strategy given in (3.12) is the maximizer of the left-hand side of (3.11).

Substituting \(q^*\) and \(\pi ^*\) in (3.12) into (3.4) and (3.11), we obtain:

$$\begin{aligned} \begin{aligned}&\left( {\widetilde{A}}_1'+r{\widetilde{A}}_1\right) x+{\widetilde{B}}_1' +\Big [({\tilde{\mu }}-r)\pi ^*+\lambda \mu (\theta -\eta )+\lambda \mu \eta q^*\\&\quad -\beta _{11}\sigma ^2\lambda A_1(q^*)^2-\beta _{12}{\tilde{\sigma }}^2 A_1(\pi ^*)^2\Big ]{\widetilde{A}}_1=0, \end{aligned} \end{aligned}$$
(3.13)

and

$$\begin{aligned} \begin{aligned}&\left( A_1'+rA_1\right) x+ B_1'+\Big [({\tilde{\mu }}-r)\pi ^*+\lambda \mu (\theta -\eta )+\lambda \mu \eta q^*\Big ]A_1\\&\quad -\lambda \sigma ^2(q^*)^2\left( \frac{m_1{\widetilde{A}}_1^2}{2}+\frac{\beta _{11}A_1^2}{2}\right) -{\tilde{\sigma }}^2(\pi ^*)^2\left( \frac{m_1{\widetilde{A}}_1^2}{2}+\frac{\beta _{12}A_1^2}{2}\right) =0. \end{aligned} \end{aligned}$$
(3.14)

By separating the variables with and without x respectively, we can obtain the following system of equations:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \begin{aligned} &{}{\widetilde{A}}_1'+r{\widetilde{A}}_1=0,\qquad A_1'+rA_1=0,\\ &{}{\widetilde{B}}_1' +\Big [({\tilde{\mu }}-r)\pi ^*+\lambda \mu (\theta -\eta )+\lambda \mu \eta q^* -\beta _{11}\sigma ^2\lambda A_1(q^*)^2-\beta _{12}{\tilde{\sigma }}^2 A_1(\pi ^*)^2\Big ]{\widetilde{A}}_1=0,\\ &{}B_1'+\Big [({\tilde{\mu }}-r)\pi ^*+\lambda \mu (\theta -\eta )+\lambda \mu \eta q^*\Big ]A_1\\ &{}\quad -\lambda \sigma ^2(q^*)^2\left( \frac{m_1{\widetilde{A}}_1^2}{2}+\frac{\beta _{11}A_1^2}{2}\right) -{\tilde{\sigma }}^2(\pi ^*)^2\left( \frac{m_1{\widetilde{A}}_1^2}{2}+\frac{\beta _{12}A_1^2}{2}\right) =0. \end{aligned} \end{array}\right. } \end{aligned}$$

Solving the above equations with the respective boundary conditions in (3.6) gives:

$$\begin{aligned} {\widetilde{A}}_1(t)= & {} e^{r(T-t)}, \qquad A_1(t)=e^{r(T-t)}, \\ {\widetilde{B}}_1(t)= & {} \int _t^T\lambda \mu (\theta -\eta )\mathrm {d}s +\int _t^T{\tilde{b}}_{11}(s)\mathrm {d}s+\int _t^T{\tilde{b}}_{12}(s)\mathrm {d}s, \\ B_1(t)= & {} \int _t^T\lambda \mu (\theta -\eta )\mathrm {d}s+\int _t^Tb_{11}(s)\mathrm {d}s+\int _t^Tb_{12}(s)\mathrm {d}s, \end{aligned}$$

where

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \begin{aligned} {\tilde{b}}_{11}(s)&{}=\left[ \lambda \mu \eta q^*(s)-\beta _{11}\sigma ^2\lambda (q^*(s))^2e^{r(T-s)}\right] e^{r(T-s)},\\ {\tilde{b}}_{12}(s)&{}=\left[ ({\tilde{\mu }}-r)\pi ^*(s)-\beta _{12}{\tilde{\sigma }}^2(\pi ^*(s))^2e^{r(T-s)}\right] e^{r(T-s)}, \end{aligned} \end{array}\right. } \end{aligned}$$
(3.15)

and

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \begin{aligned} b_{11}(s)&{}=\left[ \lambda \mu \eta q^*(s)-\frac{\lambda \sigma ^2(q^*(s))^2}{2}(m_1+\beta _{11})e^{r(T-s)}\right] e^{r(T-s)},\\ b_{12}(s)&{}=\left[ ({\tilde{\mu }}-r)\pi ^*(s)-\frac{{\tilde{\sigma }}^2(\pi ^*(s))^2}{2}(m_1+\beta _{12})e^{r(T-s)}\right] e^{r(T-s)}. \end{aligned} \end{array}\right. } \end{aligned}$$
(3.16)

It is worth noting that the solution to the insurer’s robust optimization problem in (2.11) is derived based on a given reinsurance premium \(\eta \), and the equilibrium reinsurance price \(\eta ^*\) would be determined in the next subsection. A proportional reinsurance contract \((q,\eta )\) is called incentive compatible if the agent’s retained fraction q of each claim and the reinsurance premium to be determined by the reinsurer, say \(\eta \), satisfy (3.12a). As shown in (3.12a), the agent’s optimal retention level of the insurance risk linearly increases as the given reinsurance price increases. This result is consistent with the economic interpretation that the insurer makes his decision in the transferred insurance risk when the reinsurance price is given. Furthermore, the optimal reinsurance protection demand \(1-q^*\) decreases as the reinsurance price increases. This also seems consistent with the law of demand that is one of the fundamental principles in economics. A similar conclusion was also drawn in Wang and Siu (2020), where a robust optimal reinsurance agreement with VaR risk constraint was derived.

3.2 The reinsurer’s problem

The optimization problem of the reinsurer is discussed in this subsection. First, for all \((t, y)\in [0,T]\times {\mathbb {R}},\) we define an infinitesimal generator \({\mathcal {L}}_2\) acting on \(W_2(t,y)\in C^{1,2}([0,T]\times {\mathbb {R}})\) as follows:

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_2^{u, v, \phi _1, \phi _2} W_2(t, y) :=&\frac{\partial W_2(t, y)}{\partial t}+\Big [ry+({\tilde{\mu }}-r){\tilde{\pi }}+\lambda \mu \eta (1-q) +\sigma \sqrt{\lambda }\phi _{21}(1-q)\\&+{\tilde{\sigma }}\phi _{22}{\tilde{\pi }}\Big ]\frac{\partial W_2(t, y)}{\partial y} +\frac{1}{2}\left( \lambda \sigma ^2(1-q)^2+{\tilde{\sigma }}^2{\tilde{\pi }}^2\right) \frac{\partial ^2W_2(t, y)}{\partial y^2}. \end{aligned} \end{aligned}$$

The following verification theorem for the reinsurer is stated without giving the proof, which follows similarly from that of Theorem 4.1 in Björk and Murgoci (2010).

Theorem 3.2

(Verification Theorem for the reinsurer’s optimization problem) For Problem (2.12), if there exist real-valued functions \(W_2(t,y)\) and \(g_2(t,y)\in D_p^{1,2}([0,T] \times {\mathbb {R}})\) satisfying the following extended HJB system of equations: \(\forall \ (t, y)\in [0,T]\times {\mathbb {R}},\)

$$\begin{aligned}&\underset{v\in {\mathcal {V}}}{\sup }\,\underset{\phi _2\in \Sigma _2}{\inf }\,\Bigg \{ {\mathcal {L}}_2^{u^*, v, \phi _1^*, \phi _2} W_2(t, y)-{\mathcal {L}}_2^{u^*, v, \phi _1^*, \phi _2}\frac{m_2}{2}g_2^2(t,y)\nonumber \\&\quad +m_2g_2(t,y){\mathcal {L}}_2^{u^*, v, \phi _1^*, \phi _2}g_2(t,y) +\sum _{j=1}^2\frac{\phi _{2j}^2}{2\beta _{2j}}\Bigg \}=0, \end{aligned}$$
(3.17)
$$\begin{aligned}&W_2(T, y)=y, \end{aligned}$$
(3.18)
$$\begin{aligned}&g_2(T, y)=y, \end{aligned}$$
(3.19)
$$\begin{aligned}&{\mathcal {L}}_2^{u^*, v^*, \phi _1^*, \phi _2^*}g_2(t, y)=0, \end{aligned}$$
(3.20)

where

$$\begin{aligned} \begin{aligned} (v^*, \phi _2^*)&:=\arg \underset{v\in {\mathcal {V}}}{\sup }\,\underset{\phi _2\in \Sigma _2}{\inf }\,\Bigg \{ {\mathcal {L}}_2^{u^*, v, \phi _1^*, \phi _2} W_2(t, y)-{\mathcal {L}}_2^{u^*, v, \phi _1^*, \phi _2}\frac{m_2}{2}g_2^2(t,y)\\&\quad +\!m_2g_2(t,y){\mathcal {L}}_2^{u^*, v, \phi _1^*, \phi _2}g_2(t,y) \!+\!\sum _{j=1}^2\frac{\phi _{2j}^2}{2\beta _{2j}}\Bigg \}, \end{aligned} \end{aligned}$$

\(u^*\) is the robust equilibrium reinsurance–investment strategy of the insurer, and \(\phi ^*_1\) is the worst-case scenario density generator of the insurer in Theorem 3.1, then we have \(W_2(t, y)=V_2(t, y),\) \({\mathbb {E}}^{{\mathbb {Q}}_2^*}_{t,y}\left[ Y^{u^*, v^*}(T)\right] =g_2(t, y),\) \(v^*\) is the robust equilibrium reinsurance–investment strategy of the reinsurer, and \(\phi _2^*\) is the worst-case scenario density generator of the reinsurer.

Equation (3.17) in Theorem 3.2 is equivalent to:

$$\begin{aligned} \begin{aligned}&\underset{v\in {\mathcal {V}}}{\sup }\,\underset{\phi _2\in \Sigma _2}{\inf }\,\Bigg \{ \frac{\partial W_2(t, y)}{\partial t}+\left[ ry+({\tilde{\mu }}-r){\tilde{\pi }}+\lambda \mu \eta (1-q^*) +\sigma \sqrt{\lambda }\phi _{21}(1-q^*)+{\tilde{\sigma }}\phi _{22}{\tilde{\pi }}\right] \\&\quad \times \frac{\partial W_2(t, y)}{\partial y} \!+\!\frac{1}{2}\left( \lambda \sigma ^2(1\!-\!q^*)^2\!+\!{\tilde{\sigma }}^2{\tilde{\pi }}^2\right) \left( \frac{\partial ^2W_2(t, y)}{\partial y^2}\!-\!m_2\left( \frac{\partial g_2(t, y)}{\partial y}\right) ^2\right) \\&\quad +\frac{\phi _{21}^2}{2\beta _{21}}+\frac{\phi _{22}^2}{2\beta _{22}} \Bigg \}=0. \end{aligned} \end{aligned}$$
(3.21)

To solve (3.20) and (3.21), the following trial solutions that are of affine forms are considered:

$$\begin{aligned} {\left\{ \begin{array}{ll} W_2(t,y)=A_2(t)y+B_2(t),\\ A_2(T)=1,\ \ B_2(T)=0, \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} {\left\{ \begin{array}{ll} g_2(t,y)={\widetilde{A}}_2(t)y+{\widetilde{B}}_2(t),\\ {\widetilde{A}}_2(T)=1,\ \ {\widetilde{B}}_2(T)=0. \end{array}\right. } \end{aligned}$$

Again, the terminal conditions for \(A_2\), \(B_2\), \({\widetilde{A}}_2\) and \({\widetilde{B}}_2\) are determined from the terminal conditions for \(W_2\) and \(g_2\). These functions are supposed to be sufficiently smooth.

Putting the corresponding partial derivatives of \(W_2\) and \(g_2\) into (3.21), we obtain:

$$\begin{aligned}&\underset{v\in {\mathcal {V}}}{\sup }\,\underset{\phi _2\in \Sigma _2}{\inf }\,\Bigg \{ A_2'y+B_2'+\left[ ry+({\tilde{\mu }}-r){\tilde{\pi }}+\lambda \mu \eta (1-q^*) +\sigma \sqrt{\lambda }\phi _{21}(1-q^*) +{\tilde{\sigma }}\phi _{22}{\tilde{\pi }}\right] A_2\nonumber \\&\quad -\frac{m_2{\widetilde{A}}_2^2}{2}\left( \lambda \sigma ^2(1-q^*)^2+{\tilde{\sigma }}^2{\tilde{\pi }}^2\right) +\frac{\phi _{21}^2}{2\beta _{21}}+\frac{\phi _{22}^2}{2\beta _{22}} \Bigg \}=0. \end{aligned}$$
(3.22)

For each fixed v, the first-order optimality condition for \(\phi _2\) yields the minimum point \(\phi _2^*(t):=(\phi _{21}^*(t),\phi _{22}^*(t))\) as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \begin{aligned} \phi _{21}^*(t)&{}=-\beta _{21}\sigma \sqrt{\lambda }A_2(t)(1-q^*(t)),\\ \phi _{22}^*(t)&{}=-\beta _{22}{\tilde{\sigma }}A_2(t){\tilde{\pi }}(t). \end{aligned} \end{array}\right. } \end{aligned}$$
(3.23)

Procedures similar to those in Sect. 3.1 can be employed to verify that \(\phi ^*_2\) given in (3.23) gives rise to the minimum point of the left-hand side of (3.22), so we do not repeat them here.

Substituting (3.23) into (3.22), we obtain:

$$\begin{aligned}&\underset{v\in {\mathcal {V}}}{\sup }\,\Bigg \{ A_2'y+ B_2'+\Big [ry+({\tilde{\mu }}-r){\tilde{\pi }}+\lambda \mu \eta (1-q^*)-\beta _{21}\sigma ^2\lambda A_2(1-q^*)^2-\beta _{22}{\tilde{\sigma }}^2 A_2{\tilde{\pi }}^2\Big ]A_2\nonumber \\&\quad -\frac{m_2{\widetilde{A}}_2^2}{2}\left[ \lambda \sigma ^2(1-q^*)^2+{\tilde{\sigma }}^2{\tilde{\pi }}^2\right] +\frac{\beta _{21}\lambda \sigma ^2A_2^2(1-q^*)^2}{2}+\frac{\beta _{22}{\tilde{\sigma }}^2A_2^2{\tilde{\pi }}^2}{2}\Bigg \}=0. \end{aligned}$$
(3.24)

Substituting \(q^*\) in (3.12a) into (3.24), we obtain:

$$\begin{aligned} \begin{aligned}&\underset{v\in {\mathcal {V}}}{\sup }\,\Biggl \{ A_2'y+ B_2'+\Bigg [ry+({\tilde{\mu }}-r){\tilde{\pi }}+\lambda \mu \eta -\frac{\lambda \mu ^2\eta ^2A_1}{\beta _{11}\sigma ^2A_1^2+m_1\sigma ^2{\widetilde{A}}_1^2}\Bigg ]A_2\\&\quad -\lambda \sigma ^2\left( \frac{m_2{\widetilde{A}}_2^2}{2}+\frac{\beta _{21}A_2^2}{2}\right) \Bigg (1-\frac{2\mu \eta A_1}{\beta _{11}\sigma ^2A_1^2+m_1\sigma ^2{\widetilde{A}}_1^2}+\frac{\mu ^2\eta ^2A_1^2}{\left( \beta _{11}\sigma ^2A_1^2+m_1\sigma ^2{\widetilde{A}}_1^2\right) ^2}\Bigg ) \Biggl \}=0. \end{aligned} \end{aligned}$$

Similarly, the first-order optimality condition on the value function with respect to v yields the maximum point \(v^*(t) := (\eta ^*(t), {\tilde{\pi }}^*(t))\) as follows:

figure b

Putting \(\eta ^*\) and \({\tilde{\pi }}^*\) in (3.25) back into (3.20) and (3.24) gives:

$$\begin{aligned} \begin{aligned}&\left( {\widetilde{A}}_2'+r{\widetilde{A}}_2\right) y+ {\widetilde{B}}_2'+\Big [({\tilde{\mu }}-r){\tilde{\pi }}^*+\lambda \mu \eta ^* (1-q^*)-\beta _{21}\sigma ^2\lambda A_2(1-q^*)^2\\&\quad -\beta _{22}{\tilde{\sigma }}^2 A_2({\tilde{\pi }}^*)^2\Big ]{\widetilde{A}}_2=0, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&\left( A_2'+rA_2\right) y+ B_2'+\Big [({\tilde{\mu }}-r){\tilde{\pi }}^*+\lambda \mu \eta ^* (1-q^*)\Big ]A_2\\&\quad -\frac{\lambda \sigma ^2(1-q^*)^2}{2}(m_2{\widetilde{A}}_2^2+\beta _{21}A_2^2)-\frac{{\tilde{\sigma }}^2({\tilde{\pi }}^*)^2}{2} (m_2{\widetilde{A}}_2^2+\beta _{22}A_2^2)=0. \end{aligned} \end{aligned}$$

Therefore, by the method of separation of variables, we obtain the following ODEs:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \begin{aligned} &{}{\widetilde{A}}_2'+r{\widetilde{A}}_2=0,\qquad A_2'+rA_2=0,\\ &{}{\widetilde{B}}_2'+\Big [({\tilde{\mu }}-r){\tilde{\pi }}^*+\lambda \mu \eta ^* (1-q^*)-\beta _{21}\sigma ^2\lambda A_2(1-q^*)^2-\beta _{22}{\tilde{\sigma }}^2 A_2({\tilde{\pi }}^*)^2\Big ]{\widetilde{A}}_2=0,\\ &{}B_2'+\Big [({\tilde{\mu }}-r){\tilde{\pi }}^*+\lambda \mu \eta ^* (1-q^*)\Big ]A_2\\ &{}\quad -\frac{\lambda \sigma ^2(1-q^*)^2}{2}(m_2{\widetilde{A}}_2^2+\beta _{21}A_2^2)-\frac{{\tilde{\sigma }}^2({\tilde{\pi }}^*)^2}{2} (m_2{\widetilde{A}}_2^2+\beta _{22}A_2^2)=0, \end{aligned} \end{array}\right. } \end{aligned}$$

Using the boundary conditions, we obtain:

$$\begin{aligned} {\widetilde{A}}_2(t)= & {} e^{r(T-t)}, \qquad A_2(t)=e^{r(T-t)}, \\ {\widetilde{B}}_2(t)= & {} \int _t^T{\tilde{b}}_{21}(s)\mathrm {d}s+\int _t^T{\tilde{b}}_{22}(s)\mathrm {d}s, \\ B_2(t)= & {} \int _t^Tb_{21}(s)\mathrm {d}s+\int _t^Tb_{22}(s)\mathrm {d}s, \end{aligned}$$

where

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \begin{aligned} {\tilde{b}}_{21}(s)&{}=\left[ \lambda \mu \eta ^*(s)(1-q^*(s))-\beta _{21}\sigma ^2\lambda (1-q^*(s))^2e^{r(T-s)}\right] e^{r(T-s)},\\ {\tilde{b}}_{22}(s)&{}=\left[ ({\tilde{\mu }}-r){\tilde{\pi }}^*(s)-\beta _{22}{\tilde{\sigma }}^2({\tilde{\pi }}^*(s))^2e^{r(T-s)}\right] e^{r(T-s)},\\ b_{21}(s)&{}=\left[ \lambda \mu \eta ^*(s)(1-q^*(s))-\frac{\lambda \sigma ^2(1-q^*(s))^2}{2}(m_2+\beta _{21})e^{r(T-s)}\right] e^{r(T-s)},\\ b_{22}(s)&{}=\left[ ({\tilde{\mu }}-r){\tilde{\pi }}^*(s)-\frac{{\tilde{\sigma }}^2({\tilde{\pi }}^*(s))^2}{2}(m_2+\beta _{22})e^{r(T-s)}\right] e^{r(T-s)}. \end{aligned} \end{array}\right. } \end{aligned}$$
(3.26)

Based on the above derivations, the main results of this paper are summarized in the following theorems. In Theorem 3.3, we provide the explicit expressions for the insurer’s robust equilibrium retention level and the reinsurer’s robust equilibrium reinsurance price, and we also present the analytical expressions for the equilibrium investment strategies and the value functions of the insurer and the reinsurer. In Theorem 3.6, we give the expected values of the insurer’s and the reinsurer’s terminal surpluses, as well as the worst-case density generators of the insurer and the reinsurer.

We first impose the following assumption.

Assumption 3.1

Suppose the following conditions are satisfied:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \sqrt{\lambda }\mu >3\sigma ,\\ \dfrac{\sigma ^2(\beta _{11}+m_1)^2e^{r(T-t)}+\sigma ^2(\beta _{11}+m_1)(\beta _{21}+m_2)e^{r(T-t)}}{2\mu (\beta _{11}+m_1)+\mu (\beta _{21}+m_2)}\ge \theta . \end{array}\right. } \end{aligned}$$

Theorem 3.3

Under Assumption 3.1, the insurer’s robust optimal retained proportion of the claims and the reinsurer’s robust optimal reinsurance price are respectively given by:

$$\begin{aligned} q^*(t)=\frac{\beta _{11}+m_1+\beta _{21}+m_2}{2(\beta _{11}+m_1)+\beta _{21}+m_2}, \end{aligned}$$
(3.27)

and

$$\begin{aligned} \eta ^*(t)=\frac{\sigma ^2(\beta _{11}+m_1)^2e^{r(T-t)}+\sigma ^2(\beta _{11}+m_1)(\beta _{21}+m_2)e^{r(T-t)}}{2\mu (\beta _{11}+m_1)+\mu (\beta _{21}+m_2)}. \end{aligned}$$
(3.28)

Furthermore, the robust equilibrium investment strategies of the insurer and the reinsurer are respectively given by:

$$\begin{aligned} \pi ^*(t)=\frac{{\tilde{\mu }}-r}{(m_1+\beta _{12}){\tilde{\sigma }}^2e^{r(T-t)}}, \end{aligned}$$
(3.29)

and

$$\begin{aligned} {\tilde{\pi }}^*(t)=\frac{{\tilde{\mu }}-r}{(m_2+\beta _{22}){\tilde{\sigma }}^2e^{r(T-t)}}. \end{aligned}$$
(3.30)

Finally, the equilibrium value functions of the insurer and the reinsurer are respectively given by the following integral representations:

$$\begin{aligned} V_1(t,x)= & {} xe^{r(T-t)}+\int _t^T\lambda \mu (\theta -\eta ^*(s))\mathrm {d}s+\int _t^Tb_{11}(s)\mathrm {d}s+\int _t^Tb_{12}(s)\mathrm {d}s, \\ V_2(t,y)= & {} ye^{r(T-t)}+\int _t^Tb_{21}(s)\mathrm {d}s+\int _t^Tb_{22}(s)\mathrm {d}s, \end{aligned}$$

where \(b_{1i},\) for \(i\in \{1,2\},\) were given by (3.16) with \(\eta \) substituted for \(\eta ^*\) and \(b_{2i},\) for \(i\in \{1,2\},\) were given by (3.26).

Proof

It was derived that

$$\begin{aligned} A_1(t)=A_2(t)={\widetilde{A}}_1(t)={\widetilde{A}}_2(t)=e^{r(T-t)}. \end{aligned}$$
(3.31)

The explicit solution to the optimal reinsurance price in (3.28) is obtained by substituting (3.31) into \(\eta ^*\) in (3.25a). Inserting (3.28) into \(q^*\) in (3.12a), we can obtain the optimal reinsurance retention level of the insurer given by (3.27). Similarly, if we put (3.31) back into (3.12b) and (3.25b), we can obtain the robust equilibrium investment strategies of the insurer and the reinsurer presented in (3.29) and (3.30), respectively. This completes the proof. \(\square \)

Remark 3.4

The agent’s robust optimal retention level of insurance claims in (3.27) lies in the interval (0, 1). Consequently, we do not have to consider the cases at the boundary points, say \(q^*=0\) or \(q^*=1\) which correspond respectively to the cases where the insurer purchases a full reinsurance coverage and where the insurer has no reinsurance demand at all. Also, note that under the second condition in Assumption 3.1, \(\eta ^* (t)\) in (3.28) is larger than or equal to \(\theta \). That is, only non-cheap reinsurance is considered here.

Remark 3.5

The results in Theorem 3.3 indicate that the robust optimal reinsurance contract \((q^*(t), \eta ^*(t))\) is independent of the ambiguity levels on the stock return. This may stem from the assumption that the random shocks in the stock price and the claim process are independent. It should be noted that recently some authors have studied the optimal reinsurance and investment strategies when the insurance market and financial market are correlated, see, for example, Bi and Cai (2019), Brachetta and Schmidli (2020) and Ceci et al. (2021).

In the following theorem, we provide the expectation of the terminal surpluses associated with the robust equilibrium strategies of the insurer and the reinsurer and determine the worst-case scenario density generators. Plugging the expressions of \({\widetilde{A}}_i(t)\) and \({\widetilde{B}}_i(t),\) for \(i\in \{1,2\},\) in the preceding paragraphs into the trial solutions of \(g_i(t, x),\) the results in this theorem can be directly obtained.

Theorem 3.6

The expected values of the insurer’s and the reinsurer’s terminal surpluses are respectively given by:

$$\begin{aligned} {\mathbb {E}}^{{\mathbb {Q}}_1}_{t,x}\left[ X^{u^*, v^*}(T)\right]= & {} g_1(t, x)\\= & {} xe^{r(T-t)}+\int _t^T\lambda \mu (\theta -\eta ^*(s))\mathrm {d}s +\int _t^T{\tilde{b}}_{11}(s)\mathrm {d}s+\int _t^T{\tilde{b}}_{12}(s)\mathrm {d}s,\\ {\mathbb {E}}^{{\mathbb {Q}}_2}_{t,y}\left[ Y^{u^*, v^*}(T)\right]= & {} g_2(t, y)=ye^{r(T-t)}+\int _t^T{\tilde{b}}_{21}(s)\mathrm {d}s+\int _t^T{\tilde{b}}_{22}(s)\mathrm {d}s, \end{aligned}$$

where \({{\tilde{b}}}_{1i},\) for \(i\in \{1,2\},\) were given by (3.15) with \(\eta \) substituted for \(\eta ^*\) and \({{\tilde{b}}}_{2i},\) for \(i\in \{1,2\},\) were given by (3.26). The worst-case density generator \(\phi _1^*(t):=(\phi _{11}^*(t),\phi _{12}^*(t))\) of the insurer is given by:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \phi _{11}^*(t)=-\beta _{11}\sigma \sqrt{\lambda }q^*(t)e^{r(T-t)},\\ \phi _{12}^*(t)=-\beta _{12}{\tilde{\sigma }}\pi _1^*(t)e^{r(T-t)}. \end{array}\right. } \end{aligned}$$

The reinsurer’s worst-case density generator \(\phi _2^*(t):=(\phi _{21}^*(t),\phi _{22}^*(t))\) is given by:

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \phi _{21}^*(t)=-\beta _{21}\sigma \sqrt{\lambda }(1-q^*(t))e^{r(T-t)},\\ \phi _{22}^*(t)=-\beta _{22}{\tilde{\sigma }}{\tilde{\pi }}_1^*(t)e^{r(T-t)}. \end{array}\right. } \end{aligned}$$

In the above expressions, \(\eta ^*(t),\) \(q^*(t),\) \({\tilde{\pi }}^*(t)\) and \(\pi ^*(t)\) were given in Theorem 3.3.

When the insurer (or the reinsurer) completely trusts the reference model under the reference probability measure \({\mathbb {P}},\) the respective ambiguity aversion coefficients would be identical to zero. In this case, the robust optimization problem in (2.10) would reduce to the traditional optimization problem in (2.9). Consequently, setting the ambiguity aversion parameters of the insurer in Theorem 3.3 to zero would yield the robust reinsurance contract and the robust equilibrium investment strategies of an ANI and an AAR, respectively; similarly, putting the ambiguity aversion parameters of the reinsurer in Theorem 3.3 to be zero would yield the robust reinsurance contract and the robust equilibrium investment strategies of an AAI and an ANR, respectively. These two results are presented in the following corollaries.

Corollary 3.7

The equilibrium value functions of the ANI and the AAR are respectively given by the following integral representations:

$$\begin{aligned} {\widehat{V}}_1(t,x)= & {} xe^{r(T-t)}+\int _t^T\lambda \mu (\theta -{\hat{\eta }}^*(s))\mathrm {d}s+\int _t^T{\hat{b}}_{11}(s)\mathrm {d}s+\int _t^T{\hat{b}}_{12}(s)\mathrm {d}s, \\ {\widehat{V}}_2(t,y)= & {} ye^{r(T-t)}+\int _t^T{\hat{b}}_{21}(s)\mathrm {d}s+\int _t^Tb_{22}(s)\mathrm {d}s, \end{aligned}$$

where

$$\begin{aligned} {\left\{ \begin{array}{ll}{} {\hat{b}}_{11}(s)=\left[ \lambda \mu {\hat{\eta }}^*(s) {\hat{q}}^*(s)-\frac{\lambda \sigma ^2({\hat{q}}^*(s))^2}{2}m_1e^{r(T-s)}\right] e^{r(T-s)},\\ {\hat{b}}_{12}(s)=\left[ ({\tilde{\mu }}-r){\hat{\pi }}^*(s)-\frac{{\tilde{\sigma }}^2({\hat{\pi }}^*(s))^2}{2}m_1e^{r(T-s)}\right] e^{r(T-s)}, \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} {\left\{ \begin{array}{ll}{} {\hat{b}}_{21}(s)=\left[ \lambda \mu {\hat{\eta }}^*(s)(1-{\hat{q}}^*(s))-\frac{\lambda \sigma ^2(1-{\hat{q}}^*(s))^2}{2}(m_2+\beta _{21})e^{r(T-s)}\right] e^{r(T-s)},\\ b_{22}(s)=\left[ ({\tilde{\mu }}-r){\tilde{\pi }}^*(s)-\frac{{\tilde{\sigma }}^2({\tilde{\pi }}^*(s))^2}{2}(m_2+\beta _{22})e^{r(T-s)}\right] e^{r(T-s)}, \end{array}\right. } \end{aligned}$$

The ANI’s robust optimal retained proportion and the AAR’s robust optimal reinsurance price (or premium) are respectively given by:

$$\begin{aligned} {\hat{q}}^*(t)=\frac{m_1+\beta _{21}+m_2}{2m_1+\beta _{21}+m_2}, \end{aligned}$$

and

$$\begin{aligned} {\hat{\eta }}^*(t)=\frac{\sigma ^2m_1e^{r(T-t)}(m_1+\beta _{21}+m_2)}{\mu (2m_1+\beta _{21}+m_2)}. \end{aligned}$$

Furthermore, the robust equilibrium investment strategy of the ANI is given by:

$$\begin{aligned} {\hat{\pi }}^*(t)=\frac{{\tilde{\mu }}-r}{m_1{\tilde{\sigma }}^2e^{r(T-t)}}. \end{aligned}$$

The robust equilibrium investment strategy of the AAR remain the same as that in (3.30).

Corollary 3.8

The equilibrium value functions of the AAI and the ANR are respectively given by the following integral representations:

$$\begin{aligned} {\check{V}}_1(t,x)= & {} xe^{r(T-t)}+\int _t^T\lambda \mu (\theta -{\check{\eta }}^*(s))\mathrm {d}s+\int _t^T{\check{b}}_{11}(s)\mathrm {d}s+\int _t^Tb_{12}(s)\mathrm {d}s, \\ {\check{V}}_2(t,y)= & {} ye^{r(T-t)}+\int _t^T{\check{b}}_{21}(s)\mathrm {d}s+\int _t^T{\check{b}}_{22}(s)\mathrm {d}s, \end{aligned}$$

where

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \begin{aligned} {\check{b}}_{11}(s)&{}=\left[ \lambda \mu {\check{\eta }}^*(s) {\check{q}}^*(s)-\frac{\lambda \sigma ^2({\check{q}}^*(s))^2}{2}(m_1+\beta _{11})e^{r(T-s)}\right] e^{r(T-s)},\\ b_{12}(s)&{}=\left[ ({\tilde{\mu }}-r)\pi ^*(s)-\frac{{\tilde{\sigma }}^2(\pi ^*(s))^2}{2}(m_1+\beta _{12})e^{r(T-s)}\right] e^{r(T-s)}, \end{aligned} \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} {\left\{ \begin{array}{ll}{} \begin{aligned} {\check{b}}_{21}(s)&{}=\left[ \lambda \mu {\check{\eta }}^*(s)(1-{\check{q}}^*(s))-\frac{\lambda \sigma ^2(1-{\check{q}}^*(s))^2}{2}m_2e^{r(T-s)}\right] e^{r(T-s)},\\ {\check{b}}_{22}(s)&{}=\left[ ({\tilde{\mu }}-r){\check{\pi }}^*(s)-\frac{{\tilde{\sigma }}^2({\check{\pi }}^*(s))^2}{2}m_2e^{r(T-s)}\right] e^{r(T-s)}, \end{aligned} \end{array}\right. } \end{aligned}$$

The AAI’s robust optimal retained proportion and the ANR’s robust optimal reinsurance price (or premium) are respectively given by:

$$\begin{aligned} {\check{q}}^*(t)=\frac{\beta _{11}+m_1+m_2}{2\beta _{11}+2m_1+m_2}, \end{aligned}$$

and

$$\begin{aligned} {\check{\eta }}^*(t)=\frac{\sigma ^2(\beta _{11}+m_1)e^{r(T-t)}(\beta _{11}+m_1+m_2)}{\mu (2\beta _{11}+2m_1+m_2)}. \end{aligned}$$

Furthermore, the robust equilibrium investment strategy of the ANR is given by:

$$\begin{aligned} {\check{\pi }}^*(t)=\frac{{\tilde{\mu }}-r}{m_2{\tilde{\sigma }}^2e^{r(T-t)}}. \end{aligned}$$

The robust equilibrium investment strategy of the AAI remain the same as that in (3.29).

Remark 3.9

The robust optimal reinsurance contracts derived in Corollaries 3.7 and 3.8 imply that the optimal retention level of an ANI and the optimal reinsurance premium of an ANR are influenced by the ambiguity aversion coefficients of their counter parties. This may be attributed to the strategic interaction between the reinsurer and the insurer implied by the principal–agent framework.

4 Utility losses of the suboptimal investment and reinsurance strategies

In this section, we examine the utility losses of an AAI and an AAR. To this end, it is assumed that the insurer and the reinsurer are ambiguous about the insurance and financial risks. It is supposed, however, that they do not adopt the robust optimal reinsurance–investment strategies \(u^*=(q^*,\pi ^*)\) and \(v^*=(\eta ^*,{\tilde{\pi }}^*)\) given in Theorem 3.3. Instead, they make their decisions as if they were ambiguity-neutral. Say they follow the strategies given in Corollaries 3.7 and 3.8, respectively. In such circumstances, the agent’s suboptimal value function is defined by:

$$\begin{aligned} \begin{aligned} {\widetilde{V}}_1(t,x)&:=\underset{{\mathbb {Q}}_1\in {\mathcal {Q}}}{\inf }\, \Bigg \{{\mathbb {E}}^{{\mathbb {Q}}_1}_{t,x}\left[ X^{{\hat{u}}^*, {\hat{v}}^*}(T)\right] -\frac{m_1}{2}\mathrm {Var}^{{\mathbb {Q}}_1}_{t,x} \left[ X^{{\hat{u}}^*, {\hat{v}}^*}(T)\right] \\&\quad +{\mathbb {E}}^{{\mathbb {Q}}_1}_{t,x}\left[ \int _t^T\left( \frac{\phi _{11}^2(s)}{2\beta _{11}} +\frac{\phi _{12}^2(s)}{2\beta _{12}}\right) \mathrm {d}s\right] \Bigg \},\\ \end{aligned} \end{aligned}$$

and the reinsurer’s suboptimal value function is defined by:

$$\begin{aligned} \begin{aligned} {\widetilde{V}}_2(t,y)&:=\underset{{\mathbb {Q}}_2\in {\mathcal {Q}}}{\inf }\, \Bigg \{{\mathbb {E}}^{{\mathbb {Q}}_2}_{t,y}\left[ Y^{{\hat{u}}^*, {\hat{v}}^*}(T)\right] -\frac{m_2}{2}\mathrm {Var}^{{\mathbb {Q}}_2}_{t,y} \left[ Y^{{\hat{u}}^*, {\hat{v}}^*}(T)\right] \\&\quad +{\mathbb {E}}^{{\mathbb {Q}}_2}_{t,y}\left[ \int _t^T\left( \frac{\phi _{21}^2(s)}{2\beta _{21}} +\frac{\phi _{22}^2(s)}{2\beta _{22}}\right) \mathrm {d}s\right] \Bigg \}. \end{aligned} \end{aligned}$$

It should be noted that the equilibrium reinsurance–investment strategies of the insurer and the reinsurer are now pre-specified, whereby the worst-case alternative measures \({\mathbb {Q}}_k,\) for \(k\in \{1,2\},\) would be endogenously determined. As described in Zhao et al. (2019), Hu et al. (2018a, 2018b), Li et al. (2018) and Wang and Siu (2020), we define the following (relative) utility losses of the insurer and the reinsurer associated with the suboptimal reinsurance–investment strategies:

$$\begin{aligned} UL_1(t):=1-\frac{{\widetilde{V}}_1(t,x)}{V_1(t,x)}, \end{aligned}$$

and

$$\begin{aligned} UL_2(t):=1-\frac{{\widetilde{V}}_2(t,y)}{V_2(t,y)}, \end{aligned}$$

where \(V_1(t,x)\) and \(V_2(t,y)\) are the robust optimal value functions of the insurer and the reinsurer given in Theorem 3.3, respectively.

The suboptimal value function \({\widetilde{V}}_1(t,x)\) of the insurer with respect to the suboptimal reinsurance treaty \(\left( {\hat{q}}^*(t),{\hat{\eta }}^*(t)\right) \) and the suboptimal investment strategy \({\hat{\pi }}^*(t)\) solves the following minimization problem:

$$\begin{aligned} \begin{aligned}&\underset{\phi _1\in \Sigma _1}{\inf }\,\Bigg \{\frac{\partial {\widetilde{W}}_1(t, x)}{\partial t}+\!\Big [rx+\!({\tilde{\mu }}-r){\hat{\pi }}^*+\!\lambda \mu (\theta -{\hat{\eta }}^*)+\lambda \mu {\hat{\eta }}^* {\hat{q}}^*+\sigma \sqrt{\lambda }{\tilde{\phi }}_{11}{\hat{q}}^* +{\tilde{\sigma }}{\tilde{\phi }}_{12}{\hat{\pi }}^*\Big ]\\&\quad \times \frac{\partial {\widetilde{W}}_1(t, x)}{\partial x} +\frac{1}{2}\left[ \lambda \sigma ^2({\hat{q}}^*)^2+{\tilde{\sigma }}^2({\hat{\pi }}^*)^2\right] \left[ \frac{\partial ^2{\widetilde{W}}_1(t, x)}{\partial x^2}-m_1\left( \frac{\partial {\tilde{g}}_1(t, x)}{\partial x}\right) ^2\right] \\&\quad +\frac{{\tilde{\phi }}_{11}^2}{2\beta _{11}}+\frac{{\tilde{\phi }}_{12}^2}{2\beta _{12}}\Bigg \}=0. \end{aligned} \end{aligned}$$
(4.1)

The suboptimal value function \({\widetilde{V}}_2(t,y)\) of the reinsurer corresponding to the suboptimal reinsurance agreement \(\left( {\check{q}}^*(t),{\check{\eta }}^*(t)\right) \) and the suboptimal investment strategy \({\check{\pi }}^*(t)\) solves the following minimization problem:

$$\begin{aligned} \begin{aligned}&\underset{\phi _2\in \Sigma _2}{\inf }\,\Bigg \{ \frac{\partial {\widetilde{W}}_2(t, y)}{\partial t}+\Big [ry+({\tilde{\mu }}-r){\check{\pi }}^*+\lambda \mu {\check{\eta }}^*(1-{\check{q}}^*) +\sigma \sqrt{\lambda }{\tilde{\phi }}_{21}(1-{\check{q}}^*) +{\tilde{\sigma }}{\tilde{\phi }}_{22}{\check{\pi }}^*\Big ]\\&\quad \times \frac{\partial {\widetilde{W}}_2(t, y)}{\partial y} +\frac{1}{2}\left[ \lambda \sigma ^2(1-{\check{q}}^*)^2+{\tilde{\sigma }}^2({\check{\pi }}^*)^2\right] \left[ \frac{\partial ^2{\widetilde{W}}_2(t, y)}{\partial y^2}-m_2\left( \frac{\partial {\tilde{g}}_2(t, y)}{\partial y}\right) ^2\right] \\&\quad +\frac{{\tilde{\phi }}_{21}^2}{2\beta _{21}}+\frac{{\tilde{\phi }}_{22}^2}{2\beta _{22}} \Bigg \}=0. \end{aligned} \end{aligned}$$
(4.2)

Following the similar procedures for deriving the robust optimal value functions of the insurer and the reinsurer, the optimization problems in (4.1) and (4.2) can be solved. The suboptimal value function of the insurer is given as follows:

$$\begin{aligned} {\widetilde{V}}_1(t,x)=xe^{r(T-t)}+\int _t^T\lambda \mu (\theta -{\hat{\eta }}^*(s))\mathrm {d}s+\int _t^Tc_{11}(s)\mathrm {d}s+\int _t^Tc_{12}(s)\mathrm {d}s, \end{aligned}$$

and the suboptimal value function of the reinsurer is given by:

$$\begin{aligned} {\widetilde{V}}_2(t,y)=ye^{r(T-t)}+\int _t^Tc_{21}(s)\mathrm {d}s+\int _t^Tc_{22}(s)\mathrm {d}s, \end{aligned}$$

with

$$\begin{aligned} {\left\{ \begin{array}{ll}{} c_{11}(s)=\left[ \lambda \mu {\hat{\eta }}^*(s) {\hat{q}}^*(s)-\frac{\lambda \sigma ^2({\hat{q}}^*(s))^2}{2}(m_1+\beta _{11})e^{r(T-s)}\right] e^{r(T-s)},\\ c_{12}(s)=\left[ ({\tilde{\mu }}-r){\hat{\pi }}^*(s)-\frac{{\tilde{\sigma }}^2({\hat{\pi }}^*(s))^2}{2}(m_1+\beta _{12})e^{r(T-s)}\right] e^{r(T-s)}, \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} {\left\{ \begin{array}{ll}{} c_{21}(s)=\left[ \lambda \mu {\check{\eta }}^*(s)(1-{\check{q}}^*(s))-\frac{\lambda \sigma ^2(1-{\check{q}}^*(s))^2}{2}(m_2+\beta _{21})e^{r(T-s)}\right] e^{r(T-s)},\\ c_{22}(s)=\left[ ({\tilde{\mu }}-r){\check{\pi }}^*(s)-\frac{{\tilde{\sigma }}^2({\check{\pi }}^*(s))^2}{2}(m_2+\beta _{22})e^{r(T-s)}\right] e^{r(T-s)}, \end{array}\right. } \end{aligned}$$

where \({\hat{q}}^*(t), {\hat{\eta }}^*(t), {\hat{\pi }}^*(t)\) were given in Corollary 3.7 and \({\check{q}}^*(t), {\check{\eta }}^*(t), {\check{\pi }}^*(t)\) were given in Corollary 3.8.

5 Numerical examples

In this section, we provide numerical examples for sensitivity analyses of the robust equilibrium reinsurance and investment strategies derived in Sect. 3 and the utility losses presented in Sect. 4. The model parameters used as our benchmark are shown in Table 1 unless otherwise stated. In each of the following figures, we study the sensitivity of robust equilibrium reinsurance–investment strategies and utility losses with respect to the value of one parameter by varying the value of that parameter. The conditions in Assumption 3.1 are guaranteed to be satisfied when the parameters vary in the sensitivity analyses here.

Table 1 Values of parameters in numerical experiments
Fig. 1
figure 1

Effects of the ambiguity aversion parameters \(\beta _{k2},\) for \(k\in \{1,2\},\) on the robust equilibrium investment strategies of the insurer and the reinsurer for the risky asset

Fig. 2
figure 2

Impacts of the risk aversion parameters \(m_{k},\) for \(k\in \{1,2\},\) on the robust equilibrium reinsurance strategies of the principal and the agent

Fig. 3
figure 3

Effects of the ambiguity aversion parameters \(\beta _{k1},\) for \(k\in \{1,2\},\) on the robust reinsurance contracts

Fig. 4
figure 4

Impacts of T on the utility losses of the insurer and the reinsurer

Fig. 5
figure 5

Effects of the ambiguity aversion parameters \(\beta _{ki},\) for \(k, i\in \{1,2\},\) on the utility losses of the insurer and the reinsurer

Figure 1 demonstrates the impact of the ambiguity aversion parameter \(\beta _{k2},\) for \(k\in \{1,2\},\) and the risk aversion parameter \(m_k\) on the robust equilibrium investment strategies of the insurer and the reinsurer in the risky asset. As shown in Fig. 1, if an AAI (or an AAR) has a higher level of ambiguity aversion, they would reduce the amount invested in the risky asset. Intuitively, this conclusion appears to be reasonable because the decision-makers would invest less wealth in an asset that they have less information about the underlying mechanism that generates the price movements to mitigate financial risks. This conclusion also indicates that an AAI (or an AAR) would be more conservative than an ANI (or an ANR) with respect to financial risks, which is reflected in the decrements in the investment demand for the risky asset. Additionally, for a given ambiguity aversion parameter, the robust equilibrium investment strategies in the stock decrease as the parameter \(m_k\) increases. In other words, the more risk-averse the agent (or the principal) is, the less wealth the agent (or the principal) tends to invest in the risky share.

Figure 2 shows the impact of the risk aversion coefficient \(m_k,\) for \(k\in \{1,2\},\) on the robust equilibrium reinsurance strategies of the insurer and the reinsurer under different scenarios. It has been indicated that the agent’s equilibrium retention level \(q^*(0)\) declines as \(m_1\) increases. This can be explained by that a more risk-averse insurer is less willing to undertake insurance risks and so the insurer tends to cede more insurance risks to the reinsurer. For the same level of risk aversion, an AAI retains less insurance risk than an ANI, which indicates that ambiguity aversion attitudes render the insurer more conservative to the insurance risks. Regarding the reinsurer, we observe that the equilibrium reinsurance premium \(\eta ^*(0)\) increases as her risk aversion parameter \(m_2\) increases. This may be attributed to the idea that if the reinsurer is more risk-averse, she would like to deal with less insurance risks. Consequently, the reinsurer tends to enhance the reinsurance price with a view to compensating the additional insurance risks to be undertaken. Finally, for a fixed risk aversion parameter, an AAR charges a higher reinsurance price than an ANR, and this indicates that the consideration of preference for robustness induces the reinsurer to select more conservative and cautious strategies. The results in this figure also suggest that the impact of ambiguity aversion on financial risks and that on insurance risks are consistent with each other.

Figure 3 illustrates the effects of the ambiguity aversion parameter \(\beta _{k1},\) for \(k\in \{1,2\},\) on the robust equilibrium reinsurance strategies of the insurer and the reinsurer which were derived in Theorem 3.3, as well as Corollaries 3.7 and 3.8, respectively. We can first observe that the insurer decreases his optimal retained level \(q^*(0)\) as the ambiguity aversion parameter corresponding to the diffusion risk of the claims becomes larger. Moreover, the reinsurer is prone to increasing the reinsurance price when her ambiguity aversion parameter increases in order to reduce the adverse impact of model misspecification. These appear to be in line with intuition. The left panel of Fig. 3 shows that for a fixed ambiguity aversion parameter of the insurer, the optimal retention level of the insurer in the optimal reinsurance contract between an AAI and an AAR is higher than that in an optimal reinsurance contract between an AAI and ANR. As shown in Fig. 2, this is mainly because an AAR tends to offer a higher reinsurance price than an ANR. As discussed in Remark 3.9, if both the principal and the agent have concerns for robustness, the impacts of their attitudes towards model uncertainty would be strengthened. Consequently, the reinsurer tends to adopt more conservative strategies, i.e., the reinsurer increases the reinsurance premium. This may explain why the black curve is above the red curve in the right panel of Fig. 3.

Figures 4 and 5 depict the results of the sensitivity analyses for the utility losses of the insurer and the reinsurer. It can be seen in Fig. 4 that the utility loss \(UL_1(0)\) of the insurer shows an rising trend as T expands. This could be explained by that the insurer is expected to face a larger amount of model uncertainties when the reinsurance and investment planning horizon T becomes longer. This provides the insurer with the implications that model ambiguity needs to be taken into account when he/she intends to maintain a long-term cooperation relationship with the reinsurer and participates in long-term investment activities. The utility loss \(UL_2(0)\) of the reinsurer also increases as T increases, though to a lesser extent compared with that of the insurer. Additionally, we find that the utility losses of the insurer and the reinsurer are increasing functions of their respective ambiguity aversion parameters. Figure 5 shows the impacts of the ambiguity aversion coefficients \(\beta _{ki},\) for \(k, i\in \{1,2\},\) on the utility losses of the insurer and the reinsurer. These results imply that the decision-maker would suffer a greater utility loss resulting from discarding model ambiguity if he/she has less information about the reference model. Also, we can observe that the utility losses of the insurer and the reinsurer are relatively less sensitive to the ambiguity aversion parameter \(\beta _{k2}\) compared with those to \(\beta _{k1}.\) This suggests that the decision-makers’ ambiguity aversion attitudes towards the claim process play more important roles in their utility losses than those towards the financial market.

6 Concluding remarks

The aim of this paper was to examine a robust optimal reinsurance contracting problem under a continuous-time principal–agent framework. More specifically, we have assumed that both the insurer and the reinsurer are ambiguity-averse and intend to develop a robust proportional reinsurance contract and robust investment strategies by considering a family of alternative models. Both the insurer and the reinsurer have access to investment opportunities of a stock and a bank account. Under the time-consistent mean-variance criterion, two systems of extended HJB equations have been considered to obtain the explicit expressions for the equilibrium reinsurance–investment strategies and the corresponding equilibrium value functions of the insurer and the reinsurer. We also present particular cases of our model and discuss the utility losses of the insurer and the reinsurer if they ignore model uncertainty.

The main implications found from the results are summarized as follows: (1) The insurer and the reinsurer are prone to selecting more conservative investment strategies if they are more ambiguity-averse, or more risk-averse. This is reflected in the reduced amount invested in the risky asset; (2) The insurer tends to undertake less insurance risks and purchase more reinsurance if he is more ambiguity-averse, or more risk-averse. Besides, the reinsurer with a larger ambiguity aversion parameter or a risk aversion parameter would charge a higher reinsurance premium; and (3) The utility losses of the principal and the agent increase as their ambiguity aversion parameters and the horizon for reinsurance and investment increase, which are consistent with the conclusions obtained by Hu et al. (2018a, 2018b) and Hu and Wang (2019). This conclusion also indicates that it is important to stress model uncertainty for long-term decision-makers. In future research endeavors, we expect to extend the purview of the current study via incorporation of moral hazard and adverse selection of the insurer under the principal–agent framework.