1 Introduction

The example of the current COVID-19 pandemic clearly shows the significant influence of mobility on the spread of a disease. Mathematical-epidemiological models can address this using various techniques. The movement of people between separate patches such as airports, islands, cities etc. can be represented using Lagrangian movement for short-term stays or Eulerian movement for long-term migrations [5]. The modelling here is done via ordinary differential equations (ODE). However, since this point-by-point distribution of pathogens does not reflect reality on its own, spatial spread need to be taken into account as well. This can be achieved with reaction-diffusion systems which contain partial differential equations (PDE) [2, 10]. Consequently, we consider a system of the form

$$\displaystyle \begin{aligned} \partial_t u &= \kappa \Delta u + f(u) \, ,\\ u &= u_0, \quad t=0 \, ,\\ \partial_\nu u &= 0, \quad x \in \partial \Omega \, . \end{aligned}$$

The goal is to fit this model to data sets. Unfortunately, several parameters are unknown in the epidemiological context, such as the transmission rate or even the parameters describing mobility. In addition, noisy data may be expected, for this the initial value condition shall be adjusted.

In this contribution, a parameter estimation via adjoint functions is tested. This corresponds to techniques from static and dynamic optimization. To investigate the accuracy of the method, we consider an artificially generated data set. Numerical simulations are performed to fit the model to this data set.

2 Model

In the following we consider the set Ω = (0, a) × (0, b) as spatial coordinates and a time axis (0, T) with resulting domain V =  Ω × (0, T). To model a spatial spread of an infectious disease, we use an epidemiological SIS model. The resulting reaction-diffusion system reads as

$$\displaystyle \begin{aligned} \partial_t S &=\kappa_S \Delta S - \frac{\beta}{N} S I + \gamma I \, , \end{aligned} $$
(1a)
$$\displaystyle \begin{aligned} \partial_t I &= \kappa_I \Delta I + \frac{\beta}{N} S I - \gamma I \, , \end{aligned} $$
(1b)
$$\displaystyle \begin{aligned} S &= S_0, \, I = I_0, \quad t=0 \, , \end{aligned} $$
(1c)
$$\displaystyle \begin{aligned} \partial_\nu S &= \partial_\nu I = 0, \quad x \in \partial \Omega \, . \end{aligned} $$
(1d)

The functions S, I, N ∈ C 2, 1(V ) represent the densities of the compartments of susceptible (S) and infected (I) individuals and the total population density N = S + I in coordinate x at time t.

Here, e.g. \(\partial _t S=\frac {\partial S}{\partial t}\) stands for the time derivative of S and \(\Delta S = \text{div}(\text{grad } S) = \frac {\partial ^2 S}{\partial x_1^2} + \frac {\partial ^2 S}{\partial x_2^2}\) stands for the Laplace operator. For the two compartments initial value conditions are given by S 0, I 0 ∈ C 2( Ω). At the boundary  Ω Neumann boundary conditions are implied, whereby ν S denotes the derivative of S to the direction of the unit outward normal ν. In context, this means that none of the individuals leaves the region Ω. We also assume that ∫Ω I(x, 0) dx > 0 holds with S 0, I 0 ≥ 0. We define

$$\displaystyle \begin{aligned} \overline{N} := \int_\Omega N(x,0) \, dx \, , \end{aligned}$$

which stands for the total number of individuals at time t = 0. Due to the Neumann boundary conditions the Gauss’s theorem delivers

$$\displaystyle \begin{aligned} \frac{\partial}{\partial t} \int_\Omega N(x,t) \, dx = \int_\Omega \kappa_S \Delta S + \kappa_I \Delta I \: dx = \int_{\partial \Omega} \kappa_S \partial_\nu S + \kappa_I \partial_\nu I \: d \omega = 0 \, . \end{aligned}$$

Thus, the total population is constant with value \(\overline {N}\).

The parameters β, γ > 0 represent the transmission and recovery rates of the corresponding disease and κ S, κ I > 0 the diffusivity of the corresponding compartments. For simplicity, we assume that κ s = κ I holds and β, γ are constants independent of x. For the derivation of such a model in one dimensional case and the operation of epidemiological models, we refer to [5].

SIS-based reaction-diffusion systems as in (1) have already been studied in [1, 3, 6,7,8,9]. The existence of a global and unique solution is shown, also for cases in which κ S ≠ κ I holds and β, γ are Hölder continuous functions over Ω. In [1] a Basic Reproduction Number is established on Sobolev space H 1( Ω) by

$$\displaystyle \begin{aligned} \mathcal{R}_0 = \sup_{\substack{\varphi \in H^1(\Omega) \\ \varphi \neq 0}} \left( \frac{\int_\Omega \beta \varphi^2}{\int_\Omega \kappa_I \vert \nabla \varphi \vert^2 + \gamma \varphi^2} \right) \, . \end{aligned} $$
(2)

There is shown, that if \(\mathcal {R}_0<1\) holds, the unique disease-free equilibrium \(\text{DFE}=\left ( \frac {\overline {N}}{\vert \Omega \vert },0 \right )\) is globally asymptotically stable and unstable for \(\mathcal {R}_0 > 1\). The expression | Ω| here stands for the corresponding measure. On the other hand, for \(\mathcal {R}_0>1\) the existence of a unique endemic equilibrium EE is shown.

Furthermore, we set κ := κ S = κ I and substitute S = N − I. If we additionally define \(u:=\frac {I}{N}\), we receive a reduced system with f(u) := β(1 − u)u − γu

$$\displaystyle \begin{aligned} \partial_t u &= \kappa \Delta u + f(u) \, , \end{aligned} $$
(3a)
$$\displaystyle \begin{aligned} u &= u_0, \, \quad t=0 \, , \end{aligned} $$
(3b)
$$\displaystyle \begin{aligned} \partial_\nu u &= 0, \quad x \in \partial \Omega \, . \end{aligned} $$
(3c)

The simplifying assumptions and the normalization are used to test the presented parameter fitting via adjoint functions. It is clear that in realistic situations much more complex models should be used.

3 Adjoint System

We now want to fit model (3) to data sets using adjoint functions known from optimal control theory. In the epidemiological context, this means parameter estimation of the transmission rate β > 0 and diffusivity κ > 0. The recovery rate γ > 0 can be assumed to be the reciprocal of the average infection duration and thus does not need to be fitted. Furthermore, we assume that the data is noisy and therefore the initial condition u 0 ∈ C 2( Ω) has to be adjusted. In the following, the function u DATA contains the available data points and \(u_0^{\mathit {DATA}}\) the supposedly noisy initial value of the data set at t = 0.

We introduce an objective function \(J : \mathbb {R}^2 \times C^2(\Omega ) \rightarrow \mathbb {R}\)

$$\displaystyle \begin{aligned} J(\beta,\kappa,u_0) := w_0 \Vert u - u^{\mathit{DATA}} \Vert_{L_V^2}^2 + w_1 (\beta^2 + \kappa^2) + w_2 \Vert u_0 - u_0^{\mathit{DATA}} \Vert_{L_\Omega^2}^2 \, . \end{aligned} $$
(4)

The function u stands for the solution of the reaction-diffusion PDE system (3). The objective function includes the L 2-norm \(\Vert g \Vert _{L_Y^2}:=\left ( \int _Y g(y)^2 \, d y \right )^{1/2}\) and corresponding normalizing weights \(w_0:=1/ \Vert u^{\mathit {DATA}} \Vert _{L_V^2}^2\) respectively \(w_2:=1/\Vert u_0^{\mathit {DATA}} \Vert _{L_\Omega ^2}^2\). The convex and radially unbounded regularization term w 1(β 2 + κ 2) depends on a very small choosen weight w 1 whose influence is investigated in the subsequent simulations. Assuming one already has initial guess \(\hat {\beta },\hat {\kappa }\) for the parameters, a term of the form \(w_1\left ((\beta -\hat {\beta })^2 + (\kappa -\hat {\kappa })^2 \right )\) can be used alternatively.

This leads to a minimization problem with dynamic constraints

$$\displaystyle \begin{aligned} \min_{\beta,\kappa,u_0} \, J(\beta,\kappa,u_0) \quad \text{ subject to PDE system (3)} \, . \end{aligned} $$
(5)

A Lagrange function is introduced containing adjoint functions z ∈ C 2, 1(V )

$$\displaystyle \begin{aligned} L(\beta,\kappa,u_0,u,z):=\int_V g \, dxdt + \psi + \int_V z \left( f(u) + \kappa \Delta u - \partial_t u \right) \, dxdt \, , \end{aligned} $$
(6)

whereby \(g:=w_0 \left ( u - u^{\mathit {DATA}} \right )^2\) and \(\psi :=w_1 (\beta ^2 + \kappa ^2)+w_2 \int _\Omega \left ( u_0 - u_0^{\mathit {DATA}} \right )^2 \, dx\).

The necessary condition for a minimum \((\beta ^*,\kappa ^*,u_0^*,u^*,z^*)\) is fulfilled, if

$$\displaystyle \begin{aligned} 0 = \nabla L := \left(\partial_\beta L,\partial_\kappa L,\partial_{u_0} L,\partial_u L,\partial_z L \right) \end{aligned}$$

holds true. It should be noted that Gâteaux derivatives are needed for the derivatives of L to the directions u 0, u and z. This leads to the following system in \((\beta ^*,\kappa ^*,u_0^*,u^*,z^*)\):

  1. (i)

    0 =  β ψ +∫V z∂ β f dxdt,   (Optimality Condition)

    0 =  κ ψ +∫V z Δu dxdt,

  2. (ii)

    \(u_0=u_0^{\mathit {DATA}}-\frac {z(x,0)}{2w_2}\),     (Optimal Initial Condition)

  3. (iii)

    t z = − u g − z∂ u f − κ Δz,   (Adjoint Equation)

    z = 0,  t = T,       (Transversality Condition)

    ν z = 0,  x ∈  Ω,     (Adjoint Neumann Boundary Condition).

When L is derived in the z direction, the original PDE system (3) is recovered.

4 Numerical Simulations

From the analysis in Sect. 3, the gradient of L with respect to β and κ reads

$$\displaystyle \begin{aligned} \partial_\beta L &= 2 w_1 \beta + \int_V z(1-u)u \, dxdt \end{aligned} $$
(7a)
$$\displaystyle \begin{aligned} \partial_\kappa L &= 2 w_1 \kappa + \int_V z \Delta u \, dxdt \end{aligned} $$
(7b)

and we obtain the adjoint equation

$$\displaystyle \begin{aligned} \partial_t z = - 2 w_0 \left( u - u^{\mathit{DATA}} \right) - z( \beta (1 - 2 u) - \gamma ) -\kappa \Delta z \, . \end{aligned} $$
(8)

The latter must be solved backward in time t due to the transversality condition. This is done using the forward-backward sweep method, see [4]. The performed algorithm can be found in Appendix 1. Solving the PDEs is done using finite differences

$$\displaystyle \begin{aligned} \Delta u_{i,j}^n \approx \frac{1}{h^2} \left( u_{i-1,j}^n + u_{i,j-1}^n - 4 u_{i,j}^n + u_{i+1,j}^n + u_{i,j+1}^n \right) \end{aligned} $$
(9)

and an explicit Euler-scheme

$$\displaystyle \begin{aligned} u_{i,j}^{n+1} = u_{i,j}^n + \tau ( \kappa \Delta u_{i,j}^n + f(u_{i,j}^n) ) \end{aligned} $$
(10)

on the domain V =  Ω × (0, T) with Ω = (0, a) × (0, b). The Neumann boundary conditions are implemented by \(u_{k+1,j}^n=u_{k,j}^n\) etc., if index (k, j) stands for a point at the rectangular boundary  Ω. In the following simulations we use the setting

  • h := 0.1, τ := 0.001, a := 3, b := 2, T := 1

  • \(x_1^i=ih\): i = 0, …, 30 \(x_2^j=jh\): j = 0, …, 20  t n = : n = 0, …, 1000 .

To test the procedure an artificial data set is generated with initial condition

$$\displaystyle \begin{aligned} u_0^{\mathit{DATA}}(x_1^i,x_2^j):=0.02 \delta_{(0.4,0.6)}(x_1^i,x_2^j)+0.1 \delta_{(2,1)}(x_1^i,x_2^j) \end{aligned} $$
(11)

whereby \(\delta _{(\tilde {x}_1,\tilde {x}_2)}(x_1^i,x_2^j)=1\), if \((x_1^i,x_2^j)=(\tilde {x}_1,\tilde {x}_2)\) and else \(\delta _{(\tilde {x}_1,\tilde {x}_2)} = 0\). Subsequently, the state variable PDE (3) is solved with β := 0.3, κ := 0.2 and γ := 0.1. The received solution is called \(\overline {u}\) in the following. To simulate noisy data, a normally distributed \(q_{i,j}^n \sim \mathcal {N}(0,\sigma ^2)\) is generated, so that the desired data set is calculated by

$$\displaystyle \begin{aligned} u^{\mathit{DATA}}(x_1^i,x_2^j,t^n):= \max \Bigl( 0 , (1+q_{i,j}^n) \cdot \overline{u}(x_1^i,x_2^j,t^n) \Bigr) \, . \end{aligned} $$
(12)

5 Results and Conclusions

The application of the presented method is tested in three simulations with different initial values β 0, κ 0. The initial value for the initial condition u 0 is taken from the desired data set u DATA. The resulting Table 1 and Fig. 1 in Appendix 2 show adequate parameter estimates. A test run without artificial noise on the data set resulted in the original values β = 0.3 and κ = 0.2. The simulations also show the effect of the weight w 1 of the regularization term on the minimization of the objective function J. Despite this disturbance, better results are obtained than without it. The prerequisite for this is a correspondingly small choice for w 1 which influences the convexity of the objective function in the respective parameters.

Fig. 1
figure 1

Graphical Results for the Simulation with β 0 := 1, κ 0 := 1, w 1 := 10−12

Table 1 The recovery rate is fixed with γ := 0.1. The algorithm stops with tolerance TOL := 10−6. The original parameters of the artificial data set are β := 0.3 and κ := 0.2. The artificial noise is generated with standard deviation σ := 0.1

The present simulations show that the applied method works very well in this toy problem with self-generated data set. In principle, the procedure is suitable to perform such parameter estimations. In the next step, the method should be tested with real data sets. Depending on the disease, much more sophisticated epidemiological models may also be required. Mobility movements between patches, such as daily commuting or travelling, should also be added to the model. With respect to the PDE solution, other solution methods should also be tested, since the simple Euler method may be numerically unstable. In addition, a simple rectangular area was assumed in our example. In real cases, appropriate adjustments are necessary here.