Introduction

Since the pioneering works of Malthus [1] and Verhulst [2] many other studies have been proposed aiming at the analysis of population dynamics, many of them based on the movement of the population considering the dispersion and taking into account logistic growth rates, seeking the factors that can alter in any way birth rates and mortality. In [3] the random-walk is adopted as a starting point for the study of dispersal in living organisms, while in [4, 5] the main models involving this mechanism are reviewed. Considering population control mechanisms [6], extinction conditions are obtained through impulsive culling.  Most of those works present different models and their simplified analytical or numerical solutions to specific cases are carried out. However, most works end up ignoring the practical applications derived from such models. In this sense, a way to create a bridge between theory and practice is the inverse analysis, once the parameters estimation approach can be used for fitting such models to a given population of interest, allowing for the practical use of these models in forecasting and optimal mechanisms of population control [7,8,9,10]. The inverse problem analysis involves, in general, intensive iterative procedures. Therefore, it is of major importance to have a precise solution methodology with good computational efficiency in solving the mathematical model, generally given by partial differential equations [11].

Aiming at a robust implementation for the population dynamics problem modeled as partial differential equations, Knupp et al. [9] employed a combination of hybrid numerical-analytical solution methodology, known as the Generalized Integral Transform Technique (GITT), with the Differential Evolution optimization method for the inverse problem related to population growth models with time delay and impulsive culling, using the maximum likelihood procedure. It was observed the need of prior information regarding one or more model parameters, which should be obtained by means of other prior independent procedure. In such cases, it is critically important to consider how the uncertainties present in the supposedly known values of these parameters affect the estimation of the others. In this scenario, however, the most attractive framework is the Bayesian approach, which combines the likelihood function with prior information in order to yield a representation for the parameters posterior distribution [12]. The Bayesian framework has been successfully employed in different areas, as can be seen in these works [13,14,15,16,17,18,19], just to cite a few examples.

In certain conditions, where the experimental error follows a normal distribution and the prior information can be modeled as a normal distribution, it is possible to derive the Maximum a posteriori objective function that may be used to obtain single point estimates for the unknowns, besides the approximation of the confidence intervals. Nevertheless, this represents only part of the information on the unknowns, and the estimated confidence intervals must be used cautiously, since this commonly adopted approach is exact only for linear problems, being just an approximation for nonlinear inverse problems, as the one to be considered in this work. The posterior probability distribution may be further explored using random sampling methods, such as the Markov Chain Monte Carlo techniques. These approaches are more computationally intensive, but on the other hand they allow to approach the true posterior distribution upon the appropriate modeling of the prior information and the experimental errors, regardless of their statistical distribution form.

In the present work, the inverse analysis of population dynamics with time delay and impulsive culling is tackled, with two main contributions: (i) in order to adequately take into account the prior information considered available for one or more model parameters, as well as the uncertainties involved, the inverse analysis is carried out within the Bayesian framework and implemented via Markov Chain Monte Carlo methods with the Metropolis-Hastings (MH) algorithm; (ii) in order to overcome the difficulty posed by the high number of simulations required by the Markov Chain Monte Carlo methods, it is necessary a fast and accurate direct problem solution implementation. In this sense, it is proposed a computationally low cost solution, constructed upon the integral transformation of the original problem, and employing semi-analytical integrations and reduced-order expansions of the nonlinear terms. The use of this approximated solution, however, introduces errors which should be taken in account in the inverse problem solution. Thus, in this work it is also investigated the Approximation Error Model (AEM) approach in the Bayesian formulation [20].

Table 1 Parameters for the direct problem solution
Table 2 Direct problem solution convergence behavior, \(u(x,t=50)\) (a) With \(M=30\) subregions for the semi analytical integration. (b) With \(M=150\) subregions for the semi analytical integration
Table 3 Direct problem solution convergence behavior, \(u(x,t=125)\) (a) With \(M=30\) subregions for the semi analytical integration. (b) With \(M=150\) subregions for the semi analytical integration, 21
Table 4 Test cases
Table 5 Case 1—Inverse problem solution (a) Statistics obtained after discarding the burn-in states. (b) Estimates with 95% confidence intervals obtained with MCMC method. CI width (%): ratio between the 95% confidence interval range and the estimated mean value
Fig. 1
figure 1

Case 1—Markov Chain for each parameter (black), Mean (green), 95% confidence interval (red) and exact value (blue)

Fig. 2
figure 2

Case 1—posterior distribution

Direct Problem Formulation and Solution Methodology

Consider the time dependent density of an adult population u defined in a one-dimensional space with diffusive behaviour according to Fick’s law, and constant dispersion coefficient D, reproduction time delay \(\tau \), premature death rate \(\mu \), and birth and death rates given by b(u) and d(u), respectively. The governing equation is given by [6],

$$\begin{aligned} \frac{\partial u(x,t)}{\partial t}=D\frac{\partial ^2 u(x,t) }{\partial x^2}+e^{-\mu \tau }b(u)-d(u)-\sum _{j=1}^{N_{cs}}B_ju(x_j,t)\delta (x-x_j) \end{aligned}$$
(1)

defined in \(0<x<L\) for \(t>0\) and the positions \(x_j\) represent the location of the culling sites, with culling intensities \(B_j\). The total number of culling sites is given by \(N_{cs}\), and \(\delta \) denotes the Dirac-delta function.

The initial and boundary conditions are given respectively by:

$$\begin{aligned}&u(x,t)=u_0(x,t),\quad -\tau \le t\le 0 \end{aligned}$$
(2)
$$\begin{aligned}&\left. \frac{\partial u}{\partial x} \right| _{x=0} =0, \quad \left. \frac{\partial u}{\partial x} \right| _{x=L} =0 \end{aligned}$$
(3)

Assume that the rates b(u) and d(u) are defined according to a logistic model, being respectively given by [7]:

$$\begin{aligned}&b(u)=\frac{u(x,t-\tau )}{K+C u(x,t-\tau )} \end{aligned}$$
(4)
$$\begin{aligned}&d(u)=\frac{P u^2(x,t)}{K+C u(x,t-\tau )} \end{aligned}$$
(5)

In this population dynamics model, the parameters are biologically interpreted as: D is the population dispersion coefficient, P is the intrinsic growth rate, K is the carrying capacity, C is the rate of replacement of the population in carrying capacity and B is the intensity of impulsive culling sites.

For the solution of the partial differential equation, given by Eqs. (15), the hybrid numerical–analytical approach named Generalized Integral Transform Technique (GITT) [21] has been employed. Following the formalism for the solution of the nonlinear model given by Eq. (1), the following integral transformation pair is defined:

$$\begin{aligned}&{\text {Transform:}}~~{\overline{u}}_{i}(t)=\int _0^L u(x,t){\widetilde{\psi }}_{i}(x)dx \end{aligned}$$
(6)
$$\begin{aligned}&{\text {Inverse:}}~~u(x,t)=\sum _{i=0}^\infty {\overline{u}}_{i}(t){\widetilde{\psi }}_{i}(x) \end{aligned}$$
(7)

where \({\widetilde{\psi }}_{i}(x)\) are the normalized eigenfunctions:

$$\begin{aligned} {\widetilde{\psi }}_{i}(x)=\frac{\psi _{i}(x)}{\sqrt{N_{i}}} \end{aligned}$$
(8)

with the normalization integral, \(N_i\), given by:

$$\begin{aligned} {N_{i}}=\int _0^L \psi _{i}^2(x)dx \end{aligned}$$
(9)
Table 6 Case 2—Inverse problem solution (a) Statistics obtained after discarding burn-in states. (b) Estimates with 95% confidence intervals obtained with MCMC method. CI width (%): ratio between the 95% confidence interval range and the estimated mean value
Fig. 3
figure 3

Case 2—Markov Chain for each parameter (black), Mean (green), 95% confidence interval (red) and exact value (blue)

Fig. 4
figure 4

Case 2—posterior distribution

Fig. 5
figure 5

Case 3—Markov Chain for each parameter (black), Mean (green), 95% confidence interval (red) and exact value (blue)

Fig. 6
figure 6

Case 3—posterior distribution

The eigenfunctions are obtained considering an auxiliary eigenvalue problem, derived from the direct application of separation of variables to the linear homogeneous purely diffusive version of Eq. (1). Hence, the following Sturm–Liouville eigenvalue problem is employed:

$$\begin{aligned} D\frac{d^2 \psi _i(x)}{dx^2}+ \lambda _i^2 \psi _i(x)=0~,~0<x<L \end{aligned}$$
(10)

with boundary conditions:

$$\begin{aligned} \left. \frac{d\psi _i(x)}{dx}\right| _{x=0}=0,~~\left. \frac{d\psi _i(x)}{dx}\right| _{x=0}=0 \end{aligned}$$
(11)

Equation (10), with the boundary conditions given in Eq. (11), leads to an infinite set of solutions for the eigenfunctions \(\psi _i(x)\), for discrete values of \(\lambda _i\):

$$\begin{aligned} \psi _i(x)=\cos (\lambda _i x),\quad i=0,1,2,\ldots \end{aligned}$$
(12)

where the eigenvalues \(\lambda _i\) are given by:

$$\begin{aligned} \lambda _i = \frac{i\pi \sqrt{D}}{L},\quad i=0,1,2,\ldots \end{aligned}$$
(13)

The integral transformation of the original problem is carried out by operating on Eq. (1) with \(\int _0^L {\widetilde{\psi }}_i(x)\left( .\right) dx\), and employing the inverse formula, Eq. (7), into the source-terms, yielding the following set of ordinary differential equations for the transformed potentials \(\overline{u_i}\),

$$\begin{aligned} \frac{d{\overline{u}}_i(t)}{dt}+\lambda _i^2{\overline{u}}_i(t)={\overline{g}}_i(t,\mathbf {u} ),\quad i=0,1,2,\ldots \end{aligned}$$
(14)

with

$$\begin{aligned} \mathbf{u }=({\overline{u}}_0, {\overline{u}}_1,{\overline{u}}_2\ldots ) \end{aligned}$$
(15)

and

$$\begin{aligned} {\overline{g}}_i(t,\mathbf{u })={\overline{g}}_{i,1}(t,\mathbf{u })+{\overline{g}}_{i,2}(t,\mathbf{u })+{\overline{g}}_{i,3}(t,\mathbf{u }) \end{aligned}$$
(16)

where

$$\begin{aligned}&{\overline{g}}_{i,1}=\int _0^L\frac{{\widetilde{\psi }}_i(x)e^{-\mu \tau }\sum _{j=0}^\infty {\overline{u}}_j(t-\tau ){\widetilde{\psi }}_j(x)}{K+C\sum _{j=0}^\infty {\overline{u}}_j(t-\tau ){\widetilde{\psi }}_j(x)}dx \end{aligned}$$
(17)
$$\begin{aligned}&{\overline{g}}_{i,2}=\int _0^L\frac{{\widetilde{\psi }}_i(x)P\left( \sum _{j=0}^\infty {\overline{u}}_j(t){\widetilde{\psi }}_j(x)\right) ^2}{K+C\sum _{j=0}^\infty {\overline{u}}_j(t-\tau ){\widetilde{\psi }}_j(x)}dx \end{aligned}$$
(18)
$$\begin{aligned}&\begin{array}{rcl} {\overline{g}}_{i,3}&{}=&{}\int _0^L{\widetilde{\psi }}_i(x)\sum _{k=1}^{N_{cs}}B_k\left( \sum _{j=0}^\infty {\overline{u}}_j(t){\widetilde{\psi }}_j(x_k)\right) \delta (x-x_k)dx\\ &{}=&{}\sum _{k=1}^{N_{cs}}B_k{\widetilde{\psi }}_i(x_k)\left( \sum _{j=0}^\infty {\overline{u}}_j(t){\widetilde{\psi }}_j(x_k)\right) \end{array} \end{aligned}$$
(19)

The transformed initial conditions are obtained by the integral transformation of Eq. (2), yielding:

$$\begin{aligned} {{\bar{u}}}_i (t) = \int \limits _{0}^L {{\tilde{\psi }} _i (x)u_0(x,t) dx,~ - \tau \le t \le 0},~i = 0,1,2,\ldots \end{aligned}$$
(20)

Equations (1420) form an infinite set of coupled ordinary differential equations, which is unlikely to allow for explicit analytical solutions. Nonetheless, if the system and the infinite expansions appearing in Eqs. (1719) are truncated to a finite order N, the system can be numerically solved by reliable built-in routines provided by well-established computational platforms, such as the NDSolve routine within the Wolfram Mathematica environment.

The most computationally intensive task within this solution procedure is the calculation of Eqs. (17) and (18), since the integrals involved most probably cannot be solved analytically, and numerical techniques can become costly due to the oscillatory behavior of the eigenfunctions. Thus, the alternative semi-analytical integration procedure, such as proposed in Ref. [22], is employed. Assuming that t he function to be integrated is of the form f(xtu), one can write the following approximation:

$$\begin{aligned} \int _{x_0}^{x_1}f(x,t,u)\psi _i (x)dx\approx \sum _{j=1}^M \int _{x_{j-1}}^{x_j}{\widehat{f}}_j(x,t,u)\psi _i (x)dx \end{aligned}$$
(21)

where \({\hat{f}}_j(x,t,u)\) are simpler representations of the original function f(xtu), defined in M subregions. In this work \({\hat{f}}\) are first order representations of f within each subregion, i.e. \({\hat{f}}_j\) is a linear approximation of f within \([x_{j-1},x_j]\), for \(j=1,2,\ldots ,M\). Hence, the integrals on the RHS of Eq. (21) allows for analytical integration.

Table 7 Case 4—Inverse problem solution (a) Statistics obtained after discarding the burn-in states. (b) Estimates with 95% confidence intervals obtained with MCMC method. CI width (%): ratio between the 95% confidence interval range and the estimated mean value
Fig. 7
figure 7

Case 4—Markov Chain for each parameter (black), Mean (green), 95% confidence interval (red) and exact value (blue)

Fig. 8
figure 8

Case 4—posterior distribution

Fig. 9
figure 9

Convergence of the mean of the approximation error

Fig. 10
figure 10

Convergence trace of the covariance matrix

It should be remembered that this work is aimed at the corresponding inverse problem analysis within the Bayesian framework, exploring the posterior densities of the sought parameters with the Markov Chain Monte Carlo method, which may require thousands numerical simulations of the direct problem solution. So, even considering the semi-analytical integration proposed, the computational cost can still be very high. Based on the better convergence characteristics of the integrals of eigenfunction expansions [23], one of the contributions of the present work is the proposition of a low-cost solution of the direct problem, based on a reduced truncation order, \(N_R\le N\), of the expansions appearing in the integrals of the nonlinear terms, in Eqs. (1718), as follows:

$$\begin{aligned}&{\overline{g}}_{i,1}=\int _0^L\frac{{\widetilde{\psi }}_i(x)e^{-\mu \tau }\sum _{j=0}^{N_R}{\overline{u}}_j(t-\tau ){\widetilde{\psi }}_j(x)}{K+C\sum _{j=0}^{N_R}{\overline{u}}_j(t-\tau ){\widetilde{\psi }}_j(x)}dx \end{aligned}$$
(22)
$$\begin{aligned}&{\overline{g}}_{i,2}=\int _0^L\frac{{\widetilde{\psi }}_i(x)P\left( \sum _{j=0}^{N_R}{\overline{u}}_j(t){\widetilde{\psi }}_j(x)\right) ^2}{K+C\sum _{j=0}^{N_R}{\overline{u}}_j(t-\tau ){\widetilde{\psi }}_j(x)}dx \end{aligned}$$
(23)

This approach can yield a significant reduction of the computational effort associated with the numerical integrations, without considerable precision loss, as we shall demonstrate later.

Inverse Problem Formulation and Solution Methodology

Consider known the rates of premature death, \(\mu \), and reproduction time delay, \(\tau \), for the population under investigation, such as discussed in Ref. [9]. Hence, in this work the inverse problem will be formulated as a problem of estimating the following parameters:

$$\begin{aligned} {\mathbf{P }=\left\{ D,P,K,C,B\right\} }^T \end{aligned}$$
(24)

where we have assumed that all the possible culling sites have the same culling intensity B.

Given a set of \(N_D\) experimental data, obtained from transient measurements at one or more distinct locations:

$$\begin{aligned} {\mathbf{Y }=\left\{ Y_1,Y_2,\ldots ,Y_{N_D}\right\} }^T \end{aligned}$$
(25)

In the Bayesian approach, the inverse problem is formulated as a problem of statistical inference based on the following principles [12]:

  1. (i)

    The sought parameters are modeled as random variables;

  2. (ii)

    The randomness describes our degree of information;

  3. (iii)

    The degree of information is coded into probability distributions;

  4. (iv)

    The solution of the inverse problem is the posterior probability distribution.

Thus, in the Bayesian approach all possible information is incorporated into the model, as prior probability density functions, in order to reduce the amount of uncertainty present in the problem. Denoting the prior probability density function as \(\pi \left( \mathbf{P }\right) \), the Bayes’ theorem formulation for inverse problems can be expressed as:

$$\begin{aligned} \pi \left( \mathbf{P }|\mathbf{Y }\right) =\frac{\pi \left( \mathbf{Y }|\mathbf{P }\right) \pi \left( \mathbf{P }\right) }{\pi \left( \mathbf{Y }\right) } \end{aligned}$$
(26)

where \(\pi (\mathbf{P }|\mathbf{Y })\) is the posterior probability distribution, \(\pi \left( \mathbf{Y }\right) \) is the marginal density and \(\pi (\mathbf{Y }|\mathbf{P })\) is the likelihood function.

Note that this method can be implemented in various ways. In this work it is explored the Markov Chain Monte Carlo method (MCMC). While other methods result in a point estimate for the unknowns, the MCMC directly exploits the posterior probability distribution through its sampling strategy [12, 24].

The implementation of the MCMC method with the Metropolis-Hastings algorithm can be performed using the following steps:

  1. 1.

    Initialize the chain iterations counter \(i=0\), and choose a starting value \(\mathbf{P }^0\).

  2. 2.

    Generate a candidate value \(\mathbf{P }'\) from an arbitrary auxiliary distribution \(q(\mathbf{P }'|\mathbf{P }^i)\). In this work, a random walk implemented with a uniform distribution is considered. Hence:

    $$\begin{aligned} q(\mathbf{P }'|\mathbf{P }^i) = U(\mathbf{P }^i - \delta , \mathbf{P }^i + \delta ) \end{aligned}$$
    (27)

    where U represents a uniform distribution with support \([\mathbf{P }^i - \delta , \mathbf{P }^i + \delta ]\).

  3. 3.

    Calculate the following acceptance probability \(\alpha \), for the chosen candidate:

    $$\begin{aligned} \alpha =\min \left[ 1,\frac{\pi (\mathbf{P }'|\mathbf{Y })q\left( \mathbf{P }^i|\mathbf{P }'\right) }{\pi \left( \mathbf{P }^i|\mathbf{Y }\right) q\left( \mathbf{P }'|\mathbf{P }^i\right) }\right] \end{aligned}$$
    (28)
  4. 4.

    Following the Metropolis-Hastings algorithm, draw a random number \(\Lambda \) from \(\Lambda \sim U(0,1)\), where U is the uniform distribution with support [0,1].

  5. 5.

    If \(\Lambda \le \alpha \) then accept the new value and make \(\mathbf{P }^{i+1}=\mathbf{P }'\).

  6. 6.

    Else reject the candidate, and make \(\mathbf{P }^{i+1}=\mathbf{P }^i\).

  7. 7.

    Increase the counter i to \(i+1\), and go back to step 2.

This procedure is carried out to yield a Markov chain with \(N_{MCMC}\) states, in the form \(\left\{ \mathbf{P }^1,\mathbf{P }^2,\ldots ,\mathbf{P }^{N_{MCMC}}\right\} \). This sequence of generated samples is a representation of the posterior distribution. Hence, inference on the sought posterior distribution is obtained from inference on the generated samples. For inference purpose, it is worth noting that the initial states of the Markov chain must be ignored while the chain has not converged to the equilibrium. These ignored states are commonly referred to as the burn-in period.

Approximation Error Model

While the solution of the inverse problem considering only Gaussian additive errors is given by:

$$\begin{aligned} \mathbf{Y }=U_C(\mathbf{P })+\mathbf{e } \end{aligned}$$
(29)

where \(U_C\) is considered the solution of the direct problem, taken as the solution that best represents the nature of the problem and \(\mathbf{e }\) is a vector containing measurement errors, which in most cases are well represented by a normal distribution with zero mean and standard deviation \(\sigma _e\).

On the other hand, in the Approximation Error Model approach (AEM), the modeling error is treated as an additional noise to the conventional error model. The approximation error modeling can be written as [25].

$$\begin{aligned} \mathbf{Y }=U_R(\mathbf{P })+[U_C(\mathbf{P })-U_R(\mathbf{P })]+\mathbf{e } \end{aligned}$$
(30)

where \(U_R(P)\) is the solution of the problem considering the reduced model, and one can define:

$$\begin{aligned} \epsilon (\mathbf{P })=U_C(\mathbf{P })-U_R(\mathbf{P }) \end{aligned}$$
(31)

The approach of modeling error can be written as:

$$\begin{aligned} \mathbf{Y }=U_R(\mathbf{P })+\eta (\mathbf{P }) \end{aligned}$$
(32)

with \(\eta (\mathbf{P })=\epsilon (\mathbf{P })+\mathbf{e }\).

The posterior distribution analysis procedure is the same as the conventional error model, with only a small change in the likelihood composition.

$$\begin{aligned} \pi (\mathbf{Y }|\mathbf{P })=2\pi ^{-\frac{N_d}{2}}|\mathbf{W }_{*}|^{-\frac{1}{2}}\exp {\left[ -\frac{1}{2}\mathbf{R }^T_{*}{\mathbf{W }_{*}^{-1}\mathbf{R }_{*}}\right] } \end{aligned}$$
(33)

where \(\mathbf{W }_{*}\) is the covariance matrix and \(\mathbf{R }_{*}\) is the residual vector, given by:

$$\begin{aligned} \mathbf{R }_{*}=\mathbf{Y }-U_R(\mathbf{P })-\epsilon _{*}(\mathbf{P }) \end{aligned}$$
(34)

where \(\epsilon _{*}(\mathbf{P })\) is the mean value of \(\epsilon (\mathbf{P })\). To obtain \(\mathbf{W }_{*}\) and \(\epsilon _{*}(\mathbf{P })\) it was used:

  1. 1.

    \(N_S\) samples are generated for the parameter vector \(\mathbf{P }\).

  2. 2.

    \(U_R\) and \(U_C\) solutions are calculated for each set \(\mathbf{P }\).

  3. 3.

    \(N_S\) samples are calculated for the modeling error \(\epsilon (\mathbf{P })\).

  4. 4.

    The mean and covariance matrix are calculated for \(\epsilon (\mathbf{P })\)

Results and Discussion

Some numerical results are now reported, first for the direct problem solution, considering the solution methodology described in “Direct Problem Formulation and Solution Methodology” section, and next for the inverse problem solution, considering the methodology described in “Inverse Problem Formulation and Solution Methodology” section. For the numerical examples, we consider the parameter values presented in Table 1, which are the same employed in Refs. [7, 9], considering four culling sites, i.e. \(N_{CS} = 4\) in Eq. (1), with equal culling intensities, \(B_1,\ldots ,B_4 = B\), located at \(x_1 = 0.2\), \(x_2 = 0.4\), \(x_3=0.6\) and \(x_4 = 0.8\). For the solution of the direct and inverse problems, the computational implementations were run with a computer equipped with CPU Intel Core i5-6200U, 2.3 GHz processor, with 7.85 GB RAM.

The convergence of the direct problem solution was analyzed for the order of the semi analytical integration (M), see Eq. (21) and the truncation order of the eigenfunctions expansions (N), considering both the complete solution, in which \(N_R = N\) in Eqs. (22) and (23), and the approximated solution, with \(N_R < N\), aiming at the computational cost reduction. Besides investigating the convergence behavior of the solutions, it was also analyzed the computational time required. Comparing Table 2a, b, one may observe a very good performance of the semi-analytical integration procedure, achieving convergence of two to three significant digits for \(M=30\) in comparison to the more refined case, with \(M=150\), while yielding considerable lower CPU times. Furthermore, the results presented in Table 2 demonstrate good convergence behaviors regarding also the truncation order of the eigenfunctions expansions (N), for both the complete and the approximated solutions, with a remarkable CPU time reduction observed in the approximated solution, without significant accuracy loss, keeping the results converged with two to three significant digits. It is worth noting that the less refined solution presented in Table 2, with \(N=20\), \(N_R=1\) and \(M=30\), was achieved with only 0.36 s and keeps the relative error in the order of 3% in comparison with the most refined case presented, with \(N=100\), \(M=150\) and \(N_R=N\), which took 673.35 s to compute. Similar results are presented in Table 3 for the time instant (\(t = 125\)) with even better results observed for the approximated solution, demonstrating the good agreement with the more refined solutions also for increasing time instants.

Table 8 Case 5—Inverse problem solution (a) Statistics obtained after discarding the burn-in states. (b) Estimates with 95% confidence intervals obtained with MCMC method. CI width (%): ratio between the 95% confidence interval range and the estimated mean value
Fig. 11
figure 11

Case 5—Markov Chain for each parameter (black), Mean (green), 95% confidence interval (red) and exact value (blue)

Fig. 12
figure 12

Case 5—posterior distribution

Table 9 Case 6—Inverse problem solution (a) Statistics obtained after discarding the burn-in states. (b) Estimates with 95% confidence intervals obtained with MCMC method. CI width (%): ratio between the 95% confidence interval range and the estimated mean value
Fig. 13
figure 13

Case 6—Markov Chain for each parameter (black), Mean (green), 95% confidence interval (red) and exact value (blue)

Fig. 14
figure 14

Case 6—posterior distribution

Fig. 15
figure 15

Autocorrelation functions of the Markov chains regarding the parameters C and D in Cases 1 and 3

Fig. 16
figure 16

Autocorrelation functions of the Markov chains regarding the parameters C and D in Cases 5 and 6

For the inverse problem solution, the synthetic experimental data were simulated in the following form:

$$\begin{aligned} Y_i=U_i({\mathbf {P}}_{\text {exact}})+e_i,\quad e_i\sim N(0,0.004) \end{aligned}$$
(35)

where \(U_i\) represents the direct problem solution at a given location x and time instant t, using the exact values for the parameters \({\mathbf {P}}\). In order to investigate the influence of the approximated solution into the estimations, two sets of experimental data were considered. The first set was generated considering the approximated solution with \(N=20\) and \(N_R=1\), and the second set was generated using the most refined solution considered, with \(N=N_R=150\). Although the most appropriate test scenario is the use of the complete solution for simulating the experimental data, in order to avoid the so called inverse crime [12], we also tested the case involving the inverse crime, i.e., the model employed in simulating the experimental data is the same employed in the inverse problem solution, aiming at evaluating how the quality of estimates vary when considering the given approximation. In Ref. [9] a thorough sensitivity analysis was performed for this problem, which showed that all parameters but C could be simultaneously estimated without the use of any prior information. Hence, in this work, we considered the use of prior information for this parameter, whereas for the others no prior information is adopted. The prior information for C was modeled by a normal distribution with mean in the exact value for the parameter C, and standard deviation of 5% of the mean value. Nonetheless, the inverse problem solution without prior information for this parameter was also tried here, in order to revisit the conclusions achieved in Ref. [9], now within the Bayesian framework. In summary, we can separate the test cases considering: (i) nature of the simulated experimental data; and (ii) use of a priori information for the parameter C. In this sense, the test cases considered are shown in Table 4. For all cases considered, the direct problem solution within the inverse problem procedure was obtained using the approximated solution with \(N=20\), \(M=30\) sub-regions in the semi-analytical integration, (21) and approximated solution \(N_R=1\), (23).

In Case 1 the Markov chain was constructed with a total of 120,000 states, being the first 80,000 states discarded as the burn-in. Table 5 summarizes the calculated statistics quantities. One may observe that the absence of prior associated with lack of accuracy of the solution in reproducing the experimental data (the experimental data were simulated with the complete solution in this case) leads to poor estimates. Figure 1 shows that the Markov chains converge to incorrect values, and it should be noticed that the parameter C was estimated to a meaningless value, in which the confidence interval encompasses the zero value, suggesting an overparametrized model. This result is in agreement with the sensitivity analysis performed in Ref. [9]. Figure 2 consolidates this analysis by presenting the histogram of the posterior distribution of the sought parameters.

In Case 2 it was considered 70,000 states for the Markov chains, and the first 20,000 states were discarded as the burn-in. The results presented in Table 6 show that the estimated confidence intervals are close to the expected values with low levels of uncertainty, as concluded from its confidence interval relatively narrow width. It can be seen that due to the inverse crime (in this case the experimental data were simulated with the approximated solution) the convergence and quality of the estimates improved, as shown Figs. 3 and 4, even without prior information for the parameter C. Nonetheless, the estimated confidence interval for C does not encompass the exact value, once again confirming the difficulties associated with the estimation of this parameter, as also demonstrated with the sensitivity analysis performed in Ref. [9].

In Case 3, The Markov chains were constructed with 50,000 states. These results illustrate that when prior information is considered available for the parameter C, much better estimations are achieved. Nonetheless, the results obtained still lead to some estimations that are relatively far from the expected values for some parameters, even if good convergence of the Markov chain is observed for all parameters, as shown in Figs. 5 and 6. For example, one may observe that the estimated confidence intervals for the parameters D, P and B do not encompass the exact values of such parameters. This is probably explained by the errors associated with the approximated solution. To demonstrate this, we finally consider in Case 4 the case with inverse crime and prior information available for the parameter C. In Case 4 the Markov chain was considered with 120,000 states, being the 20,000 first states discarded as the burn-in. Table 7 summarizes the results obtained. One may observe that for all parameters, the exact value lies inside the estimated 95% confidence intervals, demonstrating that the errors associated with the approximated solution, even if small, can have significant influence on the estimated parameters. In Fig. 7 it is shown that all the chains present fast convergence and the posterior density distribution, presented in Fig. 8, approaches a normal distribution.

Before proceeding to the inverse problem solution with the AEM approach, some considerations must be made about the characterization of the modeling error \(\epsilon \). To generate the samples, the average of the parameters obtained in Case 4 were disturbed by a random noise with zero mean and standard deviation of 5% of the mean, each of these parameters was used to generate a measure of the approximation error. For the calculation of the mean and the covariance matrix, 15,000 samples were generated, which proved to be sufficient when considering the convergence of the mean and the trace of the covariance matrix, as can be seen in the Figs. 9 and 10.

In Case 5, no prior information was used for the parameter C, and the Markov chains were constructed with 150,000 states. Convergence was achieved and all parameters were estimated with reasonable values, near the exact ones, with exception of parameter C, as expected due to the sensitivity issues already discussed here. These results are summarized in Table 8 and Figs. (11) and (12).

To solve the inverse problem in Case 6, prior information was used for parameter C, with chains of 100,000 states using 30,000 burn-in states for all parameters except for parameter P which used 40,000, as indicated by the convergence test from Heidel [26]. After removing the burn-in states, the statistical quantities were calculated and presented in the Table 9. In Fig. 13 it is shown that all the chains converge considering few states and the posterior density distribution, presented in Fig. 14, follows a normal distribution. One may observe that the use of the low cost solution with the AEM approach yielded very good estimations for all parameters considered, within relatively small and reliable confidence intervals. This result confirms the importance of taking into account the error model when employing approximated solutions or surrogate models, especially in nonlinear models such as the case of most population dynamics formulations.

In order to better observe the influence of the prior information and the AEM approach in the Markov chains, Figs. 15 and 16 present the autocorrelation functions regarding the Markov chains obtained for the parameters with lower sensitivity (parameters C and D) [9], for Cases 1 and 3 (in Fig. 15) and Cases 5 and 6 (in Fig. 16). The autocorrelation functions show that considering prior information for C leads to a significant improvement in the convergence behavior, with a remarkable decrease in the number of states needed to obtain an independent sample, for both parameters here analyzed, C and D, even with prior information only adopted for C.

Concluding Remarks

This work addressed the direct and inverse problem formulation and solution to tackle the diffusive population problem with impulsive culling sites. The direct problem was formulated using the reaction-diffusion equation with logistic birth and mortality rates and impulsive culling. The solution of this problem followed the formalism of the Generalized Integral Transform Technique (GITT) employing a computationally low cost solution, since the inverse methodology would require several calculations of the direct problem. The inverse problem was formulated within the Bayesian framework and the Markov Chain Monte Carlo method, implemented with the Metropolis-Hastings algorithm, which was employed to sample the posterior probability density function. The approximate solution of the direct problem was used to reduce the computational cost, but the error due to this type of approximation generated estimates and/or confidence regions incompatible with the values used in the simulation. To handle this issue, the Approximation Error Model (AEM) approach was employed, leading to reliable results, as presented in Case 6. This work also confirmed the sensitivity analysis performed in Ref. [16], now within the Bayesian framework, demonstrating that one of the parameters appearing in the logistic-like growth rate model employed in this formulation yields low sensitivity and is unlikely to be estimated without the use of prior information. It should be highlighted that the methodology here developed, combining a low cost numerical-analytical solution with the Approximation Error Model approach, can be efficiently employed for the solution of inverse problems within the Bayesian framework, despite of the sampling methodology considered, including newly developed MCMC methods.