1 Introduction

The growth of the human population and economic development lead to large-scale destruction of natural habitats of many animal and plant species resulting in a significant loss of biodiversity. Protected areas are an important tool for biodiversity conservation because they directly aim at the protection of biodiversity components which present a high degree of risk of extinction. The available resources for implementation and management of protected area systems being obviously limited, it is important to use them optimally. So, many optimization problems and models associated with the design of nature reserves have been proposed in the conservation biology and operations research literature. The reader can refer, for example, to the references [6, 11, 15, 18, 22, 23, 25, 27, 2933, 35, 36]. Many models associated with the determination of optimal nature reserves are based on the assumption that the species that will survive in the long term in a protected site—integrated to the reserve—is exactly known. In reality, this information is not always available. Thus, probabilistic models have been proposed to take account, to some extent, of uncertainty regarding the survival of species in protected sites (see, e.g., [12, 24, 34, 38]). In these models, the survival probabilities of the species in each protected site are assumed to be known. Some authors then propose methods to determine, under different constraints, a reserve that maximizes the expected number of species that will survive in this reserve [2, 4, 5, 28] or to determine a reserve which ensures that each species has a survival probability greater than or equal to a certain threshold [1, 17, 31]. A common feature of all these approaches is that the data are assumed to be perfectly known. It is therefore assumed that there is no uncertainty about their value when in fact there are almost always errors in the determination of such data. Ignoring uncertainty can lead to retain reserves that are far from being interesting if some data errors were proven. We focus in this article to determine a robust reserve, i.e., a reserve for which certain characteristics remain relatively independent of data errors. The reader can refer to the comprehensive article of Moilanen et al. (2006) [24] for an overview of the robustness regarding nature reserves. The goal of their article is to provide terminology and a methodological basis for robust reserve planning. They show how uncertainty analysis for reserve planning can be implemented within a framework of information-gap decision theory. The precise problem we are interested here is the problem mentioned above of determining a reserve which ensures that all species have a survival probability greater than or equal to a predefined threshold. The determination of the survival probabilities in each site is obviously a difficult task, and it is likely that the values retained are not perfectly accurate. We consider that there may be a certain percentage of errors on these probabilities and show how to account for this form of uncertainty on the survival probabilities to define a robust reserve of minimum cost. Specifically, we seek to determine a reserve of minimum cost such that the survival probability of each species in this reserve is greater than or equal to a certain threshold regardless of the value taken by the survival probabilities (in a set of possible values). If we call “scenario” a set of fixed values for the survival probabilities of each species in each site—among the possible values—then the problem is to seek protection against uncertainty in the worst case scenario and therefore in all scenarios. Of course, this protection has a cost that depends on the amount of uncertainty, and we will see later in the article that this cost can be high. We show that the optimal reserve can be determined by a mixed-integer linear program. The main notations used throughout the article are presented in Table 1.

Table 1 Symbols and notation

2 The Basic Probabilistic Problem and its Modeling by a 0-1 Linear Program

We consider a set of potential sites that can be selected to form a nature reserve. To simplify the presentation, we assume that these sites form a grid of square sites (see Fig. 1). Each site is designated by a pair of indices (i, j), and we denote by M = {(i, j) : i = 1, …, m ; j = 1, …, n } the set of sites. Note that the proposed approach would apply without difficulty to any other set of sites. Note also that the sites of the grid are not necessarily all eligible. The problem is to select among potential sites those that will form the reserve and those that will be assigned to other activities such as agricultural, industrial, or urban development. We are interested in a set of N s rare or endangered species present on these sites. Each species will be designated by an index s ∈ {1, …, N s  }, and the set of species under consideration will be denoted by S = {1, …, N s  }. We know for each site the list of species present on this site and for each species, the probability of its long-term survival in this site if it is selected. For each species s, we denote by M s the set of sites where the species s is present, and we denote by p ijs (0 ≤ p ijs  < 1) the probability that species s survives in the site (i, j) if it is selected. Of course, p ijs  = 0 if the species s is not present on the site (i, j). We assume that these probabilities are independent, i.e., that the survival probability of a species in a selected site does not depend on the decisions taken regarding the other sites. With each site, (i, j) is assigned a cost c ij , and we classically define the cost of a reserve as the sum of the costs of the sites that constitute it. We seek to identify a reserve of minimum cost which ensures that all considered species have a certain survival probability. More specifically, we try to select a set of sites to form a reserve such as the survival probability of each species s in this reserve is greater than or equal to a certain threshold value denoted by b s (0 < b s  < 1). Associate with each site of the grid a Boolean variable x ij that is equal to 1 iff the site (i, j) is part of the reserve. Given a reserve—defined by the values of the variables x ij —the survival probability of the species s in this reserve is equal to \( 1-{\displaystyle {\prod}_{\left(i,j\right)\in {M}_s}\left(1-{p}_{ijs}{x}_{ij}\right)} \). The constraint which imposes that this probability is greater than or equal to the predetermined threshold b s can therefore be written as \( 1-{\displaystyle {\prod}_{\left(i,j\right)\in {M}_s}\left(1-{p}_{ijs}{x}_{ij}\right)}\ge {b}_s \). This type of constraint is considered in [24] to illustrate the robustness concept. Using the logarithmic function and putting α ijs  = log(1 − p ijs ) and β s  = log(1 − b s ), this inequality becomes \( {\displaystyle {\sum}_{\left(i,j\right)\in {M}_s}{\alpha}_{ijs}{x}_{ij}}\le {\beta}_s \) [17]. The equivalence between the last two inequalities is true only because the variables x ij are Boolean. Indeed, in that case, log(1 − p ijs x ij ) = x ij  log(1 − p ijs ). Note that α ijs and β s are negative or null. The problem of looking for a reserve of minimal cost, ensuring that each species has a minimum survival probability can therefore be formulated by the 0-1 linear program (PR1).

Fig. 1
figure 1

A landscape represented by a 8 × 8 grid with 10 species, the corresponding survival probabilities and the associated costs in each site

$$ \left(\mathrm{P}\mathrm{R}1\right)\kern0.5em \left\{\begin{array}{l} \min \kern0.62em {\displaystyle \sum_{\left(i,j\right)\in M}{c}_{ij}{x}_{ij}}\\ {}\mathrm{s}.\mathrm{t}.\kern0.37em \left|\kern0.5em \begin{array}{lll}{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\alpha}_{ijs}{x}_{ij}}\le {\beta}_s\hfill & s\in S\hfill & (1.1)\hfill \\ {}{x}_{ij}\in \left\{0,1\right\}\hfill & \left(i,j\right)\in M\hfill & (1.2)\hfill \end{array}\right.\end{array}\right. $$

3 Taking into Account Uncertainty on the Survival Probabilities

Consider now that there is uncertainty regarding the estimation of some survival probabilities p ijs . Recall that for each species s, we denote by M s the set of sites where the species s is present. So, we assume that for all sites of M s , it is possible that there is an error on the retained value of the survival probability of species s that we call nominal value and that we denote by \( {\tilde{p}}_{ijs} \). Thus, the only thing we know with certainty is that the value of the survival probability p ijs belongs to the interval [\( {\tilde{p}}_{ijs}-{\delta}_{ijs},{\tilde{p}}_{ijs}+{\gamma}_{ijs} \)] with δ ijs  ≥ 0, \( {\delta}_{ijs}\le {\tilde{p}}_{ijs} \), γ ijs  ≥ 0, and \( {\tilde{p}}_{ijs}+{\gamma}_{ijs}\le 1 \). Such a definition of uncertainty on the survival probabilities is suggested in [24]. The considered problem then consists in determining an optimal robust reserve, i.e., a reserve of minimum cost in which the survival probability of each species s is greater than or equal to the threshold b s , regardless of the values taken by the uncertain probabilities p ijs in the interval [\( {\tilde{p}}_{ijs}-{\delta}_{ijs},{\tilde{p}}_{ijs}+{\gamma}_{ijs} \)]. As outlined in [24], a reserve that is optimal with respect to the nominal values \( {\tilde{p}}_{ijs} \) is often not robust to uncertainty. Note that in the search for a robust reserve as we have defined it, we can consider that the values of the survival probabilities only belong to the interval [\( {\tilde{p}}_{ijs}-{\delta}_{ijs},{\tilde{p}}_{ijs} \)]. For example, there might be a maximum error of e s  % on the probability \( {\tilde{p}}_{ijs} \), i.e., \( {\delta}_{ijs}={e}_s\;{\tilde{p}}_{ijs}/100 \). If we do not make other assumptions, the optimal robust reserve is obtained by the program (PR1) in which the probabilities p ijs are fixed to their minimum value, \( {\tilde{p}}_{ijs}-{\delta}_{ijs} \), for all species s and for all sites (i, j) of M s . This gives a reserve with maximum robustness [24]. Indeed, whatever the values taken by the survival probabilities in the interval [\( {\tilde{p}}_{ijs}-{\delta}_{ijs},{\tilde{p}}_{ijs} \)] the survival probability of each species s will be greater than or equal to b s . However, this very pessimistic hypothesis can result in retaining an unduly costly reserve. Thus, we will consider, following an idea proposed by Bertsimas and Sim [3] and included in numerous publications on robust optimization, that it is unlikely that there are errors on the values of all uncertain probabilities. In our problem, we assume that, for a given species s, the probabilities p ijs may differ from their nominal value \( {\tilde{p}}_{ijs} \) in at most Γ s sites of M s . For example, Γ s may correspond to a proportion of the number of sites in M s : Γ s  = ⌊r s  |M s |/100⌋ where r s is a constant between 0 and 100. This means that setting r s to 0 is equivalent to considering that there is no uncertainty about the survival probabilities of species s and, on the contrary, setting r s to 100 amounts to considering that all the survival probabilities of species s different from 0 can take the value \( {\tilde{p}}_{ijs}-{\delta}_{ijs} \) instead of their nominal value \( {\tilde{p}}_{ijs} \). The resolution of these two extreme cases comes down to solving a problem without uncertainty about the value of the survival probabilities: when Γ s  = 0, everything happens as if the survival probability of the species s in the site (i, j) was definitely set at \( {\tilde{p}}_{ijs} \) and when Γ s  = M s , everything happens as if this probability was set at \( {\tilde{p}}_{ijs}-{\delta}_{ijs} \). In the intermediate cases (0 < r s  < 100), for each species s, the ν s sites of the retained reserve R corresponding to the highest values of \( {\delta}_{ijs}/\left(1-{\tilde{p}}_{ijs}\right) \) will have in fact a survival probability equal to \( {\tilde{p}}_{ijs}-{\delta}_{ijs} \) with ν s  = min {Γ s , |M s  ∩ R| }. Indeed, using the Boolean variable z ijs that is equal to 1 if the survival probability of the species s in the site (i, j) is equal to \( {\tilde{p}}_{ijs}-{\delta}_{ijs} \) instead of \( {\tilde{p}}_{ijs} \), the problem of minimizing the survival probability of species s in the reserve according to the uncertainty can be formulated as the problem of maximizing its extinction probability, i.e.,

$$ \underset{z_{ijs}\in \left\{0,1\right\}\kern0.24em \left(\left(i,j\right)\in {M}_s\cap R\right)}{ \max}\left\{{\displaystyle \prod_{\left(i,j\right)\in {M}_s\cap R}\kern1em \left(\;1-{\tilde{p}}_{ijs}+{z}_{ijs}\kern0.24em {\delta}_{ijs}\right)}:{\displaystyle \sum_{\left(i,j\right)\in {M}_s\cap R}{z}_{ijs}}\le {\nu}_s\right\} $$
(1)

We are going to show that the solution of the maximization problem (1) is achieved by setting to 1 the ν s variables corresponding to the ν s largest values of \( {\delta}_{ijs}/\left(1-{\tilde{p}}_{ijs}\right) \). By using the logarithmic function and taking into account that the variable z ijs can only take the values 0 or 1, (1) is equivalent to the following maximization problem:

$$ \underset{z_{ijs}\in \left\{0,1\right\}\kern0.24em \left(\left(i,j\right)\in {M}_s\cap R\right)}{ \max}\left\{{\displaystyle \sum_{\left(i,j\right)\in {M}_s\cap R} \log \kern0.5em \left(\;1-{\tilde{p}}_{ijs}+{z}_{ijs}\kern0.24em {\delta}_{ijs}\right)}:{\displaystyle \sum_{\left(i,j\right)\in {M}_s\cap R}{z}_{ijs}}\le {\nu}_s\right\} $$

or

$$ \underset{z_{ijs}\in \left\{0,1\right\}\kern0.24em \left(\left(i,j\right)\in {M}_s\cap R\right)}{ \max}\left\{\begin{array}{l}{\displaystyle \sum_{\left(i,j\right)\in {M}_s\cap R} \log \left(1-{\tilde{p}}_{ijs}\right)}\\ {}+{\displaystyle \sum_{\left(i,j\right)\in {M}_s\cap R}\left[\; \log \left(1-{\tilde{p}}_{ijs}+{\delta}_{ijs}\right)- \log \left(1-{\tilde{p}}_{ijs}\right)\right]\;{z}_{ijs}}:{\displaystyle \sum_{\left(i,j\right)\in {M}_s\cap R}{z}_{ijs}}\le {\nu}_s\end{array}\right\} $$

which is itself equivalent to

$$ \underset{z_{ijs}\in \left\{0,1\right\}\kern0.24em \left(\left(i,j\right)\in {M}_s\cap R\right)}{ \max}\left\{{\displaystyle \sum_{\left(i,j\right)\in {M}_s\cap R}\Big( \log \left(1+\frac{\delta_{ijs}}{1-{\tilde{p}}_{ijs}}\right){z}_{ijs}}:{\displaystyle \sum_{\left(i,j\right)\in {M}_s\cap R}{z}_{ijs}}\le {\nu}_s\right\} $$
(2)

since the constant \( {\displaystyle \sum_{\left(i,j\right)\in {M}_s\cap R} \log \left(1-{\tilde{p}}_{ijs}\right)} \) has no play into the maximization problem. The optimal solution to (2) and thus to (1) is obtained by setting to 1 the ν s variables z ijs corresponding to the ν s largest values of \( {\delta}_{ijs}/\left(1-{\tilde{p}}_{ijs}\right) \).

4 Modeling the Search for an Optimal Robust Reserve by a Mixed-Integer Linear Program

Let us consider a reserve which is defined by the Boolean matrix \( {\overline{x}}_{ij},\kern0.5em \left(i,j\right)\in M \). The site (i, j) belongs to the reserve if and only if \( {\overline{x}}_{ij}=1 \). As mentioned above, denote by z ijs the Boolean variable which equals 1 if the survival probability of the species s in the site (i, j) is equal to \( {\tilde{p}}_{ijs}-{\delta}_{ijs} \) instead of \( {\tilde{p}}_{ijs} \), the extinction probability then being equal to \( 1-\left({\tilde{p}}_{ijs}-{\delta}_{ijs}\right) \). When z ijs  = 0, the survival probability is equal to \( {\tilde{p}}_{ijs} \) and the extinction probability is equal to \( 1-{\tilde{p}}_{ijs} \). To ensure that the extinction probability of each species s in this reserve is less than or equal to the threshold 1 − b s regardless of the values taken by the survival probabilities—in the set of possible values—the following condition must be verified for all s:

$$ \underset{z_{ijs}\in \left\{0,1\right\}\kern0.24em \left(\left(i,j\right)\in {M}_s\right)}{ \max}\left\{{\displaystyle \prod_{\left(i,j\right)\in {M}_s}\kern.2em \left[1-\left({\tilde{p}}_{ijs}-{z}_{ijs}\;{\delta}_{ijs}\right)\;{\overline{x}}_{ij}\right]}:{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{z}_{ijs}}\le {\varGamma}_s\right\}\le 1-{b}_s $$
(3)

Using the logarithmic function, this condition can also be written

$$ \underset{z_{ijs}\in \left\{0,1\right\}\kern0.24em \left(\left(i,j\right)\in {M}_s\right)}{ \max}\left\{{\displaystyle \sum_{\left(i,j\right)\in {M}_s} \log\;\left[1-\left({\tilde{p}}_{ijs}-{z}_{ijs}\;{\delta}_{ijs}\right)\;{\overline{x}}_{ij}\right]}:{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{z}_{ijs}}\le {\varGamma}_s\right\}\le \log \left(1-{b}_s\right) $$
(4)

By rewriting the objective function to be maximized in the constraint (4), we get

$$ \begin{array}{c}\hfill {\displaystyle \sum_{\left(i,j\right)\in {M}_s} \log\;\left[1-\left({\tilde{p}}_{ijs}-{z}_{ijs}\;{\delta}_{ijs}\right){\overline{x}}_{ij}\right]}={\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\overline{x}}_{ij} \log \left(1-{\tilde{p}}_{ijs}+{z}_{ijs}\;{\delta}_{ijs}\right)}\left(\mathrm{because}\kern0.5em {\overline{x}}_{ij}\in \left\{0,1\right\}\right)\hfill \\ {}\hfill ={\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\overline{x}}_{ij} \log\;\left(1-{\tilde{p}}_{ijs}\right)}\kern1em +{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\overline{x}}_{ij}{z}_{ijs}\left[\; \log \left(1-{\tilde{p}}_{ijs}+{\delta}_{ijs}\right)- \log\;\left(1-{\tilde{p}}_{ijs}\right)\right]}\left(\mathrm{because}\;{z}_{ijs}\in \left\{0,1\right\}\right)\hfill \\ {}\hfill ={\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\overline{x}}_{ij} \log\;\left(1-{\tilde{p}}_{ijs}\right)}+{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\overline{x}}_{ij}{z}_{ijs}{\varDelta}_{ijs}}\;\mathrm{with}\kern0.5em {\varDelta}_{ijs}= \log \left(1-{\tilde{p}}_{ijs}+{\delta}_{ijs}\right)- \log \left(1-{\tilde{p}}_{ijs}\right)\hfill \end{array} $$
(5)

Finally, we can express the fact that the extinction probability of each species s—in the reserve defined by \( \overline{x} \)—is less than or equal to the threshold, whatever the values taken by the survival probabilities, by the following inequality:

$$ {\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\overline{x}}_{ij} \log\;\left(1-{\tilde{p}}_{ijs}\right)}+\underset{z_{ijs}\in \left\{0,1\right\}\kern0.24em \left(\left(i,j\right)\in {M}_s\right)}{ \max}\left\{{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\overline{x}}_{ij}{z}_{ijs}{\varDelta}_{ijs}}:{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{z}_{ijs}}\le {\varGamma}_s\right\}\le \log \left(1-{b}_s\right) $$
(6)

In the maximization problem appearing in the constraint (6), the integrality constraints z ijs  ∈ {0, 1} can be relaxed, i.e., replaced by 0 ≤ z ijs  ≤ 1. In fact, a solution of this maximization problem (with z ijs  ∈ {0, 1} or with 0 ≤ z ijs  ≤ 1) is to fix to 1 the ν s variables z ijs corresponding to the ν s largest values of \( {\overline{x}}_{ij}{\varDelta}_{ijs} \) with \( {\nu}_s= \min\;\left\{{\varGamma}_s,\;\left|{M}_s\cap \left\{\left(i,j\right):{\overline{x}}_{ij}=1\right\}\;\right|\;\right\} \). The maximization problem in (6) then becomes a continuous linear program. As we have just seen, it admits a finite optimal solution. Its dual therefore also admits a finite optimal solution and is written as

$$ \underset{\lambda_s\ge 0,\;{\mu}_{ijs}\ge 0\kern0.24em \left(\left(i,j\right)\in {M}_s\right)}{ \min}\kern0.24em \left\{{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\mu}_{ijs}}+{\varGamma}_s{\lambda}_s:{\lambda}_s+{\mu}_{ijs}\ge {\varDelta}_{ijs}{\overline{x}}_{ij}\;\left(\left(i,j\right)\in {M}_s\right)\right\} $$
(7)

where λ s is the nonnegative dual variable associated with the constraint \( {\displaystyle \sum_{\left(i,j\right)\in {M}_s}{z}_{ijs}}\le {\varGamma}_s \) and μ ijs is the nonnegative dual variable associated with the constraint z ijs  ≤ 1. Since by duality, the optimum value of the maximization problem appearing in (6) is equal to the optimum value of the minimization problem (7), the constraint (6) can be rewritten as

$$ \begin{array}{l}{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\overline{x}}_{ij} \log\;\left(1-{\tilde{p}}_{ijs}\right)}+\\ {}\underset{\lambda_s\ge 0,{\mu}_{ijs}\ge 0\kern0.24em \left(i,j\right)\in {M}_s}{ \min}\kern0.24em \left\{{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\mu}_{ijs}}+{\varGamma}_s{\lambda}_s:{\lambda}_s+{\mu}_{ijs}\ge {\varDelta}_{ijs}{\overline{x}}_{ij}\kern0.62em \left(\left(i,j\right)\in {M}_s\right)\right\}\le \log \left(1-{b}_s\right)\kern0.5em \end{array} $$

We can now write the mixed-integer linear program (PR2) corresponding to the search for a robust optimal reserve:

$$ \left(\mathrm{P}\mathrm{R}2\right)\left\{\begin{array}{l} \min \kern0.5em {\displaystyle \sum_{\left(i,j\right)\in M}{c}_{ij}{x}_{ij}}\\ {}\mathrm{s}.\mathrm{t}.\ \left|\kern0.5em \begin{array}{lll}{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{x}_{ij} \log\;\left(1-{\tilde{p}}_{ijs}\right)}+{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\mu}_{ijs}}+{\varGamma}_s{\lambda}_s\le \log \left(1-{b}_s\right)\hfill & s\in S\hfill & (2.1)\hfill \\ {}{\lambda}_s+{\mu}_{ijs}\ge {\varDelta}_{ijs}{x}_{ij}\hfill & \left(i,j\right)\in {M}_s,s\in S\hfill & (2.2)\hfill \\ {}{\mu}_{ijs}\ge 0\hfill & \left(i,j\right)\in {M}_s,s\in S\hfill & (2.3)\hfill \\ {}{\lambda}_s\ge 0\hfill & s\in S\hfill & (2.4)\hfill \\ {}{x}_{ij}\in \left\{0,1\right\}\hfill & \left(i,j\right)\in M\hfill & (2.5)\hfill \end{array}\right.\end{array}\right. $$

Let \( \overline{x} \) be an optimal solution of (PR2). The optimal reserve includes the sites (i, j) such that \( {\overline{x}}_{ij}=1 \), its cost is equal to \( {\displaystyle \sum_{\left(i,j\right)\in M}{c}_{ij}{\overline{x}}_{ij}} \), and the worst scenario is defined by the worst survival probabilities in the ν s sites of the reserve corresponding to the ν s maximum values of \( {\delta}_{ijs}/\left(1-{\tilde{p}}_{ijs}\right) \) with \( {\nu}_s= \min\;\left\{{\varGamma}_s,\;\left|{M}_s\cap \left\{\left(i,j\right):{\overline{x}}_{ij}=1\right\}\;\right|\right\} \). In the particular case where \( {\varGamma}_s=\left\lfloor \frac{r_s\;\left|{M}_s\right|}{100}\right\rfloor \) for all s and \( {\delta}_{ijs}=\frac{e_s}{100}{\tilde{p}}_{ijs} \) for all (i, j) and for all s, the program (PR2) becomes

$$ \left(\mathrm{P}\mathrm{R}2\hbox{'}\right)\kern0.5em \left\{\begin{array}{l} \min \kern0.5em {\displaystyle \sum_{\left(i,j\right)\in M}{c}_{ij}{x}_{ij}}\\ {}\mathrm{s}.\mathrm{t}.\ \left|\kern0.5em \begin{array}{lll}{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{x}_{ij} \log\;\left(1-{\tilde{p}}_{ijs}\right)}+{\displaystyle \sum_{\left(i,j\right)\in {M}_s}{\mu}_{ijs}}+\left\lfloor \frac{r_s\;\left|{M}_s\right|}{100}\right\rfloor {\lambda}_s\le \log \left(1-{b}_s\right)\hfill & s\in S\hfill & \left(2.1\hbox{'}\right)\hfill \\ {}{\mu}_{ijs}+{\lambda}_s\ge \left\{ \log \left[1-{\tilde{p}}_{ijs}\left(1-\frac{e_s}{100}\right)\right]- \log \left(1-{\tilde{p}}_{ijs}\right)\right\}\;{x}_{ij}\hfill & \left(i,j\right)\in {M}_s,s\in S\hfill & \left(2.2\hbox{'}\right)\hfill \\ {}{\mu}_{ijs}\ge 0\hfill & \left(i,j\right)\in {M}_s,s\in S\hfill & (2.3)\hfill \\ {}{\lambda}_s\ge 0\hfill & s\in S\hfill & (2.4)\hfill \\ {}{x}_{ij}\in \left\{0,1\right\}\hfill & \left(i,j\right)\in M\hfill & (2.5)\hfill \end{array}\right.\end{array}\right. $$

5 Example

We illustrate in this section the application of the method to an hypothetical landscape represented by a 8 × 8 grid with 10 species. The data are presented in Fig. 1. For each site (i, j), the list of species present on the site and the associated nominal values of the survival probabilities \( {\tilde{p}}_{ijs} \) are indicated. The cost of each site is shown in the corresponding cell in the lower right. To facilitate the review of this example, we give in Table 2 the sets M s . We seek to determine a reserve which provides a survival probability for each species equal to 0.8 and then to 0.9, that is to say, b s  = b = 0.8 and then b s  = b = 0.9 for all s. The level of uncertainty Γ s is defined by a percentage r s of the number of sites where the species s is present, i.e., Γ s  = ⌊r s |M s |/100⌋ where ⌊y⌋ is the integer part of y. For example, the species 3 being present in eight sites on 64, if r 3 = 30, then the maximum number of sites where the survival probability of this species may differ from its nominal value is equal to Γ 3 = ⌊30 × 8/100⌋ = 2. In these experiments, we supposed that r s  = r for all s and we considered four different values of r: r ∈ {0, 20, 30, 100}. We also supposed that \( {\delta}_{ijs}={e}_s\;{\tilde{p}}_{ijs}/100 \) with e s  = e for all s, and we considered two values of e: e ∈ {10, 20}.

Table 2 List of the sets M s corresponding to the landscape of Fig. 1

The mathematical program (PR2) has been implemented by using the modeling language AMPL [16] and solved by the mixed-integer linear solver CPLEX 12.5.0 [9]. The experiments have been carried out on a PC with an Intel Core i-5 2.60 GHz processor and 8 GB of RAM. The obtained results are shown in Table 3. Some reserves corresponding to instances of Table 3 are shown in Fig. 2.

Table 3 Results for the landscape of Fig. 1 when the survival probability imposed for each species, b, is equal to 0.8 or 0.9 and for different values of e and r
Fig. 2
figure 2

Optimal robust reserves for some instances of Table 3

Details of the results corresponding to Fig. 2b are shown in Table 4. Taking into account the value of r, there can be no error as regards the species 1, 6, and 10, and there may be errors in only one site as regards other species. Let R* be the set of sites selected to form the optimal robust reserve, \( {\overset{\sim }{P}}_s\left({R}^{\ast}\right) \) the survival probability of the species s in this reserve with the nominal values of the survival probabilities in each site, and P s (R *) the survival probability of the species s, in the reserve, in the worst case scenario. Now, consider the instance where the threshold value for the survival probability of each species s is defined again by b = 0.8 but without taking into account the uncertainty on the survival probabilities in each site. This produces the optimal reserve presented in Fig. 2a and costing 40. The survival probabilities of each species in this reserve, computed from the nominal values of the survival probabilities, are given in Table 5.

Table 4 Details of the results corresponding to Fig. 2b
Table 5 Survival probabilities of each species in the reserve of Fig. 2a when there is no uncertainty on p ijs

Always consider the reserve of Fig. 2a but now take into account the uncertainty on the survival probabilities in each site when e = 20 and r = 20. This means that in up to ⌊20|M s | /100⌋ sites, the true values of the survival probabilities of the species s are equal to only 80% of their nominal value (see values in bold in Table 6). In this context, the worst case scenario corresponds to the survival probabilities of each species in the reserve that are given in Table 6. As expected all these probabilities are less than or equal to those of Table 5, but the survival probabilities of the species 2, 4, 5, 7, 8 and 9 fall below the threshold of 0.8 (see shaded lines in Table 6). The considered reserve is therefore not at all robust. Recall that, in this context of uncertainty, the optimal robust solution is given by the reserve of Fig. 2c whose cost is equal to 50. So, we can say that for this hypothetical landscape, the protection against the uncertainty, defined by e = 20 and r = 20, increases the cost of the optimal reserve of 25 %.

Table 6 Survival probabilities of each species in the reserve of Fig. 2a in the worst case scenario when e = 20 and r = 20

Now, consider the following problem which is suggested in [24] where it is called “the robustness question”: how wrong can \( {\tilde{p}}_{ijs} \) be, without jeopardizing the required performance of a given reserve? Here, we consider a slightly different problem: how wrong can \( {\tilde{p}}_{ijs} \) be without jeopardizing the required performance of any reserve? We will see that the cost of the optimal robust reserve increases with uncertainty over \( {\tilde{p}}_{ijs} \) until there is no longer a robust reserve satisfying the required performance. Consider again the landscape described in Fig. 1 with b = 0.9 and r = 30 and find out the greatest value of e for which there exists a robust reserve that is to say a reserve that ensures a survival probability in the reserve of all species greater than or equal to b in the worst case scenario. One can readily formulate this problem by replacing in (PR2′) the objective function to minimize by the expression e to maximize (in this case e becomes a variable). However, the resulting optimization problem is very difficult because the variable e appears in the nonlinear term, \( \log \left[1-{\tilde{p}}_{ijs}\left(1-\frac{e}{100}\right)\right]\;{x}_{ij} \), of the constraint (2.2). Another way to determine this limit value of e is to solve (PR2′), gradually increasing the value of e until there are no more feasible reserves. In this case, the survival probability in all reserves of at least one species falls below b in the worst case scenario. Table 7 shows the results obtained by this approach. Clearly, in this case, there are feasible robust reserves when e ≤ 12 but there are no more feasible robust reserves when e ≥ 13.

Table 7 Cost of the optimal robust reserve for the landscape of Fig. 1 when b = 0.9, r = 30 and for different values of e

Similarly, it may be interesting, when b and e are fixed to determine the highest value of r for which there exists a robust reserve. Here too, the mathematical problem is difficult since it consists to solve (PR2′) with the objective “min r” instead of “\( \max {\displaystyle \sum_{\left(i,j\right)\in M}{c}_{ij}{x}_{ij}} \).” When e and b are fixed, all the constraints of (PR2′) are linear except the constraint (2.1′) which contains the quadratic term s . As previously, we determine this limit value of r by solving (PR2′) and gradually increasing the value of r until there are no more feasible reserves. In this case, the survival probability in all reserves of at least one species falls below b in the worst case scenario. Table 8 shows the results obtained by this approach when b = 0.9 and e = 20. Clearly, in this case, there are feasible robust reserves when r ≤ 24, but there are no more feasible robust reserves when r ≥ 25.

Table 8 Cost of the optimal robust reserve for the landscape of Fig. 1 when b = 0.9, e = 20 and for different values of r

6 Experiments on Larger Landscapes

We have studied the behavior of the resolution of (PR2) on many instances, and we present in this section the corresponding experimental results. In this study, the landscapes are defined by four parameters: the number of sites, the cost of protecting each site, the number of species, and the survival probabilities of each species in each site. In these experiments, we have assumed that 50 species were present in the landscapes, and we have considered two possibilities for each of the other three parameters thus obtaining eight types of landscapes. We have considered landscapes represented by 15 × 15 − and 20 × 20 − matrices and protection costs randomly generated between 1 and 10 and also between 1 and 20. Finally, in a first case, each species appears randomly in 5 % of the sites and in these sites, its survival probability is drawn at random from the set { 0.5, 0.6, 0.7, 0.8 }; in a second case, each species appears randomly in 4 % of the sites and in these sites, its survival probability is drawn at random from the set { 0.7, 0.8 }. The characteristics of these eight types of landscapes are summarized in Table 9.

Table 9 Description of the eight types of landscapes considered in the experiments

For each of the eight landscape types, we have considered five different landscapes by modifying the seed of the random number generator thus obtaining 40 different landscapes. As in the experiments of “Section 5”, Γ s  = ⌊r | M s |/100⌋, b s  = b, and \( {\delta}_{ijs}=e\;{\tilde{p}}_{ijs}/100 \). For each of the 40 landscapes, we have studied the resulting solutions for the following values of the different parameters: b ∈ {0.85, 090, 0.95}, e ∈ {10, 20}, and r ∈ {20, 30, 100}. For each of the 40 landscapes, we have also studied the case without uncertainty, i.e., we have solved (PR2) with b ∈ {0.85, 090, 0.95} and r = 0. We have thus solved a total of 840 instances of (PR2). Recall that when r = 100, everything happens as if there were no uncertainty, provided that \( {\tilde{p}}_{ijs} \) is replaced by \( {\tilde{p}}_{ijs}-{\delta}_{ijs} \). All instances were resolved quickly within a few seconds of computation, except some instances corresponding to landscapes of size 20 × 20 and a minimum survival probability of 0.95. In this case, the resolution of several instances requested a few hundred seconds, the more difficult instances corresponding to e = 20 and r = 20 or 30. All instances corresponding to landscapes of size 20 × 20 admit a feasible solution. Regarding the landscapes of size 15 × 15, there is no feasible solution for one seed of the random number generator for b = 0.95 when each species appears randomly in 4 % of the sites with a survival probability drawn at random from the set { 0.7, 0.8 } (landscapes of types 2 and 4). Other instances for which there is no feasible solution are presented in Table 10.

Table 10 Instances with no solution

The experiments have shown that the cost of protection against uncertainty is often relatively high. Figures 3, 4, and 5 summarize the average increase (in percentage) in the cost of an optimal robust reserve based on the uncertainty with respect to the cost of an optimal reserve without uncertainty, for all instances of size 20 × 20 (landscapes of types 5, 6, 7, and 8). The average increase is calculated on five instances. For example, consider the landscapes of type 8 when the minimum survival probability, b, is equal to 0.90 (Fig. 4). The results obtained for r = 30 and e = 10 are given in Table 11. In this case, the average increase in the cost for protection against uncertainty is about 55 %.

Fig. 3
figure 3

Average percentage increase in costs for the four types of landscapes of size 20 × 20 when b = 0.85

Fig. 4
figure 4

Average percentage increase in costs for the four types of landscapes of size 20 × 20 when b = 0.90

Fig. 5
figure 5

Average percentage increase in costs for the four types of landscapes of size 20 × 20 when b = 0.95

Table 11 Cost of the optimal robust reserve for landscapes of type 8 when r = 30, e = 10, and b = 0.90

Let us now look at Fig. 3, which corresponds to a minimum survival probability of 0.85. We see in this figure that there is no increase in the cost of an optimal robust reserve for landscapes of types 6 and 8 when e = 10 and r = 30 or r = 100. However, for landscapes of types 5 and 7 and for the same parameter values, the average cost increases of about 20 %. The largest increase (about 57 %) occurs for landscapes of type 8 when e = 20 and r = 100. Note that for the four types of landscapes, the average increase of the cost is very similar when e = 20 and r = 30 or e = 20 and r = 100. We can also notice in Figs. 4 and 5 that the curves representing the increase in the average cost basis of the uncertainty are very similar for the four types of landscapes. Finally, the largest average increase in the cost occurs when the minimum survival probability, b, is equal to 0.90 (Fig. 4). Indeed, the average increase in cost for the four types of landscapes, when uncertainty is defined by e = 20 and r = 100, is between approximately 61 and 77 %.

7 Taking into Account Additional Constraints

The fact of being able to formulate the search for an optimal robust reserve by a mixed-integer linear program, that is to say by the program (PR2), provides an easy way to take into account additional constraints on the selection of sites such as, for example, spatial constraints. This can be done by introducing into the program (PR2) some linear constraints—or some constraints which can be linearized without too much difficulty—on the variables x ij . Of course, the mathematical program obtained will be more or less easy to solve depending on the additional constraints that one wishes to consider. For example, to avoid getting too fragmented reserves (see, e.g., [7, 8, 10, 13, 14, 19, 21, 37, 39]), we can search for compact reserves. Many measures have been proposed to evaluate the compactness of a reserve. Here, we have chosen to measure it by the perimeter-to-area ratio (see, e.g., [20, 26]). So, we only consider reserves with perimeter-to-area ratio less than or equal to a certain value ρ. The area of the reserve, A(x), is simply expressed by the linear function a(i,j) ∈ M x ij and its perimeter by the quadratic function \( \varPi (x)=4l{\displaystyle \sum_{\left(i,j\right)\in M}{x}_{ij}}-2l{\displaystyle \sum_{\left(i,j\right)\in M,\;j\ne n}{x}_{ij}{x}_{ij+1}}-2l{\displaystyle \sum_{\left(i,j\right)\in M,\kern0.24em i\ne m}{x}_{ij}{x}_{i+1,j}} \) where a is the area of a site and l is the side length [39]. Here, we assume that the side length, l, of a site is equal to one unit length and therefore that its area, a, is equal to one unit area. The compactness constraint is therefore expressed by the inequality Π(x) ≤ ρ A(x) where ρ is a fixed parameter. The problem of looking for a robust and compact reserve of minimal cost, ensuring that each species has a minimum survival probability, can therefore be formulated by the mixed-integer linear program (PR3) obtained from (PR2) by adding the compactness constraint (3.1).

$$ \left(\mathrm{P}\mathrm{R}3\right)\left\{\begin{array}{l} \min \kern0.5em {\displaystyle \sum_{\left(i,j\right)\in M}{c}_{ij}{x}_{ij}}\\ {}\mathrm{s}.\mathrm{t}.\ \left|\kern0.5em \begin{array}{ll}(2.1),(2.2),\left(2,3\right),\left(2,4\right),\left(2,5\right)\hfill & \hfill \\ {}4{\displaystyle \sum_{\left(i,j\right)\in M}{x}_{ij}}-2{\displaystyle \sum_{\begin{array}{l}\left(i,j\right)\in M\\ {}\kern0.74em j\ne n\end{array}}{x}_{ij}{x}_{ij+1}}-2{\displaystyle \sum_{\begin{array}{l}\left(i,j\right)\in M\\ {}\kern0.74em i\ne m\end{array}}{x}_{ij}{x}_{i+1,j}}\le \rho {\displaystyle \sum_{\left(i,j\right)\in M}{x}_{ij}}\hfill & (3.1)\hfill \end{array}\right.\end{array}\right. $$

Note that a classical technique provides easily a linearization of the constraint (3.1). In effect, it is sufficient to substitute the variable w 1 ij to the product x ij x ij + 1, the variable w 2 ij to the product x ij x i + 1,j and to add the supplementary constraints w 1 ij  ≤ x ij , w 1 ij  ≤ x ij + 1, w 2 ij  ≤ x ij , and w 2 ij  ≤ x i + 1,j . We will call (PR4) the program (PR3) thus linearized. We have taken up the landscape described in Fig. 1 (Section 5) when the different parameter values are set as follows: b = 0.9, e = 10, r = 20. In this case, the optimal robust reserve costs 77 (Table 3) and is described by Fig. 2e. We have solved this instance again, but this time considering the compactness constraint (3.1), for three values of the compactness parameter, ρ = 1, ρ = 2, and ρ = 3. The optimal reserves obtained cost 159, 90, and 77, respectively, and are described in Fig. 6. The cost of the optimal robust reserve is heavily dependent on the compactness constraint. Figure 6 shows that this constraint also has great influence on the structure of the reserve.

Fig. 6
figure 6

Optimal robust reserves for one instance of Table 3 when ρ = 1, 2, and 3

We also studied the resolution of (PR4) for the four types of landscapes 1, 2, 3, and 4, defined at the beginning of “Section 6”, and when b = 0.9. The results relating to the cost of uncertainty when the compactness constraint is defined by ρ = 2 are shown in Fig. 7, and the results when there is no compactness constraint are shown in Fig. 8.

Fig. 7
figure 7

Average percentage increase in costs for the four types of landscapes of size 15 × 15 when b = 0.90 and ρ = 2

Fig. 8
figure 8

Average percentage increase in costs for the four types of landscapes of size 15 × 15 when b = 0.90 and without compactness constraint

Figures 7 and 8 correspond to the average results of 100 instances (four types of landscapes, five values that define the uncertainty, and five values for the seed of the random number generator). Without the compactness constraint, the average cost of the optimal robust reserve is equal to about 218 while with the compactness constraint, that cost becomes equal to about 274. The considered compactness constraint therefore increases the cost of the optimal robust reserve by about 26 %. We can see that the shape of the four curves when there is a compactness constraint (Fig. 7) does not differ greatly from the shape of the four curves when there is no compactness constraint (Fig. 8). As might be expected, the compactness constraint makes the problem more difficult. Indeed, without this constraint, the average time to resolve an instance (resolution of PR2) is about 2.5 s while with the constraint (resolution of PR4) that time becomes equal to about 104 s.

8 Conclusions

We considered the problem of defining a nature reserve to protect a set of predetermined species. For each species and for each site, we know the survival probability of this species in this site and we assume that these probabilities are independent. The aim is to determine a reserve providing to each species a survival probability greater than or equal to a certain threshold value. This problem has already been studied in the literature of conservation biology, but in this study, we assume, realistically, there may be errors more or less important in estimating the different survival probabilities. However, we limit the number of possible errors for each species by a parameter eventually depending on the species. We show how to determine a robust reserve of minimum cost, i.e., a reserve which ensures that the survival probability of each species exceeds a certain threshold regardless the errors made in the estimation of the survival probabilities in each site. We show that this optimal reserve can be determined by solving a mixed-integer linear program. The approach is therefore relatively simple to implement since it only requires the use of a standard mathematical programming solver. It also has the advantage that the model can be easily modified which is generally not the case when constructing a specific algorithm. For example, we show how to design a compact and robust optimal reserve by simply introducing an additional constraint in the previous mixed-integer linear program. We illustrate in detail the approach on a small hypothetical landscape. The resulting reserves are substantially different from that which would be produced if the possible errors on the survival probabilities were ignored. Finally, we present and analyze the results achieved by applying the proposed approach to many large hypothetical landscapes.