1 Introduction

Water resources play a critical role in human society and healthy ecosystems. Rapid population growth with the requirements of higher living standards and excessive water pollution due to agricultural and industrial expansions have led to an increasing demand for water and caused intense social and environmental conflicts world wide (Kerachian and Karamouz 2007). In order to restore sustainable development, possible alternatives to deal with the increasing water demand have been developed, such as developing new water resources projects (supply management) and increasing overall management efficiency (Khare et al. 2007). Optimal allocation of water resources in river basins is an important objective of water resource development projects all over the world (Abolpour et al. 2007) and may also be one of the most effective water management alternatives to deal with increasing water demand and inadequate surface supplies (Khare et al. 2007; Maqsood et al. 2005).

A basic problem in optimal allocation of water resources is that different interest groups have different objectives which may be conflicting and incommensurable (Chen et al. 2007). The optimization problem becomes more complicated as the number of interaction factors, such as upstream–downstream impacts and environmental changes, becomes large. The interactions factors are also often nonlinear and thus difficult to predict (Vink and Schot 2002). Traditionally, multi-objective optimization problems are often solved using the weighting method or the ε-constraint method (Carlos and Peter 1995; Cohon and Marks 1975; Nemhauser et al. 1989). The weighting method produces only a single Pareto optimum for each run of the optimization process and is highly sensitive to the weight vector used in the scaling process. In order to provide a group of Pareto optimal solutions, the Pareto optimal set is obtained by varying the weight associated with each objective and solving the problem sequentially. In the constrained method, one objective is left out in the formation of the constraint set. The objectives included in the constraint set are varied from the lower bound to the upper bound in order to trace out the Pareto frontier (Louie et al. 1984). In recent years, many heuristic and metaheuristic algorithms such as evolutionary algorithms (EA), evolutionary strategies (ES) (Coello Coello et al. 2007), ant colony optimization (ACO) and particle swarm optimization (PSO) have been proposed and successfully applied to solve multi-objective problems. PISA project (http://www.tik.ee.ethz.ch/pisa/) provides some ready-to-use modules of multi-objective optimization for application engineer and optimizers. Although such algorithms may not always guarantee the global optimum solution, they provide quite good results in an acceptable computation time (Kumar and Reddy 2006). While it has been shown that these biologically inspired heuristics offer better performance over classical optimization approaches (the weighting method and the constrained method) in multi-objective optimization problems, they still suffer from some limitations such as premature convergence and poor exploitation abilities (Tan et al. 2008). So there is a constant need to improve existing algorithms.

The human immune system (HIS) is a highly developed, parallel and distributed adaptive system. A relatively new optimization algorithm, namely the immune algorithm (IA), has been proposed by imitating the HIS defense against invaders and is becoming increasingly popular in tackling various optimization challenges (Du et al. 2005; Garrett 2004; De Castro and Von Zuben 2002; Köster et al. 2003). Zhang (2007) developed a new dynamic immune optimization algorithm for constrained nonlinear multi-objective optimization problems, and Vrugt and Robinson (2007) conducted a simulation of multiple optimization algorithms using the concepts of global information sharing and genetically adaptive offspring creation. Chen and Mahfouf (2006) proposed a novel population adaptive-based immune algorithm (PAIA) using clonal selection and immune network theories for multi-objective optimization problems.

The macro-evolution algorithm (MA) is inspired by the dynamics of species extinction and diversification at large time scales. It has been found to be successful for a wide variety of optimization tasks (Marin and Solé 1999). MA has the advantages of reaching higher fitness values and higher probability of success in reaching good fitness values in comparison with genetic algorithms (GA) with tournament selection. In this article, we propose the so-called macro-evolutionary multi-objective immune algorithm (MEMOIA) by incorporating the features of MA, clonal selection and immune memory to perform a global optimal search, where the clonal selection based on the diversity in the evolving population is used for solution exploitation, an entropy-based density assessment scheme (EDAS) is used to distribute non-dominated individuals uniformly along the discovered Pareto-frontier, and MA is employed to preserve the diversity of individuals and form part of the pool solution.

The article is organized as follows. Section 2 describes the proposed algorithm. Its performance in solving an experimental multi-objective function is assessed in Sect. 3, followed by an application to a water resources allocation problem with multi-objective water management requirements in the Dongjiang River basin of southern China, with consideration of social, economic and environmental issues. The results and discussions of the application are presented in Sect. 5. Conclusions are given in Sect. 6.

2 Proposed MEMOIA

2.1 Principles and theories

With a view to attaining global multi-objective optimal solutions with fast convergence, diversity, and spread of solution sets, we proposed a MEMOIA based on macro-evolutionary theories and features of immune pattern recognition and learning through the mechanism of clonal selection and immune memory for multi-objective problems. For an immune system, the effectiveness of the immune response to secondary encounter of antigens is enhanced by the presence of memory cells associated with the first infection, which is capable of producing high affinity antibodies (Abs) after subsequent encounters. Therefore, instead of starting from scratch every time, an intrinsic scheme involving a reinforcement learning strategy is adopted to ensure a fast and accurate immune response after subsequent infection. Thus, antibodies with high affinities are cloned and stored in the local memory (Wong et al. 2009). The archive diversity of MEMOIA is maintained by the proposed entropy-based density assessment technique while evolving population diversity is maintained by a new clonal selection scheme. Archive diversity ensures that a representative set of non-dominated solutions is stored, while evolving population diversity is crucial to the discovery of a diverse, well-distributed and near-optimal solution set (Tan et al. 2008). The representation of a MEMOIA for the biological immune system is given in Table 1. The antigenic stimulus and the antibodies define the multi-objective problem to be solved and the potential solutions to the problem, respectively. Intuitively, the affinity of the antibodies is associated with the extent it solves the problem defined in terms of Pareto dominance. Immune memory is implemented in the form of a fixed-size archive and memory cells are represented by non-dominated antibodies (Tan et al. 2008). In order to promote antibody (Ab) diversity and to facilitate the exchange of good information among antibodies, antibodies are subjected to macro-evolutionary operators, which use a connectivity matrix W to compare the fitness values and similarities of all the antibodies in one generation dynamically (Marin and Solé 1999; Zhang and Xu 2005).

Table 1 Mappings of the biological immune system and MEMOIA

The proposed hybrid algorithm tends to overcome shortcomings of individual algorithms. The hybridization has been used in the literature and many researchers have proven that the hybrid approaches in solving optimal problems are highly efficient (Heinonen and Pettersson 2007; Zhang et al. 2008). We employ the MEMOIA, which combines the macro-evolutionary algorithm and immune learning of artificial immune systems (AIS), here to resolve the multi-objective optimization problems.

2.2 Algorithmic flow of MEMOIA

The computational procedure of our MEMOIA is as follows.

  • Step 1 Define the antigen

  • According to Table 1, the objective function and constraints are represented by antigens.

  • Step 2 Initialize population

  • The initial population of antibodies is generated via uniform random coding. Binary and real number coding are the two most popular coding techniques. The real number coding is used for the problems presented in this article. For decision variables \( x_{i},\quad i=1,\ldots , n, \) defined on intervals x i ∈ [a i , b i ], the real number coding of x i , can be expressed as:

    $$ x_{i} = a_{i} + \beta \cdot \left( {b_{i} - a_{i} } \right) $$
    (1)

    where β is within [0, 1]. An antibody \(V_{i}=\{x_{1}, \ldots , x_{j}\}\) is represented as a vector of real numbers, where j is the number of elements in V i .

  • Step 3 Evaluate antibodies

  • Deb (2001) proposed an algorithm to compare two solutions and determine whether one has dominated the other. The scheme will be described in Sect. 2.3.

  • Step 4 Generate memory set/Update archive

  • After all antibodies are evaluated, antibodies are selected or updated as memory cells. The archive is updated at each cycle, and if the candidate solution is not dominated by any member in the archive, it will be added to the archive. Likewise, any memory cells dominated by this solution will be removed from the archive. When the predetermined archive size is reached, a recurrent truncation process based on EDAS is used to remove the most ineffective archive members. The EDAS will be described in Sect. 2.4. The rationale of eliminating memory cells based on EDAS is to maintain a set of uniformly distributed memory cells in the archive.

  • Step 5 Clonal selection

  • After the archiving process, appropriate antibodies are selected and cloned into the mating pool which has the same size as the evolving population. There are actually two stages in the selection process. The first stage involves the cloning of memory cells from the archive by the proposed clonal selection scheme which will be described in Sect. 2.5. These cloned memory cells will form only part of the mating pool solution. As both the clonal selection and macro-evolutionary algorithm are employed, in order to maintain the diverse of solutions, the effects of the both algorithms should be considered. Here only 50% of them, that is 0.5·narchive, are used. To fulfill the rest, we will use the macro-evolutionary algorithm which will be described in Sect. 2.6.

  • Step 6 Crossover/Mutation

  • Genetic operations, such as crossover and mutation, can enhance the immune algorithm in producing solutions and perturbing selected solutions to avoid local optima (Chen and You 2005). The crossover operation generates new antibodies by mixing genetic material in the chromosomes of the original antibodies in the mating pool. For the variables with real number coding (as in this study), arithmetical crossover is applied to interpolate at selected crossover points between the values of two elements instead of exchanging them. This approach can maintain elements of newly generated vectors within the original domain, and can be expressed as (Chu et al. 2008; Michalewicz 1992):

    $$ \begin{gathered} x_{u}^{t + 1} = c \cdot x_{v}^{t} + \left( {1 - c} \right) \cdot x_{u}^{t} , \hfill \\ x_{v}^{t + 1} = c \cdot x_{u}^{t} + \left( {1 - c} \right) \cdot x_{v}^{t} , \hfill \\ \end{gathered} $$
    (2)

    where \( x_{u}^{t} \) and \( x_{v}^{t} \) are the two decision variables to be crossed, \( x_{u}^{t + 1} \) and \( x_{v}^{t + 1} \) are the newly generated variables, and c is a constant between 0 and 1.

  • Mutation randomly changes antibody elements, introducing diversities such that the algorithm does not stick at local optima. For an antibody \(V_{i}=\{x_{1}, \ldots , x_{m}, \ldots , x_{n}\},\) each decision variable, x m (1 ≤ m ≤ n) has the same mutation probability. Let \( V^{\prime}_{i} = \left( {x_{1} , \ldots ,x^{\prime}_{m} , \ldots ,x_{n} } \right) \) be the antibody V i mutated; the mutated element \( x^{\prime}_{m} \) can be defined as (Chu et al. 2008; Östermark 1999):

    $$ x^{\prime}_{m} = \left\{ \begin{gathered} x_{m} + \Updelta (t,b_{m} - x_{m} )\quad {\text{if}}\,rr = 0 ,\hfill \\ x_{m} - \Updelta (t,x_{m} - a_{m} )\quad {\text{if}}\,rr = 1, \hfill \\ \end{gathered} \right. $$
    (3)

    and

    $$ \Updelta \left( {t,y} \right) = y \cdot \left( {1 - \gamma^{{\left( {1 - \frac{t}{T}} \right)^{b} }} } \right) $$
    (4)

    where γ is a random number uniformly distributed within [0, 1], rr is a random zero–one digit, t is the evaluation number, T is the maximal evaluation number, and b is a system parameter determining the degree of dependency on evaluation number (we have used b = 2 in this article). The flowchart of MEMOIA is shown in Fig. 1.

    Fig. 1
    figure 1

    The flowchart of MEMOIA

2.3 Constrained dominance scheme

In order to evaluate antibodies that satisfy the various constraints, a constrained non-dominance scheme is proposed by Deb (2001) In particular, an antibody F a dominates an antibody F b if any one of the following conditions is true:

  1. (1)

    F a is feasible and F b is infeasible.

  2. (2)

    Both solutions are infeasible and F a has few constraint violations than F b .

  3. (3)

    Both solutions are infeasible, both solutions have the same number of constraint violations and F a dominates F b in terms of the objective values.

  4. (4)

    Both solutions are feasible and F a dominates F b according to the objective values.

2.4 EDAS

The EDAS, which is motivated by the immune system’s ability to maintain a regulated repertoire of lymphocytes that is representative of the actual antigenic environment, is proposed to maintain a uniformly distributed set of non-dominated solutions (Tan et al. 2008). In order to maximize the information conveyed by the memory cells about the problem at hand, the concept of entropy or information gain is applied to quantify the information contributed by the each memory cell to the archive.

According to the entropy optimization principles of Shannon (Kapur and Kesavan 1992), considering the archive \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{X} \) \(( {{\text{where}}\,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{X} = \{ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{x}_{i} \left| { \, i = 1,2, \ldots ,narchive} \right.} \},\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {x}_{i} \in [ {lowbd_{j} ,uppbd_{j} } ]} ) \) as a statistical population containing narchive non-dominated antibodies or memory cells in an m-dimensional feature space, and the probability value of \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{X} \) is denoted by P \( \left( {{\text{where}}\,P = \left\{ {p(x_{i} )\left| {i = 1,2, \ldots ,narchive} \right.} \right\},0 \le p\left( {x_{i} } \right) \le 1,\,{\text{and}}\,\sum\nolimits_{i = 1}^{narchive} {p\left( {x_{i} } \right) = 1} } \right), \) the entropy is mathematically defined

$$ H\left( {x_{i} } \right) = - p\left( {x_{i} } \right)\log \left( {p\left( {x_{i} } \right)} \right). $$
(5)

The entropy is defined in terms of probability density in Eq. 5. Nonparametric approaches do not need the assumption that the density belongs to a mixture from a parametric set (Mukherjee and Vapnik 2000). The most popular approach is Parzen’s method, in which the probability can be estimated by Eq. 6 based on the observed antibodies.

$$ P\left( x \right) = \frac{1}{narchive}\sum\limits_{i = 1}^{narchive} {K\left( {x - x_{i} } \right)} $$
(6)

where K(x − x i ) is a smooth function. The multivariate Gaussian with covariance matrix ∑ and one free parameter σ (the width) is used and it can be described by

$$ K\left( {x - x_{i} } \right) = \frac{1}{{\left( {2\pi } \right)^{m/2} \left| \sum \right|^{1/2} }}\exp \left\{ { - \frac{1}{2}\left( {x - x_{i} } \right)^{T} \left( \sum \right)^{ - 1} \left( {x - x_{i} } \right)} \right\} $$
(7)

Here, T is the transpose operator, m is the number of objectives and the kernel width is defined by σ (σ = ∑1/2). The Parzen window is sensitive to the kernel width σ setting, therefore, σ is adapted along the optimization process,

$$ \sigma_{j} = \frac{{uppbd_{j} - lowbd_{j} }}{narchive}, $$
(8)

where uppbd j and lowbd j denote the maximum and minimum values along the jth dimension of the feature space found in the archive, respectively.

Note that an antibody with a higher H will have a better contribution to the overall information content of the archive. When the memory cells are uniformly distributed along the Pareto frontier, H is maximized. In particular, a new memory cell with a higher H will replace the most ineffective memory cell with the least H when the predetermined archive size is reached. This will allow the archive to retain a set of uniformly distributed memory cells in the archive (Tan et al. 2008).

2.5 Clonal selection

The clonal selection principle (Khilwani et al. 2008) is one of the inspiring methodologies employed in AIS for optimization problems. Based on the clonal selection principle, the selection of non-dominated solutions or memory cells from the archive can result in fast convergence speeds. In order to provide a better indication of the less explored regions in the search space, a clonal selection scheme whose selection and clonal rate of memory cells are based on the degree of their representation (dr) is developed as following steps (Tan et al. 2008).

  • Step 1 Determine representative memory cells

  • Since we are only interested in guiding search towards better and less explored regions in the search space, only a subset of the evolving population denoted as \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{S^{ * } }} \) is considered, with \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{y_{k} }} \in \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{S^{ * } }} \) being only dominated by memory cells. An antibody, \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{y_{k} }} \in \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{S^{ * } }} \) is representative of a particular memory cell \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{x_{i} }} \) along the jth dimension if

    $$ y_{k,j} \in R_{i,j} \quad {\text{iff}}\,\left\| {x_{i,j} ,y_{k,j} } \right\| \le \lambda_{i,j} $$
    (9)

    where λ i,j is the range of similarity and R i,j denotes the set of antibodies that are representative of \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{x_{i} }} \) along the jth dimension. For simplicity, λ i,j is adapted along the optimization process as 0.15 (uppbd j  − lowbd j ) where uppbd j and lowbd j denotes the maximum and minimum values along the jth dimension of the objective space found in the archive, respectively (see Tan et al. 2008).

  • Step 2 Calculate the degree of representation (dr) of each memory cell \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{x_{i} }} \) in \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{S^{ * } }} \) by

    $$ dr\left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{x}_{i} } \right) = \sum\limits_{j = 1}^{m} {\mathop {\min }\limits_{{y_{k} \in S^{ * } }} \left\| {x_{i,j} ,y_{k,j} } \right\|} \cdot E\left( {x_{i,j} ,y_{k,j} } \right) $$
    (10)
    $$ E\left( {x_{i,j} ,y_{k,j} } \right) = \left\{ \begin{aligned} & 1\quad {\text{if}}\,y_{k,j} \in R_{i,j} \hfill \\ & 0\quad {\text{otherwise}} \hfill \\ \end{aligned} \right. $$
    (11)
  • Step 3 Perform cloning

  • A higher value of dr implies a lower degree of similarity between the memory cell and its representatives and, hence denotes a lower level of representation in the evolving population. To explore and exploit the less populated regions, memory cells with higher dr will be assigned higher cloning rates. After ranking the memory cells in terms of dr, the actual distribution of the clonal rates across the different ranks shown in the following pseudocode (Fig. 2) is empirically determined according to exhaustive experimentation (Tan et al. 2008).

    Fig. 2
    figure 2

    Pseudocode of clonal selection

2.6 Macro-evolutionary algorithm

As mentioned in Sect. 2.2, the macro-evolutionary algorithm is used to fill up the mating pool. Let 0.5·narchive be the number of evolutionary memory cells. The relationship between these evolutionary memory cells is represented by a connectivity matrix W, in which each item \({\user2{W}}_{{i,j}} \left( t \right)\left( {i,j \in \left\{ {{\text{1}},{\text{2}}, \ldots ,0.{\text{5}} \cdot narchive} \right\}} \right)\) measures the influence of memory cells j on memory cells i at generation t with a continuous value. The algorithm is described below.

  1. (1)

    Connection matrix:

    Each individual gathers information about the rest of the population through the strength and sign of its couplings W i,j as

    $$ W_{i,j} = \frac{{f\left( {x_{i} } \right) - f\left( {x_{j} } \right)}}{{dis\left( {x_{i} ,x_{j} } \right)}},$$
    (12)

    where x i are the input parameters of the ith individual, f (x i ) are the objective values of x i , and dis(x i , x j ) is the Euclidean distance between x i and x j .

    The relation of each memory cells to the rest of the population determines its survival coefficient h defined as

    $$ h_{i} \left( t \right) = \sum\limits_{j = 1}^{0.5 \cdot narchive} {W_{i,j} \left( t \right)} $$
    (13)

    Individuals with higher h i inputs will be favored in the sense of being able to out-compete other less-fit solutions.

  2. (2)

    Selection operator:

    Selection operator is used to determine surviving individuals based on their relations as a sum of penalties and benefits. The state of a memory cell in the next generation is updated synchronously as

    $$ S_{i} \left( {t + 1} \right) = \left\{ \begin{aligned} & 1\quad {\text{if}}\,h_{i} \left( t \right) \ge 0,\quad {\text{alive}} \hfill \\ & 0\quad {\text{otherwise,}}\quad {\text{extinct}} \hfill \\ \end{aligned} \right. $$
    (14)
  3. (3)

    Colonization operator:

    The colonization operator allows the filling of vacant sites that are freed by extinct individuals (that is, those with S i  = 0). This operator is applied to each extinct individual in two ways. With a probability τ, a totally new solution x new will be generated. Otherwise, exploitation of surviving solutions takes place through colonization. For a given extinct solution x i and a surviving solutions x b , the extinct solution will be “attracted” toward x b . Mathematically, a possible (but not unique) choice for this colonization of extinct solutions can be expressed as

    $$ x_{i} \left( {t + 1} \right) = \left\{ \begin{aligned} & x_{b} \left( t \right) + \rho \cdot \lambda \left[ {x_{b} \left( t \right) - x_{i} \left( t \right)} \right]\quad {\text{if}}\,\xi > \tau \hfill \\ & x_{\text{new}} \quad {\text{if}}\,\xi \le \tau \hfill \\ \end{aligned} \right. $$
    (15)

    where ξ ∈ [0,1] is a random number, λ ∈ [−1,1] (both with uniform distribution), and ρ and τ are given constants of the algorithm. It can be seen that ρ describes a maximum radius around surviving solutions and τ acts as a “temperature”. Parameter τ can be set in the same as in simulated annealing and be described by the linear relation

    $$ \tau \left( {t,G} \right) = 1 - \frac{t}{T}. $$
    (16)

    MA randomly chooses only one objective to optimize at each generation and therefore, cannot guarantee the convergence of Pareto-optimal solutions during the optimization procedure (Vink and Schot 2002). However, many of the memory cells are diversified and non-inferior, and only a few solutions need to be deleted at a time. We can use a simple method, which is based on Edgeworth-Pareto optimality (Edgeworth 1881; Pareto 1896) and takes only O(0.25·narchive 2) computing time, to discard the inferior solutions. Therefore, MA is a very promising method for solving engineering multi-objective optimization problems.

3 Evaluation of our MEMOIA using a test problem

3.1 Performance measures

In order to quantitatively assess the performance of a multi-objective optimization algorithm, the following three metrics were adopted from references (Bosman and Thierens 2003; Scott 1995; Zitzler et al. 2000).

  1. (1)

    Reverse Generational Distance (RGD)

    For the purpose of estimating the departure of the elements in the Pareto front (PF) produced by the proposed algorithm from the true Pareto front of the problem with consideration of diversity, Bosman and Thierens (Bosman and Thierens 2003) proposed the metric RGD to compute the distance \( \widetilde{{d_{j} }} \)to the closest solution in the PF known set for each j solution in the PF true , based on the Generational Distance (GD) metric introduced by Van Veldhuizen and Lamont (1999). The RGD is defined as

    $$ RGD = \frac{1}{{n_{true} }}\sqrt {\sum\limits_{j = 1}^{{n_{true} }} {\widetilde{{d_{j} }}^{2} } } $$
    (17)

    where n true is the cardinality (number of elements) of the PF true set. A smaller value of RGD implies better convergence.

  2. (2)

    Spacing (S)

    The metric of spacing (S) is first introduced by Scott (1995) and measures how well the solutions throughout the PF known are distributed. Its mathematical definition is

    $$ S = \sqrt {\frac{1}{n - 1}\sum\limits_{i = 1}^{n} {\left( {\overline{d} - d_{i} } \right)^{2} } } , $$
    (18)

    where \( d_{i} = \mathop {\min }\limits_{j} \sum\nolimits_{k = 1}^{m} {\left| {f_{k}^{i} \left( x \right) - f_{k}^{j} \left( x \right)} \right|} \) and \( \overline{d} = \frac{1}{n}\sum\nolimits_{i = 1}^{n} {d_{i} } \). A value of zero for metric S indicates that all the solutions on the PF known are equally spaced from one another.

  3. (3)

    Error Ratio (ER)

    Veldhuizen and Lamont (1999) presented the metric of the Error Ratio (ER) measuring the number of non-dominated vectors of the PF known that are not members of the PF true . In order to normalize the metric, ER is represented as

    $$ ER = \frac{{\sum\nolimits_{i = 1}^{n} {e_{i} } }}{n}, $$
    (19)

    where n is the number of members in the discovered Pareto frontier PF known , e i  = 1 if solution i is not on the PF true , e i  = 0 otherwise.

3.2 A test problem

Veldhuizen (1999) cited a number of test problems that have been used in the past to test multi-objective algorithms. We present only a Viennet benchmark problem (Viennet et al. 1996) to evaluate the performance of MEMOIA, which is characterized by a objective space, a discontinuous Pareto optimal set, and several local minima in the objective functions, as follows:

$$ \begin{gathered} F = {\text{Minimize}}\left\{ {f_{1} \left( {x,y} \right),f_{2} \left( {x,y} \right),f_{3} \left( {x,y} \right)} \right\} \hfill \\ \left\{ \begin{gathered} f_{1} (x,y) = \frac{1}{2}(x^{2} + y^{2} ) + \sin (x^{2} + y^{2} ) \hfill \\ f_{2} (x,y) = \frac{{(3x - 2y + 4)^{2} }}{8} + \frac{{(x - y + 1)^{2} }}{27} + 15,\quad {\text{With}}\,x,y \in \left[ { - 3, \, 3} \right]. \hfill \\ f_{3} (x,y) = \frac{1}{{x^{2} + y^{2} + 1}} - 1.1\exp ( - x^{2} - y^{2} ) \hfill \\ \end{gathered} \right. \hfill \\ \end{gathered} $$
(20)

The proposed MEMOIA algorithm is compared to The Non-dominated Sorting Genetic Algorithm (NSGAII) and Improving the Strength Pareto Evolutionary Algorithm (SPEA2) (Zitzler et al. 2002). Thirty independent runs of the three algorithms are performed for the test problem in order to obtain the statistical information. The various parameter settings for each algorithm are listed in Table 2. The true Pareto frontier (PF true ) of the Viennet benchmark problem and the Pareto frontier produced by MEMOIA are shown in Figs. 3 and 4 plots their Pareto optimal solutions. It can be seen that MEMOIA has a very good representation of the PF true . The values of the three metrics for each algorithm, presented in Table 3, also proved the comparative effectiveness of MEMOIA for the Viennet benchmark problem. And the values generated by using t test technique (Coello Coello et al. 2007) are smaller than the largest acceptable one with significance level α = 0.01 (the last row in Table 3). It means that the differences between the MEMOIA and SPEA2, the MEMOIA and NSGAII are very small with not statistically significant. Based on the proven efficiency of MA (Vink and Schot 2002; Marin and Solé 1999; Zhang and Xu 2005), we use MA to improve the EMOIA (Evolutionary Multi-Objective Immune Algorithm) whose effectiveness has also been validated by Tan et al. (2008) in MEMOIA. Specifically, the EDAS in EMOIA and MA have the ability of maintaining the diverse solution set and avoiding premature convergence, which are the most fundamental requirements of multi-objective techniques. So MEMOIA is a very promising approach to engineering optimization problems which are often complex and time-consuming such as water allocation in river basin.

Table 2 Parameter setting for MEMOIA, SPEA2 and NSGAII
Fig. 3
figure 3

The true Pareto front (left) and how it is depicted by MEMOIA (right)

Fig. 4
figure 4

The true Pareto optimal solutions (left) and theirs produced by MEMOIA (right)

Table 3 Results of the metrics for the Viennet benchmark problem

4 Description of the case study for Dongjiang River

Water allocation problems are very complex, involving social, economic, environmental, and political factors (Taghi Sattari et al. 2009). Ideally, the water allocation should be economically efficient, technically feasible as well as socially fair (Yang and Yang 2010; Yin et al. 2010). Consequently, a water allocation model often has to consider multiple objectives. In this section, the MEMOIA is applied to the water allocation model in order to maximize the economic interests including hydropower generation while at the same time to minimize water shortages and the amount of polluted water. The results reveal the trade-offs amongst the economic interests, water shortages and the amount of polluted water.

4.1 System description

The Dongjiang River basin (Fig. 5) is located in southern China, between 113°52′ and 115°52′ longitude east and 22°38′ and 25°14′ latitude north, respectively. The river basin has a total drainage area of 35,340 km2 and covers seven administrative cities, namely, Meizhou, Shaoguan, Heyuan, Huizhou, Guangzhou, Dongguan and Shenzhen in Guangdong province, southern China. The Dongjiang River flows from northeast to southwest and discharges into the Pearl River estuary. The annual rainfall is about 1,500–2,400 mm, 80% of which is concentrated in the rainy seasons (Apr–Oct). The population in the entire river basin was 22 million in 2005 (from Guangdong statistical yearbook 2006). Along the river there are three large reservoirs with a total storage capacity of 17.06 billion m3 (13.90 of which are in the Xinfengjiang reservoir alone). These are the Xinfengjiang, Fengshuba and Baipenzhu reservoirs, each of which serves as a hydroelectric power station, as well as being the main structures for management of water flow and salinity and distribution points to serve water needs, through controlling main and side inflows and return flow, and water intake for communal needs. Figure 6 shows a schematic of the water system network.

Fig. 5
figure 5

Location of Dongjiang River basin in China

Fig. 6
figure 6

System description and river network of the Dongjiang River basin

4.2 Water allocation model

To bring the water allocation problem down to a manageable size, Dongjiang River basin is divided into seven operational areas according to administrative boundaries. In each operational area, water supply agents take water from the river and/or reservoirs. Therefore, it is necessary to integrate river and reservoirs for water allocation (Fig. 7). If there is enough water in the river for every consumer sector, the reservoir operation will be standard (i.e., keeping water level at the upper limit to store as much water as possible). If the water in the river is not enough for the total demand or, in an extreme scenario, for the in-stream ecological water demand, the reservoir operations should first take water supply into consideration at the basin level. In summary, the operations of reservoirs can be described as follows:

Fig. 7
figure 7

Conceptual framework of optimal allocation of water resources

  1. (1)

    When the water level in a reservoir is above the upper limit, hydropower generation should be increased to keep the water level at the upper limit.

  2. (2)

    When the water level is between the upper and critical limit, the reservoir operation should take the economic interests including hydropower generation, water shortages and the amount of polluted water into consideration. This may be possible through optimal water allocation.

According to the functional nature of water utilization, water use sectors in the seven operational areas in the basin are divided into four groups: industrial water, agricultural water, domestic water, and ecological water (Tang 1995). Previously, and in the conceptual framework of optimal water allocation in Fig. 7, the water allocation model distributes water among these four sectors by considering the following three objectives: to maximize economic interest (OF 1), to minimize water shortages (OF 2) and to minimize the amount of organic pollutants in water (OF 3). The allocation model is formulated for monthly operation; details of the objective functions are provided below.

4.2.1 Objective functions

The objective functions are OF 1, OF 2 and OF 3 as follows:

  1. (i)

    To maximize economic interest (OF 1): this objective function is defined as the total economic value, including hydropower generation of the whole basin. It can be estimated as

    $$ OF_{1} \left( X \right) = \sum\limits_{t = 1}^{12} {\left\{ {\left[ {\sum\limits_{i = 1}^{7} {\sum\limits_{j = 1}^{4} {\left( {x_{ij}^{t} * NER_{ij} } \right)} } } \right] + \left[ {\sum\limits_{k = 1}^{3} {\left( {KA_{k} * RP_{{^{k} }}^{t} * H_{k}^{t} } \right) * P} r} \right]} \right\}} $$
    (21)

    Here, OF 1 is the objective function 1; NER ij is the net economic return per unit volume of water from sector j in the ith operational zone (US$/m3); x ij is the decision variable which means the water allocated to sector j in the ith operational zone; 4 is the number of sectors; 7 is the number of operational zones; KA k is the power coefficient for reservoir k; RP k is the amount of water released to turbines of the kth reservoir; H k is the average head available of the kth reservoir during period t and is expressed as a non-linear function of the average storage during that period (shown in Eq. 22); Pr is the price of power (US$/kWh); 3 is the number of reservoirs in the Dongjiang River basin.

    $$ H_{k}^{t} = \left( {H_{k}^{t} } \right)^{ * } - \alpha_{k} * \left( {RP_{k}^{t} } \right)^{2} $$
    (22)

    \( \left( {H_{k}^{t} } \right)^{ * } \) is the water height for reservoir k at time period t (measured from turbine level) and will be obtained as a function of the average storage volume (at the beginning and end of the time interval, shown in Eq. 23) in this article; α k is the average head loss coefficient for reservoir k;

    $$ \left( {H_{t} } \right)^{ * } = H\left( {\frac{{S_{t} + S_{t + 1} }}{2}} \right) $$
    (23)
  2. (ii)

    To minimize water shortages (OF 2): To ensure equal sharing of water resources in every operational zone, the shortage of water supply can be defined as

    $$ OF_{2} \left( X \right) = \sum\limits_{t = 1}^{12} {\left\{ {\frac{1}{7}\sum\limits_{i = 1}^{7} {\left[ {\frac{{\sum\nolimits_{j = 1}^{4} {\left( {DW_{ij}^{t} - x_{ij}^{t} } \right)} }}{{NPO_{i} }}} \right]} } \right\}} , $$
    (24)

    where OF 2 is the objective function 2; DW ij is the water demand of sector j in the ith operational zone (determined externally); NPO i is the number of population in the ith operational zone.

  3. (iii)

    To minimize the amount of organic pollutants in water (OF 3): the objective function 3 is defined as the total amount of organic pollutants (represented as chemical oxygen demand, COD) in water of the whole basin and is given as

    $$ OF_{3} \left( X \right) = \sum\limits_{t = 1}^{12} {\sum\limits_{i = 1}^{7} {\sum\limits_{j = 1}^{4} {\left( {x_{ij}^{t} * PP_{ij} } \right),} } } $$
    (25)

    where OF 3 is objective function 3; PP ij is the parameter which can be determined using the Eq. 26.

    $$ PP_{ij} = \frac{{PW_{ij} }}{{SW_{ij} }}, $$
    (26)

    where PW ij is the amount of releasing organic pollutants and SW ij is the volume of supplied water from sector j in the ith operational zone in a typical year.

4.2.2 System constraints

The water balance of each node, and in each reservoir in the whole basin, is considered in the system constraints.

  1. (i)

    Water balance equation in the ith operational zone

    $$ Q_{i,t} = Q_{i - 1,t} + \sum\limits_{k = 1}^{3} {RR_{k,i} * \left( {O_{k,t} + RP_{k}^{t} } \right)} + R_{i,t} + \sum\limits_{j = 1}^{4} {\left( {x_{i,j} * ret_{i,j} } \right)} - U_{i,t} - \sum\limits_{j = 1}^{4} {\left( {x_{i,j} } \right)} - TW_{i,t} , $$
    (27)

    where Q i,t is the discharge of the ith operational zone in the tth month; Q i-1,t is the discharge from the (i − 1)th zone; O k,t is the release during time period t from the kth reservoir; RR k,i is the water connection between the ith operational zone and the kth reservoir; R i,t is the sum of the water yield produced in the watersheds located up to the ith zone; ret i,j is the return flow coefficient (dimensionless), 0 ≤ ret i,j  ≤ 1; U i,t is the water loss including evaporation, seepage loss and conveyances loss; TW i,t is the amount of water transferred out of the basin.

  2. (ii)

    Environmental and ecological water requirement of river system

    Some water flows should be retained in the rivers to maintain ecological and environmental conditions. That is,

    $$ Q_{i,t} \ge DEE_{i,t} $$
    (28)

    where DEE i,t is the environmental and ecological water requirement of river system which can be calculated with Tennant method (Tennant 1976), IFIM (instream flow incremental methodology) (Reiser et al. 1989), flow duration exceedance percentiles (Smakhtin 2001), the wetted perimeter method (Gippel and Stewardson 1998), among others.

  3. (iii)

    Reservoir constraints

    1. (a)

      The continuity equation for the reservoirs can be written as the following:

      $$ S_{t + 1} = S_{t} + I_{t} - O_{t} - RP^{t} - E_{t} $$
      (29)
    2. (b)

      Storage constraints based on physical limits

      $$ S_{\min ,t} \le S_{t} \le S_{\max ,t} $$
      (30)
    3. (c)

      Constraints on the power generation

      $$ CPOW_{\min } \le \left( {KA * RP * H} \right) \le CPOW_{\max } $$
      (31)
    4. (d)

      Constraints on the releases for energy production

      $$ RP_{\min } \le RP^{t} \le RP_{\max } , $$
      (32)

      where S t is the beginning storage; S t+1 is the ending storage; I t is the inflow into reservoir during time period t; O t is the spilled water from the reservoir during time period t; E t is the evaporation loss during time period t; S min,t is the minimum storage; S max,t is the maximum storage; CPOW min and CPOW max are the minimum power generation and the installed power generation capacity for the reservoir; RP min and RP max are the minimum and maximum releases for the reservoir (Table 4); obviously RP max is related to the installed power generation capacity CPOW while RP min is related to the minimum required release downstream determined by Eq. 27.

  4. (iv)

    Water demand and non-negativity constraints

    $$ 0 \le x_{i,j}^{t} \le DW_{ij} $$
    (33)
    Table 4 The basic characteristics of the reservoir and power plant in water system

4.3 Input data

Data required to run the water allocation model include the available water resource, the reservoir and power plant properties, the operational rules of reservoirs, water loss of reservoirs, and water demand. These data types are mixed, including experimental, field observations, statistical, and empirically estimated.

The population and the volume of water demand in every operation zone, according to the Dongjiang River water allocation scheme plan for 2010, are shown in Table 5. The historic inflow data from 1956 to 2005 in all operational zones and reservoirs were available. The environmental and ecological water requirements of river system at Boluo, Heyuan and Qilinzui gauging stations were determined by the average value from the results of Tennant, IFIM, flow duration exceedance percentiles and the wetted perimeter methods (shown in Table 6).

Table 5 Population and water demand by sectors in Dongjiang River basin
Table 6 The environmental and ecological water requirement of river system DEE it (m3/s)

The net economic return (NER) per water unit from agriculture or industry was calculated as water use benefit divided by the volume of water demand. As domestic water use is the most important sector, its NER was taken as the maximum of NERs of agriculture and industry. The NER from ecology sector was difficult to quantify. In our proposed model, the environmental and ecological water requirement of river system needs to be satisfied first before the other water allocations. Following some researches and water resources administrators works (see, for example, Babel et al. 2005; Liu et al. 2010), we use the average of all other sectors as the NER from ecology sector in this article for the illustration (shown in Table 7).

Table 7 The net economic return per unit volume of water NER ij (US$/m3)

In the implementation of MEMOIA for our water allocation model, narchive was set to be 150 and ρ 0.5.

5 Results and discussions

Figure 8 presents the trade-off curves among the economic interest (OF 1), water shortage (OF 2) and the amount of COD (OF 3) objectives obtained with MEMOIA. The Pareto frontier was populated with Pareto optimal points which were obtained after 5,000 generations. Table 8 listed the minimum or maximum values on the Pareto frontier for objectives and the hydropower generation from every reservoir. The maximum economic interest was approximately US $401.54 billion when water shortages were 80.21 m3/capita year and the amount of COD were 13.09 × 104 tons per year (Scenario A). The minimum water shortages value was approximately 70.9 m3/capita year when the economic interest was US $400.37 billion and the amount of COD were 13.02 × 104 tons/year (Scenario B). But when the minimum amount of COD were approximately 9.0 × 104 tons/year, the maximum economic interest was not more than US $376 billion and water shortages was more than 110 m3/capita year (Scenario C). The results of water allocation of each scenario from MEMOIA were shown in Table 9. From the water allocation results, we can see that Huizhou would lose the most quantity water in every Scenario. However, if the quantity of water demand in every zone was considered, the biggest proportional water shortage would happen in Heyuan no matter which Scenario was adopted. From the aspects of sectoral water demand, agriculture sector in Huizhou lose the maximum water quantity in Scenario A, B and C. The reason might be that the agriculture sectors demand more water in these two zones, and the NER in agriculture sectors was lower.

Fig. 8
figure 8

The Pareto frontier for the three-objective water allocation model from MEMOIA

Table 8 Maximum and minimum values (with the significance of bold) on the Pareto frontier for each objective in water allocation model from MEMOIA and CMOIA
Table 9 Allocated water in every sector for Scenario A, B and C from MEMOIA (million m3)

In the problem of water optimal allocation, weighting technique or simultaneous compromise constraint technique is often used to convert the multi-objective decision-making problem into a single objective function (Babel et al. 2005), but it produces only a single Pareto optimum. In order to obtain a group of Pareto optimal solutions and compare with the results from MEMOIA, the constrained multi-objective optimization immune algorithm (CMOIA) proposed by Zhang (2007) is employed to solve the water allocation model. And the results, which are signed with the superscript and parenthesis, list in Table 8. MEMOIA takes more time at the same 5,000 generations, but the performance of MEMOIA is better than CMOIA’s within the reasonable amount of time.

In general, the less the water shortage is, the more the economic interest will be. However, the hydropower generation controlled by reservoirs affects the water shortage and economic interest. If more water is used to produce hydropower in flood periods, there could be a water shortage in dry seasons. Moreover, the net economic return per unit volume of water from each sector is different. Furthermore, the amounts of COD (OF 3) are affected by water supply. If the water shortage is low, a large amount of COD will be discharged into environment. Therefore, OF 1, OF 2 and OF 3 are interrelated and conflict with each other. Figure 9 has shown the relation of every pair of objectives on the Pareto frontier. Figure 9a shows that the relation between the economic interest (OF 1) and the water shortages (OF 2) is negative, indicating that economic interests and social concerns can be satisfied together if the water allocation and reservoir operations are optimal. Figure 9b shows the relation between the economic interest (OF 1) and the amount of COD (OF 3) is positive. Figure 9c shows that the relation between the water shortages (OF 2) and the amount of COD (OF 3) is negative. It can be seen that although increasing water supply could increase the economic interest and reduce water shortages, it could result in more pollutant input to the river, with the parameter PP ij being non-negligible. So the results of water allocation can provide the quantificable benefits or costs among different objectives, which are highly useful for water managers and professional in making decisions for allocating water among use sectors and different areas.

Fig. 9
figure 9

Relation between objectives of the Pareto frontier for the water allocation model: a the economic interest (OF 1) and the water shortages (OF 2), b the economic interest (OF 1) and the amount of COD (OF 3), c the water shortages (OF 2) and the amount of COD (OF 3)

In practice, water managements should make their own final decision based on the scientific advices. For example, if water managers have preference to economy, the water shortages or the amounts of organic pollutants will be suffered as a trade-off, and vice versa, as shown by Scenario A. If water managers want to know how the objectives affect each other when they make a decision, the results of water allocation can provide the quantitative relationship amongst them. So it can provide some flexible options for water managers.

6 Conclusions

In order to overcome the limitations of the conventional multi-objective optimization algorithm, a MEMOIA is developed by integrating features of the MA and immune systems. The proposed MEMOIA can reach a better diverse solution set and avoid premature convergence, which are the most fundamental requirements of multi-objective techniques. By using the benchmarking function (the Viennet benchmark problem), its performance is assessed against other similar algorithms, including NSGAII and SPEA2.

The proposed algorithm is applied to the problem of optimal allocation of water resources in Dongjiang Rover Basin which is highly complex due to the large number of interacting factors, nonlinearity and often unpredictable feedback. The results, which take into account interdependence among objectives, can be used to help water administrators make more appropriate decisions regarding water supply, hydropower generation and reservoir operation. The application of MEMOIA in the problem of optimal allocation of water resources also demonstrates the effectiveness of the proposed algorithm in solving optimal water allocation problems, comparing with CMOIA. The uncertain nature of water allocation model variables, water demand, the value attached to environmental flows and other parameters will also be taken into consideration in further research.