Introduction

Many rivers are significantly polluted all around the world. A primary reason is that all three major sources of pollution (industry, agriculture and domestic) are concentrated along the rivers. This is due primarily to the fact that rivers have traditionally been a convenient place for water withdrawal and wastewater discharge. Many researchers have proposed different methodologies for river water quality management (e.g., Burn 1989; Burn and Yulianti 2001; Chau 2006; Kannel et al. 2007; Nikoo et al. 2011, 2014; Saha et al. 2014; Babamaaji and Lee 2014; Liu et al. 2014; Alaya et al. 2014; Xu et al. 2014; Hernandez and Uddameri 2014; Ortolani 2014; Noh et al. 2014; May and Mazlan 2014). Environmental penalty functions can help environmental protection agencies find an effective remedy for river water quality problems. In other words, penalty functions can enforce dischargers to select treatment strategies which satisfy river water quality standards.

In the early days of environmental legislation, insignificant penalties were imposed on violators. These environmental penalties had little effect on corporations and individuals to comply with environmental regulations (Situ and Emmons 1999). Thus, the application of environmental penalties has experienced some rather dramatic changes over the past decades. More civil and criminal monetary penalties have been imposed on violators and the amount of the penalties has increased (Tomkins 2005).

Researchers have studied different aspects of imposing environmental penalties on violators. For example, Rousseau and Proost (2005) compared various combinations of regulatory instruments (emission standards, emission taxes, and technology standards) and enforcement instruments (civil fines, criminal fines, and transaction offers) and showed that the inclusion of information, enforcement and monitoring costs indeed would alter the relative cost efficiency of the different instruments. Wu (2009) studied why some firms violate environmental regulatory standards while others comply with them. He showed that different factors would drive decisions to violate or comply with an environmental regulation. Rousseau and Telle (2010) concentrated on emissions due to production and assumed that the damage function was perfectly known. They showed that the optimal fine does not exist typically. Non-existence occurred because of existence of some interactions between the economy, emissions and the environment. They argued that these interactions could not be accurately revealed in the fine imposed by the regulator.

Earnhart and Segerson (2012) analyzed the influence of financial status on the success of environmental enforcement. They considered several dimensions of both enforcement and financial status. As the most outstanding result, they showed empirically and theoretically that the amplified enforcement could actually lead to worse environmental performance.

Resolving the conflict of interests that arise among different dischargers locating along a river is one of the major goals in the river water quality and quantity management. Game theory is a worthwhile tool to resolve these conflicts. In other words, river games are decision making problems having multiple players whose choices impact one another. Cooperative, non-cooperative, bargaining, and evolutionary game theories are among the main branches of game theory. In the past years, several researchers have used game theory to resolve the conflicts among the stakeholders in river systems (Young et al. 1982; Lund and Palmer 1997; Dockner and Nishimura 1999; Loaiciga 2004; Salazar et al. 2007; Ganji et al. 2007; Niksokhan et al. 2009; Mahjouri and Bizhani-Manzar 2013). Wei et al. (2010) proposed some game theoretic simulation models to analyze water conflicts in a water transfer project in China. The main results of their study showed that the players would receive better payoffs through cooperation if financial incentives to cooperate were provided for them. Abed-Elmdoust and Kerachian (2012) developed a methodology called n-person iterated signaling game for river water quality management. They modeled the bargaining process among dischargers and developed perfect Bayesian equilibrium strategies for pollution control. Abed-Elmdoust and Kerachian (2014) proposed a methodology called n-person Bayesian static game for river quality management. They modeled the natural process of decision making among dischargers who would be penalized for any violation of water quality standards. Their methodology could provide the Bayesian Nash equilibrium of waste load allocation strategies.

Non-cooperation among stakeholders is the consequence of their self-optimizing viewpoint even when all parties benefit greatly from cooperation (Madani 2010). Thus, non-cooperative game theory is an efficient tool to achieve equilibrium solutions in river systems consisting of several stakeholders (dischargers) with conflicting objectives. However, its deficiency is that it cannot guarantee that the achieved equilibrium solutions will be stable over time. Introducing evolutionary game theory based on Darwin’s theory of natural selection, game theory entered into a new phase. From the perspective of evolutionary game theory, a superior strategy would be stable over time. Although evolutionary game theory has provided numerous insights to particular evolutionary questions, a growing number of scientists have become interested in evolutionary game theory in hopes that it will provide tools for addressing a number of deficiencies in the traditional game theory.

A brief literature review about the evolutionary game theory is presented in this section. Lewontin (1961) made the first effort to apply game theory to evolutionary biology based on Darwin’s theory of natural selection. Smith and Price (1973) played the most important role in expanding evolutionary game theory. They stated the concept of evolutionary stable strategy (ESS) with mathematical relationships. Axelrod (1987) utilized genetic algorithm (GA) as an evolution explorer to find evolutionary strategies in repeated dyadic Prisoner’s Dilemma game. He also investigated the characterizations of evolutionary strategies in constant and changing environments. The evolution of social behavior in some dyadic matrix games was also surveyed by Browning and Colman (2004) using a genetic algorithm. They showed that the type of evolved social behavior is decisively dependent on the strategic structure of the underlying interactions.

Fishman (2008) developed an evolutionary game theoretic model for analyzing economic conflicts of interests and provided an analytical framework for analyzing asymmetric games. Huang et al. (2009) presented a dyadic game model to resolve conflicts among water users. The payoff matrix of the game was determined according to a reward-punishment mechanism that assumed to be applied by a water resources management agency.

Grilo and Correia (2011) examined the effects of the update dynamics on symmetric dyadic evolutionary games to study the emergence of cooperation in populations of interacting agents. Wang et al. (2011) studied an evolutionary game between the government that controlled environmental pollution and the firms that generated contamination during production. They integrated evolutionary game theory and system dynamics to propose a new approach for decision making. They developed a novel penalty function to control the pollution rate to zero. This restraining penalty also stabilized the Nash equilibrium. Zhang et al. (2012a, b) proposed a dynamic evolutionary game model between secondary lead enterprises and environmental protection agencies to deal with environmental pollution as a major issue in China. Their study included some important factors such as costs and benefits of secondary lead enterprises, tax incentives and penalties of environmental protection agencies. Zhang et al. (2012a, b) studied path dependence in green construction technology change. They employed an evolutionary game theoretic model to find the dominant technique as the final result of the dynamic evolution process. Chatterjee et al. (2012) studied the evolutionary game dynamics in populations with different learners. In their research, they examined evolutionary game theory in an environment where individuals could learn from each other. They tried to make a connection between computational learning theory and evolutionary game dynamics. Han et al. (2012) analyzed equilibrium properties of evolutionary multi-player games with random payoff matrices. They showed that the probability of realizing a certain number of equilibria would change as the number of players and strategies increased.

In this study, evolutionary game theory is used to help an environmental protection agency find the most appropriate penalty functions which can encourage dischargers to choose treatment strategies that will satisfy river water quality standards.

In this paper, a new methodology is developed for determining the penalty functions which guarantee the river water quality by penalizing any discharger who violates the water quality standards. In this methodology, the interactions among pollution dischargers are modeled as an asymmetric evolutionary game which can provide the evolutionary stable treatment strategies of dischargers. The evolutionary stable treatment strategies are found based on different penalty functions and the results are analyzed. According to analysis of the results, the most appropriate penalty function which can guarantee the river water quality standards is gained. Moreover, due to uncertainty about the upstream condition of the river system, the most appropriate penalty functions are determined for the several scenarios defined based on different conditions of head water quality and quantity. The efficiency and applicability of the proposed methodology is demonstrated by applying it to the Zarjub River in the northern part of Iran. As the proposed methodology is based on the evolutionary game theory, in the following section, this theory is briefly explained.

Evolutionary game theory

Traditional game theory deals with economic behavior of players, which is assumed to be rational and selfish in a non-cooperative environment. In other words, in traditional non-cooperative game theory, the emphasis is on finding the best strategies for players to optimize their individual benefits. However, in evolutionary game theory, the emphasis is on searching for strategies which will remain stable over time while considering the economic behavior of players. In evolutionary game theory, the emphasis is shifted from individuals to populations of strategies and financial gain is replaced with Darwinian fitness. Moreover, in evolutionary game theoretic approach, the term “strategy” comes to refer to a heritable phenotype, and rational decision making in traditional game theory is replaced with evolutionary stability (Fishman 2008). In an evolutionary game, each player has a primary population of strategies which will be edited through the evolution process.

Evolutionary stable strategy

ESS is a strategy if chosen by a population; no other alternative strategy can invade it (Smith and Price 1973). In fact, like in traditional game theory that none of the players could reach a better payoff by changing their strategies from the equilibrium point (e.g., Nash equilibrium); no population (player) could improve its fitness by following strategies other than ESSs in evolutionary game theory. On the other hand, in evolutionary game theory, the size of the population (the number of individuals in the population which pursue a specific strategy) is of great importance in finding the ESSs.

ESS in asymmetric games

The game among pollution load dischargers in a river is asymmetric regarding the direction of river flow, different location of dischargers and their unequal shares from the assimilative capacity of the river.

In asymmetric games, players fill different roles with different payoff matrices. For example, in a two player game, there are two payoff matrices often written in one table as shown in Fig. 1. The numbers in lower left corners of the cells are the payoffs to the individual in role A, whereas the numbers in the upper right corners are the payoffs to role \(B\).

Fig. 1
figure 1

Payoff matrix of an asymmetric bimatrix game (Kisdi 2011)

For example, \(E_{A} (S_{2} ,S_{1} )\) denotes the payoff to someone in role \(A\) if he plays strategy \(S_{2}\) against an opponent who is in role \(B\) and plays strategy \(S_{1}\) (Kisdi 2011).

Consider a bimatrix game in which there are two strategies \(I_{1}\) and \(I_{2}\) for the player in role \(A\) and two strategies \(J_{1}\) and \(J_{2}\) for the player in role \(B\). Considering Eqs. 1 and 2, the fitness of individuals adopting strategies \(I_{1}\) and \(J_{1}\) can be estimated as follows:

$$W_{A} (I_{1} ) = P_{{J_{1} }} \,E_{A} (I_{1} ,J_{1} ) + P_{{J_{2} }} \,E_{A} (I_{1} ,J_{2} ) = \frac{{rE_{A} (I_{1} ,J_{1} ) + sE_{A} (I_{1} ,J_{2} )}}{r + s}$$
(1)
$$W_{B} (J_{1} ) = P_{{I_{1} }} \,E_{B} (J_{1} ,I_{1} ) + P_{{I_{2} }} \,E_{B} (J_{1} ,I_{2} ) = \frac{{r^{\prime}E_{B} (J_{1} ,I_{1} ) + s^{\prime}E_{B} (J_{1} ,I_{2} )}}{{r^{\prime} + s^{\prime}}}$$
(2)

where \(P_{{I_{1} }}\) and \(P_{{I_{2} }}\) are, respectively, frequencies of strategies \(I_{1}\) and \(I_{2}\) in the population related to player \(A\). Similarly, \(P_{{J_{1} }}\) and \(P_{{J_{2} }}\) are, respectively, frequencies of strategies \(J_{1}\) and \(J_{2}\) in the population related to player \(B\). \(r\) \((r^{\prime})\) is also the number of times strategy \(I_{1}\) \((J_{1} )\) faces strategy \(J_{1}\) \((I_{1} )\) and \(s\) \((s^{\prime})\) is the number of times strategy \(I_{1}\) \((J_{1} )\) faces strategy \(J_{2}\) \((I_{2} )\).

In convoluted real world asymmetric games, the evolutionary algorithms (such as GA) can be considered for simulating the concepts of evolutionary games to find ESSs (Axelrod 1987). Applying this useful solution for finding the ESS in a river water quality management is the main concern of the present paper.

Model framework

This paper presents a new methodology for developing environmental penalty functions in polluted rivers utilizing an n-person evolutionary asymmetric game. A flowchart of the methodology proposed for developing evolutionary stable treatment strategies in a river system and selecting the best penalty function is presented in Fig. 2. As shown in this figure, the proposed methodology consists of some main modules such as water quality simulation, calculation of wastewater treatment costs and penalty functions, and evolutionary game model.

Fig. 2
figure 2

Flowchart of the proposed methodology for developing evolutionary stable treatment strategies for waste load allocation in rivers (n = total number of dischargers)

The inputs of this methodology are quantity and quality of headwater, dischargers’ pollution loads, river water quality standards, and treatment cost data. The outputs of the model are the ESSs and the stable environmental penalty functions. In the proposed methodology, at first, the river assimilative capacity and the concentrations of water quality indicators at several points along the river system are calculated using a calibrated river water quality simulation model. The penalty functions of dischargers are then calculated using the compliance costs approach (for more information on CCA, see “Compliance costs approach (CCA)” and the water quality simulation model. Treatment cost functions are also calculated using the existing treatment cost data. At last, payoff matrices are calculated for all pollution load dischargers considering their different treatment strategies. These matrices will be used as an input of the evolutionary game module. This module includes several genetic algorithm-based evolutionary models. The evolutionary models are used for updating the population (archive) of strategies of each pollution load discharger. In each evolutionary model, at first, an archive of solutions is considered that contains initial treatment strategies for the respective discharger (role). These initial strategies can be selected randomly. The evolutionary model is run for a specific role and will find the more powerful solutions for that role while incorporating the strategy archives of the other dischargers. At the end of the evolutionary process of a specific role, an updated archive related to that role is achieved. This updated archive will be incorporated in the evolutionary models of the other roles as well. This process will continue until the best strategy in each population remains constant. The constant strategy of each population will be the evolutionary stable treatment strategy of the respective role. The quality condition of downstream checkpoint is specified based on the evolutionary stable treatment strategies that dischargers choose in practice. The environmental agency is able to anticipate the quality condition of the river system based on different penalty functions. Hence, it will be able to select and impose the best penalty function which will guarantee the river water quality standards. In the next sections, the main components of the methodology are described in more details.

Compliance costs approach

Given the objective of “guaranteeing river water quality” using a standard, it is well known that this can be obtained using an environmental penalty that is at least as high as the treatment costs necessary to comply with the standard. Thus, the penalty to obtain this goal depends on the compliance costs. In this paper, the CCA is used for estimating the penalty functions.

Micro genetic algorithm (MGA)

The inspiration for simulating thriving strategies in a population came from an artificial intelligence procedure developed by computer scientist, John Holland (Goldberg 1989). Holland’s technique is called the genetic algorithm. Using GA, one represents strategies as chromosomes. In this paper, MGA is used to solve the evolutionary models and update the archives of strategies. MGA is a GA with a small population. When after a number of generations, the best solution remains constant; the evolutionary algorithm is reinitialized by preserving the best individual and substituting the rest of the population with randomly generated solutions (Koumousis and Katsaras 2006). Krishnakumar (1989) showed that MGA can avoid trapping in local optimal solutions and performs better than a simple GA in solving multimodal optimization problems. Several successful applications of MGA have appeared in the literature (e.g., Coello and Pulido 2001; Xu and Liu 2002). The MGA used for this study is written using Visual FORTRAN language.

GA-based evolutionary game module

In the evolutionary game module, a GA model is run for each role (discharger). To incorporate the effect of other dischargers’ behavior, populations (population 1 for discharger 1, population 2 for discharger 2, …, population n for discharger n) of their strategies (i.e., their treatment levels) are also considered in each GA model. In other words, in each GA model which is run for a discharger, considering \(n\) pollution load dischargers, \(n - 1\) archives of strategies of other \(n - 1\) dischargers should also be taken into account. After running the GA model for a specific role, the population relevant to that role is updated. When the GA model relevant to each pollution load discharger is run, the resulting strategies in the last generation are considered as the updated population of strategies of that discharger. This updated population of strategies will be used in other dischargers’ GA models. The GA models of the dischargers are sequentially run to reach evolutionary stable treatment strategies of all dischargers. This iterative process is stopped when the best strategy in each population remains constant during several consecutive runs. A schematic view of the repeated dyadic games among individuals in two given populations is depicted in Fig. 3.

Fig. 3
figure 3

A schematic view of the repeated games among individuals in two given populations

The objective function of each GA model is minimizing the fitness function of strategies in the population relevant to the corresponding role.

In this paper, minimizing the total cost (treatment cost plus penalty cost) of strategies in the population of a specific role (while considering other dischargers’ decisions) is considered as the objective function of each GA model. For each strategy \(S_{i}\) in population \(j\), this function can be expressed as:

$${\text{Min}}({\text{Fitness}}_{ij} ) = {\text{Min}}({\text{Trcost}}_{ij} + {\text{Pencost}}_{ij} )$$
(3)
$${\text{Trcost}}_{ij} = f(S_{i} )$$
(4)

where \({\text{Fitness}}_{ij}\) is the fitness function of an individual adopting strategy \(S_{i}\) in population \(j\). Equation 4 states that the treatment cost of strategy \(S_{i}\) in population \(j\) (\({\text{Trcost}}_{ij}\)) is a function of treatment level \(S_{i}\). The penalty value of a strategy depends on itself and the existing strategies in other populations. For example, the penalty cost of strategy \(S_{i}\) in population j can be estimated using Eq. 5 as follows:

$$\begin{aligned} {\text{Pencost}}\,S_{ij} & = \sum\limits_{i1 = 1}^{m} {\sum\limits_{i2 = 1}^{m} { \ldots \sum\limits_{i(j - 1) = 1}^{m} {\sum\limits_{i(j + 1) = 1}^{m} { \cdots \sum\limits_{in = 1}^{m} {\left[ {P_{{S_{i1} S_{i2} \ldots S_{i(j - 1)} S_{i(j + 1)} \ldots S_{in} }} } \right.} } } } } \\&\left. {\quad \times E_{ij} (S_{ij} ,S_{i1} ,S_{i2} , \ldots ,S_{i(j - 1)} ,S_{i(j + 1)} , \ldots ,S_{in} )} \right]\quad i = 1, \, 2, \, \ldots , { }m \, ;\;j = 1, \, 2, \, \ldots ,n \\ \end{aligned}$$
(5)

where \(E_{ij} (S_{ij} ,S_{i1} ,S_{i2} , \ldots ,S_{i(j - 1)} ,S_{i(j + 1)} , \ldots ,S_{in} )\) denotes the amount of penalty allocated to strategy \(S_{i}\) against its opponents; strategy \(S_{i1}\) (strategy \(S_{i}\) in population 1), strategy \(S_{i2}\) (strategy \(S_{i}\) in population 2), …, and \(S_{in}\) (strategy \(S_{i}\) in population \(n\)). \(P_{{S_{i1} S_{i2} \ldots S_{i(j - 1)} S_{i(j + 1)} \ldots S_{in} }}\) are the frequency of strategy combination \(S_{i1} S_{i2} \ldots S_{i(j - 1)} S_{i(j + 1)} \ldots S_{in}\) after all repeated games in each round which is calculated as follows:

$$P_{{S_{i1} S_{i2} \ldots S_{i(j - 1)} S_{i(j + 1)} \ldots S_{in} }} = \frac{{r_{{S_{i1} S_{i2} \ldots S_{i(j - 1)} S_{i(j + 1)} \ldots S_{in} }} }}{{\sum\limits_{i1 = 1}^{m} {\sum\limits_{i2 = 1}^{m} { \cdots \sum\limits_{i(j - 1) = 1}^{m} {\sum\limits_{i(j + 1) = 1}^{m} { \cdots \sum\limits_{in = 1}^{m} {r_{{S_{i1} S_{i2} \ldots S_{i(j - 1)} S_{i(j + 1)} \ldots S_{in} }} } } } } } }}$$
(6)

where \(r_{{S_{i1} S_{i2} \ldots S_{i(j - 1)} S_{i(j + 1)} \ldots S_{in} }}\) is the number of times strategy \(S_{i}\) faces strategy combination \(S_{i1} S_{i2} \ldots S_{i(j - 1)} S_{i(j + 1)} \ldots S_{in}\). It is assumed that there are \(m\) strategies \(S_{1}S_{2} ,\ldots,S_{m}\) in each population. Other frequencies can be calculated in a similar way.

A traditional waste load allocation model

Minimizing the total treatment and penalty cost of the system can be considered as objective function of the traditional waste load allocation model:

$${\text{Min}}\sum\limits_{i = 1}^{n} {{\text{Totcost}}_{i} }$$
(7)

where \({\text{Totcost}}_{i}\) is the summation of treatment and penalty costs of discharger \(i\) and \(n\) is the number of pollution dischargers along the river system.

In this paper, the results of the proposed methodology are compared with the ones provided by the traditional waste load allocation model.

Case study

To evaluate the applicability of the proposed methodology, it is applied to the Zarjub River in the Rasht Region in Iran. Wastewater and storm water in this region are gathered by a combined wastewater and storm drainage collection network, which discharges its wastewater into the Zarjub River. The methodology is applied to a reach of the river which passes through the Rasht city and its suburb (Fig. 4). This reach is divided into seven zones so that there is only one pollution load discharger in each zone. As a matter of concern, point sources are the most significant pollution sources in the study area. The main characteristics of the pollution loads and the headwater quantity and quality are presented in Table 1.

Fig. 4
figure 4

The Zarjub River system and its dischargers in the study area (Mesbah et al. 2009)

Table 1 The average quantity and quality of the upstream flow and pollution loads in the Zarjub River (Mesbah et al. 2009)

Existing water quality data show that the dissolved oxygen (DO) concentration in the Zarjub River significantly violates the standard level (Mesbah et al. 2010). In this study, the simulation of biological oxygen demand (BOD) and DO concentrations in the Zarjub River is carried out using a water quality simulation model calibrated and verified by Mesbah et al. (2009).

To calculate treatment cost functions, it is assumed that some aerated lagoons are used for treating domestic wastewater. Based on the existing data in Iran, the construction cost of an aerated lagoon is estimated to be 140 Dollars per capita, and its annual operational cost is about 4 % of its construction cost. This operational cost is required for reducing BOD concentration of wastewater from 250 to 10 mg/L. The rate of wastewater disposal in the study area is about 144 L/day per capita.

We assume that there is only one river water quality monitoring station downstream of the river (based on the real situation in Iran, it is an appropriate assumption). As a consequence, we can consider the situation as a game between the waste dischargers that is characterized as an n-person prisoner’s dilemma. It is worth noting that in Iran, water quality in rivers is protected by Iran Department of Environment (IDOE). IDOE monitors the river water quality at its monitoring stations. This department does not have any waste load allocation program and only assigns a minimum treatment level to dischargers based on Iran Wastewater Discharge Standard. IDOE sets a penalty function for each river for protecting river water quality and penalizes dischargers when a water quality violation is detected. Therefore, there is not a predefined treatment strategy for a discharger in Iran. On the other hand, IDOE has some limited monitoring stations for monitoring quality of a river and this fact allows dischargers to have strategic behaviors. Therefore, game theory can be used to find a penalty function which can protect river water quality considering the non-cooperative behavior of dischargers. In this paper, the results of the proposed game theoretic model are also compared with those of a least-cost optimization model. To estimate the operational treatment cost function of each role (discharger), the following linear function is used:

$$f_{i} = c_{i} t_{i}$$
(8)

where \(f_{i}\) is the annual operational treatment cost of discharger \(i\) (1,000 $) and \(t_{i}\) is the treatment level of discharger \(i\) \((0 \le t_{i} \le 1)\). \(c_{i}\) is a constant coefficient. The estimated values for coefficient \(c_{i}\) are presented in Table 2. Taking into account that the treatment plants are constructed by government in Iran, we therefore assume that the government supply treatment plants to dischargers and they only need to pay operational costs of treatment plants in our case study.

Table 2 Coefficients of the operational treatment cost functions of dischargers

Penalty function is defined for the case where the downstream dissolved oxygen concentration decreases below the minimum acceptable level. The CCA is utilized to calculate penalty functions. Regarding this method, the amount of penalty allocated to dischargers is equal to the excess cost for extra treatment to reach the standard level of DO concentration in the river. In this method, to calculate the penalty function based on the DO concentration at the monitoring station of the environmental protection agency, the extra treatment costs are calculated for different combinations of treatment levels of dischargers. Note that there are different situations in which the downstream DO will be equal to zero. These situations are different regarding the BOD concentration. In fact, in these situations, DO concentrations cannot be used as a basis for estimating the penalty value anymore and a penalty function should be estimated considering the BOD concentration. So when DO concentration is between 0 and 3.5 mg/L, the penalties are calculated based on the DO concentration, and when DO concentration is equal to zero, an extra penalty is calculated regarding the BOD concentration. Therefore, the penalty functions have the following general form:

$$P_{i} = \left\{ {\begin{array}{*{20}c} {a_{i} x} \\ {b_{i} x + c_{i} } \\ {d_{i} y + e_{i} } \\ \end{array} } \right.\,\,\,\,\begin{array}{*{20}c} {0 \le x < m} \\ {m \le x < 3.5} \\ {x = 3.5} \\ \end{array} \,\,$$
(9)

where \(P_{i}\) is the penalty function for scenario \(i\) of the quantity and quality of headwater (1,000 $). \(x\) and \(y\) are, respectively, the DO violation and BOD concentration in the water quality checkpoint (mg/L). We also assume that \(a_{i}\), \(b_{i}\), \(c_{i}\), \(d_{i}\), \(e_{i}\) and \(m\) are constant coefficients. The estimated amounts for these coefficients for various scenarios of quantity and quality of headwater are revealed in Table 3. In the penalty function estimation, it is assumed that the DO concentration of the treated wastewater is 3.5 mg/L. It is also assumed that all dischargers pay equal penalties for a water quality violation from the standards.

Table 3 Coefficients of the penalty functions for various quantitative and qualitative conditions of the headwater

Results and discussion

In this paper, nine different scenarios of quantity and quality of headwater are considered (Table 4). The payoff matrices of pollution load dischargers are calculated based on the players’ treatment strategies and the quantity and quality of upstream flow. Payoff matrix is an n-dimensional matrix where \(n\) is the number of pollution load dischargers. Each element of this matrix shows the total cost of each discharger (i.e., the treatment cost plus the penalty cost) which is calculated based on the treatment strategies that dischargers have chosen and the resulting received penalty from the environmental protection agency.

Table 4 Quantity and quality condition of headwater in different scenarios

In the proposed evolutionary game theoretic model, it is assumed that there are four treatment strategies that dischargers can choose: 0, 30, 60 and 90 % reduction in the BOD concentration of their effluents. The game is accurately assumed to be asymmetric regarding the direction of river flow. Actually, upstream and downstream dischargers play asymmetric games since they take unequal share of the common resource (river).

The proposed model is used to determine the evolutionary treatment strategies of the dischargers in the Zarjub River system for the nine defined scenarios. Other important assumptions made in the model include:

  • Monitoring ambient water quality and imposing a penalty are costless and are done on a continuous basis.

  • The strategy set for each firm and also its treatment/abatement costs are perfectly known by the regulator and by the other firms (game of complete information).

The evolutionary stable wastewater treatment levels of dischargers are presented in Table 5.

Table 5 Evolutionary stable wastewater treatment levels of the dischargers (%)

Treatment strategies in Table 5 are ESSs because they could resist the invasion of other strategies and get greater fitness. Any other strategies rather than ESSs would meet lower fitness. For example, assuming that the stable strategies in the populations one to six to be fixed for the first scenario, if an individual in population seven adopts a treatment policy of 60 % in place of 90 %, his total cost will increase from 11.21000 to 11.33000 Dollars. As shown in Table 5, for all scenarios, some dischargers learn over time that it will be more advantageous to choose higher treatment levels to lower their total penalty. For example, since the sixth discharger has a considerable pollution load, to decrease his total cost, he decides to choose high treatment levels in different scenarios. Consequently, high treatment levels become prevalent in population 6. Similarly, high treatment levels become prevalent in population 7 noting that discharger in role 7 is located downstream of the river, where the quality of the water is usually low. The variations of the DO concentration along the Zarjub River based on the evolutionary stable treatment strategies presented in Table 5 are depicted in Fig. 5.

Fig. 5
figure 5

Variations of the DO concentration along the Zarjub River based on the evolutionary stable treatment strategies of dischargers and original penalty functions

The amounts of the penalty and the total cost of different dischargers (roles) under various scenarios of quantity and quality of headwater are presented in Table 6.

Table 6 Penalty and total cost of different roles, respectively, for various scenarios of quantity and quality of upstream flow (1,000 $)

The results presented in Fig. 5 and Table 6 show that the considered penalty functions cannot force the pollution load dischargers to treat their wastewater in a way that the river water quality standards are met. Therefore, by applying a safety factor to the penalty functions, the amounts of penalties are increased and the pollution load dischargers will receive more penalties if DO concentration in the monitoring station of environmental protection agency violates the standard level. By applying an appropriate safety factor, higher treatment levels would prevail in different roles to avoid high penalties. The evolutionary stable treatment strategies in different roles corresponding to minimum safety factors are presented in Table 7 (a minimum safety factor is the lowest one which would satisfy the standard river water quality along the river).

Table 7 Evolutionary stable wastewater treatment levels of different roles considering safety factors for penalty functions (%)

The variations of the DO concentration along the Zarjub River considering the modified treatment strategies for various scenarios of quantity and quality of upstream flow are depicted in Fig. 6. The amounts of the total cost of different roles based on the selected safety factors and the revised treatment strategies are presented in Table 8.

Fig. 6
figure 6

Variations of the DO concentration along the Zarjub River based on the revised penalty functions

Table 8 Total cost of dischargers based on the results of proposed methodology in the case of applying safety factors

In this paper, the results of the proposed algorithm are compared with the results of a traditional waste load allocation model. In the traditional waste load allocation model, the total treatment cost is minimized while the water quality standards are met. The optimal treatment strategies of dischargers, obtained using the traditional waste load allocation model, are presented in Table 9. The total cost of each discharger, based on the results of the traditional waste load allocation model, is given in Table 10. In many cases, the optimal treatment strategies obtained by the traditional waste load allocation method (Table 9) are not the equilibrium solutions of the game. For example, assuming others’ strategies to be constant, by a 30 % decrease in the treatment level of discharger 2 in scenario 3, the amount of the total cost for this discharger will decrease from 0.15000 to 0.11000 Dollars.

Table 9 Optimal wastewater treatment levels of dischargers obtained using the traditional waste load allocation model (%)
Table 10 Total cost of dischargers based on the results of the traditional waste load allocation model

Comparing the results of the models presented in Tables 8 and 10 shows that the least-cost optimization model works well and the total cost of dischargers based on the results of this model is less than that obtained using the game theory-based model. However, the least-cost treatment strategy is not implemented by dischargers in practice because they act non-cooperatively. In other words, the least-cost treatment strategy is acceptable when dischargers participate in a grand coalition. The proposed methodology gives an evolutionary stable treatment strategy which dischargers have consensus to implement it. Therefore, it is suggested that the IDOE sets the penalty functions using the game theoretic waste load allocation model.

Summary and conclusion

In this paper, using the concept of evolutionary game theory, a new methodology was proposed for estimating the necessary penalty functions for river water quality management. We developed an asymmetric evolutionary game model to determine evolutionary stable strategies (the dischargers’ treatment strategies) incorporating a river water quality simulation model and MGA considering the river assimilative capacity. The penalty functions were determined in a way that they minimize the possibility of violating the water quality standards. The applicability of the proposed methodology was examined by using the available water quantity and quality data from the Zarjub River in Iran. In an evolutionary environment of the game in rivers, any discharger tries to find the most stable treatment strategy considering the behavior of other dischargers. It was also illustrated that while the environmental protection agency determines the penalties using the compliance costs analysis, there is still a high possibility of violating the water quality standards. To solve this problem, a safety factor was applied to environmental penalties so that the corresponding treatment strategies would satisfy the water quality standards. The resulted penalty functions can be used by environmental protection agency to carry out the river water quality management more efficiently. Moreover, it was shown that the results of traditional models of pollution load allocation may not provide stable treatment strategies in a non-cooperative environment. The methodology presented in this paper can be easily applied for water quality management in rivers with several pollution load dischargers. In future works, the methodology can be extended to incorporate the uncertainties in pollution loads and decay rates. Moreover, the total amount of penalty which should be allocated to dischargers can be divided among them in proportion to their pollution loads.