Introduction

Water quality is one of the important issues in water resources management. In broad terms, water quality can be classified into three broad categories, namely physical, chemical and biological and each category has a number of parameters (Swamee and Tyagee 2007). The assessment of these three categories by field monitoring of rivers provides basic data for detecting trends, for providing water quality information to water authorities, and for making recommendations for future actions. This assessment is usually conducted by referring to natural water quality, human health and intended uses (Pesce and Wunderlin 2000; Gazzaz et al. 2012). In fact, monitoring all parameters with different sources of pollution entering a river basin is laborious and expensive. Moreover, many scientists and researchers have difficulty in defining water quality and presenting it in a simple and consolidated way. This difficulty exists due to the complexity of factors or parameters affecting water quality, and the large variability of parameters used to describe the water quality status of water bodies (Chapman 1992). This has led to many extensive attempts to present the state of water quality in simply ways without losing its scientific basis.

A water quality index (WQI) is a single dimensionless number expressing the water quality in a simple form by aggregating the measurements of selected parameters. A WQI has been proposed as early as in 1965, to define the state of water quality in a river (Horton 1965). Considering the easiness of their use and the scientific basis, WQIs have become important and popular tools in assessing the water quality of water bodies worldwide, particularly of rivers. Since the birth of the concept of WQI, various indices have been formulated and developed by many researchers. WQIs have also been considered as a pivotal component of the wider environmental or natural resource indices such as the Environmental Performance Index (EPI 2010) and the Stream Index (Ladson et al. 1999).

The general structure of a WQI is presented in Fig. 1. As can be seen in the figure, a WQI consists of a number of water quality parameters, which are transformed to a common scale. Such transformations are carried out since the monitored water quality data have different units. These values of the parameters transformed to a common scale are known as sub-indices. After all the sub-indices are obtained, they are aggregated to form the final index value. As indicated in Fig. 1, the aggregation process may occur in two sequential stages, from the sub-indices to the aggregated sub-indices (if aggregated sub-indices exist) and then from the aggregated sub-indices to the final index. The final index will be interpreted to evaluate or assess the status of the water quality.

Fig 1
figure 1

General structure of an index

In general, the information gained from the WQIs can be used for the following purposes:

  1. a)

    To provide an overall status of water quality to the water authorities and the wider community (Ocampo-Duque et al. 2006)\

  2. b)

    To study impacts of regulatory policies and environmental programs on environmental quality (Swamee and Tyagi 2007)

  3. c)

    To compare the water quality of different sources and sites, without making highly technical assessment of the water quality data (Sarkar and Abbasi 2006)

  4. d)

    To assist policy makers and the public to avoid subjective assessments and subsequent biased opinions (Stambuk-Giljanovic 1999; 2003)

The use of river water quality indices as a tool to evaluate water quality status has been adopted by many organizations and agencies, but there is no worldwide accepted methodology in developing a WQI. On the basis of literature reviewed, all indices have their own strengths and weaknesses. There are a few studies that have reviewed the existing WQIs. Lumb et al. (2011a) reviewed the conceptual frameworks of various WQI models developed from the 1960s till 2010 and presented the importance of various WQIs, the steps used in their formulation and their current uses. They also presented future directions and noted a need to develop a universally applicable WQI model that is flexible enough to cut across the available data for assessing the water quality for different uses. Tyagi et al. (2013) reviewed four popular WQIs and presented their merits and demerits. However, there is no systematic and thorough review of existing WQIs in the literature to explore and assess the steps used in their development and bring out the advantages and disadvantages of different methods used in each step.

This paper reviews 30 WQIs developed and used in different countries across the world. The reviewed WQIs, the country or region where they were applied and the reports or papers that presented their application  are listed in Table 2 in the Appendix. The indices are reviewed on the basis of the following four steps that have been used in the past to develop a WQI (Abbasi and Abbasi 2012):

  1. 1.

    Selection of parameters

  2. 2.

    Obtaining sub-index values (transformation to a common scale)

  3. 3.

    Establishing weights

  4. 4.

    Aggregation of sub-indices to produce the final index

This paper presents the different methods employed in the reviewed indices for each of the above steps needed to develop a WQI. The advantages and disadvantages of the different methods used in each step are also discussed. Although 30 WQIs were reviewed, seven WQIs were identified as most important based on the popularity of their use. For these seven WQIs, the different steps used in their development and application are presented in detail.

The structure of this paper is as follows. Firstly, an overview of the 30 WQIs reviewed in this study is presented. Based on the 30 WQIs, details of the different methods used in each of the above-mentioned four steps needed to develop an index are then presented. This is followed by a detailed discussion on how the seven selected popular WQIs were developed and applied. Finally, recommendations for future research and conclusions are presented.

Overview of the reviewed WQIs

The 30 WQIs reviewed in this study (which are listed in the Appendix) are based on their applications in 66 journal articles, 30 reports from various government agencies and 4 conference papers. The applications for each of the 30 WQIs are also presented in the Appendix. The journals that contributed the maximum to this review are Environmental Monitoring and Assessment (19 papers), Water Research (8 papers), Journal of Environmental Engineering (4 papers), Environmental Management (4 papers), Ecological Indicators (4 papers) and Water Science and Technology (2 papers). The other journals had contributions of less than two papers each. The reviewed applications were published during the period 1987–2014, but it should be noted that the WQI may have been originally developed prior to 1987. In Table 2, the report or paper that presented the originally developed index is in italics.

Although all WQIs have a common overall structure, there were two main purposes in developing an index. These purposes can be either for general assessment of the water quality or for some specific uses. For all the reviewed WQIs, the purpose for which it was developed or has been applied is also provided in Table 2 (column 5). A general assessment aims to provide a glimpse of the water quality status, whereas specific uses are intended to fulfil “suitability” for certain uses (Smith 1990). As can be seen in Table 2, most of the WQIs aim to provide a general assessment of the river water quality status, whereas a few WQIs also consider specific uses such as suitability for drinking water supply, irrigation, bathing, aquaculture, forestry related activities and recreational uses.

It is worth mentioning here that the reviewed WQIs are based predominantly on physical and chemical parameters, and only a few WQIs have faecal coliform as an indicator for assessing the suitability of the river for recreational use. A state-of-the-art review of WQIs based on bioassessment has been presented in Abbasi and Abbasi (2011). It should also be noted that some of the WQI applications have adopted an originally developed WQI as it is or have modified a previous WQI so as to make it more suitable for a particular region or for a particular purpose. These modifications were typically made by using different water quality parameters or by applying different types of aggregation methods.

Steps in developing a water quality index

As mentioned earlier, there are in general four steps undertaken for the development of a WQI. Table 1 presents specific details about each of these four steps for all 30 WQIs reviewed in this study. Some studies had considered all steps to establish their indices, while a few others considered only certain steps in the development of the WQI. Of the four steps, 1, 2 and 4 are essential for all WQIs, whereas step 3 (which is the establishing of weights) was not used in some indices (i.e. they used equal weights). Details of these steps, including the different methods used under each step are discussed in this section.

Table 1 Specific details regarding the four steps needed to develop a WQI for all the reviewed WQIs

Selection of parameters

Parameter selection is an essential step in the development of an index as the selected parameters are the main constituents of a WQI. The indices have different number of selected parameters, ranging from four (in Ross 1977; The River Ganga Index of Ved Prakash et al. as cited in Abbasi and Abbasi 2012) to 26 (in Diljido et al. 1994). With regard to the type of system used for the selection of parameters, they can be divided into three categories, viz. fixed, open and mixed systems. These three systems are discussed below:

  1. 1.

    Fixed system: The majority of WQIs reviewed have used a fixed set of parameters (e.g. Brown et al. 1970; Prati et al. 1971; Scottish Research Development Department (SRDD) 1976; Ross 1977; Dunnette 1979; House 1986; Cude 2001, Department of the Environment (DoE) Malaysia 2002; Hallock 2002; Liou et al. 2004; Said et al. 2004; Almeida et al. 2012). Consequently, the user can only utilize the selected parameters for final index calculation. Although using the same set of parameters will allow the user to have a better comparison of water quality status among sites or among rivers, this will create a common problem in index application called “rigidity”. Rigidity is manifested when necessity arises for additional important variables to be included in an index to address specific water quality concerns, but the user cannot add the new parameters needed for the future index application (Swamee and Tyagi 2007).

  2. 2.

    Open system: Some WQIs recommend the use of a minimum number of so-called basic parameters based on their characteristics [Ministry of the Environment of Indonesia (MoEI) 2003; CCME 2001] and also based on their impacts on the environment (Oudin et al. 1999). The basic parameters are a fixed set of parameters that should always be in the final index calculation as they are the most significant parameters for water quality evaluation in that site or region (Dojlido et al. 1994). On the other hand, some WQIs (e.g. Harkins 1974) do not provide any guidelines at all for the selection of parameters. Application of such WQIs might vary from one place to another because not only are the parameters not specified, but the maximum number of selected parameters in the final index calculation is also not specified. Thus, in the application of such WQIs, the users are able to incorporate as many parameters from the list of potential parameters. Such flexibility has the advantage that it will avoid rigidity (Swamee and Tyagi 2007). However, not having a fixed set of parameters poses critical issues such as difficulty in making comparisons among monitored sites and among river basins (Terrado et al. 2010).

  3. 3.

    Mixed system: The mixed system consists of the basic as well as additional parameters. Additional parameters are used in the final index calculation only if one of the additional parameters has a greater sub-index value than the aggregated index value based on the basic parameters. In this case, the final aggregated index value should be recalculated by adding or considering those additional parameters having greater sub-index values (Dojlido et al. 1994). These additional parameters are usually less monitored, particularly toxic parameters (Hanh et al. 2011).

The selection of parameters, in particular for the fixed and mixed systems, aim to select the parameters which have the greatest influence on water quality of the river. However, Abbasi and Abbasi (2012) accentuate that there is no method by which 100 % objectivity or accuracy can be achieved in the selection of parameters. In general, in the design of a WQI, an initial set of the water quality parameters is decided through the following:

  1. a)

    A literature review (Said et al. 2004; Pesce and Wunderlin 2000; Kannel et al. 2007)

  2. b)

    Data availability (Cude 2001)

  3. c)

    Redundancy of parameters (parameters that have similar properties need not be considered) (Dunnette 1979)

  4. d)

    Parameters should represent the overall water quality status (Dunnette 1979; Hanh et al. 2011)

  5. e)

    The intended use of the water body (Prati et al. 1971; Stoner 1978; Smith 1990; Hurley et al. 2012)

To minimize subjectivity and uncertainty in this step, the initial set (decided based on the above criteria) is usually refined through two methods, namely expert judgement and statistical methods, which are discussed below:

Expert judgement

One of the challenges in many WQIs is the selection of significant parameters to be included in the final aggregation of the index. The initial set of selected water quality parameters involves a great deal of subjective assessment of the index developers. To deal with this, the involvement of expert judgement has been applied to reduce the uncertainty and inaccuracy in selecting the significant parameters.

In general, expert judgement can be incorporated in the selection of parameters through three approaches, namely individual interviews, interactive groups and the Delphi method (Meyer and Booker 1990). Of the three approaches, the Delphi method is the one that has been widely used for the selection of parameters (Juwana et al. 2010). This method aims to mine view or opinion from experts without having the experts to congregate at an agreed time and place (Delbecq et al. 1975). Linstone and Turoff (2002) define the Delphi method as follows:

…a method for structuring a group communication process so that the process is effective in allowing a group of individuals, as a whole, to deal with a complex problem (Linstone and Turoff 2002, p. 3).

There is an important pre-condition for the Delphi method that should be met before its implementation. The index developers should isolate the water quality experts from one another when they give their judgements and should also make their judgements anonymous. This aims to avoid some of the biasing effects, particularly due to interactions between experts. Such interactions could lead to dominant experts causing the other experts to agree to a judgement that they do not hold (Meyer and Booker 1990).

Application of this method often needs several rounds of questionnaires until convergence of experts’ opinion is achieved (Brown et al. 1970; SRDD 1976; Dunnette 1979; Dinius 1987; House 1989; Almeida et al. 2012). In the first questionnaire, the respondents are asked to rate a set of parameters for possible inclusion in the WQI. At this stage, they are also allowed to add new parameters that were not included in the questionnaire. In the second round, they are asked to review the results of the first questionnaire, including adding new parameters. The intention here is to introduce new parameters and initiate a lesser divergence of water quality experts’ opinion with respect to various parameters rated. These iterations can be continued until consensus on types and number of parameters is achieved.

Statistical methods

The other approach that is commonly used in the selection of significant parameters is the use of statistical methods, which include Pearson’s coefficient of correlation and principal component/factor analysis (PCA/PFA). Although this might be the most objective method for parameter selection, it is still subjective in the sense that these methods are ultimately dependent upon the data provided for the analysis (House 1986; Abbasi and Abbasi 2012).

Pearson’s coefficient of correlation is, in general, used to reduce the number of water quality parameters by eliminating some parameters which are highly correlated with the others. For example, Debels et al. (2005) eliminated ammonia and orthophosphate due to their high correlation with chemical oxygen demand (COD). The other statistical method, PCA/PFA, is often employed for grouping the parameters that have similar characteristics (Liou et al. 2004; Hanh et al. 2011) and to reduce number of parameters by selecting the parameters that explain most of the variance observed. Debels et al. (2005) and Koçer and Sevgili (2014) used PCA to cluster several parameters into “certain groups” and then removed some of them to develop a WQI with a minimum number of parameters. Gazzaz et al. (2012) employed the PFA to reduce number of water quality parameters by considering only parameters that exhibit large factor loadings for subsequent analysis.

Generation of sub-indices

This step aims to transform the water quality parameters into a common scale since the actual values of the parameters have their own different units; for example, ammonia nitrogen has the unit of milligram per litre, while turbidity is presented in nephelometric turbidity units (NTU). Further, the ranges of levels to which different parameters can occur vary greatly from parameter to parameter; for example, dissolved oxygen (DO) would rarely be beyond the range 0–12 mg/L, whereas sodium can be in the range 0–1000 mg/L or beyond (Abbasi and Abbasi 2012). In most of the WQIs, the parameters can only be aggregated when they have the same common scales; therefore, rescaling or standardizing to form sub-indices is necessary. A few WQIs do not consider this step. Instead of sub-indices, the actual values of the parameters are used in the final index aggregation. For example, CCME (2001) developed multivariate statistical procedure to aggregate the actual values of the parameters without transforming them into a common scale, whereas Said et al. (2004) proposed a specific mathematic equation used for directly aggregating the index, in which there is no need to standardize the parameters.

In some WQIs, particular parameter(s) are directly taken as individual sub-indices to be aggregated to a final index value. On the other hand, the individual sub-indices can also be further aggregated to form a bigger group of sub-indices, which are then aggregated to a final index value (often called composite or aggregated sub-indices). For example, Bhargava’s Index (Bhargava 1985) has four different aggregated sub-indices, viz. coliform, heavy metals, physical parameters and organic and inorganic sub-indices. The Status and Sustainability index (Oudin et al. 1999) has 12 different aggregated sub-indices, ranging from phosphorous matter to phytoplankton sub-indices.

In general, to obtain the sub-index values, the index developers establish sub-index functions or rating curves. Sub-index functions are mathematical relationships between actual values of parameters monitored and the sub-index values. The actual values of the parameters can be converted to sub-index values using the sub-index functions. A rating curve is a corresponding graph of the value of parameters (on x-axis) against the sub-index values (on y-axis). In most WQIs, different sub-index functions are used for computing the sub-index values of different parameters. These sub-index functions or rating curves can also be used interactively and thus help the index developers to define all parameters with dimensionless values within an identical range (i.e. 0–100 or 0–1). To establish the sub-index functions or rating curves of different parameters, there are three different methods that are commonly employed: (1) expert judgement, (2) use of the water quality standards and (3) statistical methods.

Expert judgement

Experts’ judgement can be used to develop sub-index functions or rating curves. In this approach, “key points” of rating curves are obtained using questionnaires. Similar to the selection of parameters for the WQI, the Delphi method is employed here also to have convergence of water experts’ opinion on sub-index values. Deininger (1980) explained that the experts are asked to draw (often manually) the rating curves based on their judgement to identify the level of water quality variation by the various possible measurements of the respective parameters. A set of rating curves were developed based on agreed key points from experts’ opinions. In many WQIs, such rating curves are then converted into linear or non-linear sub-index functions. Then, the index users generate the sub-index values through direct calculations by using the sub-index functions. Such an approach has been widely used in the development of various WQIs, such as the National Sanitation Foundation (NSF) Index (Brown et al. 1970), the Scottish Research Development Department (SRDD) index (SRDD 1976), Ross’ Index (Ross 1977), Oregon Index (Dunnette 1979), House’s Index (House 1986), and Almeida’s Index (Almeida et al. 2012).

Use of the water quality standards

Another approach to establish rating curves or sub-index functions is based on the permissible limits from the legislated standards, such as technical regulations, national water requirements and WHO standards or international directives. House (1986) explained that the use of water quality standards facilitates sub-division of sub-index values and provide more information for the users. In this approach, the key points defining rating curves or sub-index functions are obtained using the permissible limits for various levels of intended uses. On the basis of these, actual parameter values can be transformed into sub-index values through three methods, namely linear interpolation rescaling, categorical scaling and comparison with the permissible limits.

The linear interpolation rescaling is a method used to produce an identical range for sub-index values, usually 0–100 or 0–1 (Prati et al. 1971; House 1989; Bascarón 1979; Dojlido et al. 1994; Stambuk-Giljanovic 2003; Liou et al. 2004). The index developers established the rating curves based on drinkable water use (class 1), domestic water supply (class 2), irrigation (class 3), navigation (class 4) and wastewater (class 5), wherein the permissible limits for each class has different sub-index values. For example, the permissible limit for BOD5 is 4, 6, 15, 20 and 50 mg/L for class 1, 2, 3, 4 and 5, respectively. Those actual parameters are then converted into specific sub-indices, e.g. 100, 75, 50, 25 and 1, respectively. These pairs of data (i.e. 4:100, 6:75, 15:50, 20:25 and 50:1) based on the relationship between the permissible limits and the sub-index values are referred to as the key points of rating curves (Hanh et al. 2011). If actual parameters lie in between two classes, a simple linear interpolation is used to obtain their sub-index values. The permissible limits of upper and lower classes will be the maximum and minimum values. In this method, sub-index functions used to calculate the sub-index values use the following general equations:

$$ {S}_i={S}_1-\left[\left({S}_1-{S}_2\right)\left(\frac{x_i-{x}_1}{x_2-{x}_1}\right)\right] $$
(1)
$$ {S}_i={S}_1-\left[\left({S}_1-{S}_2\right)\left(\frac{x_1-{x}_i}{x_1-{x}_2}\right)\right] $$
(2)

where S i is ith sub-index value, S 1 and S 2 are the sub-index values for upper and lower class, respectively, and X 1 and X 2 are values of the permissible limits for upper and lower class. Equation (1) is used to generate sub-indices when a parameter has a decreasing level of water quality with an increase in actual parameter values (e.g. BOD5). On the other hand, Eq. (2) is used if a parameter has an increasing level of water quality with an increase in actual parameter values (e.g. DO).

The second method that transforms actual parameter values to sub-indices is the categorical scaling method. It is a method typically used for parameters assigned as constants wherein the values must be 0 or 1. If the concentration of a parameter is well above or exceeding the permissible limit, then the sub-index value will fall to 0. In contrast, the sub-index value will be 1 if the concentration is below the permissible limits (MoEI 2003; Liou et al. 2004). The general equation to generate sub-index values using this method is as follows:

$$ {S}_i=0,\kern0.5em \mathrm{if}\kern0.5em {X}_i\kern0.5em \mathrm{is}\kern0.5em \mathrm{well}\kern0.5em \mathrm{above}\kern0.5em \mathrm{the}\kern0.5em \mathrm{permissible}\kern0.5em \mathrm{limits} $$
(3)
$$ {S}_i=1,\kern0.5em \mathrm{if}\kern0.5em {X}_i\kern0.5em \mathrm{is}\kern0.5em \mathrm{well}\kern0.5em \mathrm{below}\kern0.5em \mathrm{the}\kern0.5em \mathrm{permissible}\kern0.5em \mathrm{limits} $$
(4)

where S i is the ith sub-index value and x i is the ith actual parameter value.

The last method to generate sub-indices is based on comparison of the actual value of the parameters with their permissible limits. The sub-index values range from 0 to 1, in accordance with the degree of water quality from worst to highest. Liou et al. (2004) defined the sub-index value in this approach as follows:

$$ {S}_i=\frac{x_i}{x_{\max }} $$
(5)

where S i is ith sub-index value, X i is the actual parameter value (mg/L) and X max is the maximum value of the permissible limit (mg/L).

Statistical analysis

This approach utilizes statistical characteristics (like the mean or various quantiles) of the historical data to obtain the key points for generating the rating curves. For example, Dunnette (1979) used arithmetic mean of actual parameter values of six monitoring stations during the years 1973–1975 in Willamette River in Oregon to correspond to sub-index values of 80 for BOD5, total solids, oxygen and nitrogen and 70 for faecal coliform (FC). Hallock (2002) developed rating curves of total phosphorous, total nutrients, turbidity and total suspended solids based on fitting sub-index values of 100, 80, 40 and 20 to actual parameter values of those parameters at the 10th, 80th, 95th and 99th percentiles, respectively.

Establishing weights

The weights are assigned to the parameters with regard to their relative importance and their influence on the final index value. In general, the weights of all parameters can be either equal or unequal. Equal weights are assigned if the parameters of an index are equally important, whereas if some parameters have greater or lesser importance than others, then unequal weights are assigned.

A few of the index developers used equal weights in the development of WQIs (e.g. Nemerow and Sumitomo 1970; Harkins 1974; Dojlido et al. 1994; Oudin et al. 1999; Cude 2001; CCME 2001; Hallock 2002; Hanh et al. 2011). These studies preferred equal weights to unequal weights since there were doubts related to subjectivity over experts’ opinion in reaching a convergence (as expert panel often give different weights to the same parameters) (Harkins 1974). Moreover, different weights could lead to sensitivity of the final index to the most heavily weighted parameter. For instance, in an index heavily weighted towards DO, high concentrations of faecal coliform may not be reflected in the final index value if DO concentration is near ideal. This characteristic (of high faecal coliforms not being reflected in the final index) may be desirable in water quality indices specific to the protection of aquatic life. However, for WQIs that are designed to communicate general status of water quality rather than the quality of water for any specific use, sensitivity to changes in each variable is more desirable than sensitivity to the most heavily weighted variable (Cude 2001).

In unequal weights, to avoid subjectivity of the index developers, parameter weights are given based on participatory-based approaches, which may involve the key stakeholders like water quality experts, policy makers or practitioners from environmental protection agencies of a certain region. Even though there are a few participatory-based approaches that are available to generate weights, only two methods have been widely used. These two methods are the Delphi method and the analytical hierarchy process (AHP). The other available participatory-based approaches such as budget allocation procedure (BAP) and the revised Simos’ procedure have been used to determine weights of indicators for indices other than WQIs (Kodikara et al. 2010).

The Delphi method has been commonly used for summing up individual expert opinions to establish parameter weights for various WQIs. Horton (1965) proposed weights for parameters as follows: one for four parameters (special conductivity, chlorides, alkalinity and carbon chloroform extract), two for one parameter (coliform) and four for three parameters (DO, sewerage treatment and pH). To minimize subjectivity and enhance credibility, this procedure for parameter weighting was then improved by Brown et al. (1970) through incorporating a large panel of water quality experts from the USA. They were asked to compare relative water quality using a scale of 1 (highest) to 5 (lowest). Arithmetic mean was calculated for the ratings of all experts’ opinion. Then, a temporary weight of 1.0 was assigned to the parameter which received the highest significance rating. All other temporary weights were obtained by dividing the highest rating by the individual mean rating. Each temporary weight was then divided by the sum of all the temporary weights to arrive at the final weight. Since then, the Delphi method has been used in many WQIs to generate the relative weights of the selected parameters. It should be noted that the total weight, which is the summation of weights of all the selected parameters, is 1 for most WQIs.

The AHP is the other method employed to gain expert’s judgement for assigning weights to the parameters. It is a mature and easy concept, which has been widely used in many other different fields. It allows the decision-makers to incorporate both quantitative and qualitative aspects in the decision-making processes. In this method, a weight assessment is performed through pair-wise comparison matrices, in which the respondents (experts or public) are required to give their preference by comparing several choices. The AHP method is very useful to determine the weights of either individual or aggregated parameters. Ocampo-Duque et al. (2006) employed the AHP for generating weights of five groups of similar parameters. Gazzaz et al. (2012) used the AHP for establishing weights that will be used in an artificial neural network (ANN) model for computing the WQI.

Index aggregation

Index aggregation is performed after the assignment of weights to obtain the final index value. Such an aggregation may occur in sequential stages if an index has aggregated sub-indices. In such cases, the aggregated sub-indices are again aggregated to obtain the final index value. The two most common aggregation methods for the sub-indices are the additive (arithmetic) and multiplicative (geometric) methods. It should also be noted that there are other modified versions of these two basic methods. The basic equations for additive aggregation with equal and unequal weights are presented in Eqs. (6) and (7), respectively.

$$ \mathrm{W}\mathrm{Q}\mathrm{I}={\displaystyle \sum_{i=1}^n{S}_i} $$
(6)
$$ \mathrm{W}\mathrm{Q}\mathrm{I}={\displaystyle \sum_{i=1}^n{S}_i{w}_i} $$
(7)

where WQI is the aggregated index, n is the number of sub-indices, w i is ith weight and S i is the ith sub-index. The weights (w i ) indicate the relative importance of S i . As can be seen in Table 1 (column 5), the additive method has been widely used to aggregate the sub-indices of various existing WQIs (e.g. Prati et al. 1971; Brown et al. 1970; SRDD 1976; Ross 1977; Bascarón 1979; Dunnette 1979; House 1989; Sargaonkar and Deshpande 2003). It offers simplicity wherein the final index value is calculated by the addition of the weighted sub-indices.

A few WQIs have also used modified additive methods that calculate the squared function of an aggregated index and then divide it by 100 (SRDD 1976; Bordalo et al. 2006; Carvalho et al. 2011), as shown by the following equations:

$$ \mathrm{W}\mathrm{Q}\mathrm{I}=\frac{1}{100}{\left({\displaystyle \sum_{i=1}^n{S}_i}\right)}^2 $$
(8)
$$ \mathrm{W}\mathrm{Q}\mathrm{I}=\frac{1}{100}{\left({\displaystyle \sum_{i=1}^n{S}_i{w}_i}\right)}^2 $$
(9)

where the symbols in Eqs. (8) and (9) are the same as those in Eqs. (6) and (7).

Bascarón (1979) proposed another modified version of the additive method for index aggregation, as shown in Eq. (10). In this version, the total values of final aggregation should be divided by the total weights of the selected parameters. Such an aggregation method has been adopted and modified further in some WQIs (e.g. Pesce and Wunderlin 2000; Debels et al. 2005; Abrahão et al. 2007; Sánchez et al. 2007; Koçer and Sevgili 2014).

$$ \mathrm{W}\mathrm{Q}\mathrm{I}=\frac{{\displaystyle {\sum}_{i=1}^n{C}_i{P}_i}}{{\displaystyle {\sum}_{i=1}^n{P}_i}} $$
(10)

where WQI is the aggregated index, n is number of parameters, C i is the sub-index value (called normalization factor in Bascarón index) and P i is the relative weight of each parameter. Details of the method used for calculating C i are presented later, when the Bascarón index is discussed in detail.

Although the additive method provides a simple way of index aggregation, this method creates the problem known as “eclipsing”, wherein the final index value does not represent the actual state of overall water quality as the lower values of one or some sub-indices are dominated by the higher values of other sub-indices or vice versa (Swamee and Tyagi 2000; Liou et al. 2004; Juwana et al. 2012). Smith (1990) also highlighted that this method would never produce a zero value of the final index albeit one of sub-indices is 0.

The other commonly used index aggregation method, namely the multiplicative method which is shown in Eqs. (11) and (12), was suggested by Brown (1973). Since then, this method has been adopted for final aggregation in many WQIs (e.g. Walski and Parker 1974; SRDD 1976; Bhargava 1985; Dinius 1987; Almeida et al. 2012).

$$ \mathrm{W}\mathrm{Q}\mathrm{I}={\displaystyle \prod_{1=1}^n{S}_i^{w_i}} $$
(11)
$$ \mathrm{W}\mathrm{Q}\mathrm{I}={\displaystyle \prod_{1=1}^n{S}_i^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$n$}\right.}} $$
(12)

where the symbols are the same as earlier and the sum of weights is equal to 1. When the weights in Eq. (11) are equal, then the equation takes the form presented in Eq. (12).

Although perfect substitutability and compensability among sub-indices do not arise in the multiplicative method (as these problems occur in the additive method), the multiplicative method still suffers from the eclipsing problem (Simth 1990; Swamee and Tyagi 2000; Juwana et al. 2012). Smith (1990) and Liou et al. (2004) showed that if one low water quality parameter exists, using the multiplicative method will lead to a low final aggregated index. As an extreme case, the final aggregated index value will be 0 if one of the parameters has a sub-index value of 0 (irrespective of other sub-index values). Furthermore, another ambiguity arises if variables’ weighing is very close to zero. It will lead to the weighted sub-index value being close to 1 (even though it has a high unweighted sub-index value). Such a situation in aggregation is referred to as the dichotomous sub-index problem (Ott 1978; Liou et al. 2004). Thus, the value of the sub-index gets transformed into either 0 or 1. To deal with these limitations, Smith (1990) proposed a minimum operator to aggregate sub-indices, which is defined by Eq. (13):

$$ \mathrm{W}\mathrm{Q}\mathrm{I}=\mathrm{M}\mathrm{i}\mathrm{n}\left({I}_1,{I}_2,\dots, {I}_n\right) $$
(13)

where I i is the sub-index value for the ith parameter and n is number of sub-indices.

The minimum operator aggregation addresses eclipsing and ambiguity in the aggregation process; however, this method fails to provide a composite picture of overall water quality (Swamee and Tyagi 2000). This aggregation method has been adopted by few indices (Oudin et al. 1999; Hèbert 2005).

Dojlido et al. (1994) proposed to use the harmonic mean of squares method to aggregate sub-indices of a WQI in order to deal with the eclipsing problem. Cude (2001) explained that this method allows the parameters that have low quality to impart the greatest influence on the water quality index and acknowledges that different water quality parameters will pose different significance to overall water quality at different times and locations. Nevertheless, Swamee and Tyagi (2000) highlighted that such an aggregation method suffers from the problem called “ambiguity”. Ambiguity exists where all the sub-indices are acceptable and yet the overall index is not. This may result in considering the overall water quality as unacceptable, although it actually is of acceptable quality. The equation for the square root of the harmonic mean of squares (of the sub-indices) aggregation is as follows:

$$ \mathrm{W}\mathrm{Q}\mathrm{I}-\sqrt{\frac{n}{{\displaystyle {\sum}_{i=1}^n\frac{1}{{S_i}^2}}}} $$
(14)

where the symbols are the same as those used earlier. In this aggregation method, it is assumed that all S i values are non-zero and if any S i value is zero, the WQI will be taken as zero.

To avoid the problems of eclipsing and ambiguity, another aggregation approach was proposed by Liou et al. (2004) through the use of a mixed-aggregation method (combination of additive and geometric methods). According to Liou et al. (2004), parameters that have a very strong correlation are first clustered into three groups, namely organics, particulates and faecal coliform. In order to generate the aggregated sub-index values for each group, parameters in the same group are aggregated through the equal additive method. Then, the three sub-indices are aggregated to have the final index value by using geometric mean. The overall water quality index is generated by multiplying the aggregated index by three scaling coefficients, as shown in Eq. (15):

$$ \mathrm{W}\mathrm{Q}\mathrm{I}={C}_{\mathrm{temp}}{C}_{\mathrm{pH}}{C}_{\mathrm{Tox}}{\left[\left({\displaystyle \sum_{i=1}^n{I}_i{w}_i}\right)\left({\displaystyle \sum_{j=1}^n{I}_j{w}_j}\right)\left({\displaystyle \sum_{k=1}^n{I}_k{w}_k}\right)\right]}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$3$}\right.} $$
(15)

where I i denotes the sub-index value for the organics parameters, I j represents the sub-index value for the particulate parameters and I k is the sub-index for faecal coliform. In addition, three scaling coefficients are prefixed, which address the sub-indices of temperature (C temp), pH (C pH) and toxic substances (C Tox ), respectively. Hanh et al. (2011) also employed a similar hybrid aggregation method (of additive and multiplicative forms) to aggregate the sub-indices to produce a final index value.

In addition to the methods explained above, a significant contribution for final aggregation was introduced in the development of the CCME WQI (CCME 2001). In this method, all parameters are standardized, and three factors on which the index is founded are calculated. These three factors are scope, frequency and amplitude, which are denoted by notations F 1, F 2 and F 3, respectively. F 1 refers to the number of parameters that do not meet the water quality standards (calculated using Eq. (16)), whereas frequency defines the frequency with which the objectives are not met (Eq. (17)). Amplitude corresponds to the amount by which the objectives are not met. The calculation of F 1 and F 2 is relatively straightforward, but F 3 requires some additional steps. F 3 is calculated in three steps. In the first step, the number of times by which an individual concentration is greater than the objective of a parameter (or less than, when the objective is a minimum) is termed an “excursion” and is calculated using Eq. (18) (when the test value must not exceed the objective). Then, the collective amount by which individual tests are out of compliance is calculated by summing the excursions of individual tests from their objectives and dividing by the total number of tests (both those meeting objectives and those not meeting objectives). This variable, referred to as the “normalized sum of excursions”, or nse, is calculated using Eq. (19). The amplitude F 3 is then calculated using Eq. (20) and the final index is calculated using Eq. (21) (CCME 2001).

$$ {F}_1\kern0.5em =\kern0.5em \left(\frac{\mathrm{Number}\kern0.5em \mathrm{of}\kern0.5em \mathrm{failed}\kern0.5em \mathrm{variables}}{\mathrm{Total}\kern0.5em \mathrm{number}\kern0.5em \mathrm{of}\kern0.5em \mathrm{tests}}\right)\times 100 $$
(16)
$$ {F}_2\kern0.5em =\left(\frac{\mathrm{Number}\kern0.5em \mathrm{of}\kern0.5em \mathrm{failed}\kern0.5em \mathrm{variables}}{\mathrm{Total}\kern0.5em \mathrm{number}\kern0.5em \mathrm{of}\kern0.5em \mathrm{tests}}\right)\times 100 $$
(17)
$$ {\mathrm{excursion}}_i=\left(\frac{\mathrm{Failed}\kern0.5em \mathrm{Test}\kern0.5em {\mathrm{Value}}_i}{{\mathrm{Objective}}_i}\right)-1 $$
(18)
$$ nse=\left(\frac{{\displaystyle {\sum}_{i=1}^n\mathrm{excursions}}}{\mathrm{number}\kern0.5em \mathrm{of}\kern0.5em \mathrm{tests}}\right) $$
(19)
$$ {F}_3=\left(\frac{nse}{0.01nse+0.001}\right) $$
(20)
$$ \mathrm{CCME}\kern0.5em \mathrm{W}\mathrm{Q}\mathrm{I}=100-\left(\frac{\sqrt{\left({F}_1\right)+{\left({F}_2\right)}^2+{\left({F}_3\right)}^2}}{1.732}\right) $$
(21)

where 1.732 is a constant that normalizes the resultant values to a range between 0 and 100, where 0 represents the “worst” and 100 represents the “best” water quality. Tyagi et al. (2013) pointed out some demerits of this aggregation method, especially indicating that F 1 does not work appropriately when too few variables are considered or when too much covariance exists among them.

Another final aggregation method was proposed by Said et al. (2004). They used a simplified mathematical expression for final aggregation, which is presented in Eq. (22). The advantage of this method is that it is able to determine the final aggregated index through direct calculations using the selected parameters and without generating the sub-indices. However, this equation was developed for a specific region and it might not be suitable for other regions.

$$ \mathrm{W}\mathrm{Q}\mathrm{I}= log\left[\frac{{\mathrm{DO}}^{1.5}}{(3.8)^{\mathrm{TP}}{\left(\mathrm{Turb}\right)}^{0.15}{(15)}^{\frac{\mathrm{FCol}}{1000}}+0.14{(SC)}^{0.5}}\right] $$
(22)

where DO is the dissolved oxygen (% oxygen saturation), Turb is the turbidity (in nephelometric turbidity units [NTU]), TP is the total phosphorus (mg/L), FCol is the faecal coliform bacteria (counts/100 mL) and SC is the specific conductivity (in MS/cm at 25 °C).

Important water quality indices

Although 30 WQIs were reviewed in this study, only seven of those WQIs (first seven indices listed in the Appendix) were selected and explained in detail in this section based on their popularity. The popularity of a WQI was decided based on two criteria, namely the number of their applications in refereed journals and by government agencies. The indices presented in Table 2 are listed in the order of their popularity, with the first WQI in the list (CCME WQI) being the most popular, with its applications presented in 14 journal papers and in more than ten government agency reports. These applications are listed in columns 3 and 4 of Table 2.

The following sub-sections discuss the seven selected WQIs, especially with emphasis on the four steps in developing a WQI. It should be noted that once a WQI is developed, the final index value will have to be interpreted to assess the water quality for its suitability for specific purposes. Hence, for each of the seven WQIs, a discussion on how the final index value was interpreted is also presented.

Canadian Council of Ministers of the Environment Index

The CCME WQI was developed by the Canadian Council of Ministers of the Environment as a tool to assess and report water quality information to both management institutions and the public (CCME 2001). Several studies in the literature have applied this index for various purposes. In Canada, it was used to evaluate the water quality status of several river basins (Khan et al. 2003; Lumb et al. 2006; Davies 2006), to evaluate drinking water quality (Khan et al. 2004; Hurley et al. 2012) and to assess water quality in metal mines (de Rosemond et al. 2009). In addition to the above-mentioned applications of CCME WQI in Canada, this index also has been adopted in several other countries. For example, it was employed in Turkey (Boyacioglu 2010), India (Sharma and Kansal 2011), Spain (Terrado et al. 2010), Chile (Espejo et al. 2012), Albania (Damo and Icka 2013) and Iran (Mostafaei 2014).

  1. a)

    Selection of parameters

    The CCME WQI allows flexibility to select parameters so that the index users can easily modified and adopted according to local conditions and issues. For instance, Alberta State in Canada used four groups of parameters, metals (up to 22 parameters), nutrients (6 parameters), bacteria (2 parameters) and pesticides (17 parameters), while New Brunswick State used only 14 parameters in applying the CCME WQI.

  2. b)

    Generation of sub-indices

    The CCME WQI index does not use this step to obtain sub-indices.

  3. c)

    Establishing weights

    Since sub-indices are not generated in this WQI, there are no weights associated with them.

  4. d)

    Index aggregation

    As explained earlier, the CWQI provides a straightforward mathematical framework for aggregating the final index value (with Eq. (21) used to calculate the final index).

  5. e)

    Final index value interpretation

    A grade of 0 to 100 is considered to interpret the final index value. The CCME WQI values are classified into five different categories, namely excellent quality (from 95 to 100), good quality (from 80 to 94), fair quality (from 65 to 79), marginal quality (from 45 to 64) and poor quality (from 0 to 44).

National Sanitation Foundation Index

The National Sanitation Foundation (NSF) WQI is one of the earliest WQIs, which was developed during the early 1970s (Brown et al. 1970). The index obtained credibility among other available WQIs since more than hundred water quality experts from throughout the USA were considered in the development of this index. Although originally developed in the USA, this WQI or its modified version has been applied in various countries including Brazil (Simões et al. 2008), India (MPCB 2014) and Iran (Mojahedi and Attari 2009).

  1. a)

    Selection of parameters

    The NSF WQI used the Delphi technique to finalize a fixed set of parameters. Based on the consensus of water quality experts from across the USA, nine parameters were selected as presented in Table 1 (column 2). Later, two more parameters (pesticides and toxic elements) were added to the set of nine parameters.

  2. b)

    Generation of sub-indices

    The sub-indices for NSF WQI were also established through the Delphi technique. This information was later used to produce “an average curve” which represented the general pattern of all sub-indices, except for pesticides and toxic elements. These two sub-indices were established through categorical scaling of 0 and 1. If both parameters exceed the permissible limits, the status of water quality is automatically registered as 0 (the worst level).

  3. c)

    Establishing weights

    Using the Delphi technique, another questionnaire was constructed to identify individual weights for the selected parameters. Based on this procedure, the final weights (in brackets) were as follows: DO (0.17), FC (0.16), pH (0.11), BOD5 (0.11), temperature (0.10), TP (0.10), NO3 (0.10), turbidity (0.08) and TS (0.07). The sum of all individual weights is equal to 1.

  4. d)

    Index aggregation

    In the index originally proposed by Brown et al. (1970), the aggregation of the sub-indices was undertaken using the additive method. In the course of using the index, it was found that the arithmetic formulation, although easy to understand and calculate, as highlighted in Lumb et al. (2011a), lacked sensitivity in terms of the effect a single bad parameter value would have on the WQI. This led Brown et al. (1973) to propose a variation of NSF WQI in which the multiplicative aggregation is used.

  5. e)

    Final index value interpretation

    The final index values ranged from 0 (very bad water quality) to 100 (very good water quality). Brown and McClelland (1974) suggested the following classification of the index scores for grading the quality of water in the NSF WQI: excellent (90–100), good (70–89), medium (50–69), bad (25–49) and very bad (0–24).

Oregon Index

The Oregon Water Quality Index (OWQI) was developed in the 1970s (Dunnette 1979) for the purpose of summarizing and evaluating water quality status and trends in Oregon. The original OWQI was discontinued in 1983 due to the enormous resources required for calculating and reporting the results. With the advancements in computer technology, enhanced tools of data display and visualization and a better understanding of water quality, the OWQI was updated by Cude (2001) by refining the original sub-indices and improving the aggregation method. The purpose of the updated OWQI was to express ambient water quality for general recreational use. However, it has been widely used by the Oregon Department of Environment Quality (ODEQ) to evaluate the overall water quality of Oregon’s rivers (ODEQ 2014). The OWQI was also used by the Idaho Department of Environmental Quality (IDEQ 2002) to conduct an integrated approach in assessing ecological of Idaho’s rivers. The OWQI is also part of a suite of popular WQIs that were incorporated in an automated software called Qualidex (Sarkar and Abbasi 2006).

  1. a)

    Selection of parameters

    The selection of parameters was conducted based on water quality data of the Willamette River basin in Oregon (Dunnette 1979). The author undertook an exhaustive process for parameter selection, which involved several stages in consecutive order, namely literature review of previous WQIs, a parameter selection procedure based on rejection rationales, a modified Delphi technique and consideration of major impairment categories.

    In the first stage, 90 possible parameters were listed based on a literature review of available WQIs. Then, three rejections were used to reject parameters, namely availability of data, parameters being of questionable significance and not being present in harmful amounts. These rejections reduced the number of parameters from 90 to 30. Then, the Delphi technique was applied to the 30 parameters. Unlike in the NSF WQI, only staff members of the ODEQ were considered as respondents. Through their consensus, 14 parameters were selected and subjected to another rejection rationale of redundancy and impairment categories. The redundancy rejection is usually carried out by examining Pearson’s correlation coefficient, while in the impairment rejection, the water quality was classified according to the impairment categories of oxygen depletion, eutrophication or potential for excess biological growth, dissolved substances and health hazards. Finally, six parameters were selected, as presented in Table 1 (column 2).

    In addition to the originally selected six parameters, Cude (2001) argued that two additional parameters (TP and temperature) should be added to the set of parameters. These parameters were added based on a better understanding of their significance to water quality in Oregon’s streams.

  2. b)

    Generation of sub-indices

    To generate sub-indices in the updated OWQI, Cude (2001) developed non-linear regression rating curves for the original six parameters based on the original logarithmic graphs proposed when OWQI was originally developed. The rating curve for TP was developed based on the risk of eutrophication in Oregon’s streams and that for temperature was developed with the protection of cold water fisheries (Cude 2001). For each sub-index, parameter measurements were converted to a relative quality rating between 10 (worst case) and 100 (ideal).

  3. c)

    Establishing weights

    The original OWQI (Dunnette 1979) used the Delphi technique to generate weights. The weights of the six selected parameters were obtained as follows: DO (0.4), FC (0.2), pH (0.1), nitrate + ammonia-N (0.1), TS (0.1) and BOD (0.1). On the contrary, Cude (2001) argued that unequal weights for the parameters is only suitable for WQIs that were developed for a specific use, not for general use, in which some parameters might play a more important role than the others. Therefore, equal weight parameters were used for this index.

  4. d)

    Index aggregation

    The original OWQI (Dunnette 1979) used additive method for index aggregation. Once all six different sub-index values were obtained, they were aggregated using the additive method to produce the final index value (using Eq. 7). Since there was an eclipsing problem, in the updated index, Cude (2001) adopted an unweighted harmonic square formula (presented in Eq. 14) to aggregate the sub-indices.

  5. e)

    Final index value interpretation

    The water quality is evaluated by the OWQI according to five classes, which are as follows: excellent (final index value from 90 to 100), good (85 to 89), fair (80 to 84), poor (60 to 79) and very poor (10 to 59).

Bascarón index

The Bascarón index was developed by Bascarón (1979) specifically for Spain. This index has been used in several studies, particularly from South American countries. For example, it was used and applied in Argentina (Pesce and Wunderlin 2000), in Chile (Debels et al. 2005), in Brazil (Abrahão et al. 2007), in Spain (Sánchez et al. 2007), in India (Kannel et al. 2007) and in Turkey (Koçer and Sevgili 2014).

  1. a)

    Selection of parameters

    The Bascarón index enables flexibility in the inclusion and exclusion for parameter selection (Bascarón 1979 in Abrahão et al. 2007; Lumb et al. 2011a); however, it was recommended that 26 parameters be considered in the final index aggregation (which was earlier presented in Table 1 (column 2)).

  2. b)

    Generation of sub-indices

    The sub-indices (term C i in Eq. (10)) were obtained by normalizing the actual parameter values to a common scale ranging from 0 to 100. Using the normalization factors, the sub-indices can take one of the values from 0, 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100. The value will depend on the permissible limits of the respective parameter, which is derived from water quality directives.

  3. c)

    Establishing weights

    In the Bascarón index, different weights are assigned for different parameters. The values of weights vary from 1 to 4, with the sum of all weights being 54. This sub-indices had weights as presented in brackets—pH (1), BOD5 (3), DO (4), temperature (1), TC (3), colour (2), turbidity (4), permanganate reduction (3), detergents (4), hardness (1), DO (2), pesticides (2), oil and grease (2), SO4 (2), NO3 (2), cyanides (2), sodium (1), free CO2 (3), ammonia-N (3), Cl (1), conductivity (4), Mg (1), P (1), NO2 (2), Ca (1) and apparent aspect (no weight given).

  4. d)

    Index aggregation

    Index aggregation is undertaken using a modified version of the additive method, which was presented in Eq. 10 (Bascarón 1979 in Abrahão et al. 2007).

  5. e)

    Final index value interpretation

    The interpretation of the final index is done based on five categories: good (final index value from 91 to 100), acceptable (61 to 90), regular (31 to 60), bad (16 to 30) and very bad (0 to 15).

House’s Index

This index was developed by House (1986). The author developed four indices, in which each could be used separately or in combination when the users needed a more detailed picture of river water quality status. The first of these four indices was a general WQI developed to be used as an indicator of river health for routine monitoring programs. The other three indices were potable water supply index (PWSI), aquatic toxicity index (ATI) and potable sapidity index (PSI). The PWSI, ATI and PSI were specially used in evaluating suitability of potable water supply, toxicity in aquatic and wildlife population, respectively. Although no formal reports from environmental agencies were found regarding the application of these indices, there were some publications in the literature presenting the application of this WQI in the UK, where many reaches were evaluated using the general WQI (House 1989, 1990; Tyson and House 1989; House and Ellis 1987). In addition, Carvalho et al. (2011) adopted rating class of House’s Index when assessing water quality status of a small river in Portugal.

  1. a)

    Selection of parameters

    The author conducted rigorous interviews with water authorities and river purification boards to ascertain which parameters should be included for the indexing system. Parameters were selected based on routinely monitored parameters of water authorities and river purification boards, based on interviews with officers and also based on the permissible limits for different uses.

    These four indices have different selected parameters. The general WQI used nine parameters, as presented previously in column 2 of Table 1. The PWSI consisted of thirteen parameters, which included nine parameters from the general WQI and four additional parameters, which were sulphates, F, colour and dissolved iron. The ATI considered heavy metals, pesticides and hydrocarbon parameters for a more detailed monitoring of water quality and it had twelve parameters as presented in column 2 Table 1. The last index, which is the PSI, also had the same parameters with those of the ATI. The only difference was in the form of the substances (in the ATI most of the selected parameters were in dissolved forms, while in the PSI, they were in total substance forms).

  2. b)

    Generation of sub-indices

    These indices used a scale of 10–100, with a score of 10 reflecting poor water quality akin to sewage and that of 100 indicating waters of high purity. Rating curves were developed using the permissible limits of available water quality standards for different uses. If a particular parameter had two or more standards, the median of these permissible limits was selected and converted into specific sub-index values.

  3. c)

    Establishing weights

    Different weights for individual parameters were established using the Delphi technique. The panellist consisted of personnel in the pollution prevention organizations and water experts. Final weights were then established based upon the median rankings. The general WQI had weights of DO (0.2), BOD5 (0.18), ammoniacal nitrogen (0.16), suspended solids (0.11), total coliforms (0.11), nitrates (0.09), pH (0.09), Cl (0.04) and temperature (0.02). The PWSI had weights for its 13 parameters, involving total coliforms (0.14), ammoniacal N (0.10), NO3 (0.10), SS (0.10), colour (0.10), pH (0.09), iron (0.09), BOD5 (0.09), DO (0.05), fluorides (0.05), chlorides (0.04), SO4 (0.02) and temperature (0.02). Weights were not developed for the ATI and the PSI as all parameters had equal importance and were considered very harmful for human and aquatic life.

  4. d)

    Index aggregation

    There was no grouping of parameters to form aggregated sub-indices within these WQIs. Therefore, after transforming the actual values of the parameters into sub-indices, they were aggregated to the final index using a variance of the additive method developed by the SRDD. The aggregation formula adopted is presented in Eq. 9 (House 1989).

  5. e)

    Final index value interpretation

    The interpretation of the index is divided into four classifications, involving highly polluted water (10–30) which is used for non-contact recreational uses, sewage transport and navigation; moderately polluted water (31–50) that can be used for potable water supply after advanced treatment, indirect contact sports and breeding fish population; water of reasonable quality (51 to 70) suitable for potable water supply with conventional treatment, fisheries, indirect contact sports and some industrial uses at moderate costs; and finally water of high quality (71–100) suitable for potable water supply, game fisheries, contact recreation and high quality industrial uses.

Scottish Research Development Department index

The Scottish Research Development Department (SRDD) index was developed by the Engineering Division of the SRDD (SRDD 1976) based on steps similar to those in the NSF WQI. It is also called as the Scottish WQI. Although the SRDD index was originally developed for Scotland, it has later been modified and used to evaluate the status of water quality in several river basins from different countries (for example, Thailand (Bordalo et al. 2001), Spain (Bordalo et al. 2006), Portugal (Carvalho et al. 2011) and Iran (Dadolahi-Sohrab et al. 2012)). The steps used for the application of SRDD index are as follows:

  1. a)

    Selection of parameters

    The Delphi technique was used for the selection of parameters in the SRDD index. Several rounds of questionnaires were distributed to the local water experts from around Scotland (SRDD 1976). Following the same path as the NSF WQI, the SRDD index selected a fixed set of ten parameters as presented in column 2 of Table 1.

  2. b)

    Generation of sub-indices

    Sub-indices of the SRDD index were developed based on the convergence of panellists’ judgement. The respondents were asked to decide the possible lowest and highest values of each sub-index. The SRDD index considered that values of all sub-indices started from 0 (the lowest sub-index value) to 100 (the highest sub-index value).

  3. c)

    Establishing weights

    The Delphi technique was again used in establishing the weights for each of the selected parameters as indicated in brackets: DO (0.18), BOD5 (0.15), free and saline ammonia (0.12), pH (0.09), total oxidized nitrogen (0.08), phosphate (0.08), SS (0.07), temperature (0.05), conductivity (0.06) and Escherichia coli (0.12).

  4. d)

    Index aggregation

    The SRDD index used the modified additive method for index aggregation (using Eq. 9). Since this index does not have any grouping of parameters, there is only one level of index aggregation. The final index value is obtained by directly aggregating sub-index values of each parameter.

  5. e)

    Final index value interpretation

    Similar to the NSF WQI, higher values of the SRDD index indicate better overall water quality. There are seven levels of water quality status in the SRDD index, namely clean (final index from 90 to 100), good (80 to 89), good water quality with some treatment (70 to 79), tolerable (40 to 69), polluted (30 to 39), severely polluted (20 to 29) and finally water akin to piggery waste (0 to 19).

Fuzzy-based indices

In the recent past, several index developers have started applying fuzzy-based indices, which were developed based on fuzzy logic technique (Zadeh 1965). Fuzzy logic is used to define classes of objects that have an ambiguous status. In many environmental problems, including water quality, such an ambiguity exists. Hence, it is not easy to quantify water quality using crisp data or limited indicators (Ocampo-Duque et al. 2013). Instead, Mahapatra et al. (2011) suggested to consider water quality as a fuzzy term appropriately estimated with linguistic computations.

In a fuzzy-based index, only two steps namely, parameter selection and weighing, are undertaken as in conventional indices. The two other steps (including classifying for interpretation) are completely obtained by rules (using expert’s judgement) and sets of linguistic computation, e.g. fuzzification, evaluation of inference rules and defuzzification. The development and application of this index have been applied in Spain (in Ocampo-Duque et al. 2006), in Iran (in Nikoo et al. 2011) and in Brazil (in Lermontov et al. 2009).

  1. a)

    Selection of parameters

    Fuzzy-based indices use open system. Thus, any parameter can be selected based on water quality monitoring programs or a fixed set of parameters can be adopted from existing WQIs.

  2. b)

    Generation of sub-indices

    In a fuzzy-based index, parameters are normalized and grouped through a fuzzy interference system (FIS) wherein the numerical values (inputs) are fuzzified into a qualitative state (outputs) and processed by an inference engine, membership functions, rules, sets and operators in a qualitative state (Lermontov et al. 2009).

  3. c)

    Establishing weights

    Successful application of an FIS depends on an accurate weight assignment to the parameters involved in the fuzzy rules (Ocampo-Duque et al. 2006; Lermantov 2009). The pair-wise comparison matrix in the AHP can be used for obtaining different weights for individual parameters (Ocampo-Duque et al. 2006) or for a different set of parameters (Nikoo et al. 2011).

  4. d)

    Index aggregation

    Index aggregation was undertaken through a certain set of rules written by the index developers. To obtain the final index, defuzzification is conducted. Defuzzification is a process of transforming the fuzzy outputs into non-fuzzy or numerical outputs (Ocampo-Duque et al. 2006).

  5. e)

    Final index value interpretation

    In Lermontov et al. (2009), the interpretation of the final aggregated index was then performed based on the following classification scheme: water quality is interpreted as poor (final index from 0 to 19), bad (20 to 36), fair (37 to 51), good (52 to 79) and excellent (80 to 100).

Summary, conclusions and recommendations

A water quality index (WQI) is a tool to assess the status of water quality at certain times and locations. It aggregates water quality parameters into useful information that is simple and easily understandable and thus can be used by the water authorities as well as the general public. The review presented in this paper on the development of river WQIs aimed to provide significant inputs to river water authorities worldwide for using or customizing existing indices for their application and contribute to future river WQI development studies. With this aim, this study reviewed 30 available WQIs and discussed them in light of the four steps that should be considered in the development of WQIs. These steps are the selection of parameters, generation of sub-indices, generation of parameter weights and final index aggregation process.

In this study, seven WQIs were identified as the most important based on their wider use, and they were discussed in detail. A main factor that influences the wider use of any WQI is the support and encouragement that is provided by the government and authorities to implement the index as the main indicator or tool to evaluate the status of the rivers in that region (or country). The Canadian Council of Ministers of the Environment (CCME) WQI and Oregon Water Quality Index (OWQI) are good examples of this support and encouragement provided by the government since they have been widely used in all states of Canada and two states in the USA (Oregon and Idaho).

In general, it can be concluded that there is no worldwide accepted method in constructing a WQI. The index developers might consider all the four steps in developing a WQI or they could consider some of the steps. Moreover, there is no method by which 100 % objectivity or accuracy can be achieved in the development of a WQI, specifically for the selection of parameters, generation of sub-index values, generation of parameter weights and the choice of index aggregation method. Thus, problems like rigidity, eclipsing and ambiguity will always be a challenge in the development of a WQI.

Since there is subjectivity and uncertainty involved in the steps of developing a WQI, it can also be concluded that statistical-based methods, which include correlation analysis, principal component analysis (PCA), cluster analysis (CA) and discriminant analysis (DA), might be useful methods in minimizing uncertainty in steps like the parameter selection process. For example, Wang et al. (2013), Juahir et al. (2011), Shrestha and Kazama (2007), Singh et al. (2005), Singh et al. (2004) and Wunderlin et al. (2001) applied the CA and DA for seeking the optimal selection of water quality parameters for cost-effective monitoring purpose. In addition, Khalil et al. (2010, 2014) applied correlation analysis and CA to select the best set of parameters that can be used for water quality index development. However, statistical methods are still subjective as they rely on the data provided for analysis. Thus, it is recommended that the opinion of local water quality experts is taken into account (through techniques like the Delphi method) in each of the steps in developing a WQI. For example, in the National Sanitation Foundation (NSF) WQI in the USA, the involvement of water quality experts is very high and this has become a standard approach for developing the methodology for other indices such as the Ross’ Index (Ross 1977), SRDD index (SRDD 1976), Oregon Index (Dunnette 1979), Dinius’ index (Dinius 1987), House’s Index (House 1986), Smith’s index (Smith 1990) and Almeida’s Index (Almeida et al. 2012).

In this review, it was also observed that uncertainty and sensitivity analysis was rarely undertaken to minimize the uncertainty associated with the development of a WQI. Uncertainty analysis aims to identify sources and quantify the uncertainty involved in the development of a WQI and to investigate the influences of those uncertainties on the final index values. The sources of uncertainties can be the inclusion or exclusion of the parameters, the selection of normalization schemes, the weights and the choice of aggregation methods. On the other hand, sensitivity analysis aims to study the response of an output variable (i.e. the final index value) to variations in or influence of the input uncertainties (Nardo et al. 2005; CCME 2006).

It is worth mentioning that only the CCME WQI had undertaken a sensitivity analysis for all the steps in the development of their WQI (CCME 2006), which involved investigation of the final index values with respect to the number of selected parameters, number of data samples, index aggregation methods and the water quality objectives. Other WQIs applied such an analysis only for some of the steps. For example, it was undertaken through inclusion or exclusion of several parameters (Rickwood and Carr 2009), selection of different aggregation equations (Brown et al. 1970; Landwehr and Deininger 1976; Dunnette 1979; House 1989; Smith 1990; Liou et al. 2004; Said et al. 2004), selecting different number of parameters (Bhargava 1985) and using different weighting methods (Smith 1990). Hence, this study also recommends that the sources of uncertainty are identified in every step of the development process and that those uncertainties are quantified. Quantification of the uncertainty in every step of the index development process increases the credibility of an index, as well as it helps index developers and their users to have a better understanding of the strengths and weaknesses of an index.

It is also recommended that a common WQI is used within a region or province. With regard to the selection of parameters, it is preferable that each river basin should have a unique set of parameters. However, this has a practical disadvantage that comparison of WQIs between different river catchments in a region will not be possible because of the constituent parameters being different. Hence, to facilitate comparison of WQIs between river basins, it is also recommended to have a common WQI (with a fixed set of parameters) for river basins within a province or region.