1 Introduction

With increasing population growth and demand for surface water, it is crucial to be able to assess water quality on a real-time basis. Today, a relatively large amount of water quality data exists that can be harnessed along with much improved computational capacity to provide indications of water quality status. This can be done both in real-time and in a predictive sense to determine when and where issues are likely to arise. Water quality indices (WQIs) provide a simple and reliable method by which water quality parameters can be expressed in common units and aggregated into a composite value (Brown et al. 1972; CCME 2001; USEPA 2009; Poonam et al. 2013; Gitau et al. 2016) without losing the scientific foundation of the assessment of water quality (Sutadian et al. 2016). Existing data can therefore be analyzed more efficiently. Several WQIs exist, each comprised of a manageable number of determinants (5–9) that are reflective of key determinants of water quality status and intended use (Brown et al. 1972; McClelland 1974; Dunnette 1979; Landwehr and Deininger 1976; Hallock 2002).

Water Quality Indices (WQIs) were first proposed in Germany in 1848 (Abbasi and Abbasi 2012); however, the first modern WQI was developed by Horton in 1965 (Horton 1965). Since then, scientists and environmental organizations have been developing WQIs with many proposing WQIs for specific purposes and eco-regions (Abbasi and Abbasi 2012; Gitau et al. 2016; Sutadian et al. 2016). The United States Environmental Protection Agency (USEPA) was amongst the first designers of the index framework that is still in use today (USEPA 2009). Commonly used WQIs include the additive model (AWQI) (Brown et al. 1972; Lumb et al. 2011a, 2011b), the multiplicative model (MWQI) (McClelland 1974; Lumb et al. 2011a, 2011b), the unweighted multiplicative model (UMWQI) (Landwehr and Deininger 1976 l Gupta et al. 2003), and the minimum operator model (MOWQI) (Smith 1990; Swamee and Tyagi 2000) as detailed in Gitau et al. (2016). These authors compared the indices to one another to determine their accuracy and suitability and found the multiplicative models (UMWQI, MWQI) rating about the same and ranking the best in terms of consistency, with the UMWQI being more flexible.

The concept of using an aggregated value as a composite indicator of status is common in many disciplines, including economics and ecology (Abbasi and Abbasi 2012). The indices are not meant to replace biological, chemical, or ecological data; rather, they provide an accessible synthesis of compiled information. One example of a commonly used index for environmental management is the Index for Biological Integrity (IBI). This index is used to assess the integrity of the water through the monitoring of fish and other aquatic biological communities and their tolerance and abundance (Karr 1981). Globally, the development and use of biological indicators to communicate and assess the status and trends of aquatic ecosystems has played a major role in environmental conservation and management practices (Poonam et al. 2013).

The purpose of this study was to develop new subindex equations to improve the accuracy and suitability of the Unweighted Multiplicative Water Quality Index (UMWQI). Specifically to: (1) develop a methodology for calculating an improved WQI by redesigning and incorporating flexible, criteria-based subindex formulations; (2) evaluate the impacts of determinant selection and data availability on composite water quality rating; and (3) demonstrate WQI functionalities, including water quality assessment, communication, and prediction. The subindex formulations were designed to be representative of the region’s ecological characteristics, considering the intended use of the streams and the determinants that have the largest impacts on water quality. The new subindex equations accommodate ecoregional water quality thresholds specified based on representative thresholds for the area of interest. The Unweighted Multiplicative Model (Brown et al. 1972; Mcclelland, 1974; Dunnette 1979) was adopted because of its flexibility and ability to accurately represent water quality status of the area of interest. Thus, in this study, new subindex formulations that incorporate water quality thresholds based on existing standards and criteria were developed for the UMWQI. Index and subindex ratings were also re-defined to more accurately represent water quality status and account for the water’s intended use.

The pilot site for this project, the Western Lake Erie Basin (WLEB, Fig. 1), was selected for several reasons: its value as a water resource to both society and the environment; the severity of its water quality issues including re-eutrophication (Bridgeman et al. 2012; Kane et al. 2014) and poor water clarity (Daloglu et al. 2012); and, the relatively large amounts of water quality data available in this region. Methods and approaches are generalizable and could easily be adapted to other areas with similar water quality concerns.

Fig. 1
figure 1

Pilot site (Western Lake Erie Basin) with selected stream gauge sites and weather stations. Also shown is information on: length datasets; average number of data points available per water quality parameter; and, the average availability (%) over the data length

2 Methodology

2.1 Pilot Study Site Description

The pilot site for this study is the WLEB (Fig. 1), with particular focus on the Maumee River and its tributaries, as these contribute the largest nutrient inputs to Lake Erie due to agricultural practices in the basin (Bridgeman et al. 2012; Daloglu et al. 2012; Kane et al. 2014; Stow et al. 2015; Keitzer et al. 2016). The WLEB spans an area of 2.83 million ha (7 million ac) in the states of Indiana, Michigan, and Ohio, and has ten sub-basins (NRCS 2015). Nonpoint source pollution due to intense agricultural practices (70% of land usage) in this region is directly linked to the harmful algal blooms in Lake Erie (Stow et al. 2015). The annual precipitation within the WLEB varies from 838 mm to 940 mm (Mehan et al. 2017). Generally, the eastern side of the WLEB receives greater amounts of annual precipitation than the northern and western portions (Gitau et al. 2018).

Determinants that were representative of water quality concerns in the pilot site were Total Suspended Solids (TSS), Total Phosphorus (TP), Soluble Reactive Phosphorus (SRP), and Water Temperature. These are common determinants of concern in other regions within the United States. Seasonal water temperatures have been found important for the tributaries in the WLEB in relation to water quality and harmful algal blooms (HABs) (OEPA 2009a; b; Richards et al. 2010; OEPA 2014; Stow et al. 2015). Thus, water temperature was included as a determinant in this study, though it has been excluded in other WQI formulations (e.g. USEPA 2009) due to concerns about double accounting for the determinant. Four water quality sites from the Heidelberg Tributary Loading Program Dataset (Heidelberg University 2017) were selected for the analysis, as shown in Fig. 1.

2.2 Development of Water Quality Thresholds

Table 1 shows the water quality threshold for each of the determinants considered, their impact on bodies of water, and the water quality standards and established thresholds that were implemented in the water quality index model developed through this study. For this study, 0.1 mg/L of total phosphorus was used as the threshold for standardization purposes. For soluble reactive phosphorus, a maximum level of 0.005 mg/L was adopted, consistent with the Wawasee Area Conservancy Foundation recommendation for lake systems (IDEM 2014). Though most of the state standards were based on turbidity, a threshold of 60 mg/L for total suspended sediment was chosen based on an average value from standards from North and South Dakota, New Jersey, Hawaii, and Alaska (USEPA 2015). For nitrates-nitrite, the Clean Water Act establishes a criterion of 10 mg/L (33 U.S.C. § 1251 et seq.), which was used as the water quality threshold for the proposed WQI model.

Table 1 Summary of potential key determinants, their ecological effects, and WQI concentration thresholds

2.3 Subindex Equations Development

Initial subindex formulations were developed based on modifications of the formulations proposed by USEPA (2009), as well as the methodology developed by Dunnette (1979) and Cude (2001). In this study, the development of the subindex transformation curves was based on the statistical distribution of the key determinants to account for variability in natural characteristics of streams.

In our subindex approach, data for TSS, TP, SRP, and NO2–3 were rated as follows. First, observed historical data was filtered for each day i and determinant p (Qi,p) to a subset of only the observations below the threshold (Pthreshold) for that determinant. Determinant values equal to the 10th percentile (P10) value in the filtered dataset were given a rating of 80, and the 90th percentile (P90) values were rated 50. Concentrations below the 10th percentile value were scored between 80 and 89 on a linear scale, decreasing in rating with increased concentrations; ratings of 90–100 were reserved for pristine waters that require no treatment. The lowest concentration observation of data (Pmin) received a score of 89. Similarly, concentrations above the 90th percentile of passing observations were rated from 40 to 49, also decreasing linearly. Determinant values between the 10th and 90th percentile values in the observed historical data were fitted to ratings from 50 to 79, using an exponential function based on USEPA (2009).

Observations from the unfiltered data set with concentrations above the maximum threshold were rated 39 or below, with a rating of 0 assigned to the maximum failing observation in the data (Pmax). This was done because a water body could potentially find alternative uses, up to a point, even with one or more determinants failing to meet their respective water quality thresholds. The index of 0 captures the point beyond which water was no longer suitable for any uses. This method was used in the same way across determinants regardless of the thresholds or measurement units, thus, allowing severity to be comparable across determinants.

2.4 Water Temperature Subindex Equation

Water temperature, being the only potential key determinant that is not a contaminant, required a different set of subindex equations. Unlike TSS, TP, SRP, and NO2–3, this potential key determinant is directly tied to the bioproductivity of a water body. As available water temperature data at individual stations were insufficient for subindex development, the data from eight United States Geological Survey (USGS) gauges and twenty weather stations from the National Oceanic and Atmospheric Administration (NOAA) Climate Data Online were aggregated to assess the data availability of observed water temperature. Since air temperature data are generally more accessible and available than water temperature, a linear relationship between the aggregated air and the aggregated water temperatures based on Cluis (1972), Stefan and Preud’homme (Stefan and Preud'homme 1993), and Webb and Nobilis (Webb and Nobilis 1997) was developed:

$$ {T}_w(t)=17.532+0.8264\bullet {T}_a(t) $$
(1)

where Tw is the water temperature in Celsius (°C), and Ta is the air temperature in Celsius (°C) for day t. According to the distribution of the temperature data, water temperature varied from 0 °C to 20 °C. The lower limit was potentially due to the difficulty in assessing water temperature when approaching the triple point of water (the air temperature at which water can exist as liquid, solid, and gas), which could cause difficulty with measuring the temperature if some water was frozen. Performance analysis gave values of R2 = 0.84, NSE = 0.83, and p-bias = −2.04. Through trial and error, it was found that the optimal value for the R2 and NSE value could be obtained by removing air temperature values less than −24.2 °C. This removal resulted in relatively small changes in performance statistics, though the p-bias value dropped below zero (taking on a value that was worse than before). Based on criteria in Moriasi et al. (2015), this method of estimating water temperatures was suitable for use in the pilot site since both R2 and NSE were above 0.75. Once a means of estimating water temperatures had been developed, the temperature subindex equation (Eq. 2) from Cude (2001) was adopted for use. The equation was evaluated using back-calculations and existing literature to determine its suitability for use in the WLEB.

$$ {Q}_{temp,t}=\left\{\begin{array}{cc}\mathrm{Temp}\le {11}^{{}^{\circ}}\mathrm{C}& 100\\ {}{11}^{{}^{\circ}}\mathrm{C}<\mathrm{T}\mathrm{emp}\le {29}^{{}^{\circ}}\mathrm{C}\kern1em & 76.54+4.172\bullet T-0.1623{T}^2-2.0557\bullet {10}^{-3}\bullet {T}^3\\ {}{29}^{{}^{\circ}}\mathrm{C}<\mathrm{T}& 10\end{array}\right. $$
(2)

where Qtemp,t is the subindex value for water temperature of day t, and T is temperature in degrees Celsius.

2.5 Water Quality Index Computation

The overall water quality index used in this study (Eq. 3) is an adaptation of the works by Harkins (1974) and Hallock (2002), which utilize an unweighted multiplicative model and implement a nonparametric multivariate ranking procedure (Kendall 1957; Harkins 1974; Landwehr and Deininger 1976; Hallock 2002):

$$ {UMWQI}_i={\left(\prod \limits_{i=1}^p{Q}_{i,p}\right)}^{\frac{1}{p}} $$
(3)

Where Qi is the subindex value from 0 to 100, p is the number of key determinants within the WQI, all with data availability on day i, and UMWQI is the geometric mean of the subindex values.

The UMWQI model minimizes ambiguity between the overall index and the subindex values and provides flexibility by allowing removal and addition of determinants (Gitau et al. 2016). As configured in this study, the UMWQI does not calculate the overall index for any one day unless all the key determinants have data points available for that given day. To ensure the accuracy of development of the subindex equations for TSS, TP, SRP, and NO2–3, it was necessary to verify that excluding a portion of the data would not cause the piecewise subindex equation to be substantially different than when an entire dataset was used. The Maumee River dataset (USGS 04193500) was used to validate the subindex equations. It had the longest and most consistent data, hence the most appropriate dataset to validate the equations. For this assessment, the dataset (1975–2015) was broken up into two portions: data from the years 1975 to 2005 and data from the years 2006 to 2015. The earlier dataset was used to calculate the subindex curve parameters, while the second portion was used to verify the subindex curve parameters, checking that they did not change drastically when this latter section of the data was removed.

3 Results

3.1 Subindex Formulations and Verification

Figure 2 shows the subindex equations and associated classifications and ranges as developed through this study and in relation to ecological associations based on Dinius (1972). Associated water quality ratings as derived through this study are shown in the footnote for Fig. 2. Based on the analysis, the only substantially different curve parameters once the data were separated into two portions (1975–2005 and 2006–2015) were the subindex curve parameters describing the data that would be ranked 40–49 — f and g (Fig. 2) — for the TP subindex curve. This was because there was only a small difference between the Pthreshold and P90 values, yet the corresponding index values covered the range between 40 and 49. The percent change is the difference between Pthreshold and the maximum filtered value, Pmax,f, and P90, Pmin, and P10, which play crucial roles in the linear portion of the subindex pairwise equations for each of the potential key determinants. It should be noted that for this portion of the analysis, Pmax,f was given a value of 40 (rather than the Pthreshold) which, in the final formulations, represents the lower limit for “passing” values. This was because the verification was conducted during the development process before the subindex equations were finalized.

Fig. 2
figure 2

Subindex formulations and classifications as developed and in relation to ecological associations based on Dinius (1972)

3.2 Temperature Subindex

Because the WLEB is more susceptible to algal blooms in the summer when the air temperature is higher, it was expected that the Cude (2001) equation (Eq. 2) would give an accurate representation of the water quality status in the basin with respect to water temperature. According to the Michigan Department of Environmental Quality (MDEQ 2009; 2015), warm-water habitats (WWH) should have summer water temperatures of 60-70 °F (15.6-21 °C). Though it was originally developed for cold-water habitats, Eq. 2—with a warm water temperature threshold of 29 °C — could be considered conservative with respect to the range of values provided by the MDEQ. This is because it would rank values about 21 °C at about the maximum value of the acceptable range for summer water temperatures based on the exponential equation, rather than assigning them a flat value of 10. The water temperature threshold obtained by back-calculating from Eq. 2 and finding the water temperature at which the subindex equation gave a ranking of 40 (consistent with the lower limit for “passing” values for the other key determinants) was 25.9 °C (78.6 °F). Based on values for climate conditions and maximum water temperature tolerance of 57 different American fish species (Eaton and Scheller 1996) sampled throughout the U.S., the 29 °C water temperature threshold in Eq. 2 was indicative of fish species living under extreme stress and was, thus, considered a suitable threshold for maximum tolerance of water temperature. Consequently, Eq. 2 was adopted without change for this study.

3.3 Western Lake Erie Basin Case Study

To determine which water quality determinants were most crucial with respect to the water quality status of the WLEB, the concentrations of total suspended solids (TSS), total phosphorus (TP), soluble reactive phosphorus (SRP), and nitrate-nitrite (NO2–3) were assessed based on the thresholds listed in Table 1. Out of the four determinants, nitrate-nitrite (NO2–3) was the only determinant for which >90% of the data met its respective threshold. In comparison to the rest of the determinants across all four sites (SRP ranged from 0.3–19.8%, TP from 6 to 51%, and TSS from 62.9–84.9%), nitrate-nitrite concentrations in the WLEB tributaries were consistently lower, and the majority of the NO2–3 subindex values were thus ranked above 39.

Subindices for Soluble Reactive Phosphorus (SRP), Total Phosphorus (TP), Total Suspended Solids (TSS), and Water Temperature (°C), and the overall water quality index were calculated for all four selected sites in the WLEB. Figure 3 shows the monthly and annual averages of the overall daily WQIs, respectively. Monthly average subindex values for total suspended solids (TSS) ranged between 33 and 80 (ranking from “poor” to “good”). Those for total phosphorus ranged between 31 and 73 (“poor” to “good”). Soluble reactive phosphorus had the largest range, between 13 and 78 (ranking “unsuitable for all uses” to “good”). Overall index values ranged from 35 to 80 throughout the basin, indicating that water quality in the basin is generally “poor” to “good”, consistent with results from Sekaluvu et al. (2017) and USEPA (2017).

Fig. 3
figure 3

Long-term WQIs for the four selected USGS gauge sites. The solid lines are annual averages and the markers show monthly values. Stations represented are: (a) River Raisin (USGS 04176500); (b) Blanchard River (USGS 04189000); (c) Maumee River (USGS 04193500); and, (d) Tiffin River (USGS 04185000). Dotted lines represent pass/fail threshold while the dashed lines show the boundaries for the different classifications

To provide a more accurate assessment of the trends, a nonparametric trend analysis using Kendall’s τ was conducted. All four sites had an increase in the monthly mean subindex for TSS with significant increases seen at Maumee River (τ = 0.2138; p = <.0001), River Raisin (τ = 0.0968; p = 0.0051), and Tiffin River (τ = 0.1843; p = 0.0061). For the River Raisin, Maumee River, and Tiffin River sites, there were increases in monthly mean TP subindex values, although trends in the Tiffin River were not significant. Monthly TP subindex values declined significantly at the Blanchard River site (τ = −0.1586; p = 0.0165). Significant negative trends in SRP subindex values for the River Raisin (τ = −0.0889; p = 0.0102) were consistent with results in Sekaluvu et al. (2017). The positive trends in TP subindex values at the Maumee River site (τ = 0.2257; p = <.0001) were contrary to results in Sekaluvu et al. (2017) which showed that TP levels remained mostly stable but high. The Maumee River SRP subindex trends implied improvement in associated water quality status, contrary to the findings of Sekaluvu et al. (2017) of increasing SRP concentrations.

4 Discussion

Water quality thresholds and criteria were incorporated in subindex formulations developed for key water quality determinants. The enhanced subindex formulations were built into the UMWQI and tested for suitability, with the WLEB serving as a pilot site. Based on the modified WQI, the River Raisin and Maumee River sites showed statistically significant trends of increasing overall WQI values, while the Blanchard River site showed a trend of declining WQIs. The Tiffin River showed no significant trends in overall WQI. For all four sites, there was a positive trend for monthly mean subindex values for TSS. All but the Blanchard site showed a positive trend for the monthly mean TP subindices. The Maumee River site was the only one that showed a positive, albeit not statistically significant, trend in SRP monthly mean subindices. This result was contrary to findings in Sekaluvu et al. (2017) and was potentially reflective of periods of improvements than of current trends.

The Western Lake Erie Basin is a warm-water habitat, and rising air temperatures due to climate change play a role in the prevalence of HABs (Paerl and Huisman 2008, 2009; Paerl et al. 2011). The statistically-estimated water temperature dataset developed in this study did not necessarily capture the extremes that could occur; since the dataset was based upon median values of the daily averages for the USGS gauge sites and the NOAA weather stations, it could underestimate water temperatures during warmer periods. Furthermore, the equation does not account for temperature effects of industrial discharges and reduced tree canopy, both of which present water temperature-related concerns. It would be ideal if more water temperature data were collected. Dissolved oxygen may be an appropriate substitute to water temperature in the UMWQI model when water temperature measurements are not feasible, given its ties to the well-being of aquatic life (USGS 2017) and that it is amongst the most common key determinants in other WQIs (Gitau et al. 2016; Sutadian et al. 2016; Lumb et al. 2011a).

The most challenging part of developing the WQI was accurately representing the water quality status in terms of thresholds or water quality targets—some of which are regulatory, while some may be deemed arbitrary as they may not necessarily indicate the ecological threshold of a contaminant. Because ecological thresholds of contaminants generally differ by location, regulatory thresholds may not be the most accurate for assessing water quality status. In this study, nitrate-nitrite, or more commonly nitrate-N data, were removed from the assessment due to the indication that the associated water quality status was at least fair based on the 10 mg/L threshold set for drinking water through the Clean Water Act. However, this threshold may be too high, considering that nitrogen could affect the prevalence and toxicity of harmful algal blooms (Gobler et al. 2016; Anderson et al. 2002), implying a need for more stringent environmental thresholds. Further research is necessary to better determine more appropriate environmental thresholds that may be incorporated in subindex equation development.

An important consideration pertaining to the water quality index is the use of concentration levels of key determinants rather than their loads. Surface water quality guidelines are developed based on concentrations considering the need to protect aquatic life and agricultural uses, as well as recreation and aesthetics (El-Sadek et al. 2005; Koltun, 2012). For Lake Erie (USEPA 2017), spring seasonal thresholds were developed both in terms of loads and flow-weighted mean concentrations (FWMC). The FWMC are a form of flow-adjusted concentrations. Concentrations are useful measures based on which to assess water quality, as they play a critical role in biological productivity (Cahn and Hartz 2010). Loads are typically related to the accumulation of mass or volume, and/or gauging the effect of Best Management Practices (BMPs) on reducing pollutant delivery (Koltun, 2012). Both measures are important when assessing the effects of contaminants in a watershed; the choice of data depends on the framework of the study.

The subindex computations for this study were based on statistical distributions. Because the overall index was based upon the distribution of the data, one concern is that the WQI may change as new data are added, depending on their impact on the statistical distribution. To prevent this from occurring and from favoring poor performing streams in the “failing” portion of the subindex, a “maximum” threshold could be created such that anything surpassing this would receive a value of 0. In the case of TP and SRP, for example, the “maximum” threshold for the data could be concentrations associated with hypereutrophic states for streams. However, most of the science and literature around hypereutrophic water bodies is for lakes; the concentrations that could be found in literature would be for lakes and not for streams.

Another option to consider is the possibility of bounding a “failing” standard using an exponential decay. This would standardize the bottom portion of the subindex equation and not favor lower-performing systems due to the distribution of the data being more concentrated in the “failing region.” This would keep the subindices from reaching zero and shift the focus to how quickly the decay would occur and how this could be assessed based on site-specific data. Though the ranking did not differ as greatly for the “passing” data, it is possible to set an optimal value which gets ranked at 100. As developed, the WQI in this study does not rank anything higher than 89 based on the assumption that there are very few water sources that would qualify as pristine.

The WQI that was developed in this study has great potential throughout the water quality decision-making process and offers several functionalities. Because varying concentrations of contaminants can indicate different water quality status, the WQI provides the ability to standardize how determinants affect the water quality status. Furthermore, WQIs can provide a method with which to efficiently communicate water quality concerns based on the common ranking system proposed. Finally, WQI can be used for predictive purposes based on short- and long-term trend analysis to establish overall water quality status and to pinpoint key determinants that need to be addressed in management initiatives.

5 Conclusion

This study was aimed at creating criteria-based and flexible subindex equations for the UMWQI, using the WLEB as a pilot site. The UMWQI was selected due to its flexibility and applicability beyond the WLEB. The subindex equations were developed by incorporating water quality thresholds based primarily on criteria and targets for Indiana, Ohio, and Michigan, and using statistical distributions to assign specific subindex values. The overall WQI values were calculated by taking the geometric mean from the subindex values for the Soluble Reactive Phosphorus (SRP), Total Phosphorus (TP), Total Suspended Solids (TSS), and Water Temperature (T,°C). Subindex equations developed for the respective determinant were found suitable based on the results from the River Raisin, Tiffin River, Blanchard River, and Maumee River stations. Water quality indices provide a way to report and assess large amounts of water quality data in an efficient manner. They have been tested and verified for their usefulness and effectiveness. The UMWQI provides flexibility, so that it can be applied to different sites with water quality data. Though the WQIs provide a “snapshot” of the water quality status by summarizing large amounts of water quality data, by no means do they replace water quality and environmental data. This study was conducted considering WLEB tributaries and results might not be directly applicable elsewhere. The methodology as developed is, nonetheless, applicable to other sites with similar water quality challenges.