Introduction

Watershed health is impacted by a number of variables, including climate, soils, hydrology, geomorphology, and land use/land cover (LULC). Watershed health is often evaluated by considering stream characteristics, such as sediment load (Jones et al. 2001; Mano et al. 2009; Hazbavi and Sadeghi 2017), aquatic ecosystems (Tiner 2004; Rodgers et al. 2012; Herman and Nejadhashemi 2015), and water quality (Olsen et al. 2012; Luo et al. 2013; Kim and An 2015; Jabbar and Grote 2018). Methods to assess watershed health vary, and in recent decades, different watershed assessment models have been developed to evaluate the cumulative impacts of human activities on watershed health and the condition of aquatic systems. These models have focused on different parameters that affect watershed health, such as identifying the impact of land use and land cover changes (Bateni et al. 2013; Calijuri et al. 2015; Deshmukh and Singh 2016; Peraza-Castro et al. 2018), climate change (Johnson et al. 2012; Fan and Shibata 2015; Neupane and Kumar 2015), and susceptibility to hydrologic alterations (Pyron and Neumann 2008; Marcarelli et al. 2010). Among the variety of approaches, statistical analyses and hydrological modeling have been widely performed because of the recent availability of large data sets and flexibility offered by these techniques.

In the current research, we propose a new approach for assessing watershed vulnerability to contamination based on spatial analysis using the geographic information system (GIS) and analytic hierarchy process (AHP) techniques. Due to its simplicity, the proposed method can easily be used to evaluate watershed vulnerability, with only a small amount of input information required and without field or lab work, which minimizes cost and time commitments. This procedure depends on six basic factors, which represent watershed characteristics, and it is designed to identify vulnerable zones. The proposed factors were land use/land cover, soil type, average annual precipitation, slope, depth to groundwater, and bedrock type. Using this approach to identify the vulnerable zones within river basins can improve decision-making for professionals in the area of environmental planning and management. In this research, we compare the vulnerability assessment from this method to water quality measurements from field data and water quality estimates from hydrological models.

The ability of hydrological models to simulate and predict real phenomena has increased considerably in recent years. Some of the models are based on simple empirical relationships with robust algorithms, while others physically-based governing equations with computationally calculated numerical solutions. With improvements in computational power and data availability, the number of empirical parameters and physical base functions used in many models has also grown, which can cause calibration to be more difficult (Arnold et al. 2015).

The Soil and Water Assessment Tool (SWAT) is an effective model developed to assess hydrological processes, pollution problems, and environmental issues worldwide. It has been extensively used to investigate water quality and nonpoint source pollution problems and to predict the impact of changes in land management practices for a range of scales and environmental conditions (Behera and Panda 2006; Gassman et al. 2007; Zhu and Li 2014). This model is especially useful for predicting future watershed health, especially in ungauged basins. The SWAT model is increasingly being applied to predict sediment yield (Xu et al. 2009; Liu et al. 2015), nutrient loadings (Hanson et al. 2017; Malagó et al. 2017), fecal coliform concentrations (Cho et al. 2012; Bai et al. 2017), and pesticide transport (Luo and Zhang 2009; Bannwarth et al. 2014; Boithias et al. 2014). SWAT also efficiently simulates hydrological processes (e.g., Im et al. 2007; Hoang et al. 2014). Im et al. (2007) used SWAT modeling of the Polecat Creek Watershed in Virginia and was able to simulate streamflow and sediment yields using the SWAT and hydrological simulation program-Fortran (HSPF) models. Similarly, Hoang et al. (2014) found that the SWAT provided highly accurate predictions for streamflow for both daily and monthly times, but that the nitrate fluxes simulations were highly accurate only for monthly time steps. When compared with the DAISY-MIKE SHE (DMS) model, Hoang et al. (2014) found that the SWAT simulated results for streamflow and nitrate fluxes were identical to DMS ranges during high flow times but were moderately low during low-flow times. In this research, SWAT was used to simulate total dissolved solids and nitrate concentrations in selected watersheds, as these parameters are useful for assessing watershed vulnerability.

Materials and methods

A case study in the Eagle Creek Watershed

In Central Indiana, in the northern section of the Upper White River Watershed, located within the Mississippi River Basin, lies the Eagle Creek Watershed (ECW) (Fig. 1). With a drainage area of about 459 km2, the ECW includes 10 sub-watersheds. These range in size from 26.9 to 70.7 km2. The ECW’s three major branches (i.e., School Branch, Fishback Creek, Eagle Creek Branch) flow into the Eagle Creek Reservoir. Indianapolis depends on the Eagle Creek Reservoir as one of its primary drinking water sources. Eight major tributaries (i.e., Dixon Branch, Finely Creek, Kreager Ditch, Mounts Run, Jackson Run, Woodruff Branch, Little Eagle Branch, Long Branch) feed these branches. The three primary branches have the following flow distributions: (1) Eagle Creek—an average flow of approximately 2.85 m3/s, which contributes 79% of the reservoir’s water; (2) Fishback Creek—an average flow of 1.1 m3/s, which contributes 14% of the reservoir’s water; and (3) School Branch—an average flow of 0.5 m3/s, which contributes 7% of the reservoir’s water (Tedesco et al. 2005).

Fig. 1
figure 1

Location map of the study area in Indiana showing Eagle Creek Watershed

At 56%, agriculture is the chief land use in the Eagle Creek Watershed, with urban land use at 38%, mainly in the southeastern section. Most of the remaining land is either forested or grassland. In cooler times of the year, the area receives storms of long duration and moderate intensity, but precipitation is delivered in short, high-intensity storms during late spring and summer. The ECW receives an average annual precipitation of 1050 mm. February records the least rainfall, averaging 59.7 mm, whereas May records the most rainfall, averaging 115.5 mm. The ECW has a generally flat topography, with fewer than 3% slopes. Agricultural areas are flatter, with steeper slopes observed near streams and rivers. In the upper part of the watershed, the soil is thin loess over loamy glacial till, which is deep and poorly drained. However, in the watershed’s northwest section, soils range from poorly to well drained. In addition, in the areas downstream, soils are generally deep, well drained to slightly poorly drained, and the soils create a thin, silty layer over the underlying glacial till (Hall 1999). In the extreme northeastern section of the ECW, the bedrock is mainly brown, fine-grained dolomite to dolomitic limestone. In contrast, in the southwest section, brown sandy dolomite to sandy dolomitic limestone and gray, shaley fossiliferous limestone predominate. Brownish-black, carbon-rich shale, greenish-gray shale, and small amounts of dolomite and dolomitic quartz sandstone characterize the southern part of the ECW (Shaver et al. 1986; Gray et al. 1987).

Data acquisition and processing

Thematic maps of the study area were generated based on remote sensing data. A 30-m resolution digital elevation model (DEM) of the topography was used to investigate key watershed characteristics, including topographic variability and slope. To calculate watershed characteristics (e.g., drainage networks, hydrologic units, catchment areas, and related features, including rivers and streams), the National Hydrography Dataset (NHD) and Watershed Boundary Dataset (WBD), both managed by the United States Geological Survey (USGS), were applied (USGS 2016). This study relied on the National Land Cover Database 2011 (Homer et al. 2015), with its 15 land use/land cover (LULC) classifications (Fig. 2a). In our analysis, some classifications were pooled so as to reduce the number of variables and to create more meaningful LULC categories. Categories that had been termed “developed” were combined to form one “urban” category, while all categories previously considered “forest” also became one group, as did all “wetland” categories (Fig. 2b). The data were analyzed using ArcGIS version 10.4.1, which also provided the averages of each parameter for every sub-watershed. To obtain the average annual precipitation raster for the period 1961–1990, the Parameter-elevation Regressions on Independent Slopes Model (PRISM) was used (Daly 1996).

Fig. 2
figure 2

Land use categories (a) before reclassification and (b) after reclassification and aggregated into eight categories

Methodology of watershed susceptibility assessment

Watershed susceptibility assessment factors

In this study, a decision hierarchy was employed to assign the relative weight for each factor that contributed to affect the watershed’s susceptibility, which involves two steps. First, categories were created, using six seemingly significant factors: land use, soil type, precipitation, slope, depth to groundwater, and bedrock type (Fig. 3). Second, 46 sub-categories were created in order to assess the watershed’s health. This study synthesized the judgment of experts and literature reviews in this field (Blanchard and Lerch 2000; Eimers et al. 2000; Tran et al. 2004; Lopez et al. 2008; Jun et al. 2011; Furniss et al. 2013) with other required and available data about the study area, to arrive at each factor, which was then categorized into classes or sub-categories. Next, a suitability rating value was given to each sub-category. Factors ranked between 0 and 1 (i.e., low scores) have little impact on water quality, whereas factors with high scores have a large impact on water quality. Similarly, sub-categories were rated from 1 to 10, with 1 meaning that there was a negligible impact on water quality, while high scores correlated with having a very high impact.

Fig. 3
figure 3

Thematic maps of the layers before rating for (a) soil type, (b) average annual precipitation, (c) slope%, (d) depth to groundwater, and (e) bedrock type

Land use/land cover

The LULC can affect surface water quality as either point or nonpoint source (NPS) pollution, making the LULC one of the primary factors affecting water quality, and, therefore, watershed health (Brainwood et al. 2004; Carey et al. 2011). NPS pollution in surface water, especially increases in nitrogen (N) and phosphorus (P), is usually correlated with agricultural use (Heathwaite and Johnes 1996; Ma et al. 2011). Similarly, urban lands can produce great effects on surface water quality because they contain substantial amounts of point and nonpoint source contaminants (Wilson and Weng 2010). Contamination from nutrients, organic matter, and bacteria often results from the waste generated by city wastewater treatment plants as well as from a variety of anthropogenic sources (Chang et al. 2010). Based on their impact on watershed health, for this study, the LULC was separated into eight categories. Agricultural land uses with the highest impact were rated “10,” while land use classified as “water” received the lowest rating or “1” (Table 1).

Table 1 The relative weights and rating scores of the factors and sub-criteria used for watershed susceptibility assessment

Precipitation

Precipitation and increasing pollution levels in surface water are usually assumed to be directly related. For example, surface runoff of pollutants increases with rapid precipitation and can degrade the water quality of rivers and streams (Göbel et al. 2007; Kim et al. 2007). The high correlation of precipitation with watershed health results from the impact of rainfall magnitude and intensity on sediment and nutrient loading. Thus, in this study, precipitation was classified into 10 groups, with the highest amount of annual precipitation (> 75 in.) corresponding to a value of “10,” while the lowest precipitation was given a value of “1.”

Slope

When rapid precipitation combines with slopes, it can greatly affect surface water quality (El Kateb et al. 2013; Meierdiercks et al. 2017). A steep slope can increase the flow rate of a water body, which causes soil erosion and sedimentation, such that many types of pollutants (e.g., nutrients, pathogens, and pesticides) can be carried to nearby rivers (Bracken and Croke 2007). The number of total suspended solids increases as eroded soil particles are transported to rivers, negatively affecting the water quality. Additionally, it has been found that high slopes have a considerable effect on the infiltration rate to groundwater, with Fox et al. (1997) finding that the amount of infiltration decreases as the slope increases. Therefore, this study formed six categories of slope to take into account their impact on the amount of rainfall that becomes overland flow, where it eventually either connects to the surface water or adds to the amount of groundwater by infiltration. In these new categories, gentle slopes are given a value of “1,” while steep slopes were valued at “10.”

Depth to groundwater

A broad range of catchment processes connects surface water to groundwater (Brunner et al. 2009; Lehr et al. 2015). In addition, geological factors play a part in groundwater quality, predominantly through the chemical processes of water-rock interactions. Therefore, rock and soil components contribute significantly to water quality because these components change the physical and chemical properties of water (Singh et al. 2005; Varanka et al. 2014). Another category proposed by this study is depth to groundwater, which was classified into eight groups; shallow groundwater was given a rating of “10,” but deep groundwater was given a rating of “1.”

Bedrock type

Various types of geologic materials (e.g., sedimentary, igneous, and metamorphic rocks, as well as glacial deposits) have a large effect on water quality. Due to a variety of chemical processes, long-term geochemical interactions (i.e., between rock and water) can take place between groundwater and the aquifer (Adams et al. 2001). As water runs through fractured rock aquifers, especially those made of limestone or dolomite, the groundwater’s chemical properties can be considerably altered as some carbonate materials dissolve or evaporate. Thus, surface water quality can be altered when water is exchanged between rivers and shallow aquifers. This study classified rock types into six classes based on their resistance to weathering. Metamorphic and igneous rocks were given the low value “1” because these rocks are normally very hard and resist weathering, unlike limestone, which was given a high rating of “10” because it dissolves easily.

Soil type

Soluble materials and suspended sediments in water can also originate from soil. Overall, sediment is the water pollutant that has the greatest effect on the quality of surface water physically, chemically, and biologically. Larger, heavier sediments (e.g., pebbles and sand) tend to settle first, with smaller, lighter particles (e.g., silt and clay) remaining in suspension for a long time, thus contributing greatly to water turbidity. In addition, a variety of soluble salts in the soil can increase the electrical conductivity (EC) of water, thereby negatively affecting its quality (Chhabra 1996). For example, a high clay content increases the EC as a result of the high cation-exchange capacity (CEC) of clay minerals. In this study, soil types were grouped into eight soil classes relative to their impact on water quality. Sandy soil was given a low value (1), while clay loam was valued at “10,” because clay loam increases turbidity and salinity.

Analytical hierarchy process evaluation model

Multiple-criteria decision analysis (MCDA) problems include criteria that vary in importance, so the process determines the weights of these criteria to indicate the relative significance of each of the chosen criteria in relation to the result. Therefore, information about the relative importance of each criterion is needed prior to assigning weights. As shown in Fig. 4, the analytical hierarchy process (AHP) is one of the multi-criteria decision-making methods created by Saaty (1980). It uses pairwise comparisons that measure all factors (criteria and sub-criteria) matched to each other. This method is founded on three major principles: (1) pairwise comparison judgments, (2) decomposition, and (3) synthesis of priorities. Saaty (1980) recommended using a scale from 1 to 9 to compare the factors, with 1 signifying that the criteria are equally important, and 9 signifying that a particular criterion is highly significant. The consistency ratio (CR) is calculated to assess the differences between the pairwise comparisons and the reliability of the measured weights. To be accepted, the CR should be less than 0.1. If not, subjective judgments should be rethought prior to recalculating the weights (Saaty 2008).

Fig. 4
figure 4

Flowchart of the procedures of AHP and watershed susceptibility assessment method

The structure of the decision-making problem for this study consisted of numbers represented by the symbols m and n. The values of aij (i = 1, 2, 3…, m) and (j = 1,2, 3..., n) were used to represent the performance values matrix in terms of the ith and jth elements. The values of the comparison criterion above the diagonal of the matrix were used to fill the upper triangular matrix, and the lower triangular of the matrix used the reciprocal values of the upper diagonal. In the pairwise comparison matrix A, the matrix element aij indicates the relative importance of the ith and jth alternatives with respect to criterion A, where aji is the reciprocal value of aij, as shown in Eq. 1.

Below is an example of a decision matrix, which combines a typical comparison matrix for any problem with the relative importance of each criterion:

$$ A=\left(\begin{array}{cccc}1& {a}_{12}& \cdots & {a}_{1n}\\ {}1/{a}_{12}& 1& {a}_{23}& {a}_{2n}\\ {}\cdots & 1/{a}_{23}& \cdots & \cdots \\ {}1/{a}_{1n}& 1/{a}_{2n}& \cdots & 1\end{array}\right) $$
(1)

where aj; I, j = 1, 2, ……, n is the element of row i and column j of the matrix, which is equal to the number of alternatives.

The geometric principles in Eq. 2 were used to calculate the eigenvectors for each row:

$$ E{g}_i=\sqrt[n]{a_{11}\times {a}_{12}\times {a}_{13}\times \cdots \times {a}_{1n}} $$
(2)

where, Egi represents the eigenvector for the row i, and n represents the number of elements in row i. The priority vector (pri) was found by normalizing the eigenvalues to 1, the normalization is a method that used to get numerical and comparable input data, using Eq. 3:

$$ p{r}_i=E{g}_i/\left(\sum \limits_{k=1}^nE{g}_k\right) $$
(3)

Lambda max (λmax) was evaluated based on the summation of the result of multiplying each element in the priority vector with the sum of the column of the reciprocal matrix:

$$ {\lambda}_{\mathrm{max}}=\sum \limits_{j=1}^n\left({W}_j\times \sum \limits_{i=1}^m{a}_{ij}\right) $$
(4)

where aij is the sum of the criteria in each column in the matrix; Wi is the value of the weight of each criterion corresponding to the priority vector in the matrix of decision; and where i = 1, 2, … m, and j = 1, 2, … n.

The consistency ratio (CR) can be found using Eq. 5:

$$ CR=\frac{CI}{RI} $$
(5)

where CI is the consistency index:

$$ CI=\frac{\lambda_{\mathrm{max}}-n}{n-1} $$
(6)

where λmax represents the sum of the products between the sum of each column of the comparison matrix and the relative weights, and n is the size of the matrix.

RI signifies the random index, which describes the consistency of the randomly generated pairwise comparison matrix. In this study, weighted scores for each factor were obtained using the AHP model (Table 2), with a similar method employed to obtain rating values for each sub-criteria within the watershed susceptibility assessment.

Table 2 A pairwise comparison matrix developed for assessing the relative importance of the criteria for watershed susceptibility assessment

Watershed susceptibility values in the study area were calculated using weighted overlay analysis:

$$ WS=\sum \limits_{j=1}^n{W}_j\times {C}_{ij} $$
(7)

where WS represents the watershed susceptibility for area i, Wj represents the relative importance weight of criterion, Cij represents the grading value of area i under criterion j, and n represents the total number of criteria.

After AHP analysis was completed, the maps needed for each layer were constructed as a shapefile (vector) or raster. Figure 5 shows the raster maps showing the ratings of each of the six parameters considered: land uses/land cover, soil type, average annual precipitation, slope, depth to groundwater, and bedrock type.

Fig. 5
figure 5

Thematic maps of the layers after rating for (a) land use/land cover, (b) soil type, (c) average annual precipitation, (d) slope%, (e) depth to groundwater, and (f) bedrock type

Hydrologic modeling using SWAT

SWAT is a hydrological model that quantifies the influence of changes in land management practices, land use and land cover changes, and climate change on water quality and hydrology for a range of scales with a daily time step (Neitsch et al. 2011). SWAT allows for local spatial heterogeneity of any study area by dividing a watershed into sub-basins according to topographic features. Sub-basins have a special geographic position in the watershed but are spatially connected to each other. Subsequently, sub-basins can be divided into small portions of the hydrologic response units (HRUs), which consist of combinations of land cover, soil, and slope. Multiple HRUs, created by dividing sub-basins, can provide high accuracy and better physical descriptions. Ten sub-basins, with 3513 HRUs, are delineated within the ECW according to land use, soil type, and land slope. When applying SWAT, specific data are required, such as weather, soil, land use, and topography.

The hydrological cycle can be simulated by the SWAT model using the water balance equation (Neitsch et al. 2011), as shown in Eq. 8.

$$ S{W}_t=S{W}_0+\sum \limits_{i=1}^{i=t}\left({P}_{day}-{Q}_{surf}-{E}_a-{W}_{seep}-{Q}_{gw}\right) $$
(8)

where SWt and SW0 are the final and initial soil water content (mm/d), respectively; t is the time (day); Pday is the amount of precipitation (mm/d); Qsurf is the surface runoff (mm/d); Ea is the evapotranspiration (mm/d); Wseep is the percolation (mm/d); and Qgw is the amount of return flow (mm/d).

Surface runoff in the SWAT can be calculated using the Soil Conservation Service (SCS) curve number (CN) method (USDA – SCS 1972):

$$ {Q}_{surf}=\frac{{\left({R}_{day}-0.2S\right)}^2}{\left({R}_{day}+0.8S\right)} $$
(9)

where Qsurf and Rday are surface runoff (mm) and rainfall depth (mm) for the day, respectively; and S is the retention parameter (mm). In the current study, the SWAT model was simulated for 9 years from 2010 to 2018, including a 2-year warm-up period from 2010 to 2011 (2 years).

Sensitivity analysis

Sensitivity analysis was employed to determine if key parameters could be used to calibrate and validate the SWAT model (Zhang et al. 2009; Arnold et al. 2012). For this study, global sensitivity analysis was utilized in the SWAT-CUP 2012 version 5.1.6 (Abbaspour 2015). To identify the significance of the sensitivity of each parameter, some indices were used, such as t tests (Abbaspour et al. 2017).

Calibration and validation of the SWAT model

Calibrating a model modifies parameters based on field data to confirm the same result over time (Arnold et al. 2012). Validation is a procedure for testing the accuracy of the identified parameters by simulating the observed data with a dataset not used in the calibration process, without modifying the model’s parameters (Govender and Everson 2005; Vilaysane et al. 2015). In the current study, calibration was performed using 5 years (2012–2016) of monthly observed data that obtained from monitoring gauge station at Zionsville (USGS 03353200) for both discharge and nitrate loads, but 4 years of data (2013–2016) for sediment loads, due to the availability of each of these data types.

Calibration and validation procedures were executed in the SWAT-CUP using the sequential uncertainty fitting (SUFI-2) algorithm. The SUFI-2 is a semi-automated procedure for calibration and an uncertainty analysis algorithm (Schuol et al. 2008; Kundu et al. 2016). The SUFI-2 has been applied in many studies, such as by Setegn et al. (2008) in the Lake Tana Basin or Rai et al. (2018) in the Brahmani and Baitarani river deltas.

The parameters were modified to minimize the variation between the observed data and simulated results, using the calibration procedure. Calibration was executed for the period from 2012 to 2015, using 26 parameters (Table 3), depending on the results of the sensitivity analysis and a review of previous studies (Heathman et al. 2008; Pyron and Neumann 2008; Yen et al. 2014; Teshager et al. 2015; Jang et al. 2018). Among these, 15 parameters were considered to be more related to streamflow calibration, with six parameters associated with sediment load calibration, and five parameters more related to nitrate load calibration. The validation procedure was performed for the period from 2017 to 2018.

Table 3 The SWAT parameters for calibration of streamflow, sediment load, and nitrate

To check the performance of the SWAT model, many indices can be employed. In the current research, the Nash-Sutcliffe (NS) coefficient was used for statistical evaluation. Nash-Sutcliffe efficiency (NSE) values range between −∞ and 1; NSE = 1 indicates a perfect match of the simulated output data to the observed data.

The coefficient of determination (R2) was also employed in assessing the accuracy of the model. Percent bias (PBIAS) measures the average tendency of the simulated data to be larger or smaller than the observations. The optimal value of PBIAS is 0, where low magnitude values indicate better model simulations. A positive value indicates the model is underestimation while the negative value indicates the model is overestimation.

The calculations of R2, NSE, and PBIAS are computed using Eqs. 10, 11, and 12 (Moriasi et al. 2007).

$$ NSE=1-\left[\frac{\sum \limits_{i=1}^n{\left({Q}_i^m-{Q}_i^s\right)}^2}{\sum \limits_{i=1}^n{\left({Q}_i^m-{Q}_{mean}^m\right)}^2}\right] $$
(10)
$$ {R}^2=\frac{{\left[\sum \limits_{i=1}^n\left({Q}_i^m-{Q}_{mean}^m\right)\left({Q}_i^s-{Q}_{mean}^s\right)\right]}^2}{\sum \limits_{i=1}^n{\left({Q}_i^m-{Q}_{mean}^m\right)}^2\sum \limits_{i=1}^n{\left({Q}_i^s-{Q}_{mean}^s\right)}^2} $$
(11)
$$ PBIAS=1-\left[\frac{\sum \limits_{i=1}^n\left({Q}_i^m-{Q}_i^s\right)\times 100}{\sum \limits_{i=1}^n\left({Q}_i^m\right)}\right] $$
(12)

The SWAT model shows the existing relationship, on a monthly basis, between the observed and simulated data. For the period from 2012 to 2016 (Fig. 6a), the model has a good performance in the flow simulation, with values for the estimators of the efficiency of the model of 0.78, 0.73, and 14.4, for R2, NSE, and PBIAS, respectively. When comparing the observed and simulated data related to streamflow for validation, R2 (0.76), NES (0.72), and PBIAS (10.4) were slightly less than with the calibration results. By comparing the observed and simulated flows through an analysis of linear regression, the values of R2 and NSE (both for the calibration and validation period) exceeded 70% of the maximum possible (Fig. 7a), which is statistically acceptable.

Fig. 6
figure 6

Comparing the results of the simulated and observed monthly data at Zionsville (USGS 03353200) for a discharge for the calibration period (2012–2016) and validation period (2017–2018), b suspended sediment for the calibration period (2013–2016) and validation period (2017–2018), and c nitrate load for the calibration period (2012–2016) and validation period (2017–2018)

Fig. 7
figure 7

Regression relationship between the monthly observed and simulated data for a streamflow, b total suspended solids (TSS), and c nitrate loads

When calibrating the monthly sediment production from 2013 to 2016, the SWAT model showed a slight underestimation of sediment production during the rainy season. The monthly total suspended solids (TSS) simulated by the model showed lower values of the R2 coefficient, with a correlation of 0.67, NSE 0.64, and PBIAS 16.4 which evinces a weaker correspondence between the observed and calculated values. Figure 6 b indicates that the model underestimated the materials in suspension during the rainy season in most years. The validation procedure revealed that the coefficient of determination fell slightly to 0.65, NSE to 0.62, and PBIAS 22.6 (Fig. 7b), which indicates a lower predictive capacity of the SWAT model during the validation period. This lower correlation between the observed sediments and those simulated is possibly associated with changes in the vegetation cover. As illustrated in Fig. 6 c, the results of the statistical analysis of the calibration of nitrate loads from 2012 to 2016 showed a good adjustment, with values of 0.74, 0.69, and 18.3 for R2, NSE, and PBIAS, respectively. As regards the validation results, the value of R2 fell to 0.70, NSE 0.63, and PBIAS 23.4 (Fig. 7c).

To identify the reliability of the proposed technique, the SWAT model was applied. For this study, with regard to simulating and predicting the water quality of watersheds using the SWAT model, some parameters (e.g., TSS and nitrate) were tested based on the availability of the data needed. Both methods produced good results for predicting that water quality loads, which are essential for validating the suggested method.

Results and discussion

This study uses a watershed susceptibility assessment tool that allows for the calculation of a single vulnerability index value for the watershed area being investigated, using simple features that are weighted relative to their influence on surface water pollution. Based on the index, the vulnerability to pollution can be determined: watershed vulnerability categories are as follows—extremely high (70–100), high (50–70), moderate (30–50), low (10–30), and very low (0–10).

After evaluating each watershed for its vulnerability, maps were generated that displayed the relative vulnerabilities of each sub-watershed. The differences in vulnerability to pollution between the sub-watersheds in the Eagle Creek Watershed can be seen in Fig. 8. It was predicted that the upper portion of the watershed (e.g., Lion Creek and Finley Creek sub-watersheds) were likely to have a very high vulnerability to potential contaminants, as were Dixon Branch, Mounts Run, and Jackson Run sub-watersheds. Thus, about 37.6 km2 (8%) of the total area of the ECW was considered to be very highly vulnerable to contamination, with 284.5 km2 (57%) having a high vulnerability. The greatest area of vulnerability to contamination lies in the north and center of the study area, which is primarily comprised of agricultural land (85% of the total area within the northern sub-watershed). In the ECW, the area of low vulnerability is 73.8 km2 (14%), while there is a very low vulnerability within 7.3 km2 (1%).

Fig. 8
figure 8

Watershed vulnerability distribution map of the Eagle Creek Watershed

The areas predicted to have very high vulnerability are primarily agricultural, so this high vulnerability is to some degree the result of agricultural runoff. Another relevant factor might be the soil type. The most widespread type of soil near the drainage channels in the northern portion of the Eagle Creek Watershed is silty clay loam. In this segment of the study area, the steepest slopes occur in proximity to riverbanks. Thus, the slope can raise the surface runoff rate as well as the rate of soil erosion, which increases the amount of sediments and pollutants deposited in neighboring streams (Tedesco et al. 2005). Additionally, according to Walter et al. (2017), the bedrock (in this case, limestone), which is near the surface in northern watersheds, can also contribute to declining water quality. In the southern part of the study area, the vulnerability of the watersheds was categorized in a range from medium and weak, especially in the nearby portions of the sub-watersheds bordering School Branch, Eagle Creek at Grande Avenue, and Little Creek at 30th Street.

Both the TSS and nitrate load exhibited a similar trend of increasing when assessed using the SWAT model or this study’s proposed method. Regarding the simulation of sediment load, the comparison of the two methods indicated a high amount of total sediment load was observed in the middle and north portion of the ECW (Fig. 9a). A high concentration of suspended solids in the central and upper part of the basin can be supposed to be an indicator that the highest capacity of erosion and transport occurred in these areas of the basin, where a large amount of sediment is transported by streamflow and eventually deposited before reaching the lower part of the basin. Sediment production increased in the agricultural land due to decreases in the areas of natural forest and shrub vegetation, which also reduced the protection these provide for soil, leaving them more vulnerable to erosive processes (Bakker et al. 2008; Lenhart et al. 2011).

Fig. 9
figure 9

Spatial distribution map of the ECW showing loads of a TSS and b nitrate

Likewise, the difference in land use change between the upper and lower part of the ECW showed a significant effect on the simulations of the nitrate loads by the SWAT versus the proposed method. The SWAT and the new method estimated high loads of nitrate in the central and upper part of the ECW. This occurred because agriculture is the major type of land use, representing up to 80% of the total land, which reflects the impact of agricultural activities on surface water quality (Schilling and Spooner 2006; Laurent and Ruelland 2011). Driscoll et al. (2003) found that rivers within watersheds in New York and New England received a significant proportion (from 6 to 45%) of total nitrogen (N) from runoff from agricultural land use. As shown in Fig. 9 b, nitrate load in sub-watersheds ranged from 75 to nearly 30,000 kg/month. The northern part of the ECW had a nitrate load greater than the sub-watershed in the southern extent of the watershed. Therefore, both types of modeling results confirmed that the high potential loads of nitrate in the ECW are primarily associated with agricultural activities, such as fertilizer input and manure application. Hence, results of the evaluation of the predictive reliability of the watershed vulnerability assessment method revealed that the proposed approach is suitable as a decision-making tool to predict watershed health.

Conclusions

In this research, the primary parameters affecting watershed vulnerability were identified based on the AHP technique. The vulnerability evaluation of each watershed was used to create maps showing the relative vulnerabilities of the basins. This method showed a significant difference between the basins in their vulnerability to pollution in the ECW. The basins in the upper portion of study area were classified as likely to have very high vulnerability to potential contaminants. Similarly, the basins in the central part were identified as highly vulnerable to contamination based on their average value of vulnerability. The low and very low range of vulnerability was observed only in the southern portion of the ECW. To test the reliability of the proposed approach, the SWAT model was used. In this study, some parameters, such as total suspended solids (TSS) and nitrate, were used to calibrate and validate the SWAT model. The monthly TSS simulated by the SWAT model showed deficient values of the R2 coefficient, reaching a correlation of 67%, with an NSE of 0.64, indicating a weak correspondence between the observed and calculated values. For the nitrate load modeling results, statistical analysis of the calibration for the period from 2012 to 2016 showed good adjustment, with values of 0.74 and 0.69 for R2 and NSE, respectively. Hence, these values are statistically acceptable to predict the water quality status of the ECW. Both methods produced good results for predicting water quality. Hence, results of the evaluation of the predictive reliability of the watershed vulnerability assessment method revealed that the proposed approach is suitable as a decision-making tool to predict watershed health.