1 Introduction

With significant advances in the data processing technology and also with the development of various auto-calibration algorithms, technical challenges in calibrating high-dimensional and computationally intensive model have been well addressed in the recent years (Duan et al. 1992; Tolson and Shoemaker 2007; Vrugt et al. 2009; Yen 2012). One shortcoming of traditional calibration processes is the assumption that model uncertainty is attributed from parameter errors (Ajami et al. 2007) and it is unavoidable to conduct uncertainty analysis in hydrologic modeling research (Balin et al., 2010). Uncertainty in hydrologic model comes from many sources such as model structure, parameter, and even from measured data (climatic, flow, and water quality) (Salamon and Feyen, 2009; Balin et al., 2010; Yen et al., 2014d, e). Incomplete knowledge of the natural processes and inadequate mathematical and statistical techniques often lead to model structural uncertainty and parameter uncertainty. Random or systematic errors presenting in forcing data along with poor initial condition also introduce uncertainty (Salamon and Feyen, 2009). Though often neglected in many modeling studies, measurement uncertainty should be identified as it is as important as the other sources of error (Yen et al., 2014d). Failure to considering one or more sources of uncertainty may cause bias in the result and thus can be attributable to incorrect conclusions in a watershed modeling study (McMillan et al., 2011).

In a number of studies, measurement errors that are associated with rainfall data have been suggested an important source of uncertainty in hydrologic and water quality models (Balin et al., 2010, McMillan et al., 2011). Rainfall is generally measured with a tipping bucket at a point weather station. Actual rainfall holds great time and spatial variability, which is often aggregated into an a real average in the calibration process (Balin et al., 2010). The aggregation generally reduces the spatial representation of the measured rainfall over the entire catchment. Other causes of the uncertainty in the rainfall data include systematic and random errors, effects from wind and evaporation losses, and mechanical limitations (McMillan et al., 2011). The uncertainty in rainfall data is shown to have critical impacts on hydrologic model predictions as watershed models typically use weather input as the primary driver at runtime. Challenge is that it is difficult to identify other sources of uncertainty without clear understanding of uncertainty in rainfall data (McMillan et al., 2011).

In previous work, input uncertainty is explicitly incorporated into watershed simulation models by implementing latent variables while conducting model calibration (Kavetski et al. 2002; Ajami et al. 2007). Input data (e.g. precipitation) is multiplied by noise factors that are randomly generated based on normal distribution with θ mean and σ 2 variance. The value of θ is assigned to be 1 and the σ 2 is altered from 10− 5 to 10− 3 in the first version of latent variable application (Kavetski et al. 2002). In another application by Ajami et al. (2007), θ is assigned to be altered between 0.9 and 1.1 while the range for σ 2 remains the same as in Kavetski et al. (2002). The use of latent variables has demonstrated enhancements in the quality of calibration work with better model predictions against observation data. However, the sensitivity of latent variables has not been explored or identified comprehensively because only the default latent variables were implemented previous work. In addition, the ranges of latent variables may potentially have significant impact on model predictions associated with the corresponding uncertainty analysis.

The goal of this study is to evaluate the significance of input uncertainty in precipitation data in modeling hydrologic and water quality processes at the watershed scale. The Soil and Water Assessment Tool (SWAT) (Arnold et al. 2012) is implemented as the watershed simulation model on the Arroyo Colorado watershed (ACW), a lowland agricultural watershed in Texas, USA. The basic concept of the uncertainty in weather data using latent variables proposed by Ajami et al. (2007) is implemented as a reference model. Then, the significance of uncertainty in rainfall data other than parametric uncertainty is investigated. Specifically, the following objectives are defined: (i) To quantify how much improvement in SWAT calibration can be achieved by introducing certain ranges of latent variables into observed precipitation data; and (ii) To explore if the predictive uncertainty can be reduced by including input uncertainty.

2 Materials and Methods

2.1 The SWAT Model

Soil and Water Assessment Tool (SWAT) (Arnold et al., 2012) is a physically-based, spatially distributed watershed scale simulation model developed by the USDA-ARS to evaluate the impact of land management and climate change on water quantity and quality (Gassman et al., 2007; Arnold et al., 2012). Major components of the model include hydrology, weather, erosion, soil temperature, crop growth, nutrients, pesticides and agricultural management. SWAT has the ability to predict changes in hydrology, sediment, nutrient, pesticides, dissolved oxygen, bacteria and algae loadings from different management conditions in large ungauged basins. SWAT has been successfully applied to model water quality issues including sediments, nutrients and pesticides in watersheds (Rocha et al., 2013; Al-Mukhtar et al., 2014; Tessema et al., 2014; Yen et al., 2014c,d). In addition, modifications of SWAT have been developed to serve various purposes (Yen et al., 2014a).

2.2 Study Area

As shown in Fig. 1, the Arroyo Colorado watershed (ACW) is located in southern Texas along the border of USA and Mexico. The watershed (1,692 km2) is largely comprised of agricultural land that is irrigated from the Arroyo Colorado River through a network of canals, ditches and pipes under a system of irrigation districts during dry seasons to produce desired crop yields. In addition, the watershed is extensively urbanized along the main stem of the Arroyo Colorado River, particularly in the western and central parts of the basin including the cities of Mission, McAllen, Pharr, Donna, Weslaco, Mercedes, Harlingen, and San Benito.

Fig. 1
figure 1

Location of the Arroyo Colorado Watershed

The predominant land use type in ACW is agriculture (54 % in agriculture which includes 99 % crops and 1 % for others), followed by range land (18.5 %) and urban areas (12.5 %). The major cultivated crops include grain sorghum, cotton, sugar cane, and citrus as well as some vegetables and fruit. The major soil series within the watershed comprises of Harlingen, Hidalgo, Mercedes, Raymondville, Rio Grande, and Willacy (USDA-SCS 1972). Soils in the watershed are mostly clays, fine loams, and clay loams (clay: 20.4 %, fine loam: 13.2 %, clay loam: 7.4 %, fine silt: 6.3 %, silty clay: 4.0 %, fine clay: 4.0 %) with soil depths ranging between 1,600 and 2,000 mm. The watershed is characterized by a semi-arid climate with annual rainfall ranging from about 530 to 680 mm, generally from west to east and average annual temperature of 22.7 C with mean monthly temperatures ranging from 14.5 °C in January to 28.9 °C in July.

2.3 Input Data

Input data for the SWAT model of the Arroyo Colorado watershed comprised of a DEM (Digital elevation model) with 30 meter resolution that was downloaded from the National Elevation dataset of the U.S. Geological Survey (USGS) (Gesch et al., 2009) (accessible online at http://ned.usgs.gov/; last accessed on September 18, 2013), a land use map that was created from remote sensing data and field surveys to represent land cover conditions for 2004–2007 and soil properties associated with particular soils in the watershed that was acquired from the SSURGO soil database of USDA-NRCS. In the SWAT model, the watershed was divided into 17 subbasins and the subbasins were further subdivided into 475 Hydrological Response Units (HRUs) based on landuse, soil and slope combinations.

Daily weather data including precipitation and min/max air temperature collected at three stations over four years (2000–2003) was used in the model (see Fig. 1). The weather data was obtained from Texas State Climatologist Office located at Texas A&M University at College Station (COOPID 419588 near Weslaco, COOPID 415836 near Mercedes, COOPID 413943 near Harlingen). International Boundary and Water Commission provided the stream flow data for two stations, one near Llano Grande at FM 1015 south of Weslaco and the other near US 77 in South West Harlingen. Moreover, the Arroyo Colorado Basin has 21 permitted point sources discharges, among which 16 are municipal, three are industrial, and two are shrimp farms. The discharge permit limits of the municipal plants range from 0.4 to 10 million gallons per day. The shrimp farms discharge infrequently (Rains and Miranda, 2002).

Water quality data from limited grab samples were obtained for suspended sediment (SS), nitrogen (NH4+-N, NO3-N, and TN), and total phosphorus (TP). The grab-samples were converted to time series load by the LOAD ESTimator (LOADEST) developed by USGS (Runkel et al. 2004). The time series water quality data with monthly average within 95 % confidence interval was used as observation data for calibration of the SWAT model. Details of ACW can be also be found in (Seo et al. 2014).

2.4 Incorporation of Input Uncertainty

Input data includes forcing inputs such as precipitation, temperature, and land use types, which are essential drivers for the simulation processes of watershed models. Studies can be scarcely found which incorporate input uncertainty explicitly during the watershed model calibration (Ajami et al., 2007). Two recently proposed approaches, the Bayesian total error analysis (BATEA) (Kavetski et al., 2002), and the integrated Bayesian uncertainty estimator (IBUNE) (Ajami et al., 2007) implementing the Bayes theory to evaluate uncertainty conducted by input data. Take precipitation as an example to illustrated input uncertainty incorporated in calibration, it can be shown in Equation (1).

$$ {R}_i^{adjusted}=k\times {R}_i^{observed} $$
(1)

Where, R adjusted i and R observed i are the adjusted and the observed precipitation depth; k is the normally distributed random noise with θ mean and σ 2 variance defined as latent variables. In BATEA, θ is assumed to be 1 and σ 2 should be predefined for all precipitation data in each time step (σ 21 , σ 22 , …, σ 2 time steps ). The total number of σ 2 increases by including longer time steps during simulation which may potentially cause dimensional difficulties in calibration. In IBUNE, the problem is resolved by assigning θ as a predefined parameter where the same set of θ and σ 2 will be applied throughout the same model evaluation. Therefore, the number of latent variables decreases to two regardless of the problem of dimensionality.

By default in literature (Ajami et al. 2007), the range of θ is assigned from 0.9 to 1.1 and the σ 2 is altered from 10− 5 to 10− 3. In this study, ranges of θ and σ 2 are extended 10 (θ ∈ [0.81 ~ 1.21]; σ 2 ∈ [9 × 10− 6 ~ 1.1 × 10− 3]) and 20 percent (θ ∈ [0.73 ~ 1.33]; σ 2 ∈ [8.1 × 10− 6 ~ 1.21 × 10− 3]) respectively to explore the sensitivity of using various sets of latent variables.

2.5 Description of Case Scenarios

Input uncertainty may cause a considerable impact to model predictions (Ajami et al., 2007) and calibration results can be improved by incorporating input uncertainty. However, the application of latent variables was made for only one set of variables in previous studies and this practice has been established without a solid scientific justification. Furthermore, it is still unknown if the enhancement in model performance with the incorporation of input uncertainty can be quantitatively analyzed. To investigate the impact caused by altering default ranges of latent variables in precipitation data on model calibration, four scenarios are implemented in a case study. As summarized in Table 1, Scenario 01 is the case for the basic calibration without input uncertainty in precipitation data; Scenario 02 includes the default ranges of latent variables (Ajami et al. 2007); Scenario 03 represents a 10 % increase from the default ranges of latent variables; Scenario 04 doubles the increase (20 %). By gradually relaxing the range of latent variables, the potential influence caused by various sets of latent variables can be evidently justified.

Table 1 Case scenarios and the associated abbreviations

2.6 Model Calibration

The SWAT model for the Arroyo Colorado watershed was calibrated from 2002 to 2003. Streamflow was calibrated at a daily time step while water quality data (sediment and ammonium) was calibrated at monthly time step using the auto-calibration algorithm, Dynamically Dimensioned Search (DDS) (Tolson and Shoemaker, 2007). DDS is a stochastic search method that applies Bayes theorem. It has been shown that DDS outperformed many other optimization techniques such as Shuffle Complex Evolution (SCE-UA) (Duan et al., 1992), DiffeRential Evolution Adaptive Metropolis (DREAM) (Vrugt et al., 2009), Metropolis-Hastings algorithm (MHA) (Metropolis et al., 1953), Gibbs sampling algorithm (GSA) (Geman and Geman, 1984), Uniform covering by probabilistic rejection (UCPR) (Klepper and Hendrix, 1994), in terms of computational efficiency and the ability in finding relatively better objective function values (since it is not mathematically possible to find the global optimal solution for highly nonlinear problems such as calibration of complex watershed simulation models) (Yen et al. 2014b). Therefore, DDS is adopted as the parameter estimation algorithm in the Integrated Parameter Estimation and Uncertainty Analysis Tool (IPEAT) (Yen et al., 2014d) to conduct model calibration in this study. The framework of model calibration and uncertainty analysis incorporating input uncertainty in this study is shown in Fig. 2. The model was not validated for a different time period as this study focused on evaluating the different ranges of latent variables and their relative performance instead of trying to match the model output rigorously to the observed flow and water quality from ACW. Furthermore, the availability of observed data in the ACW was too short to be split into two periods for calibration and validation.

Fig. 2
figure 2

Framework of model calibration and uncertainty analysis incorporating input uncertainty

The outlet of the ACW is close to the Gulf of Mexico and therefore, any flow measured near the outlet would be impacted by the diurnal fluctuations of tidal waves. To avoid this tidal effect, the SWAT model for ACW was calibrated using the observed data from the gauge station located near Llano Grande at FM 1.015 south of Weslaco. For calibrating the ACW SWAT model, daily streamflow data was available from 2002 to 2003 whereas water quality data for sediment and ammonium nitrogen only included several grab samples. The USGS Load Estimator (LOADEST) program (Runkel et al., 2004) was used to generate monthly data from the grasp samples to calibrate sediment and ammonium. LOADEST used the MLE (Maximum Likelihood Estimation) method to create the monthly water quality data. 31 parameters for related processes were selected in all case scenarios for calibrating flow and water quality. The parameters and their recommended ranges are listed in Appendix.

Nash-Sutcliffe co-efficient (NSE) is the only objective function included in this study. It is one of the most commonly used statistical measures (ASCE 1993; Servat and Dezetter 1991) to estimate model performance and ranges from -∞ to one. The statistic normalizes the residual of error between observed and simulated against the mean observation. In Equation (3), y i Obs is the observed response at time step i; y i Sim is the simulated response at time step i; y i Mean is the mean of observed response at time step i; and N is the total number of time steps. While, NSE equals to one indicates a perfect match between observation and simulation, a negative or small value of NSE indicates a poor performance. Therefore, in the auto-calibration process, the ideal global optimal solution for the objective function (OF) is defined in such a way that the OF tends to be minimized to zero to get a perfect match (i.e., NSE = 1).

$$ NSE=1-\frac{\sum_{i=1}^N{\left({y_i}^{Obs}-{y_i}^{Sim}\right)}^2}{\sum_{i=1}^N{\left({y_i}^{Obs}-{y_i}^{Mean}\right)}^2} $$
(2)
$$ OF={\sum}_{v=1}^V\left(1-NS{E}_v\right) $$
(3)

As shown in Equation (3), the objective function is calculated as the sum of 1-NSE for the output variables, where OF is the final objective function value; NSE v is the NSE value for output variable v; and V is the total number of output variables. In this case study, the output variables were the variables that were calibrated (e.g. streamflow, sediment, and ammonia).

3 Results and Discussion

3.1 Comprehensive Comparisons

Latent variables make a direct impact to the convergence speed because of the high sensitivity of weather input to simulation output, which is well represented by the progressive improvements in the convergence speed with respect to the number of iteration during the DDS optimization process in all scenarios as shown in Fig. 3. The convergence patterns appear to be similar to each other between scenarios especially after 4,000 iterations where no significant improvement is achieved in all cases. The best objective function values achieved in Scenario 01, 02, and 03 do not exhibit a significant difference. However, Scenario 04 (the case with largest ranges of latent variables) resulted in a relatively poor performance. The performance statistics of the optimized results are summarized in Table 2. Among the four scenarios evaluated in this study, the best result in streamflow is achieved with Scenario 02 in which the range of latent variables (mean and standard deviation of the poison distribution) are defined as 0.9–1.1 for the mean and 10−5-10−3 for the standard deviation. The same scenario achieved the best result in predicting ammonia nitrogen as well, partly due to the fact that ammonia load is highly influenced by streamflow (i.e., dissolved in water). However, Scenario 2 also resulted in the worst performance statistic on the sediment prediction. With input uncertainty incorporated, the performance of SWAT in predicting streamflow and ammonia evidently became deteriorated as the ranges of latent variables increased. In contrast, the best results for sediment predictions were improved as the ranges of latent variables increased. Statistical results of Scenario 01 were close to that of Scenario 03 with a 10 % increase in the range of latent variables. In general, two out of three scenarios (Scenario 03 and 04) with input uncertainty incorporated produced worse results than the baseline scenario with no latent variables considered (Scenario 01). The result implies that the inclusion of input uncertainty does not always improve model performance. In other words, predictive uncertainty in a complex watershed model may not always be positively contributed by the consideration of the uncertainty in weather input, perhaps due to other significant sources of error that are not considered in the current study. Note that the result found in the current study does not agree with earlier study by Ajami et al. (2007) which reports that the inclusion of input uncertainty (precipitation) enhances the quality of calibration. It is difficult to make a direct comparison between the current study and the previous work (Ajami et al., 2007) as these modeling practices are differently configured. However, the fact that only one set of latent variables was applied and the hydrologic model implemented was a simple rainfall-runoff model in drawing conclusions in Ajami et al. (2007) as opposed to the current more comprehensive study in which the complex watershed scale SWAT model is used and various ranges of latent variables are evaluated. A number of uncertainty sources may involve and thus influence model predictions (Yen et al., 2014d). The inclusion of input uncertainty (precipitation) alone may not the most significant source of error and the corresponding performance of model predictions may not improve with consideration of the uncertainty depending on the characteristics of the watershed, quality of data, and modeler’s knowledge in the watershed hydrology.

Fig. 3
figure 3

Overall performance of objective function values versus model iterations in four scenarios

Table 2 Error statistics of output variables corresponding to the best results of four scenarios

3.2 Evaluation of Adjusted Precipitation Data and the Corresponding Latent Variables

The convergences of latent variables, θ and σ 2, are shown in Fig. 4 (a) and (b), respectively. The converged values of θ (θ Scenario 01 Converged  = 0.86, θ Scenario 02 Converged  = 0.78, θ Scenario 03 Converged  = 0.77) decreased while σ 2 values (σ 2 Scenario 01 Converged  = 6.4 × 10− 4, σ 2 Scenario 02 Converged  = 8.6 × 10− 4, σ 2 Scenario 03 Converged  = 1.1 × 10− 3) increased with wider ranges of upper and lower bounds. Consequently, the average annual precipitation of the original and the adjusted data decreased as the range of latent variables increase from zero to 20 % (Fig. 5). The percent decreases in the adjusted precipitation for Scenario 02, 03, and 04 were 14.5 %, 21.6 %, and 25.6 %, respectively. It is evident that the increases in the range of latent variable resulted in the decline of precipitation data. The reduced precipitations presented contrasting results in terms of performance statistics on flow, sediment, and ammonia loads. As noted earlier, the objective function was formulated such that the NSE values for flow, sediment, and ammonia were equally weighted (Equation 3). As the selection of latent variables in each scenario was directly related to the result of the optimization, the deteriorating results in streamflow and ammonia output was compensated by improving results in sediment. After testing various sets of latent variables, the NSE for sediment (=0.8) was improved by 0.02 in Scenario 04 with the 25.6 % decrease in rainfall. This could be the only optimum solution if the sediment NSE presented a significant decrease relative to the streamflow and ammonia performance in other attempts. Therefore, the result on the latent variables does not necessarily indicate that the observed precipitation is overestimated.

Fig. 4
figure 4

Convergence processes of latent variables (Scenario 02, 03, and 04 are the cases with inclusion of latent variables during calibration): (a) Convergence of latent variable θ ; (b) Convergence of latent variable σ 2

Fig. 5
figure 5

Annual precipitation in four scenarios of three gauge stations (precipitation data of Scenario 01 was not adjusted; precipitation data of Scenario 02, 03, and 04 are the adjusted precipitation)

3.3 Evaluation of Model Performance and Uncertainty Analysis

Predictive uncertainty may be qualitatively evaluated with inclusion rate and spread as summarized in Table 3. Inclusion rate is the percentage of observed data points located within the 95 % confidence interval of the predicted outputs. Spread is the average width of the corresponding uncertainty band along the predicted time series output. The units of spread for streamflow, sediment, and ammonia are cms (cubic meter per second), ton/ha (tons per hectare), and kg/ha (kilogram per hectare) respectively. From Table 3, predictive uncertainty of streamflow and ammonia is affected by the increasing ranges of latent variables in terms of variations of inclusion rate and spread. However, no substantial changes can be found for sediment predictions.

Table 3 Inclusion rate of observed streamflow, sediment, and ammonia within the 95 % confidence interval and the corresponding spread for the simulation period (2002–2003)

As depicted in Fig. 6 (a) ~ (d), the width of uncertainty band narrowed down as wider ranges of input uncertainty were incorporated. Similarly, the estimated inclusion rate decreased while the ranges of latent variables increased. However, input uncertainty applied on precipitation did not advocate the same influence in sediment and ammonia predictions. For sediment and ammonia, both inclusion rate and spread were not apparently affected by the inclusion of input uncertainty in precipitation as shown in Fig. 6 (e) ~ (h) and (i) ~ (l). The incorporation of input uncertainty in precipitation apparently brought about an impact on streamflow predictions but comparatively less in sediment and ammonia.

Fig. 6
figure 6figure 6figure 6

Time series of streamflow processes corresponding to four case scenarios (a) Scenario 01; (b) Scenario 02; (c) Scenario 03; (d) Scenario 04. Time series of sediment processes corresponding to four case scenarios (e) Scenario 01; (f) Scenario 02; (g) Scenario 03; (h) Scenario 04. Time series of ammonia processes corresponding to four case scenarios (i) Scenario 01; (j) Scenario 02; (k) Scenario 03; (l) Scenario 04

4 Conclusion

In this study, input uncertainty in precipitation data was explicitly incorporated during calibration processes. Results indicate that the influence of latent variables was demonstrated to be mostly reflected in streamflow prediction but yet not as much in sediment or ammonia results. In general, statistics errors improved with only the default ranges of latent variables (0.9–1.1) compared to the baseline (Scenario 01) where no input uncertainty was applied. However, the performance in streamflow and ammonia predictions declined as the ranges of latent variables increased. In addition, significant impact was found only in the streamflow responses in the uncertainty analysis. The increase in the range of latent variables did not show noticeable effect on the corresponding predictive uncertainty in sediment and ammonia predictions.

The results are somewhat different from the findings in previous research (Ajami et al., 2007). Since the previous work was conducted with a simple rainfall-runoff model instead of a comprehensive watershed simulation model with complex interactions among numerous physical and empirical equations such as SWAT, there could have been less fuzzy factors to consider in the uncertainty analysis. However, as demonstrated in this study, the calibration results were not significantly improved with the application of latent variables. Calibration results did not improve as the default ranges of latent variables suggested by Ajami et al. (2007) were increased (Scenarios 03 and 04). In general, the incorporation of latent variables in precipitation data may not make noticeable improvement in a sophisticated watershed simulation model. In addition, consideration of more sources of input uncertainty (e.g. daily temperature, solar radiation) may improve the overall quality of uncertainty analysis and calibration of watershed models.