1 Introduction

SWAT model has undergone several modifications to its code to solve local problems and address specific conditions, for example, SWAT-G which is a modified version of SWAT99.2 (Eckhardt et al. 2002) for application to low mountain range catchments. Sophocleous and Perkins (2000) linked SWAT with a groundwater model MODFLOW (McDonald and Harbaugh 1988) and produced the new model (SWATMOD) for the analysis of surface and groundwater interactions. Kim et al. (2008) combined the two models to end up with the SWAT-MODFLOW for low flow.

In the literature, many researchers have compared the performance of HSPF and SWAT in various catchments of different characteristics, different scales of flow, and sediment and nutrient loads (Nasr 2004). Yet, no research has been conducted to link SWAT model with HSPF in order to produce a better understanding of phosphorus losses from agricultural land.

Many years of programming effort have been invested in the development of the SWAT and HSPF codes. Moreover, the codes have been calibrated and validated in a wide variety of conditions because of their wide use, particularly in the USA, Thus, these packages can be regarded as stable and mature in the sense that most of the programming issues and bugs have been discovered and addressed. The present research is not intended to replace either of these codes but rather to develop a new combination of their best components. Moreover, the research concentrates on model development, coding and testing. Hence, any involvement in developing a new graphical user interface (GUI) would have distracted from the research objective. This project explores the hypothesis that a new phosphorus export model that combines the hydrological component of HSPF with the P-modelling component of SWAT might be better than either of these existing models separately.

The main objective of this study is to modify the water module of SWAT with HSPF while maintaining the overall structure of SWAT model, water routing, sediments and water quality components. The resulting model, called NCM, is evaluated in terms of its capability of predicting flow and total P loads.

2 Study Catchments

Three Irish catchments were used to test the NCM and SWAT models that cover a range of climates and agriculture land uses typical of Ireland. Oona catchment has an area of 88 km2 located in North-East Ireland in Co. Tyrone; Bawn catchment, which forms part of the Oona catchment, has an area of 5 km2; and Dripsey catchment (15 km2) is located in the south of Ireland in Co. Cork (See Fig. 1). For Oona catchment, the available rainfall, flows and phosphorus measurements dataset cover the period 1/10/2001 to 31/12/2002, and the separate dataset for Bawn cover the period 1/04/2006 to1/4/2009 and the data for Dripsey catchment cover the whole year 2002.

Fig. 1
figure 1

Left: Location of Bawn Catchment within Oona catchment; Right: Dripsey Catchment (Ali 2010)

3 Methodology

In order to develop the NCM, numerous approaches have been examined and the chosen approach was to use and improve an existing GIS-based package rather than generating a new package from start. Effectively, this means that the resulting combined model should be based on either existing package, e.g. HSPF or SWAT, rather than constructing the combined model from scratch.

3.1 Approach Adopted

After examining the structure and coding of both packages, the best approach was to integrate the relevant hydrological modules from HSPF into the existing GIS and GUI of SWAT. HSPF (version 12.0; Bicknell et al. 2001) is coded in a modular structure, so it was possible to extract the hydrological modules as independent FORTRAN subroutines. In this project, the interest was only on simulating agricultural land surfaces. So, the concentration was on that part of the HSPF code, and in particular on section PWATER of module PERLND of the HSPF.

In SWAT (Arnold et al. 1998), the stream channel flow consists of surface/overland flow, lateral flow (if any), and baseflow. The time lag of the three components may differ, which depends on catchment size and characteristics such as soil type and slope (the delay is negligible for small catchments and longer for larger catchments). In cases were the soil may have cracks, the water moves downward quickly and then fills the available pore space from below. In HSPF, the output to the stream/channel consists of the sum of surface water flow, interflow and baseflow. In order to link the two models and to facilitate the integration, the original flow modelling components in SWAT were disabled and new components, based on HSPF, were added, and the relevant modifications were made to the codes to link the corresponding variables (storages, flux parameters etc.) and the required input/output time-series.

The NCM begins by starting SWAT to read the required input data. In the sub-program ‘simulate’, which contains the main computation of SWAT, PWATER module was implemented and its output flow components (groundwater, interflow outflow and surface water) and actual evapotranspiration were added for each day via HRU and were sent to SWAT (see Fig. 2). In the NCM, the flow rate and its component from HSPF as well as storage fluxes and percolation from each layer are used as input data for further computation of sediment and phosphorus processes.

Fig. 2
figure 2

Schematic diagrams for the NCM

3.2 Dealing with Differences in the Spatial Representation of the Catchment

Within HSPF, the catchment is represented in terms of land segments and river reaches/reservoirs. There are two types of land segment: (i) pervious (with the capacity to allow enough infiltration to influence the water budget) and (ii) impervious. The PERLND module controls the modules simulating hydrological processes for pervious land segments. The main part of this is in the PWATER module, which implements the simulation of the water budget. SWAT, on the other hand, represents catchments in terms of hydrological response units (HRUs) and sub-catchments. The subdivision of the catchment allows the model to reflect differences in evapotranspiration for various crops and soils. Runoff is predicted separately for each HRU and routed to obtain the total runoff for the catchment. This increases accuracy and gives a much better physical description of the water balance.

For this project, the PWATER section of the HSPF hydrology module was modified to accept catchment descriptor data that are provided in the SWAT format where each HRU has an assigned set of initial parameter and catchment characteristics that are representative of their land uses.

3.3 Exclusions/Modifications

The snow accumulation and melt components of HSPF have not been considered in this project as these are not important in Ireland. Also HSPF has neither a tile flow component nor a plant growth component, so the effects of vegetation type, root growth, density, and stage of maturity and soil moisture content are lumped into the parameter (LZETP) that controls actual ET from the lower zone storage. The simulation of water outflow from a field due to tile drains is lumped into the parameters (LZSN and UZSN) that control lower and upper zone storage (Singh et al. 2004). Two input time series are required by the HSPF PWATER section, potential evapotranspiration and precipitation. It calculates three main output time series: overland flow (surface runoff), interflow/outflow and groundwater (baseflow), all contributing to streamflow, which are similar to both SWAT and HSPF. It determines the actual evapotranspiration as well as storage fluxes and percolation for individual layer at the end of each time step and for each HRU.

3.4 Treatment of the Hydrological Processes

In NCM, the delineation of the catchment is done using the GIS interface for SWAT2005. The catchment is divided into multiple sub-catchments, which are then further subdivided into non-spatial hydrologic response units (HRUs) that consist of combinations of areas with homogeneous soil types, land use and management practices. Several internal storage elements represent the water budget of each HRU, e.g., interception storage, moisture in the soil profile, active groundwater storage and inactive groundwater storage. The soil profile is modelled as four storages, the upper 10 mm layer surface detention storage (SURS), interflow storage (IFWS), the upper zone storage (UZS) and lower zone storage (LZS). Table 1 compares these with the SWAT conceptualization. In HSPF, the total moisture storage in the pervious land segment (PERS) is the sum of the moisture in the storages listed in Table 1 and is calculated as:

$$ \mathrm{PERS} = \mathrm{CEPS}+\mathrm{SURS}+\mathrm{IFWS}+\mathrm{U}\mathrm{Z}\mathrm{S}+\mathrm{L}\mathrm{Z}\mathrm{S}+\mathrm{AGWS} $$
(1)
Table 1 Comparison of storage volumes that were used for the water balance in both SWAT and HSPF

3.5 Treatment of Actual Evapotranspiration (ET)

In HSPF, actual ET is calculated from the potential ET (input to the model) demand and the amount of available water in the surface and the soil. However, in SWAT, the actual ET is computed as the sum of actual evaporation from bare soil and from plants. The actual bare soil evaporation is estimated using an exponential function of soil depth and moisture content, and the plant evaporation is simulated as a linear function of potential ET, leaf area index and rooting depth, and can be limited by the soil moisture content. SWAT has three options to compute the potential ET, namely, the Penman-Monteith (Allen et al. 1989), Hargreaves (Hargreaves and Samani 1985), or Priestly-Taylor (Priestley and Taylor 1972) methods. The model will also read in daily PET values if the user prefers to apply a different potential evapotranspiration method.

For the NCM, the three potential ET methods in SWAT were used and the corresponding time series were calculated and supplied as input to the model after being disaggregated to hourly time step records. The best method that gives the best flow fit was chosen. For Oona catchment, the Priestley-Taylor equation was used, and for Dripsey and Bawn catchments, the Hargreaves method.

3.6 Data and Parameters (Initial and Final Values)

Both the SWAT and HSPF models use meteorological data (rainfall, evapotranspiration), soils and land use maps. HRUs were determined following the selection of the threshold criteria for percentage of soil for each land use, the number of HRUs determined, and the associated soil characteristic and land use with their default parameters, which can be identified and used as initial values for the common parameters as well as initial values for PWATER modules. These parameters can then be optimised by varying them within a predefined acceptable range to find the best fit between model and observed outputs. Subroutine “changepar” was modified to include PWATER parameters, which will change within their upper limits and lower limits for automatic calibration, sensitivity and uncertainty analysis.

3.7 Sensitivity, Calibration, and Uncertainty Analysis

Sensitivity analysis is the process of calculating the rate of change in model output with respect to changes in model inputs or parameters (Moriasi et al. 2007). It is extremely important as it identifies the most sensitive model parameters and input data series that can influence the calibration process. Accordingly, sensitivity analysis is a useful tool for the assessment of the input parameters with respect to their impact on model output not only for model development, but also for model validation and reduction of uncertainty (Eckhardt et al. 2002).

SWAT2005 model has routines for automated sensitivity, calibration, and uncertainty analysis added by Van Griensven and Bauwens (2003). The calibration was completed using the Shuffled Complex Evolution (SCE-UA) algorithm (Duan et al. 1992), uncertainty analysis for model parameter was performed using the PARASOL method implemented in SWAT 2005 which considers the information from all simulation results and identifies their uncertainty bounds.

3.8 Implementation

When all of the above programming steps were completed, the original, modified and new routines were compiled into a new calculation module. This equivalent module was removed from the SWAT package and replaced with the new combined calculation module. As explained above, this allowed the use of the SWAT interface with the NCM and also made it easier to simulate the test catchments with both the original SWAT and our NCM through the same interface.

The main features of the NCM, compared with SWAT and HSPF are summarised as follows:

  • The surface flow simulation module in the NCM was derived from the HSPF model, PWATER module. Both models simulate the same three flow components for the same catchment delineation and the same parameter set.

  • There are some differences between the original HSPF PWATER module and the NCM. The latter has been adapted to run with hourly input data (rainfall and evapotranspiration) and calculation time steps. The snowmelt component was not implemented here.

  • Both the SWAT model and PWATER module were changed to work with hourly inputs in the PWATER module format and to pass on the output as daily aggregate values as required by the SWAT phosphorus modelling component.

  • The SWAT subroutine “changepar”, used in parameter optimisation, has been modified and HSPF PWATER hydrological parameters have been included in the optimization with pre-specified minimum and maximum ranges as constraints.

  • Any differences in the average daily flow given by the NCM and the original HSPF is due to: (i) using the SWAT channel flow routing component; (ii) applying a time lag to surface flow and interflow as it is done in SWAT; and (iii) adding the baseflow components later as it is done in SWAT. These make a small difference in the discharge simulation.

3.9 Criteria

3.9.1 Bias

$$ Bias = \frac{1}{n}{\displaystyle \sum_{i=1}^n\left(x{o}_i - x{m}_i\right) = \kern0.5em \left(\frac{1}{n}{\displaystyle \sum_{i=1}^n}x{o}_i\right)-\kern0.5em \left(\frac{1}{n}{\displaystyle \sum_{i=1}^n}x{m}_i\right)} $$

This criterion indicates the performance in relation to a water balance, i.e., getting the total amounts of water correctly, and is appropriate for water resources management. Note, in the paper this is expressed as a percentage of the observed mean flow.

3.9.2 Mean Absolute Error (MAE)

$$ MAE = \frac{1}{n}\ {\displaystyle \sum_{i=1}^n}\left|x{o}_i - x{m}_i\right| $$

By taking the absolute value of the differences, this criteria gives an overall assessment of the differences, without focussing particularly on flood or low flow conditions. Note, in the paper this is expressed as a percentage of the observed mean flow.

3.9.3 Root Mean Square Error (RMSE)

$$ RMSE = \sqrt{\frac{1}{n}{\displaystyle \sum_{i=1}^n}{\left(x{o}_i - x{m}_i\right)}^2} $$

This criterion is widely used, particularly in fitting catchment models. It particularly penalises the larger deviations between model and measured values as the difference is squared. As the larger differences tend to occur in high flows (floods), this criterion has a tendency to favour getting the high flows right. Note, in the paper this is expressed as a percentage of the standard deviation of observed flow series.

3.9.4 Nash-Sutcliffe (NS)

$$ NS = \kern0.5em 1-\frac{{\displaystyle {\sum}_{i=1}^n}{\left(x{o}_i - x{m}_i\right)}^2}{{\displaystyle {\sum}_{i=1}^n}{\left(x{o}_i - \overline{xo}\right)}^2} $$

The Nash-Sutcliffe criterion takes account of the existing variability in the measured data and measures the model performance in relation to this natural variability. The best possible value of this criterion is 1, when the model matches the data perfectly. A value of zero indicates the model is only as good as a constant value estimate equal to the mean. The NS can be negative when the model is worse than the mean value as a predictor.

3.9.5 Mathevet et al. (C2M)

$$ C2M = \frac{1 - \frac{{\displaystyle {\sum}_{i=1}^n}{\left(x{o}_i - x{m}_i\right)}^2}{{\displaystyle {\sum}_{i=1}^n}{\left(x{o}_i-\overline{xo}\right)}^2}}{1 + \frac{{\displaystyle {\sum}_{i=1}^n}{\left(x{o}_i - x{m}_i\right)}^{2\ }}{{\displaystyle {\sum}_{i=1}^n}{\left(x{o}_i - \overline{xo}\right)}^2}}=\kern0.5em \frac{NS}{2-NS} $$

Mathevet et al. (2006) developed this criterion to avoid an undue influence of individual NS values calculated for a few problematic catchments when combining large numbers of data sets. It was applied to comparing hydrological models by Mockler et al. (2016). As with NS, the best value is 1.

4 Results

This section utilises the NCM and SWAT 2005 models to the test catchments (Oona, Bawn and Dripsey) and compares the results. In the calibration of the flow parameters in the PWATER module, some parameters were assigned one value for the whole year in order to reduce the number of parameters being optimised (there are four parameters that can vary seasonally with different values for winter and summer months) in the HSPF code (LZETP, CEPSC, NSUR and IRC). Table 2 shows the overall summary of the flow and total P simulation results, and compares SWAT 2005, HSPF and NCM model performance. Tables 3, 4, 5 and 6 show different criteria for calibration and validation for Oona catchment

Table 2 Overall Summary of the flow and total P results compared to previous results from Nasr and Bruen (2005)
Table 3 Oona Calibration Period (Flows)
Table 4 Oona Calibration Period (P-annual load)
Table 5 Oona Validation Period (Flows)
Table 6 Oona Validation Period (P-annual load)

Note that the new model is again substantially better than either SWAT or HSPF on all criteria except one. This is for the bias in which HSPF by itself is slightly better than either of the other models.

For the important validation period both HSPF and the new proposed model are substantially better than SWAT. However, HSPF is slightly better than the proposed model.

However, the most important result is that the proposed combined model is very much better at simulating the annual phosphorus load than either SWAT or HSPF for all criteria, and does particularly well at reducing bias in the estimates.

Figure 3 shows the comparison of the observed flow and the NCM and SWAT 2005 model simulated outputs for the Bawn Catchment. The NCM underestimates the low flow period while it reproduces most of the peaks adequately and better than SWAT 2005 except for the peak of 25 October 2006.

Fig. 3
figure 3

Comparison between SWAT and NCM model performance in simulating flow for calibration period 1/4/2006 -30/11/2007 in Bawn catchment

The calibration of Dripsey catchment was done first for flow and thereafter for total P. The result was poor when both criteria are taken into consideration (to minimise the sum of square error) and hence minimise the global optimisation function. The model produces better results for total P (R2 = 0.71) that correspond to flow (R2 = 0.57). The sensitivity analysis undertaken for the Dripsey catchment shows that UZSN and IRC are the most influential parameters for the flow objective function, as well as for total P outputs, where it comes in the first and second global rank, respectively. Further manual calibration to tune these parameters reveal that the R2 for flow can be improved from 0.57 to 0.73, which is comparable with what was been published for this catchment from the previous study by Nasr and Bruen (2005) while it maintains the good results for total P (0.71).

4.1 Sensitivity Analysis

The sensitivity analysis was done on 23 of the parameters for flow and Total Phosphorus loads for the three catchments Oona, Bawn and Dripsey which have different size, soil characteristics and different hydrological responses. Tables 7 and 8 provide details of the calibrated values, sensitivity range and uncertainty bounds in parameter estimates for flow and total P for Oona catchment. The Oona has a flashy response (surface runoff is dominant) and this is reflected in that the most sensitive parameters are UZSN, IRC and LZSN. These parameters are also the most sensitive for the Dripsey catchment, while the interflow and baseflow parameters are sensitive with ranks between 5 and 14 (see Tables 9 and 10). Figure 4 shows the flow parameters and their importance. For Bawn catchment, the most sensitive parameters are DEEPFR, LZETP and KVARY.

Table 7 Details of the calibrated values, sensitivity range and uncertainty bounds in parameter estimates for flow for Oona Catchment
Table 8 Details of the calibrated values, sensitivity range and uncertainty bounds in parameter estimates for Total P for Oona Catchment
Table 9 Details of the calibrated values, sensitivity range and uncertainty bounds in parameter estimates for flow for Dripsey
Table 10 Details of the calibrated values, sensitivity range and uncertainty bounds in parameter estimates for Total P for Dripsey
Fig. 4
figure 4

Comparison of the sensitive flow and total P parameters and their importance for NCM in Dripsey, Oona and Bawn catchments. * The higher values indicate more sensitive parameter

For SWAT 2005, the most sensitive parameters are CN2 for Dripsey and Bawn and it comes in the second rank following SURLAG in Oona catchment (see Fig. 5). Comparing this sensitivity analysis results with those obtained for the larger Oona catchment shows that parameter SURLAG has no influence on the Bawn, where it has “rank 5”, while it is the most sensitive in the Oona catchment with “rank 1” and parameter CN2 has “rank 2”. This was expected since the catchment concentration time for the Bawn is less than 1 day.

Fig. 5
figure 5

Comparison of the sensitive flow and total P parameters and their importance for SWAT model in Dripsey, Oona and Bawn catchments

4.2 Uncertainty Analysis

SCE-UA produces a file of all the parameter sets investigated in the calibration process and the corresponding values of the objective function for both flow and total P. The uncertainty analysis establishes a threshold value of the objective function which is used to distinguish between “acceptable” and “unacceptable” simulation performance. This is done by the PARASOL routine using the χ 2-statistics, where the selected simulations correspond to the confidence region of 97.5 %. Zhaling et al. (2010) stated that the more observations fall inside such CR, the more considerable the contribution of parameter uncertainty to simulation uncertainty.

In Bawn catchment, it only covers 50 % of measured flow and 45 % of total P loads during calibration period (April 2006–November 2007), as shown in Fig. 6 (not good at extreme events).

Fig. 6
figure 6

Uncertainty for flow using PARASOL for calibration period in Bawn catchment (NCM)

For the validation period (1 October 2008–31 March 2009) the percentage of coverage decreases for flow and total P to 31 and 39 %, respectively.

In the case of the Dripsey catchment, there is only one set of parameters that gives a good fit to both flow and total P. All the other points are good for either flow or total P but not both. For this reason, it does not have a range of simulations with which to generate uncertainty. Here, uncertainty bounds are generated from the statistics analysis for the whole simulation run (9803 simulations) and the mean, maximum, minimum and standard deviation were determined. Figure 7 shows the flow uncertainty bound of the mean and maximum values for the estimated and observed flows of the entire simulation.

Fig. 7
figure 7

Uncertainty for flow using mean and maximum simulations estimate in Dripsey catchment (NCM)

5 Discussion

NCM simulates the Total Phosphorus loads better than SWAT in the case of Oona for both calibration and validation periods, and for Dripsey calibration period. For the Bawn catchment, SWAT performs marginally better during calibration while the NCM is better in the validation period.

  • In the simulation of Total P loads care was taken to use the same fertilizer scenarios in SWAT and the NCM to allow a fair comparison.

  • SWAT 2005 has limitations in modelling phosphorus in groundwater. It has introduced a new parameter in the “gwnutr” subroutine for the concentration of P in groundwater and critically it is assumed to be constant throughout the simulation period. In catchments where baseflows are significant (e.g., Dripsey) good modelling of groundwater P is very important and this assumption may be a serious limitation.

  • Using multi-objective optimization methods for optimizing both flow and Total P loads together gave better simulated results than sequential optimization.

  • The NCM reproduces most of the measured Total P peaks very well and any underestimation coincides with an underestimation of the corresponding flow peaks by the hydrological component of the model.

  • The relationship between flows and Total P loads in different simulation runs in terms of their Nash-Sutcliffe model efficiency (R2) for the Bawn catchment shows that achieving good flow simulation results does not guarantee good phosphorus results.

  • Uncertainty analysis methods used in this study (using PARASOL) produce unrealistically small uncertainty bounds and cover generally less than 70 % of the observed values. For other methods (for instance, 50% rank of all the simulation run as done in the Oona catchment) the range was much better. Note that PARASOL uses only the good simulations results that gave R2 better than 0.7 and ignores all other simulations. In the multi-objective case, it uses the simulation that gave good results (weighted combination of R2) for both flow and Total P.

  • The most sensitive flow parameters for the NCM are UZSN (Upper zone nominal storage), IRC (Interflow recession constant), LZSN (Lower zone nominal storage), LZETP (Lower zone evapotranspiration parameter) and DEEPFR (fraction of flow that goes to Deep groundwater) in the three catchments.

6 Conclusions

A new combined, semi-distributed, dynamic model of phosphorus export from agricultural catchments was constructed, and applied and tested in three Irish catchments ranging in size from 5 to 88 km2. The overall performance of the NCM model, during both calibration and validation periods, shows that it performs well with R2 greater than 0.7 for daily average flow in all the three catchments. For discharge simulations, it performs better than SWAT and similarly to HSPF, as might be expected.

Specific points for individual catchments are given below:

  • Oona catchment: The NCM was tested with the data from the Oona catchment and produced a better flow simulation than SWAT alone when used with hourly time steps and similar results to HSPF. For total phosphorus loads, the NCM performed better than either HSPF or SWAT for calibration and for the longer validation period. However, it was not better than HSPF for the shorter validation period.

  • Bawn catchment: At 5 km2, this catchment is very small and forms part of Oona water. The NCM simulated the flow better than SWAT2005 alone in both calibration and validation periods. The total P loads simulated were relatively poor during calibration but improved in the validation period.

  • Dripsey catchment: The flow calibration gave similar results to Nasr and Bruen (2005) but it is much better than both SWAT and HSPF for total P loads, after adjusting the best parameter with a few manual runs.