1 Introduction

In hydrology, modeling is an effective tool, and it is used extensively in soil and water management. The models permit fast processing and lower cost for evaluating management strategies and preventing undesirable outcomes (Thompson et al. 2004; Awchi 2014).

The three types of hydrological models are data-driven, conceptual, and physically-based models (Chow et al. 2005; Nyeko 2014). Subsequently, conceptual models are classified as lumped or semi-distributed models (Jajarmizadeh et al. 2012). Geethalakshmi et al. (2008) stated that a full understanding of catchment hydrology, especially large-scale catchments are unachievable via fully-distributed models due to the lack of accessibility to the required data. In addition, the lumped model is not capable of considering various land uses and the variety of hydrological processes (Ghavidelfar et al. 2011). Hence, the sub-division of watersheds into smaller parcels and making similar hydrologic units is a reasonable idea to conquer the problems that arise with fully-distributed and lumped models. A hydrological model that can present small individual units is the semi-distributed model (Wilby 1997).

In a semi-distributed model, the algorithms are simple but physically based, meaning that the hydrologic system is derived from theories and principles of physics. Spatial heterogeneity is considered by observing the physical characteristics of the catchment (Valeo and Moin 2001). The semi-distributed model predicts the average behaviour of a catchment based on several small homogeneous units, which are then aggregated for a few defined positions (Wilby 1997). Several semi-distributed models have been used successfully worldwide, e.g., TOPMODEL (Beven et al. 1995), Soil and Water Assessment Tool (SWAT) (Arnold et al. 1998) and the Global Hydrology Model (GHM) (Anderson and Kavvas et al. 2002). In this study, the TOPMODEL and SWAT models were compared for modelling streamflow. Reed et al. (2004) stated that TOPMODEL and SWAT were popular for the simulation of large catchments.

Concerning the comparison of hydrological models, Singh and Woolhiser (2002) reported that the World Meteorological Organization (WMO) had sponsored comparisons of watershed model studies (WMO 1975, 1986, 1992). Also, such models have been compared for different topics, such as regional climate and flow analysis (Michaud and Sorooshian 1994; Bell and Moore 1998). TOPMODEL has been compared with several hydrological models, e.g., the fully-distributed model MIKE SHE (Yang et al. 2000) and the semi-distributed Xinanjiang model (Li and Zhang 2008; Peng et al. 2008). The comparisons showed that TOPMODEL provided good simulation performance. Several comparisons of data-driven models and SWAT have been published over the past decade (Demirel et al. 2009; Srivastava et al. 2006), including a comparison of SWAT and Hydrologic Simulation Program-Fortran (HSPF) (Im et al. 2007; Singh et al. 2004; Hua and Lian 2013). Several studies have investigated the capability of different hydrological models with SWAT (Nasr et al. 2007; Sommerlot et al. 2013). However, there is a need to compare of SWAT with hydrological models based on fully-distributed or semi-distributed structures (El-Nasr et al. 2005; Borah et al. 2007; Shi et al. 2011).

Arid regions are becoming vulnerable around the world, and statistics show that the consumption of natural resources is proceeding at an alarming rate (Faramarzi et al. 2010). The main reasons are increased population, industrialization of rural catchments, and the impacts of global climate change. Therefore, new techniques, such as hydrological models, are important in planning future watersheds and managing water resources. There have been few academic investigations of southern Iran, where the climate is arid and semiarid. This area has an important role in the production of agricultural crops, but water is scarce. Obviously, sparse data and the limited accessibility of data are governing obstacles in the use of hydrological tools. Hence, there is a lack of recognition of the capability of hydrological tools and their importance in future applications and studies.

In this research, Roodan in southern Iran was chosen to evaluate streamflow modeling via hydrological models. Two semi-distributed models, i.e., TOPMODEL and SWAT, were developed for Roodan. To the best of the author’s knowledge, no similar study has been performed previously in southern Iran. Consequently, the benefit of this study for the future will be the assessment of the applicability of surface flow modelling via SWAT and TOPMODEL in Iran. Also, the results of this study might be useful for similar regional climates. The objectives of this study are (i) the development of TOPMODEL for Roodan; (ii) the use of the SWAT model for Roodan; and (iii) a comparison of the results of the two models for predicting daily streamflow.

2 Material and Methods

2.1 Study Area

The study area is located in southern Iran between the Hormozgan and Kerman Provinces, which is called Roodan. The area and perimeter of watershed are 10,570 km2 and 700 km, respectively. The mean elevation is 781 m and the average slope is approximately 20 % (Fig. 1). Roodan is mountainous in the north and east, with low land in the central part. For the period of 1989 to 2007, the average annual precipitation and temperature were 215 mm and 25 °C, respectively. The heaviest precipitation occurs between October and March. The predominant soil type is a heterogeneous mix of clay, silt, and sand. The land in Roodan includes shrub land, shrub land mixed with grassland, and rock. The land uses are primarily for irrigated agriculture, orchards, and urban areas. The hydro-meteorological data for this study were from the years 1989–2007. Therefore, the models were calibrated with 1989–2001 data and validated using 2002–2007 data. Government organizations provided the hydrological data for this research (Ab Rah Saz Shargh 2009) and the related quality was acceptable for modelling consideration by Jajarmizadeh (2013).

Fig. 1
figure 1

Location of Roodan and meteorological stations

2.2 Hydrologic Simulator TOPMODEL (IDL Version)

The topographic index (TI) is based on an analysis of the topography and digital elevation data of the catchment to predict the response to rainfall using the hydrological similarity theory (Suliman et al. 2014). Its distribution is given through ln(α/tanβ), which was discussed in detail by Quinn et al. (1991), where α is the drained area through a square grid per unit length of contour and tan β is the local surface slope. See Beven and Kirkby 1979 and Beven et al. 1995. The TI of the entire catchment area can be obtained from the digital elevation model (DEM). Basically, the assumptions used in TOPMODEL, as reported in Beven (1997) are 1) exponential decline of transmissivity with depth or deficit; 2) approximate local hydraulic gradient with the local surface topographic slope (tan β); and 3) a quasi-steady-state condition for uniform recharge through the catchment. This simple, steady-state theory helps to develop a relationship for both the wet and dry conditions of the catchment between the TI and local saturation of the soil, which can be used to predict non-linear runoff contributing areas (Romanowicz 1997). Romanowicz (1997) presented basic equations (Equations 1 and 2) including the sensitive recession curve parameter (m) and soil transmissivity (T o ). Equation 1 describes the subsurface drainage to streams, and Equation 2 illustrates the calculation of the runoff production areas for the catchment area (Beven et al. 1995).

$$ {Q}_b(t)={T}_o{e}^{-\overline{\lambda}}{e}^{-\overline{S}(t)/m} $$
(1)
$$ \varDelta {S}_i=\left({S}_i\kern0.30em -\kern0.30em \overline{S}(t)\right)=m\left(\overline{\lambda}\kern0.30em -\kern0.30em {\lambda}_i\right) $$
(2)

where Q o is the initial flow, Q b is the subsurface flow, \( \overline{S} \) is the average soil moisture deficit, ΔS t is the difference between the average area deficits and local area deficits, S t represents is the saturation deficit at any point in the catchment area, \( \overline{S} \) is the average deficit of soil at saturation for the catchment, λ t is the local topographic index, and \( \overline{\lambda} \) is the average catchment topographic index. A more detailed description of TOPMODEL was provided by Beven et al. (1994).

DEM and spatial entities, such as the locations of hydro-meteorological stations, were investigated through GIS. Thiessen polygons were generated for the areal rain gauge stations to compute the average precipitation in the catchment. This study was prepared for Roodan with 90-m resolution from 1:25000 topographic maps. It derived various features, such as flow direction, flow accumulation, stream network, and drainage areas, required to set up TOPMODEL’s inputs. The multiple direction algorithm (D8) was used to calculate the direction of flow in this study (Wolock and McCabe 1995). Based on that, the topographic index map of Roodan was generated (Fig. 2a). High topographic index values were found for the areas that contributed to high runoff. Figure 2b shows the histogram of topographic index, and the major pixels that represent the same index, i.e., 12, 13, and 14, clearly were found north of the study area and around the channels.

Fig. 2
figure 2

(a) Distribution of the topographic index, (b) Histogram of the topographic index and (c) HRU visualization in the SWAT model

The calibration process was prepared by fitting the discharge of the model to improve its shape compared to the observed discharge using manual calibration process, see (Suliman et al. 2014). The model’s parameters were estimated at the highest efficiencies achieved; Table 1 provides the values of the parameters. Highly-sensitive parameters, such as the recession curve, m, and the saturated transmissivity, T o , were the most sensitive parameters according to Romanowicz (1997) and Beven and Kirkby (1979). The value of m usually ranges from 0.01 to 0.1 m (Beven 1997), and T o is based on the catchment’s condition (Romanowicz 1997). The less-sensitive parameters were calibrated using the topographic and soil maps.

Table 1 Values of the calibrated parameters for TOPMODEL and SWAT

2.3 Hydrologic Simulator SWAT (Version 2009)

SWAT is a semi-distributed model derived for alternative management decisions on water resource management in small and large watersheds. SWAT was produced by the United States Department of Agriculture’s (USDA’s) Agricultural Research Service (ARS) and has been upgraded over a period of 30 years (Arnold et al. 2012). SWAT was developed for hydrologists as a secondary tool for evaluating the impact of the management of water, sediment, and agricultural chemical concentrations in catchments. While the SWAT model is suitable for simulating long-term phenomena, it is not favorable for detailed, single-event evaluations, such as flood routing. Recently, SWAT was combined with GIS to provide a better understanding of catchment geometry and the scientific visualization of data and maps. SWAT involves a semi-distributed meaning when a given watershed is divided into a number of subbasins, which are integrated based on hydrologic response units (HRUs). In SWAT, HRUs have been considered as lumped units in each subbasin that includes unique land cover, soil, and management configuration. Then, the water balance has been routed in each HRU as shown in Equation 3:

$$ S{W}_t=S{W}_0+{\displaystyle \sum_{i=1}^t\left({R}_i\kern0.30em -\kern0.30em {Q}_{surf,\kern0.28em i} - E{T}_i\kern0.30em - {W}_{seep} - {Q}_{gwrf,\kern0.28em i}\right)} $$
(3)

where SWt = final soil water content (mm); t = time (days); SW0 = initial soil water content (mm); Ri = precipitation on day i (mm); Qsurf,i = surface runoff on day i (mm); ETi = evapo- transpiration on day i (mm); Wseep = percolation on day i (mm); and Qgwrf,i = groundwater return flow, or base flow, on day i (mm). A detailed description of the SWAT model and its equations are available in Neitsch et al. (2011).

The development of the SWAT model for Roodan included automatic watershed delineation, land use/soil/slop and HRU visualization (Fig. 2c), determination of weather stations, and the adjustment of watershed data. To run SWAT for Roodan, the SWAT database was modified for visualization of input files, such as DEM map, stream map, outlet of the watershed, land use/crops, soil types, and meteorological stations. Hence, the calibration of the model was conducted in regard to sensitive flow parameters. As suggested by Winchell et al. (2010), sensitivity analysis was conducted based on 26 input parameters that had high flow sensitivity when using the SWAT model. Final selection of the sensitive parameters included a report of sensitive parameters with Hypercube-one-factor-at-a-time (LH-OAT) analysis (Nossent and Bauwens 2012). Table 1 shows the sensitive parameters and their initial ranges for calibration via SWAT. To avoid repeating the calibration procedure and the detailed input data and ranges of parameters for the SWAT model, we referred to Jajarmizadeh et al. (2014, 2015).

2.4 Assessment of the Model’s Performance

Discharge data measured at the outlet of the basin were used to assess the model’s performance by exploring and evaluating several efficiency criteria. The mean absolute error (MAE), coefficient of determination (R2), Nash-Sutcliffe efficiency (NS) (Nash and Sutcliffe 1970), Pearson’s correlation coefficient (r), and relative errors (RE%) were the five criteria used to provide dynamic and systematic error information concerning the simulation.

2.4.1 Mean Absolute Error (MAE)

MAE is a measure of the average error of a time-series simulation. Lower values of MAE indicate better performance of the model.

$$ MAE=\frac{{\displaystyle {\sum}_{i=1}^n\left|{Q}_{sim,i}-{Q}_{obs,i}\left|\right.\right.}}{n}\kern2em 0\le MAE $$
(4)

2.4.2 Coefficient of Determination, R 2

R2 describes the total variance proportion of the observed data explained by simulation. The possible range is (0.0 – 1.0), and a higher value of R2 indicates a better value.

$$ {R}^2=\frac{{\left({\displaystyle {\sum}_{i=1}^n\left({Q}_{obs,i}-{\overline{Q}}_{obs}\right)}\left({Q}_{sim,i}-{\overline{Q}}_{sim}\right)\right)}^2}{{{\displaystyle {\sum}_{i=1}^n\left({Q}_{obs,i}-{\overline{Q}}_{obs}\right)}}^2{\displaystyle {\sum}_{i=1}^n{\left({Q}_{sim,i}-{\overline{Q}}_{sim}\right)}^2}}\kern0.84em 0<{R}^2\le 1 $$
(5)

2.4.3 Nash-Sutcliffe Efficiency, NS

NS is commonly used in time series data to compare simulated and measured flows. It ranges from (−∞ to 1), and higher values indicate better agreement.

$$ NS=1-\frac{{\displaystyle {\sum}_{i=1}^n{\left({Q}_{sim,i}-{Q}_{obs,i}\right)}^2}}{{\displaystyle {\sum}_{i=1}^n{\left({Q}_{obs,i}-{\overline{Q}}_{obs}\right)}^2}}\kern2em -\infty <NS\le 1 $$
(6)

2.4.4 Pearson’s Correlation Coefficient, r

Pearson’s r is the most common coefficient used to measure the linear correlation between two variables. Values range between +1 and −1, which are the total positive and negative correlations, and the zero value has no correlation.

$$ r=\frac{{\displaystyle {\sum}_{i=1}^n\left({Q}_{obs,i}-{\overline{Q}}_{obs}\right)}\left({Q}_{sim,i}-{\overline{Q}}_{sim}\right)}{\sqrt{{{\displaystyle {\sum}_{i=1}^n\left({Q}_{obs,i}-{\overline{Q}}_{obs}\right)}}^2\sqrt{{{\displaystyle {\sum}_{i=1}^n\left({Q}_{sim,i}-{\overline{Q}}_{sim}\right)}}^2}}}\kern2em -1<r\le +1 $$
(7)

2.4.5 Relative Error, RE%

RE is an indication of the magnitude of the absolute error compared to the total measurement values.

$$ RE\;\left(\%\right)=\frac{1}{n}\frac{{\displaystyle {\sum}_{i=1}^n\left|{Q}_{sim,i}-{Q}_{obs,i}\left|\right.\right.}}{Q_{obs,i}}\ast 100\kern1em 0\le RE $$
(8)

where,

n :

is the number of days

Q obs , Q sim :

are observed and simulated discharges

\( {\overline{Q}}_{obs},{\overline{Q}}_{sim} \) :

are the means of the observed and simulated discharges.

3 Results and Discussion

3.1 Performances of the TOPMODEL and SWAT Models in Trend Analysis

The overall evaluation results of TOPMODEL and SWAT compared to observed time series at the outlet station for calibration and validation periods were analyzed and presented. Better performance can be achieved at values of R2, NS, and r closer to one and at MAE and RE values closer to zero. The TOPMODEL and SWAT models were calibrated and validated using the daily data of the periods (1989–2001) and (2002–2007), respectively. Figure 3 compares observed, TOPMODEL, and SWAT data during the calibration period. Clearly, both models experienced acceptable fluctuations with the observed data. However, overestimation and underestimation were observed for some years. For instance, the models had similar underestimation trends for peak flows, which occurred earlier in the modelling. The highest recorded flow of 4209 m3/s was predicted by TOPMODEL and SWAT as 3997 and 3315 m3/s, respectively. Trend analysis showed that TOPMODEL slightly overestimated low flows, which is obvious for 1996 in Fig. 3b for the calibration period.

Fig. 3
figure 3

(a) Daily observed and simulated streamflow by TOPMODEL and SWAT in m3/s for the calibration period from 1989 to 2001 and (b) Selected period from 1995 to 1996

Generally, SWAT outperformed TOPMODEL for peak flows in the validation period, as shown by Fig. 4a. Also, both models had roughly the same trends in predicting flows (Fig. 4b and c). Generally, the trend analysis showed that SWAT outperformed TOPMODEL for high flows in recognition of events and values in a large basin and long simulation. However, SWAT underestimated low flows, while TOPMODEL slightly overestimated them in the validation period. TOPMODEL, similar to SWAT, provided fair assessment of high-flow events. Figure 5, derived from Figs. 3 and 4, presents the high flow prediction (including flood observation). For general comparison, Borah et al. (2007) reported that one of the weaknesses of physically-based models is the prediction of low storm events. Hence, the underestimations of TOPMODEL and SWAT in the validation period might be related to the dry climate of Roodan with its limited rainfall and intermittent flows.

Fig. 4
figure 4

(a) Daily observed and simulated streamflow by TOPMODEL and SWAT in m3/s for the validation period from 2002 to 2007, (b) Selected period from 2002 and (c) Selected period from 2004 to 2005

Fig. 5
figure 5

Selected observed peak flows and TOPMODEL and SWAT predictions for the calibration and validation periods (discharge in m3/s)

Figure 6 shows observed and simulated daily flows for both SWAT and TOPMODEL during the calibration and validation periods. These two scatter plots clearly show that the aggregations of flows were between 0 and 500 m3/s and 0–200 m3/s for the calibration and validation periods, respectively. Figure 6 shows that the dispersion of the SWAT and TOPMODEL values were in agreement for calibration and validation.

Fig. 6
figure 6

Observed and simulated streamflow by TOPMODEL and SWAT (a) calibration period and (b) validation period

The results of TOPMODEL and SWAT were compared statistically. Table 2 shows that SWAT performed better than TOPMODEL, with less relative error for calibration (0.61) and validation (4.19). Also, the SWAT predictions had higher correlations based on MAE and NS than TOPMODEL for both periods. However, the R2 and r values provided by TOPMODEL, i.e., 0.69 and 0.84, respectively, were better than those provided by SWAT. Generally, SWAT can simulate streamflow in a compatible manner. The peaks of the storms were represented adequately by both models. Due to the long record of available data used in calibrating and validating the models, their outputs were biased, especially for MAE and RE, in comparison with the results reported by El-Nasr et al. (2005). TOPMODEL obtained good values for both calibration and validation regarding the quality of R2 and the NS coefficient (Parajuli et al. 2009). SWAT obtained very good values for calibration, but SWAT and TOPMODEL obtained the good values for NS as TOPMODEL.

Table 2 Observed and simulated flows for all of the modeling periods and values of the statistical indices used to compare TOPMODEL and SWAT for the calibration and validation periods

Table 2 compares the statistical values of mean, median, standard deviation, and minimum and maximum values that were obtained for SWAT, TOPMODEL, and the observed data. Table 2 indicates that SWAT had closer values of mean, median, and standard deviation of flows in the calibration period to observed data. Evaluation of the minimum and maximum values showed that SWAT did not predict the minimum flow perfectly. Singh et al. (2004) made the same observation about SWAT for the Iroquois River watershed. TOPMODEL slightly overestimated the minimum flow, but it had a closer value to observed data for maximum flow in the calibration period. The values obtained via SWAT modelling based on statistical analysis were slightly closer to the observed data in the validation period. Table 2 shows that SWAT and TOPMODEL had consistent tendencies for minimum flow prediction in the calibration and validation periods. Generally, differences in the simulation of low flows for both TOPMODEL and SWAT might be related to insufficient information for representation of sub-surface flow and subsequent release of water from that water storage, thereby contributing to the base flow for Roodan.

3.2 Performances of TOPMODEL and SWAT in Predicting Peak Flows

The performances of TOPMODEL and SWAT were examined for predicting peak streamflow. Peak flows are important in water resource issues, especially for design and analysis purposes; therefore, 20 maximum peaks of each of the calibration and validation periods were chosen for analysis within less than 20 % based on the expedience probability. Figure 7(a) shows the peaks that were selected from the calibration period and the absolute relative error (RE %) for each year. It is apparent that both TOPMODEL and SWAT predictions have fluctuations that include underprediction and overprediction. Both SWAT and TOPMODEL underestimated several events (e.g., events 1–6), and they overestimated several other events, such as 12, 13, 14, and 16. Also, there were some events for which the two models did not exhibit the same trends, e.g., 17, 18, and 20. Clearly, TOPMODEL had better ability than SWAT to simulate the highest peak, e.g., peak number 11. The average relative error of 20 selected maximum peak predictions were 36 and 36.5 for TOPMODEL and SWAT, respectively.

Fig. 7
figure 7

Selected maximum peaks of observed, TOPMODEL and SWAT predictions (a) calibration period and (b) validation period (discharge in m3/s)

Figure 7(b) indicates peak flows and attributed relative errors (RE %) for each event in the validation period. the figure shows that both TOPMODEL and SWAT were unable to predict the observed values, especially at the peaks (1, 3, 7–9, 19, 20). The 20 maximum peaks that were investigated in validation had average relative errors of 64 and 70 % for TOPMODEL and SWAT, respectively. Based on the value of mean relative errors for the calibration and validation periods, TOPMODEL performed slightly better than SWAT for peak flows. TOPMODEL succeeded in capturing peak timing and magnitude of the hydrograph, and it did a reasonable job of simulating the variability of the observed values.

3.3 TOPMODEL and SWAT Performances in Predicting Runoff Volume

The annual volume of runoff for each year (i.e., the summation of the runoff for every day in the year) was assessed. Figure 8 compares the annual volumes of runoff for the calibration and validation periods. The results showed that TOPMODEL in the calibration period underestimated the annual runoff volume for three years and overestimated it for 10 years. In contrast, SWAT overestimated it for six years and underestimated it for seven years. The reason that TOPMODEL provided obvious overestimations could have been due to the contribution of low flows. Moreover, the underestimation of low flows by SWAT also could be the reason that this model provided lower values for annual runoff for each year in calibration.

Fig. 8
figure 8

Observed and predicted annual runoff volume (m3) for each year

For the validation period, Fig. 8 shows a five-year overestimation and a one-year underestimation by TOPMODEL, whereas the SWAT model underestimated the values for two years and overestimated them for four years. The reason SWAT overestimated the values for 2002, 2003, 2006, and 2007 resulted from its overprediction of several events (Fig. 4).

For the calibration period, both models had the same trend of underestimation for three years, i.e., 1989–1991. Figure 8 shows that annual runoff volume usually was overpredicted in six years. In contrast, the SWAT and TOPMODEL models similarly underestimated the events in 2004. However, they had the same overestimation trends for four years, i.e., 2002, 2003, 2005, and 2006.

In this research, it was found that TOPMOEL required fewer parameters than the SWAT model for implementation. Moreover, TI was presented the semi-distributed feature in TOPMODEL and derived from DEM. However, SWAT was semi-distributed as well, but this resulted from using HRU that was created based on DEM, land-use, and the soil map. Obviously, the simplifications in the TOPMODEL and SWAT models concerning hydrological processes and the laws of physics can lead to discrepancies between observed and simulated flows, peak flow comparisons, and annual runoff volumes. These processes include surface runoff, evapotranspiration, percolation, lateral subsurface flow, tile flow, groundwater flow, channel flow routing.

4 Conclusions

Comparisons of hydrological models always have been challenging, but they are beneficial in determining the availability of water, which is essential information for spatial planning and management. Catchment modelling is more favourable with application of semi-distributed models by a contribution of arid climate and large-scale plain. Consequently, in this study, only two semi-distributed models, i.e., TOPMODEL and SWAT, were compared for their ability to predict flows. The data used in the models included daily observations of streamflow over a 19-year period; the first 13 years were used for calibration, and the remaining six years were used for validation. The data required for both models were collected, and the study included investigations of the performance of the two models. The main conclusions that resulted from this study are listed below:

  • Both the TOPMODEL and SWAT models provided reasonable simulations of streamflow. However, some discrepancies were evident in the predictions of both high and low observed and simulated streamflows.

  • The results showed that TOPMODEL predicted the highest flow better than SWAT in calibration, while SWAT had the better performance for the highest flow in the validation period.

  • SWAT’s predictions were more compatible with the trend of observed data for high flows, including floods. TOPMODEL predicted peak flows with slightly larger discrepancies, including underestimating and overestimating observed values in the calibration period. In validation, both models presented the underestimation trend in comparison to observed data.

  • The quality of TOPMODEL’s performance, based on R2 and NS, was good over the modeling period. In addition, SWAT had very good performance in calibration and good quality for validation.

  • The SWAT and TOPMODEL models provided different behaviors in predicting the minimum flow. SWAT underpredicted these flows, while TOPMODEL slightly overestimated them. In general, in the validation period, the statistical values are slightly better for SWAT when compared to observed flows.

  • TOPMODEL mostly overestimated the annual runoff volume in calibration, but SWAT provided a more balanced estimation of annual runoff volume. In the validation period, both models generally did a slightly better job of estimating the annual runoff volume.