1 Introduction

The study of flood events, and specifically of flash flood events, is of crucial importance for both hydrologists and engineers. Floods occur when high rain intensity is recorded, in conjunction with the basin’s characteristics (Sene 2013). However, these two factors vary in time. It is reported that rain intensity has increased as an effect of climate change (Alfieri et al. 2012; Madsen et al. 2014; Caloiero et al. 2017), while the ongoing urbanization and land cover change, such as deforestation, have also negatively impacted the basins’ flood generation characteristics, thus contributing to flood occurrence overall (Batelis and Nalbantis 2014). These effects have led to the result that areas that once did not flood are now prone to flooding, especially in the Mediterranean where recent flash floods have not only caused economic problems to the affected areas (Pistrika et al. 2014), but have also resulted in increased human casualties (Diakakis et al. 2012; Pereira et al. 2017; Feloni et al. 2019).

In order to prevent such flood damages, current practice dictates the use of non-structural measures. Legislation initiatives, such as the European Flood Directive 2007/60, are valuable tools for flood management, but the derived estimations can suffer from high uncertainty when applied in ungauged or poorly gauged basins, which eventually affect decision-making of the proposed flood measures (Yannopoulos et al. 2015). To that end, real-time flood warning and prevention systems, such as Early Warning Systems (EWS), can be deployed, which allow for real-time detection and forecasting of flood events in an adequate lead time. These systems provide with valuable information for proper decision making and damage mitigation actions (Alfieri et al. 2012; Cools et al. 2016); however, the quality and quantity of this information is mostly depending upon data availability of the model used and the overall system framework. Basic systems rely on monitoring current weather information, in order to estimate precipitation forecasts and issue warnings based on historical weather states that led to flooding. More advanced systems include hydrological modelling in order to estimate current and forecasted discharge values, which then allows for the detection of the specific areas within a basin that will probably flood (Chen et al. 2013).

The selection of the proper hydrological model is a daring question since there is an all increasing number of hydrological models to be used, varying upon the number of data they need and the study area characteristics. Process-based models, such as LISFLOOD used in the European flood alert system (Thielen et al. 2009), MIKE-SHE (Ivanescu and Drobot 2016) and TOPKAPI (Plate 2007), and data-based models, such as feature transfer functions, artificial neural networks (Sušanj et al. 2016) and fuzzy logic techniques, are usually fully distributed models, which provide detailed information and suit best with weather radar or satellite rainfall data, but tend to require a significant amount of data, which is inapplicable in data scarce basins. Conceptual models, such as MIKE11/NAM (Singh et al. 2014), SAC-SMA (Georgakakos 2006; Basha et al. 2008; Wu et al. 2010) and HBV (Kobold and Brilly 2006) are lighter, less data hungry and easier to configure models that work well in lumped or semi-distributed modelling context, but most importantly, in areas where data availability is scarce, such as in ungauged basins.

In this study, the deterministic conceptual, Sacramento Soil Moisture Accounting (SAC-SMA) model was used. The model was developed by the National Weather Service (NWS), Sacramento, California (Burnash et al. 1973; Burnash 1995) for use by the National Weather Service River Forecast System (NWSRFS), and it mainly focuses on the water distribution among the different soil layers and their ability to constrain or release water. The model was designed to be a deterministic, lumped based model, modelling the daily input of precipitation and potential evapotranspiration values into runoff and soil moisture accounting, but it has since been used as semi-distributed model (Boyle et al. 2001) and as a distributed model (Koren et al. 2004; Smith et al. 2004; Reed et al. 2007) as well. Moreover, the wide range of successful applications across different basins (Koutroulis et al. 2013; Posner and Georgakakos 2015; Katsanou and Lambrakis 2017; Uliana et al. 2019) highlight the versatility of the model and its low limitation regarding data availability and different basin characteristics, making it a valid choice for ungauged basins. The downside of the model is its large number or parameters, making calibration a complex procedure, but significant research has been conducted towards that area (Boyle et al. 2001; Ajami et al. 2004; Hogue et al. 2006; Zhang et al. 2011; Wu et al. 2012; Koren et al. 2014). Although, the model dates over 50 years and clearly reflects the technology of its time of appearance, i.e., the large number of parameters, the model is still well used, especially in the United States, mainly due to the NWSRFS adaptation and the gained experience of the users. In a recent study evaluating the usage of hydrology models through text mining techniques of over 1500 journal related abstracts (Addor and Melsen 2019), it is reported that the importance of legacy in model selection, i.e., the preference and experience gained on a particular model is usually more important when selecting a model than its adequacy, i.e., the use of the most adequate model for a specific research question that depends upon data characteristics such as the landscape of the region, and the extend and the temporal and spatial scales of the datasets. In the same study, it was reported that the SAC-SMA has been used mostly for flood and low-flow modelling for prediction and forecasting applications, rather than water-resources and drought modelling. Finally, the model is often used as a benchmark for testing new models (Nearing et al. 2016; Daliakopoulos and Tsanis 2016; Birhanu et al. 2018).

Concerning event-based rainfall-runoff applications, as stated above, the model is used by the NWSRFS as a flood forecasting tool, and specifically, as a core element of the Flash Flood Guidance system (FFG), a forecasting system of flash floods (Ntelekos et al. 2006; Georgakakos 2006). In the system, the model is used as a rainfall-runoff model in order to provide with current and forecasted soil moisture conditions and correlate them with the forecasted rainfall observations and predetermined thresholds to issue warnings. The system is well established (Norbiato et al. 2009) and has been also expanded outside the USA through the global flash flood guidance initiative (Clark et al. 2014; Putra et al. 2021).

In this context, the main objective of this study was to shed light into the model functions for event-based simulation with the intent of real time calibration and simulation applications. To that end, a simplified version of the model was introduced. The overall scope of this study was to simplify the original model by removing model functions, and therefore, parameters that do not affect the model performance in event-based rainfall runoff simulations, but yet contribute negatively into the overall model uncertainty and computing time, mainly due to the extensive number of equations and parameters involved. A mountainous basin in Greece was used as the study area, while the analysis was performed on a one-hour time interval.

2 Materials and Methods

2.1 Study Area

The Karitaina Basin is the upper part of the Alfeios river basin located in Southern Greece, in the center of the Peloponnesus prefecture (Fig. 1). The basin is surrounded by Mount Taygetus in the south, Mount Lykaion on the west and Mount Mainalo on the northeast, while its outlet is located in the northwest of the basin, near the Karitaina settlement. The basin total area is 892 km2, while the mean, maximum and lowest elevation heights are 750 m, 1854 m and 288 m, respectively. Within the basin lies the agricultural plane of Megalopolis, as well as, the large Megalopolis lignite excavation site, which produces nearly 8% of the lignite production of Greece and covers a total of 2.4% of the basin. Based on the 2012 CORINE Land Cover (CLC) dataset classification, featured in Table 1, the dominant land uses are forest and semi natural areas, 64% of the basin, while 32% of the basin, mostly in the lowlands, is covered by high agriculture activity. The most significant stream is the Alfeios river, whose source lie in the Mount Taygetus, while the smaller Elissonas stream springs from Mount Mainalo and joins Alfeios near the Megalopoli settlement (Fig. 1). The basin has a nearly symmetric shape, close to an oval shape rather than an elongated shape, which has the effect of generating hydrographs with steep rising limb; as a result, high peak discharges are often observed.

Fig. 1
figure 1

The study area, Karitaina basin

Table 1 Land cover percentages according to Corine 2012

The discharge at the Karitaina station was calculated from stage measurements. The stage measurements (H) were transformed into runoff (Q) by using the following derived rating curve:

$$\begin{array}{c}Q=7.798{ H}^{1.906}\end{array}$$
(1)

The precipitation data used in this analysis were obtained from the National Observatory of Athens Automatic Network (NOANN). The network provides high quality precipitation values for a given time step of 10 min (Lagouvardos et al. 2017). A total of nine rain gauge stations, shown in Fig. 1, were used, while the spatial interpolation of the precipitation input was conducted using the Thiessen polygon method. The NOANN network is an always expanding station, with the addition of new stations, therefore the actual gage weights used for each station were adjusted based on the available rain gauge stations at the time of each of the studied event.

2.2 Hydrological Model

The SAC-SMA model is a deterministic, nonlinear, conceptual model that describes with analytic linear and nonlinear equations the mechanics of soil moisture transfer between the soil layers. The model introduces two concepts: the tension and free water. Within a single soil block, tension water is the amount of water that is upheld by the soil particles, while free water is the water that remains within the soil mantle but can move freely between the soil particles. Runoff is defined as the amount of water that eventually runs out of the soil block, due to gravity or saturation. In the SAC-SMA model, this concept is repeated in two different soil zones, the upper and lower zone. The upper zone is responsible for the estimation of surface runoff, while the lower zone describes the groundwater storage and the baseflow generation. The two zones are connected through the percolation equation, which controls the water transfer between the upper zone and the lower zone, based on the water deficit of each zone and the soil infiltration characteristics. The model features 16 parameters that control the different stages of the flow generation, six state variables and five runoff components, as seen in the flowchart of the model (Fig. 2).

Fig. 2
figure 2

The SAC—SMA flowchart

The model was developed for the US river forecasting system, and the time interval was originally intended to be daily, or up to 6-h, since it was designed for use in large sized basins, while the available datasets and computer power of that time were also limited. However, the model is capable of accepting any time-scale input, as long as, it has been calibrated for that scale because the model parameter values are found to be timescale sensitive (Finnerty et al. 1997). Table 2 presents the description of each parameter along with its suggested value ranges.

Table 2 The SAC-SMA model parameters

2.3 Simplified Model

In flood events and simulations of small basins, where the time intervals are small, it is a fact that the impact of the lower zone processes on the observed flood hydrograph is negligible, and the generated runoff is mainly the product of the upper zone processes, where the major part of the precipitation losses and the streamflow, i.e., the total channel flow, are calculated in the form of direct, surface and interflow runoff. Specifically, the upper tension zone, controlled by parameter UZTWM, is responsible for the initial precipitation losses, while the upper free zone, controlled mainly by UZFWM and UZK, is responsible for the surface and interflow runoff. An amount of the water stored in the free water tank is then percolated to the lower zone, mainly as groundwater storage and observed as runoff in the form of baseflow only several days after the rain event ends. Since in flood events the baseflow is only a fraction of the generated flow peak, in order to increase the efficiency of the SAC-SMA model in these cases, a simplified model was introduced, where these components, i.e., the lower zone processes, are removed or replaced (Fig. 3).

Fig. 3
figure 3

The simplified SAC-SMA flowchart

In order to remove the lower zone processes, the percolation process must first be changed. By design, the water transfer from the upper zone to the lower zone is related to both the upper zone deficit and the lower zone percolation demand. The lower zone percolation demand is calculated through a nonlinear equation involving the lower zone water deficit and two model parameters, ZPERC and REXP, which describe the soil infiltration characteristics. In the simplified model, we assume that the lower zone demand is constant throughout the flood event, described by a new model parameter \({P}_{0}\)(mm/h), and as a result, the amount of water transferred into the lower zone is only depending upon the upper zone water deficit. By applying this change, the lower zone can be completely removed and replaced by a single groundwater tank, mainly for keeping some consistency with the original concept, which results in a significant reduction of the model’s total processes, since half of model functions are removed, thus making the model lighter and faster to run. Moreover, the model parameters are reduced from 16 to only 6, which provides with an overall better control over the model generated uncertainty, and also reduces the calibration and simulation computing processes.

The two models were reprogrammed using the Matlab software, based upon the original FORTRAN computer code (Peck 1976). The shuffled complex evolution (SCE-UA) optimization algorithm (Duan et al. 1992, 1994) was used for the calibration of the parameters, and the Nash–Sutcliffe efficiency coefficient (NSE; Nash and Sutcliffe 1970) was used as the model performance criterion since it emphasizes the high flows, which is the most important characteristic in event based rainfall-runoff simulations, and as a result, it is the most widely used measure in hydrology (Gupta et al. 2009; Moussa and Chahinian 2009; Pushpalatha et al. 2012).

3 Results and Discussion

A total of six events, which caused extensive flooding within the Karitaina basin, were used in this study. The maximum observed discharge and main rain characteristics are presented in Table 3. The first four events were used for the calibration of the model parameters, while the last two for its validation. Since the events occurred during the same time period, being between winter to early spring, the initial conditions were assumed of the same magnitude for each event, as a fixed percentage of the estimated water zone capacity. The potential evapotranspiration was also set as a low fixed value, since evapotranspiration is usually low and negligible in rainfall-runoff events, where cloud conditions are present and time intervals are small. Moreover, the process is identical in both models, occurring before any runoff generation is calculated, and unless the soils are completely dry it does not affect the comparison made. Finally, the model parameters for the validation process were taken as the mean values of the parameters calculated by the calibration process.

Table 3 Characteristics of the events

In Fig. 4, the calibration for both the original and the simplified models for Events 1, 2, 3 and 4 are presented, while in Fig. 5 the validation simulations for Events 5 and 6. The corresponding model parameter values are presented in Table 4, along with and the corresponding calculated NSE value, for each event. The results showed that the SAS-SMA can be used for event-based simulations, since the NSE values for the calibration process varied between 0.81 to 0.96, while for the validation process between 0.83 and 0.78, respectively, which shows high correlation between the simulated and observed hydrographs.

Fig. 4
figure 4

Calibration runs of the original and the simplified SAC-SMA

Fig. 5
figure 5

Validation runs of the original and the simplified SAC-SMA

Table 4 Model parameter values derived from calibration

First, we addressed the original and simplified model differences on model performance in short timescale events. In all occasions, the two models behaved identically, especially in the calibration processes, since, as discussed, the main source of runoff generation is the upper zone processes which are described almost identically in both model versions. The model parameter values between the two models differ slightly as a result of the optimization algorithm and the change in the percolation curve equation. A once at a time (OAT) sensitivity analysis was performed, where the sensitivity of each parameter was estimated by observing the change in the output, i.e., the NSE criterion, when altering the optimal value of each parameter in equal percentages and keeping the rest fixed in their optimal values. It was found that the most sensitive parameters for both models were the upper zone parameters UZTWM, UZK and UZFWM, while the majority of the lower zone parameters had a low or no impact upon the generated results, thus justifying the simplification.

Concerning the validation runs, a small deviation of under 6% was observed in the calculated peak discharge between the two models. This difference is probably associated with the different model parameter values used in each case, which were calculated from the mean values of the calibration events and the related set of initial conditions. The most obvious difference is found in the recessing limb. This difference is mainly the result of the different starting point, i.e., the peak discharge value, from where the recession is beginning, while no obvious effect of the different percolation process is seen. The results were overall satisfactory, and any differences found were small, compared to the benefit in the computing speed of the model being up to three times faster with the simplified model compared to the original model. However, regarding the absolute computation demands, it should be noted that model runs by modern computers are usually fast, if not nearly instant, as they depend mostly upon the size of the inputs, i.e., the duration of the event and the optimization algorithm in the calibration phase. Therefore, in a lumped scheme, as in this research work, the differences in computational demands might be non-worthy. However, in a semi-distributed or a fully-distributed scheme, where the number of calculations is multiplied based on the number of subbasins or total grids, the differences would be substantially larger. It is estimated that in the current study area with a total area of 892 km2, by applying a 500 m cell size for a distributed model, a total of 3568 cells would have to be calibrated and simulated, and thus, the computational time difference could have been over one hour.

Concerning the lower zone parameters, in both the original and simplified models, their values showed increased variability, which is the result of their negligible impact on the overall performance. In order for the optimization algorithm to properly estimate these parameters, the calibration period should be extended several days after the event for both models. However, this is not usually the case in event-based modelling, where only the period of surface runoff generation is important. Therefore, the high uncertainty of these values was expected.

Finally, we compared the model performance in contrast with the observed hydrograph. As already mentioned, the NSE values showed the high correlation between the simulation and the observed values. We now focus on the form of the rising limb, the peak discharge and the time to peak, since these values are the crucial in any flood forecasting and management system. In all events, the rising limb was correctly simulated, concerning the starting point of the hydrograph and the overall slope. A small difference is observed in the discharge peak, where in the calibration events the simulated peak is lower than the observed, while in the validation the opposite is seen. Nevertheless, the difference in most cases is under 5%, while in Events 1 and 3 is under 10%. The most obvious difference is observed in the time to peak values. In almost all cases, the time to peak value is calculated sooner than what is observed, ranging from 2 to 6 h which is noticeable in these basin sizes. For a flood forecasting system, the model would be on the safe side, since earlier times are considered the worst-case scenario. Furthermore, the model result seems to be highly correlated with the precipitation datasets, since runoff peaks are found in high correlation with the precipitation datasets, even in the recessing limb, e.g., in Event 6, while such behavior is not always found in the observed data. This difference is a result of: first, the observed hydrograph derivation method, i.e., the Q-H curve method; and second, the actual model application which is taking into account the entire rainfall pattern.

In general, deviations from the observed hydrograph may not be linked to the model inability to correctly simulate the observed hydrograph but more likely to the scale of the datasets included, such as the spatial detail of the rainfall input. In small time intervals, the storm movement within the basin is noticeable, and thus, the time to peak can change, depending on the actual direction of the storm. This detail cannot be examined when analyzing using a lumped model but rather a semi-distributed, or fully distributed approach should be favored, since they allow for better estimation of the rainfall field. Finally, it should be stated that the application of the simplified model, is suitable in basins where the upper zone processes are expected to produce the major part of storm runoff. In cases where the lower zone processes and parameters are expected to have a significant impact, such as in Karst basin cases (Katsanou and Lambrakis 2017), a calibration–validation analysis must be first performed, prior to the application of the model, in order to avoid simulation errors and unrealistic parameter values.

4 Conclusions

The SAC-SMA model was used in order to perform rainfall-runoff simulations with a fine timescale dataset. The results indicate that model performed well as a rainfall runoff model for simulation of short timescale events, as low as one hour, provided that the model parameters were calibrated for that timescale. The tension and free water concepts are valid and applicable on short timescale data simulations. Specifically, the model, in these cases, was highly sensitive to the upper zone parameters UZTWM, UZFWM and UZK, since they control the major part of the produced storm runoff. The most sensitive parameter was UZTWM, which is the upper zone tension water capacity and controls the initial losses, followed by parameter UZK, the percentage of the upper free water content, and UZFWM which controls the interflow and surface runoff generation.

In order to improve the efficiency of the model, a simplified version was introduced by modifying the original model based on the assumption that lower zone processes do not affect the generated flood hydrograph since they occur in different timescales, and thus, do not affect the flood hydrograph characteristics. The simplified model was evaluated by comparing the produced hydrographs of the original and the simplified models with the observed hydrograph for a total of six rainfall events used in model calibration and validation. In all cases, the generated hydrographs of the two models were almost identical, which proved the simplified model accuracy and the assumption made. Moreover, by reducing the number of parameters from 16 down to only six, the simplified model showed an important improvement against the original model concerning simplicity and computing time. Overall the simplified model, was lighter, easier to configure and faster to calibrate and run, which is crucial for modern early warning systems that rely on massive real time and data ensemble calculations for detailed flood forecasting applications.

Future research should focus on analyzing the performance of the simplified model in a semi-distributed and a fully distributed design, since, as stated, the overall model generated uncertainty due to the large number of parameters to be estimated, as well as the computational demand due to the size of input datasets are substantially larger and the benefit will be more rewarding. Moreover, a global sensitivity analysis in both the original, as well as the simplified model should be applied, in order to assess in detail, the sensitivity of each parameter and reach conclusions regarding the effectiveness and the performance of the simplified model. Finally, a complete flood forecasting system based on the simplified SAC-SMA model could be evaluated. In such a system, the original model could be used to determine the initial conditions in longer time intervals, such as 3 and 6 h, prior to the rainfall event, whereas the simplified model could be used during the rainfall event, in order to provide with accurate and well spatially and temporally scaled information regarding the current soil moisture conditions and generated runoff estimates for further flood forecasting analysis.