1 Introduction

Mountains in semi-arid areas like the Mediterranean region are an important source of surface water (Latron et al. 2009; Garcia-Ruiz et al. 2011). In this context, the catchments of the High Atlas Mountains located in South Morocco are a typical case studied for several years in the frame of the SUDMED and LMI TREMA programs (Chehbouni et al. 2008; Jarlan et al. 2015). Moroccan High Atlas catchments have precipitation totals varying from 300 to 900 mm, depending on the year and the location, and constitute the main contribution for water resources of the arid Haouz plain area around Marrakech (Hanich et al. 2003). In these catchments, the contribution of floods to the annual runoff yield is important (Saidi et al. 2003; Bouaicha and Benabdelfadel 2010), but these flood events are often violent and can cause many damages (Saidi et al. 2010).

Frequency analysis methods are widely used to describe the occurrence and the magnitude of extreme flood events to improve water resources management (Stedinger et al. 1993; Rao and Hamed 2001). They consist in fitting a statistical distribution to a series of observations of flood events, with an aim of determining the return periods for these events (Katz et al. 2002). Using a regional approach by combining different sites with a homogeneous behavior regarding floods, the flood quantiles can be estimated for ungauged sites (Anctil et al. 1998; Ouarda et al. 1999). In addition, several studies have shown that regional estimation of return levels can be more robust than at-site estimation (Ouarda et al. 2008; Lang et al. 2014). These methods have been applied in many regions of the world (GREHYS 1996; Svensson and Jones 2010); however, only a few studies have used regional flood frequency analysis in Morocco and in semi-arid regions in general (Farquharson et al. 1992). Among these papers, Ahattab et al. (2015) elaborated the cartography of Gradex values for estimating flood peaks to better size hydraulic structures in the Tensift basin. Similarly, Zemzami et al. (2013) used the frequency analysis to determine the design flood for sub-catchments of the Upper Moulouya basin by the method of Gradex. Zoglat et al. (2014) proposed an integrated approach that combines graphical tools and analytical approaches to identify the optimal threshold to be used in the frequency analysis of floods in north Morocco.

The goal of this study is to analyze the floods records available for the High Atlas mountainous catchments in Morocco and to test whether a regional flood frequency analysis could improve the estimation of the return levels for these extreme events. First, an analysis of the seasonal patterns is performed, and then regional distributions are fitted to the time series available. The following section describes the study area and the data used. In the third section, the regional flood frequency analysis methods are presented, and the fourth section is dedicated to results. Finally, conclusion is drawn in the last section.

2 Study area and data

2.1 The High Atlas mountainous catchments

The Moroccan High Atlas mountainous range is located in the Tensift basin (20,380 km2) where the economic development is primarily based on the agriculture, which requires an increasingly important mobilization of water resources (Schyns and Hoekstra 2014). The present work is focusing on the five mountainous watersheds: N’Fis, Rheraya, Ourika, Zat and R’dat (Fig. 1), defined by morphological characteristics and climate favoring significant floods due to mostly impermeable soils and steep slopes (Chaponnière et al. 2008; Saidi et al. 2012). Elevations are considerable in these basins, varying from 1000 to 4000 m. The highest altitudes are found in the Ourika and Rheraya catchments, and lowest elevations are found in the Zat and R’dat catchments (Table 1). These catchments are feeding the Haouz plain under arid conditions (250 mm/year) and located a few tens of kilometers in the southeast of the city of Marrakech (Boudhar et al. 2009). These basins are representative of mountainous basins in Morocco that provide most of the water in the plains, such as those located in the foothills of the Atlas mountains (Saidi et al. 2012; Zemzami et al. 2013).

Fig. 1
figure 1

Geographic location of the High Atlas watersheds and gauging stations

Table 1 Description of the five High Atlas catchments and available discharge data

No dams or regulations are present within the catchments; dams or reservoirs are located downstream of the outlets of the selected catchments. In all catchments, the presence of snow is observed during the winter, usually between November and March and for altitudes above 2000 m, with a strong inter-annual variability in snow amounts, location and the duration of the snow season (Boudhar et al. 2010; Marchane et al. 2015). The floods are usually caused by heavy rainfall events, but during spring, they can also be influenced by snowmelt (Saidi et al. 2003).

2.2 Discharge data

The discharge data are provided by the Agence de Bassin Hydraulique du Tensift (ABHT). The measurement protocol is based on a standard stage–discharge relationship, and the selected stations have not been relocated during the time period of measurements. We use data from five stations located at the outlets of the watersheds of the High Atlas: N’Kouris in N’Fis, Tahanaout in Rheraya, Aghbalou in Ourika, Zat in Taferiat and Sidi Rahal in R’dat (Fig. 1). Record lengths vary from 35 to 47 years (Table 1), depending on the station.

3 Methods

The method to estimate the return levels of extreme discharge is based on flood frequency analysis techniques (Rao and Hamed 2001). The use of frequency analysis is becoming increasingly common because it provides robust estimation of extreme quantiles, and information concerning the probability that a given event can be exceeded (El Adlouni et al. 2008).

3.1 Statistical tests

The time series used for frequency analysis must comply with the hypothesis of homogeneity, stationarity and randomness (Rao and Hamed 2001). To verify these hypotheses, three nonparametric tests were used since the distribution of the data is unknown, but assumed non-normal, as it is most frequently the case for extreme values (Tramblay et al. 2008):

  • The Wald–Wolfowitz (1943) test the hypothesis of independence (i.e., the absence of autocorrelation)

  • The Mann–Kendall test (1945) evaluates the presence of a trend in the data.

  • The Wilcoxon (1945) tests for homogeneity, by comparing the median of two samples.

In addition, to check the homogeneity of different distributions, the Anderson–Darling test (Scholz and Stephens 1987) is considered. It is a regional test that verifies whether k independent samples belong to the same population, without specifying their common distribution. The test is based on the comparison between local and regional empirical distribution functions, and the Anderson–Darling statistic is obtained by bootstrap (Viglione et al. 2007).

3.2 Fitting of extreme values distributions

A variety of statistical distributions exist to be fitted to series of annual maximum floods (Stedinger et al. 1993; Rao and Hamed 2001; El Adlouni et al. 2008). The choice of the generalized extreme value distribution to adjust the annual maximum discharge is based on the theorem of Fisher–Tippett (1928), which relates the maximum values to this law, as the limiting distribution of the maxima of a sequence of independent and identically distributed (i.i.d.) random variables. The GEV distribution is a continuous probability distribution and combines the Gumbel, Fréchet and Weibull distributions of extreme values (Jenkinson 1955):

$$\begin{aligned} F(x) & = \exp \left[ { - \left( {1 - \frac{\kappa }{\alpha }(x - \mu )} \right)^{1/\kappa } } \right]\quad \kappa \ne 0 \\ F(x) & = \exp \left[ { - \exp \left( { - \left( {\frac{x - \mu }{\alpha }} \right)} \right)} \right]\quad \kappa = 0 \\ \end{aligned}$$
(1)

where μ is the location parameter, α the scale parameter, and κ the shape parameter. If κ < 0, the distribution belongs to the Fréchet family; if κ = 0, to the Gumbel family; and if κ > 0, to the Weibull family of extreme values distributions. The GEV distribution is frequently used to model hydrological extremes, such as extreme precipitation or discharge (Katz et al. 2002). Here the maximum likelihood method is considered for parameter estimation.

3.3 Regionalization with the index flood method

The analysis of dimensionless regional flood frequency curves, built from records of a homogeneous area, could help to overcome the variability of at-site records. As noted by Farquharson et al. (1992), this at-site variability can be important in semi-arid areas.

The index flood method (Darlymple 1960) is a technique originally developed to estimate the flood frequency distribution at ungauged catchments. It is performed by scaling a regional flood frequency distribution by the index flood of the catchment, Q index:

$$\hat{Q}_{i} \left( T \right) = q_{R} \left( T \right)Q_{\text{index}}$$
(2)

with \(\hat{Q}_{i} \left( T \right)\) representing the estimated T-year flood peak discharge for a given catchment i and \(q_{R}\) the dimensionless regional T-year flood, also called growth factor for a given region. The regional growth factor is computed by pooling the normalized flood samples of a group of homogeneous catchments q i (j):

$$q_{i} \left( j \right) = Q_{i} \left( j \right)/Q_{\text{index}}$$
(3)

where \(Q_{i} \left( j \right)\) is the observed maximum flood for gauged catchment i and year j. The underlying assumption is that the normalized probability distribution functions, derived from the normalized flood samples at different locations within a region, are identical. Usually, the mean of annual maximum flood discharge is used as the index flood. For gauged catchments, the mean annual flood can be computed from measured data. For ungauged catchments, the mean annual flood can be estimated using regression with physiographic and climatic descriptors.

In the present work, due to the small number of catchments (5), the regionalization is performed considering that these catchments belong to the same homogeneous region. Prior to computing the regional distribution, the homogeneity is tested using the Anderson–Darling test.

3.4 Validation

A split-sample procedure has been implemented to validate both the local and regional quantile estimates. Split-sample procedures compare local and regional frequency analyses in the context they are designed for, where the objective is to predict upcoming events (Renard et al. 2013). The split-sample validation is based on the split of observations into calibration and validation samples. Here, due to the relative short length of available records and the strong inter-annual variability of floods, the split of observations into smaller sub-samples could introduce additional uncertainties. Therefore, a bootstrap sampling approach has been adopted that consists in randomly selecting for calibration n/2 years, n being the number of years available, and the validation is performed on the remaining years. This random selection is repeated 100,000 times in order to obtain stable results. The relative root-mean-square error (RRMSE) and the relative bias (RBIAS) are used to compare in validation the empirical quantiles obtained from the Hazen formula (1914) and the local or regional GEV quantiles.

$${\text{RRMSE}} = \sqrt {\frac{1}{N} \mathop \sum \limits_{i = 1}^{N} \left( {\frac{{Q_{T}^{i} - {\text{Qe}}_{T}^{i} }}{{{\text{Qe}}_{T}^{i} }}} \right)^{2} } \times 100$$
(4)
$${\text{RBIAS}} = \frac{1}{N} \mathop \sum \limits_{i = 1}^{N} \left( {\frac{{Q_{T}^{i} - {\text{Qe}}_{T}^{i} }}{{{\text{Qe}}_{T}^{i} }}} \right) \times 100$$
(5)

where \(Q_{T}^{i}\) is the local (or regional) estimate of the T-year quantile at site i, \({\text{Qe}}_{T}^{i}\) is the T-year empirical quantile at site i, and N is the number of sites in the region.

The stability of the quantile estimates obtained for different calibration periods is also quantified using the SPAN T index proposed by Garavaglia et al. (2011). This index is a measure of the span between two T-year quantiles estimated with distinct calibration data sets. For a given site i, it is defined as follows (Kochanek et al. 2014):

$${\text{SPAN}}_{T}^{\left( i \right)} = 2\frac{{\left| {\hat{q}_{T}^{\left( i \right)} \left( {c1} \right) - \hat{q}_{T}^{\left( i \right)} \left( {c2} \right)} \right|}}{{\hat{q}_{T}^{\left( i \right)} \left( {c1} \right) + \hat{q}_{T}^{\left( i \right)} \left( {c2} \right)}}$$
(6)

The closest SPAN T is to zero; the most stable is the quantile estimation between the two calibration samples.

In addition, to validate the regional distributions, a leave-one-out procedure (also called jackknife) is adopted. It consists in removing each basin in turn, computing the regional distribution with the remaining basins, and comparing the empirical quantiles with the at-site and regional estimates using the RBIAS and RRMSE aforementioned.

4 Results

4.1 Analysis of floods in the High Atlas catchments

The annual cycle of discharge in Fig. 2 shows maximum values during the month of April in all catchments and very low discharge values during the summer months. In all the catchments, floods are caused either by heavy rainfall events leading to short events, occurring mainly between September and January, or by a combination of heavy rainfall and snow melt, during the spring. The R’dat, N’Fis and Zat catchments have a very similar annual cycle, whereas in the Rheraya and Ourika, the catchments with the highest altitudes, a stronger influence of snow-driven events is observed. To evaluate the importance of extreme events for water resources, in Fig. 3 is shown the contribution on annual runoff of the 1 and 10% highest daily discharges. It can be seen that 1% of daily discharge represents on average 10–20% of total runoff and 10% of daily discharge can represent up to 50–70% of total runoff. Therefore, the contribution of extremes to annual runoff is important.

Fig. 2
figure 2

Average annual cycle of discharge in the Moroccan High Atlas basins

Fig. 3
figure 3

Contribution of extremes in annual runoff

Due to this marked seasonal behavior, we extracted the seasonal maximums for autumn (September to January) and spring (February to May), to distinguish the events caused solely by rain, and the events caused by a mixture of rain and snowmelt. As shown in Fig. 4, the distribution of floods during autumn is highly skewed and much variable between catchments compared to spring events. There is less variability in spring events, associated with snowmelt, than in autumn events caused by torrential rainfalls. For most catchments, the median of flood events during the two seasons is similar, except for the Rheraya and Ourika basins, where the Wilcoxon test indicates that the median of spring events is significantly higher than the median of autumn events.

Fig. 4
figure 4

Boxplots of seasonal maximum discharge. For each box, the central mark indicates the median, and the bottom and top edges of the box indicate the first and the third quartiles, respectively. The ‘+’ symbol is for data points outside the interval of +1/−1 inter-quartile range

4.2 Seasonal regionalization

The tests of Wilcoxon, Mann–Kendall and Wald–Wolfowitz applied to the seasonal maximum values confirm the hypothesis of homogeneity, stationarity and randomness at the 1% significance level. Overall, there is no change point or trends, but for both seasons, a very strong inter-annual variability of seasonal maximum discharge is observed. The local fitting of GEV distributions to seasonal maximum discharge at each station (Figs. 5, 6) indicates that the most important floods are observed in the N’Fis and Ourika catchments, mostly due to their important area. The GEV parameter values are shown in Table 2. As shown in Fig. 4, the maximum discharge is higher during autumn, mostly for N’Fis and Ourika basin. A good local fit of the GEV distribution is found for all basins and both seasons, except for some of the largest floods in autumn.

Fig. 5
figure 5

GEV fit to autumn maximum discharge

Fig. 6
figure 6

GEV fit to spring maximum discharge

Table 2 GEV parameters for each catchment and each season

From the local frequency analysis described above, the spring and the autumn floods are regionalized separately. All the spring and autumn events of each station are divided by their average, and then the five flood seasonal series put together form the regional sample, one for each season. Prior to fitting GEV distributions to the seasonal regional samples, their homogeneity has been tested with the Anderson–Darling test. For autumn and spring, the Anderson–Darling test does not reject the null hypothesis (at the 5% level), indicating that the seasonal flood samples are homogeneous. Finally, GEV distributions are fitted to the standardized seasonal flood samples. As shown in Fig. 7, there is a good fit for the spring events, whereas for autumn, the regional GEV underestimates extreme floods with return levels above 20 years. It can be seen that the 12 largest floods in autumn are completely outside of the fitted regional GEV distribution (Fig. 7). These 12 events correspond to floods that occurred at N’Kouris (N’Fis catchment), Sidi Rahal (R’dat) and Taferiat (Zat) between 1967 and 1991. There are two potential explanations for this singular behavior. First, autumn events are usually short, caused by heavy rainfall events, and therefore, it is difficult to obtain flow velocity measurements during these events to update the rating curves. Second, the N’Fis, R’dat and Zat catchments are less monitored than the Rheraya and Ourika catchments, and it is hypothesized that the rating curves are not frequently revised in order to take into account changes in the channels cross sections.

Fig. 7
figure 7

A-dimensional regional samples fitted with GEV distributions for autumn (left) and spring (right)

4.3 Validation of the local and regional models

The validation of the local and regional distributions is carried out for both seasons (autumn and spring), in order to evaluate whether the regional approach provides better quantile estimates than the local approach, by comparing the regional and local GEV quantiles to the empirical quantiles computed for different return levels between 2 and 50 years. Since the goal here is not to test the model for ungauged catchment, but to test whether the regional model increases the estimation of extreme quantiles, the index flood (Q index) is computed from observed data. Indeed, the case of different stations with various record lengths within a region considered homogeneous is the most common case encountered in Morocco.

4.3.1 Split-sample validation of local and regional distributions

The validation procedure described in Sect. 3.4 is applied separately for spring and autumn floods with the data of the five catchments. The results are presented in Table 3, where are shown the mean RBIAS, RRMSE and SPAN T over the 100,000 random selections of calibration and validation samples, between the local and regional models. Small differences in terms of RBIAS and RRMSE between local and regional models are found for the shorter return periods (2–10 years). The uncertainties for both local and regional estimates are becoming larger with longer return periods (20 and 50 years). However, for all return levels, the RRMSE and RBIAS are lower with the regional models compared to the local models. There is a contrasted behavior between the two seasons; the validation results are significantly lower for the autumn with very large RBIAS and RRMSE values, in particular for the local models. The low scores for the local models are strongly impacted by the results of the N’Kouris station. Indeed, the split-sample approach performed at N’Kouris with only 35 years of data provides the worst results in validation. This is due to the short length of records and, as shown in Fig. 5, the poor agreement of the local GEV model fitted at this station. Finally, the SPAN T index is always smaller for the regional estimates than for the local ones, indicating a much stronger stability of quantile estimates between two different periods when using a regional approach. This highlights the benefits of considering regional flood frequency methods in the semi-arid context of Morocco where a strong inter-annual variability of flood magnitude is observed.

Table 3 Split-sample validation results of the local and regional quantile estimates

4.3.2 Leave-one-out validation of the regional distributions

For this second validation experiment, each station is removed in turn, the a dimensional regional distribution is computed with the remaining stations, and the differences between the regional and empirical quantiles, local and empirical quantiles are compared using the RRMSE and RBIAS. As shown in Table 4, the errors are much greater for autumn events than for spring events. This result has to be related to the relatively low agreement of the regional distribution with high return levels in autumn, but also to the great variability of autumn floods between catchments. In autumn, the RBIAS is exceeding 50% on average and the RRMSE is equal to 110% on average, making the estimation of return levels for autumn events not very reliable. On the opposite, for spring the errors remain limited, with RBIAS less than 25% and RRMSE less than 29%. It can be seen that for both seasons there is an improvement of the regional estimation in comparison with the at-site approach, with lower RBIAS and RRMSE values for the regional approach. This result confirms the findings of other studies such as Anctil et al. (1998) in Canada and Lang et al. (2014) in France that a regional flood frequency analysis can improve the estimation of extreme flood quantiles.

Table 4 Leave-one-out validation results of the regional quantile estimates

5 Conclusions and perspectives

The goal of this study was to apply frequency analysis methods in the High Atlas mountainous catchments of Morocco in order to evaluate whether a regional flood frequency approach could improve the estimation of the magnitude and the occurrence of floods. Results have shown that floods provide a large contribution to annual runoff and a contrasted seasonal pattern is observed in these catchments, with floods occurring either in autumn, driven only by heavy rainfall, or in spring caused by a mixture of heavy rainfall and snowmelt. Therefore, regional distributions have been computed for each season separately, the autumn encompassing the months from September to January and spring from February to May. The validation of the seasonal regional distributions by comparing the regional and at-sites quantile estimations has shown that the regional estimation provides more robust and reliable results than the classical at-site local estimation of flood quantiles. However, there is a much greater uncertainty in the estimation of flood quantiles during autumn than during spring. This has practical implications for water resources management since spring runoff is a very important contribution to surface water resources, allowing to fill up dams and reservoirs located downstream of the catchments before the dry summer season.

Further work on the use of regional flood frequency analysis in Morocco should consider the use of a larger sample of catchments. In the present work, the definition of homogeneous regions was not possible due to the limited number of basins. By including a larger number of sites in Morocco, representing a variety of soils, topography and climate, it would be feasible to test regional methods tailored for ungauged catchments. In addition, due to the strong inter-annual variability of climate in Morocco, it could be interesting to complete the frequency analysis performed on annual sampling of floods with peaks-over-threshold sampling (Renard et al. 2006) that may better capture the distribution of extremes caused by heavy rainfalls that could have several occurrences within a year. Finally, these methods are adapted to the stationary context, but could be easily adapted to the non-stationary case (Tramblay et al. 2012), linked to climate change induced by either anthropogenic activities or natural climatic oscillations.