1 Introduction

Information on the spatio-temporal variability of extreme rainfall characteristics is of critical importance for many types of hydrologic studies related to the estimation of runoffs for planning, design, and management of various water resources systems [1, 2]. In particular, for urban and small rural watersheds that are generally characterized by fast response, the designs of various hydraulic structures such as small dams, culverts, storm sewers, detention basins, and so on, require extreme rainfall input with very short time durations (e.g., few minutes or hours) for runoff simulation models [1]. More specifically, this information is extracted from the “intensity–duration–frequency” (IDF) relations, which provides extreme rainfall intensities for various durations and return periods at a given site of interest [2, 3].

In current engineering practice, the IDF relations are derived based on statistical frequency analyses of annual maximum (AM) rainfall series for different durations where available rainfall records of adequate lengths could be used to estimate the parameters of a selected appropriate probability distribution model. However, these AM rainfall records for short durations (e.g., less than one day) are often limited or unavailable at the location of interest because of the high measurement costs involved while those for the daily scale are widely available. Hence, there exists an urgent need to develop new methods for modeling extreme rainfall processes over a wide range of time scales such that information related to sub-daily extreme rainfalls could be inferred from the daily extreme rainfalls available at a given site.

In Canada, Environment Canada (EC) provides short-duration extreme rainfall data for nine different rainfall durations (D = 5, 10, 15, 30, 60, 120, 360, 720, and 1440 min) and the IDF relations for approximately 650 stations across Canada with at least 10-year record [4]. The estimated extreme rainfalls for six different return periods (T = 2, 5, 10, 25, 50, and 100 years) were obtained by fitting the two-parameter Gumbel distribution to the AM series for each rainfall duration independently using the method of moments. However, it has been widely known that the three-parameter Generalized Extreme Value (GEV) distribution using the L-moment estimation method can provide more accurate extreme rainfall estimates [5,6,7]. In addition, rather than fitting the GEV distribution to AM rainfall series for each duration independently as commonly used in the traditional method, the scale-invariance GEV model has been shown to provide more robust estimates of extreme rainfalls since it can take into account the relationship between the statistical properties of the extreme rainfall processes over different durations [8, 9]. The scale-invariance concept has increasingly become a promising methodology for modeling of various extreme hydrological processes across a wide range of time scales and has been applied to updating the IDF curves considering climate change impacts in recent years [8,9,10,11,12,13,14].

In view of the above-mentioned issues, this study presents the development of new at-site IDF relations as well as regional extreme rainfall maps for urban water infrastructure design for Canada using the scale-invariance GEV model. All available historical AMS data for nine different rainfall durations from a network of 651 raingauges located across Canada were used in this study. It has been found that the newly developed IDF relations and regional extreme rainfall maps based on the proposed scaling GEV model can provide more accurate and more robust extreme design rainfalls than those given by the traditional method using the Gumbel model by the EC. Details regarding the study sites and data are described in Sect. 2. The methodology used in the development of these new products is presented Sect. 3. Section 4 provides the results and discussions. Research findings are summarized in Sect. 5.

2 Study Sites and Data

In the present study, all available AM historical rainfall data for nine different rainfall durations ranging from 5 min to 24 h from a network of 561 raingauges located across Canada were selected as shown in Fig. 1a. These AM data were obtained through the IDF_v3-10 program of the engineering climate dataset from the website of the Government of Canada [4]. The record length for these AM rainfall series varies from 10 to 82 years. Most of the data are recorded after 1960 and updated to 2017. Approximately, two third of these stations are less than 30 years of records in which stations between 10 to 20 years are composed of 40% and between 20 and 30 years cover 27%. Stations with very long record (at least 50 years) only occupy 6%. In total, more than 5000 AM rainfall series were considered in this study.

Fig. 1
figure 1

a Locations and record lengths of 561 stations. b Correlogram of means of annual maximum (AM) rainfalls of nine different rainfall durations (D = 5, 10, 15, 30, 60, 120, 360, 720, and 1440 min). Note: meanD1 denotes mean of 5-min AM rainfall series and meanD9 denotes mean of 1440-min AM rainfall series

As shown in Fig. 1a, the selected stations are located across the vast land of Canada to capture possibly the true image of the spatial variations of extreme rainfall events induced by the diverse climatic regions and the complex topography of Canada. However, the raingauge density is not equal between these different Canadian regions. In fact, only 5% of the stations are scattered in the three Northern territories whereas 95% of stations are mainly located in the Southern provinces of Canada where most people live. Even between these provinces, raingauge density also varies significantly. The three largest provinces including the Pacific province (British Columbia) and Central provinces (Ontario and Quebec) hold 66% of the number of stations while the three Prairie provinces (Alberta, Saskatchewan, and Manitoba) cover 18%, and the remaining four Atlantis provinces (New Brunswick, Nova Scotia, Prince Edward Island, and Newfoundland and Labrador) only occupy 11%.

The mean extreme rainfall values of the AM series available at different locations were computed for each rainfall duration. The Pearson correlation coefficients (CC) between these computed mean values for all nine different rainfall durations were analyzed and plotted in Fig. 1b. In general, results show that there exist strong correlations between these mean values; especially, between any pair of nearest successive rainfall durations (CC larger than the minimum value of 0.87 between 120-min and 360-min extreme rainfalls). In particular, the correlations between extreme rainfalls for very short durations (from 5 min to 1 h) are very strong (CC larger than the minimum value of 0.92 between 5-min and 1-h extreme rainfalls). Therefore, the strong correlations between extreme rainfalls over different rainfall durations as identified in this study should be taken into consideration when performing frequency analyses of these extreme rainfall series rather than treating them independently as in existing traditional methods.

3 Methodology

3.1 Point-Scale Rainfall IDF Estimation

The scaling-invariance approach based on the GEV distribution [8, 9] was used to derive point-scale extreme rainfall IDF relations. The procedure consists of the following steps:

  1. (i)

    calculate probability weighted moments (PWMs) of rainfall amounts for each AMS from 5-min to 1-day duration,

  2. (ii)

    fit regression models to PWMs over different rainfall durations,

  3. (iii)

    estimate the scaling exponents and derive moments of sub-daily and sub-hourly durations from those of daily duration,

  4. (iv)

    derive the distribution of each sub-daily and sub-hourly annual maximum series (AMS),

  5. (v)

    use the derived distribution to compute rainfall intensity for a given rainfall duration and return period of interest.

First, PWMs of each AMS from 5-min to 24-h rainfall duration are calculated for a given location of interest. Next, regression models are constructed based on the relationships between the computed PWMs of rainfall amounts for different rainfall durations. Based on these statistical relationships, scaling exponents can be then estimated. Some previous studies have indicated that extreme rainfall processes for durations ranging from a few minutes to several days could be well approximated by only one or two scaling regime(s) (see, e.g., [8, 9, 12, 14, 15]. By investigating the log–log plot of the relationships between rainfall PWMs versus rainfall durations, one can identify one scaling rainfall regime (if the plot displays a straight line and does not indicate any break point) or two different scaling rainfall regimes (if the plot shows a break point) over the selected time scales. For a large number of stations, it is simpler to assume one break point and then check the scaling exponents of the two scaling regimes. If they are equal or approximately equal, then there exists only one scaling regime. A numerical criterion was employed to automate the procedure of identification of the breakpoint location and to avoid the subjectivity [8]. The procedure first fits a piecewise (or segmented) linear regression model to the log–log plot of the first three statistical moments versus rainfall durations near one end and compute model residuals. It then iterates through all middle points to other end and finally compares the ranking of the residuals to determine the best fit such that the group-wide and the total residuals are minimized. Only the first three statistical moments are used in the search because it is sufficient to estimate the three parameters of the GEV distribution. Once scaling exponents are computed, a scaling model is constructed which allows to derive moments of rainfall amount of sub-hourly and sub-daily durations from those of daily values. From the derived moments, a theoretical distribution model and its parameters are estimated based on the method of L-moment for each sub-daily and sub-hourly AMS. Rainfall intensity for a given rainfall duration and return period of interest can be easily obtained based on the derived distribution, even for an “unobserved” duration between 5-min to 1-day duration. Since this scaling GEV model considers all extreme rainfall data for all different durations available at a site and the relationships between these rainfall durations, the results are more consistent and more robust as compared to those given by the traditional technique that fit the chosen probability model to observed AM data for each duration independently [3, 7].

3.2 Grid-Scale Extreme Rainfall Estimation

At-site IDF estimation procedure provides the computed extreme design rainfall quantiles for a given location with available extreme rainfall data (i.e., a gauged site). However, given the low density of IDF sites for its very large territory, it is a common situation in Canada that observed data are unavailable at the location of interest (i.e., an ungauged site). Consequently, it is necessary to transfer information from the gauged locations to ungauged locations. Spatial interpolation techniques are frequently used in practice to tackle this issue by fitting a surface (represented by a raster dataset) to all IDF sample measurements points (i.e., control points). Based on the fitted surface, value at any given location in the output raster can be predicted. The number and distribution of control points can greatly influence the accuracy of the computed spatial interpolation.

There are several procedures to perform spatial interpolation and each method is often referred to as a geospatial fitting model. Each model applies different computational scheme and there are also different assumptions associated with each model. In general, these spatial interpolation techniques are categorized into the non-geostatistical and geostatistical approaches [16]. The non-geostatistical methods assign values to a target location (or cell) based on the surrounding measured values using a specific mathematical equation that determine the smoothness of the resulting surface. Several popular methods in this category include the inverse distance weighting, proximity (or Thiessen method), natural neighbor, and spline. The geostatistical methods rely on the statistical relationship among the measured points (often known as semi-variogram or variogram in short) to estimate the value for a target location (or cell). With the variogram, geostatistical techniques are not only able to predict value at a target site but also provide some measure of uncertainties associated with the calculation. Several popular techniques in this category include the ordinary kriging and universal kriging methods.

In this study, the authors use the inverse square distance weighting (IDW) approach for the spatial interpolation. This simpler approach has been commonly used in many comparative studies for spatial interpolation of extreme rainfalls [17, 18]. It has been shown to provide reasonable results compared to the more complicated approaches such as ordinary kriging. The IDW method assumes that each measured point has a local influence that diminishes with distance. It uses a simple mathematical model to predict value for an unmeasured location based on the values of control points surrounding the target location and the distances between them. The control points closer to the prediction location have more influence on the predicted value than those farther away.

3.3 Model Assessment Criteria

In this study, two dimensionless indices are used for comparing the performance of the selected models. These criteria include the coefficient of determination (R2) and the mean absolute relative deviation (MADr) as given in the following equations:

$$R^2 = \left[ {\frac{{\sum \left\{ {\left( {x_i - \overline{x}} \right)\left( {y_i - \overline{y}} \right)} \right\}}}{{\left\{ {\sum (x_i - \overline{x})^2 \sum \left( {y_i - \overline{y}} \right)^2 } \right\}^\frac{1}{2} }}} \right]^2$$
(1)
$${\text{MADr}} = \frac{1}{n}\sum \left\{ {\frac{|x_i - y_i |}{{x_i }}} \right\}$$
(2)

where \(x_i , i = 1, 2, \ldots ,n\) are the observed values and \(y_i , i = 1, 2, \ldots ,n\) are the estimated values; \(n\) is the sample length; \({\overline{\text{x}}}\) and \(\overline{y}\) denote the average value of the observed and estimated quantiles, respectively.

4 Results and Discussion

4.1 Model Performance Assessment

To assess the performance of the scaling model in deriving the distribution of sub-dailly and sub-hourly AM rainfalls, the estimated short-duration extreme rainfall quantiles of different durations (D = 5 min to 12 h) and different return periods (T = 2 to 50 years) were compared with the observed data obtained from the at-site frequency analysis using the GEV distribution. Since the average record length of all stations is 25.5 years, only extreme rainfall quantiles of return periods up to T = 50 years were chosen for comparisons to obtain reliable rainfall estimates. The estimation of rainfall quantiles for return periods higher than twice the sample length involves high degree of uncertainties due to the extrapolation and should be used with caution. Two indices including R2 and MADr were used to investigate model performance for different rainfall durations, different return periods, different geographical locations, and different sample record lengths.

For different rainfall durations and return periods, Table 1 shows that the estimated extreme rainfall quantiles given by the proposed GEV scaling model agree very well with the observed data. In particular, for rainfall quantiles of return periods within the average record length, the lowest R2 is 0.93 and the highest MADr is 7.3%. For those of return periods equal twice the average record length, R2 is 0.89 and MADr is 9.7%. Except for the 6-h duration, R2 and MADr values are slightly lower. This could be due to the complicated atmospheric conditions that cause mixed extreme rainfall process for these rainfall durations. That is, severe thunderstorm events of very short durations lasting minutes to hours are embedded in a low pressure or frontal system lasting from few hours to a day. Notice that these extreme rainfall quantiles are also often underestimated in the traditional approach when using a linear regression model between rainfall intensities over different rainfall durations.

Table 1 R2 and MADr results for different rainfall durations (5 min to 12 h) and different return periods (2–50 years)

Comparison between the estimated and observed data for each province and territory of Canada indicate that the scaling approach works well for different climate regions in Canada. In general, the two indices show that the proposed scaling GEV model provides similar performance for all provinces and territories and none of them is exceptional. In particular, R2 values are very high (at least 0.95) and MADr values are relatively low (less than 12%) for most of stations. Except a few stations have slightly lower accuracy with R2 and MADr approximately 0.9 and 16%, respectively. These stations mainly less than 20 years of record. Investigation of the effect of record lengths on the agreement between the estimated and observed data also show that when the sample length increases, the correlation tends to be higher and the difference between them reduces. In detail, R2 gradually increases from 0.9 (for stations with less than 20-year record) to 0.98 (for stations more than 50-year record) while MADr decreases from 16 to 7%.

4.2 Point- and Grid-Scale IDF Relations

The point-scale IDF relations were generated for all study stations across Canada. Results for several selected durations and return periods frequently used in practice are summarized in Fig. 2. Similar to the distributions of the means of extreme rainfalls in different provinces and territories across Canada, results show the three main patterns of spatial distribution of the extreme rainfall events in the range of (i) 5-min to less than 1-h duration, (ii) 1-h to less than 6-h duration, and (iii) 6-h to 24-h duration.

Fig. 2
figure 2

Boxplots of extreme design rainfalls of different durations (D = 5 min, 1 h, 24 h in rows) and return periods (T = 2, 10, 50 years in columns) for each Canadian province and territory. Note: YT = Yukon, NT = Northwest Territories, NU = Nunavut, BC = British Columbia, AB = Alberta, SK = Saskatchewan, MB = Manitoba, ON = Ontario, QC = Quebec, NB = New Brunswick, NS = Nova Scotia, PE = Prince Edward Island, NL = Newfoundland and Labrador

For extreme rainfalls of very short durations ranging from 5 min to less than 1 h, results show that, moving from the South to the North, the extreme design rainfalls decline (see Figs. 2 and 3a). Whereas, moving from the West to the Central, extreme rainfalls increase and then decrease when moving further from Central to the East. In fact, the highest values occur in Northwest Territories for the Northern Canada and in Ontario province for the Southern part. For instance, on average, the 5-min design rainfalls of 10-year return period increase from 4 mm in Yukon to 6 mm in Northwest Territories and then decrease to 2.5 mm in Nunavut. Whereas, in the South, they rise from 5 mm in British Columbia to 12.5 mm in Ontario and then decline to 7 mm in Newfoundland and Labrador. Similarly, the 10-min, 15-min, and 30-min extreme rainfalls display the same behaviors to what were observed in the 5-min design values.

Fig. 3
figure 3

An example of a point-scale and b grid-scale extreme design rainfalls of 5-min duration and of 10-year return period. For (a), at-site rainfall depths are represented in the form of circle markers with continuous diameter and color scales (larger diameters and darker colors represent higher values). For (b), rainfall depths are plotted using continuous color scale (darker colors represent higher values) while locations of control points are represented with red dots

For extreme rainfalls of short durations ranging from an hour to less than 6 h, results show a similar pattern to what observed in shorter durations events for the Northern Territories. The Southern provinces, however, display a relative increase in extreme rainfall amounts in the Atlantis region compared to the other regions. For example, for the Northern part, on average, 1-h extreme design rainfalls of 10-year return period increase from 12 mm in Yukon to 16 mm in Northwest Territories and then reduce to 8 mm in Nunavut. For the Southern regions, there are two peaks of extreme design rainfalls when moving from the West to the East Coast. For instance, on average, 1-h design rainfalls of 10-year return period go up from 15 mm in British Columbia to the first peak of 35 mm in Ontario province and then dwindle to 30 mm in Quebec and New Brunswick. They then increase back and reach the second peak of 31 mm in Nova Scotia (slightly higher than New Brunswick) and then go down to 22 mm in Newfoundland and Labrador province.

For extreme rainfalls of durations ranging from 6 to 24 h, similar pattern was found for the Northern Canada with the peak in Northwest Territories but the difference in rainfall depths between the three territories are much lower. For the Southern part, the largest extreme design rainfalls shift to Nova Scotia. Except British Columbia, other provinces follow a distinct pattern. For example, the 24-h design rainfalls of 10-year return period increase from 60 mm in Alberta to 90 mm in Nova Scotia and then decrease to 75 mm in Newfoundland and Labrador. Compared to other regions, British Columbia displays an exceptional behavior with a very large variation (i.e. very large box and long whiskers) of design rainfall values between its gauged stations (i.e. design rainfalls range between 25 and 175 mm). Furthermore, the largest station values also occurred in British Columbia rather than in Nova Scotia. Several stations show the values as large as 200–300 mm compared to the largest value in Nova Scotia which is only 130 mm. These stations mainly located on the Northwest of Moresby Island, on the Southwest side of Vancouver Island, and in the South of North Vancouver.

For ungauged sites where the developed point-scale IDF relations are unavailable, spatial interpolation based on the inverse square distance weighting method were used to estimate the values at these locations by transferring information from the neighboring IDF stations. The computed extreme design rainfall atlas for each particular rainfall duration and return period was presented in the form of a raster. These rasters have the same resolutions and spatial extents with the ANUSPLIN daily rainfall series product produced by Natural Resources Canada [19]. For illustrative purposes, Fig. 3b shows the computed grid-scale extreme design rainfalls for 5-min duration and for 10-year return period. It can be seen that in the Southern part of Canada where the gauged density is high, the interpolated values are very consistent and highly agree with the general trend of spatial distribution and variation of extreme rainfalls from the West to East Coast discussed earlier based on the sampling values at the control points. However, in the Northern Canada, due to a low density of the observed stations, most of the interpolated values do not follow the general trend and they are often overestimated. More control points are required to enhance the accuracy of the estimation for these regions. These could come from other data sources and products such as radar rainfalls. Merging these products could be another challenge. In addition, other geospatial interpolation techniques, such as thin-plate spline, ordinary kriging, could be used and compared with the current IDW-based product to improve the results.

5 Conclusions

Extreme rainfall IDF relations are considered as a critical tool for the design of various urban water infrastructures. In Canada, these IDF relations are derived based on statistical frequency analyses of annual maximum rainfall series (AMS) for nine different durations (ranging from 5 min to one day) by fitting the two-parameter Gumbel distribution to the observed extreme rainfall data independently for each rainfall duration. The present study proposed a new procedure for developing new at-site IDF relations as well as regional extreme rainfall maps for urban infrastructure design for Canada. The proposed procedure was based on an improved three-parameter scaling GEV model to provide more accurate and more robust estimates of extreme design rainfalls.

In this study, a data set of more than 5000 annual maximum rainfall series (AMS) for nine different rainfall durations from a network of 651stations located across Canada was selected. First, a detailed analysis of these AMS was performed to identify the scaling behaviour of these extreme rainfall processes. In general, it was found that the extreme rainfall events exhibit different scaling regimes over different regions with diverse climatic conditions. On the basis of this scaling investigation, the scaling GEV distribution was selected to describe the distribution of extreme rainfalls over a wide range of time scales (from 5 min to one day). By accounting for the strong correlation between statistical moments of rainfall amounts over different rainfall durations and based on the PWM method, it has been shown that the proposed scaling GEV model can provide more accurate and more robust extreme rainfall estimates than thoses values given by traditional methods. More specifically, the computed extreme rainfalls for different rainfall durations, return periods, geographical locations, and sample lengths were found to be highly agree with the observed data. For example, for rainfall quantiles of return periods within the average record length, the lowest R2 is 0.93 and the highest MADr is 7.3%.

The present study provided the at-site IDF relations for all 561 study sites located across Canada. General trends of spatial distribution and variation of extreme design rainfalls were also investigated. In general, there are three main patterns associated with extreme events of less than 1 h, from 1 h to less than 6 h, and from 6 h to 1 day. In particular, moving from South to North, extreme rainfalls decrease for all three patterns. However, moving from the West Coast to Central regions, extreme rainfalls of less than 1-h duration increase and then decrease when moving further from Central land to the East Coast. For those of 1-h to less than 6-h duration, after first peaking in Ontario, they rise back and reach the second peak in Nova Scotia. For extreme rainfalls of 6-h duration and longer, the peaks shift from Ontario to Nova Scotia. However, British Columbia is the place where many largest extreme events occurred.

Grid-scale regional extreme rainfall maps were also developed for locations without observed data in Canada. The spatial interpolation technique based on the inverse square distance weighted model was applied to transfer the values from control points (i.e. observed IDF stations) to the ungauged sites. Results show that for the Southern regions where the control point density is high, the results highly agree with the spatial distribution and variation trend from West to East. However, for the Northern territories where the gauged density is low, the interpolated values are often over-estimated. It is suggested that more data and other interpolation techniques should be considered in order to improve the estimation results at these regions.

Finally, it is expected that the development of improved at-site IDF relations and regional extreme rainfall maps as presented in this study could provide more cost-effective design of urban water infrastructures in Canada.