Abstract
Design rainfall intensity–frequency–duration data are a basic input to many water-related development projects. To derive design rainfalls, one needs long period of recorded rainfall data. Although daily rainfall data are generally widely available, short-duration rainfall data are scarce. For many urban applications, design rainfalls for much shorter durations are needed, which cannot be obtained directly from daily read rainfall data. This paper presents a simple approach that can be adopted to derive design rainfalls of short durations using daily rainfall data and other physio-climatic characteristics using a novel ‘index frequency combined with parameter regression technique’. This uses L moments to reduce the impacts of sampling variability in the analysis. Furthermore, this adopts generalised least squares regression to account for the inter-station correlation of the rainfall data in the analysis. The proposed method is applied to a pilot data set consisting of 203 rainfall stations across Australia. An independent Monte Carlo cross-validation test shows that the proposed method is capable of generating consistent and accurate design rainfall estimates from 6-min to 12-h duration. The developed technique can be adapted to other countries where there is a scarcity of short-duration rainfall data, but daily rainfall data are abundant.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Design rainfall data are used in a wide variety of hydrological analysis and modelling, e.g. sizing of hydraulic structures, hydrological modelling, flood plain modelling and urban storm water drainage design. Design rainfall data are generally expressed in the form of intensity–frequency–duration (IFD) curves, which give the annual exceedance probability (AEP) or average recurrence interval (ARI) for a particular rainfall intensity of a given duration. Examples of the derivation of IFD data can be seen in the Australian Rainfall and Runoff (ARR) (I. E. Aust. 2001) for Australia, Madsen et al. (2002, 2009) for Denmark, Baldassarre et al. (2006) for Italy, Ben-Zvi (2009) for Israel and Bonnin et al. (2006) for USA.
In ARR2001, ordinary product moments were used with log Pearson type 3 distribution to derive IFD data (I. E. Aust. 2001). The limitation of this approach was that the short record length when used with ordinary product moments introduced notable sampling variability in estimation. In recent years, use of L moments-bases regionalisation (Hosking and Wallis 1993) has become popular in rainfall estimation (e.g. Alila 1999; Koutsoyiannis and Baloutsos 2000; Abolverdi and Khalili 2010; Lee et al. 2010; Sarker et al. 2010; Yang et al. 2010; Gabriele and Chiaravalloti 2012; Zakaria et al. 2012). This is due to the fact that L moments are less sensitive to outliers and extremes in the data as compared to the ordinary product moments. In Australia, Jakob et al. (2005, 2007, 2008) adopted method of L moments and GEV distribution to derive IFD curves. A further improvement can be made to the methods of Jakob et al. (2008) by employing the generalised least squares regression (GLSR) (Stedinger and Tasker 1985) as demonstrated by Haddad et al. (2011). The advantages of the GLSR are that it explicitly accounts for differences in record lengths from site to site and inter-site correlation of the rainfall events data to enhance the accuracy of quantile estimates and to better quantify the associated estimation error. The GLSR differentiates between model error and sampling error in contrast to ordinary least squares which considers all the error in regression as ‘model error’. The combination of L moments and GLSR presents an attractive regional frequency analysis tool as demonstrated by Madsen et al. (2002) in an application of regional rainfall frequency analysis in Denmark. They found that the GLSR framework allowed for easier representation of regional homogeneity and regional variability in the partial duration series parameters, while also allowing for the quantification of the associated uncertainty with IFD estimates at any arbitrary site in the region. More recently, the GLSR has been successfully applied for the regionalisation of flood data in Australia and this has been identified as the ‘preferred method’ for the regional flood frequency analysis in Australia (Haddad et al. 2012; Haddad and Rahman 2012).
In a previous paper by Haddad et al. (2011), the applicability of the GLSR method to estimate design rainfalls has been demonstrated, which focused on the longer-duration rainfall events. The emphasis of this paper is to extend the method to shorter durations, e.g. 6 min. The urban hydrology applications often need to estimate short-duration design rainfalls (e.g. Borga et al. 2005; Egodawatta et al. 2007; Aly et al. 2009). For example, design of urban drainage systems, the catchment size is generally smaller (e.g. few hectares), thus making the catchment response time too short, which then need IFD data of relatively shorter durations (e.g. 6 min) with hydrological and hydraulic models. This paper develops a technique that can be used to derive short-duration design rainfalls using long-duration rainfall statistics utilising L moments and GLSR. The proposed method is tested using a Monte Carlo cross-validation procedure (details can be seen in Haddad et al. 2013).
2 Study area and data
A total of 203 rainfall stations across Australia were used in this study. This data set was originally prepared and used by Jakob et al. (2005) and obtained from Australian Bureau of Meteorology. The rationale of selecting these stations was that they had a reasonably longer record lengths, data of good quality and provided a good spatial coverage of the Australian continent, with a particular emphasis on the east coast. The geographical distribution of the selected rainfall stations is shown in Fig. 1. The data set although covers all the Australian states, a greater number of stations are included from the east coast. The record lengths of the annual maximum series of these stations range from 30 to 115 years (mean 39 years and SD 12 years). The annual maximum rainfall event series data from these stations were abstracted. The rainfall event durations considered in this study are: (a) 6 min (sub-hourly duration); and (b) 1, 2, 3 and 12 h (sub-daily durations). The predictor variables considered in the regression analysis are: (a) latitude (in decimal degrees); (b) longitude (in decimal degrees); (c) distance from coast (in decimal degrees); and (d) various statistics such as the 24-, 12- and 1-h duration mean rainfall and the 24-, 12- and 1-h duration L-coefficient of variation (L-CV) and L-skewness (L-SK). These predictor variables were found to be important in predicting design rainfalls in Australia by Jakob et al. (2007) and hence adopted in this study. In developing the index rainfall model, stations with a record length equal to or greater than 30 years were used, while stations with a minimum of 45 years were used for L-CV and L-SK models since L-CV and L-SK are more susceptible to sampling variability.
3 Methods
In estimating the design rainfall quantiles, the GEV distribution was used for sub-daily durations following Jakob et al. (2005, 2007). For the 6-min duration, the generalised Pareto (GP) distribution was adopted as this was found to be the best fit distribution for the study data set based on goodness-of-fit testing as described in Sect. 4.2.
In estimating the rainfall quantiles for a given duration, index rainfall approach was used; details of this can be seen in Hosking and Wallis (1997) and Haddad et al. (2011). Using the mean value as the index rainfall parameter, the regional T-year event estimator can be written:
where \(\hat{I}_{{T_{i} }}\) is design rainfall intensity for a given duration i and T, \(\hat{\mu }_{i}\) is regional mean rainfall (index rainfall) for a given duration I and T-year, and \(\hat{z}_{T}\) is regional growth factor for T-year.
In this paper, we adopted an index frequency approach combined with parameter regression technique (Haddad and Rahman 2012; Haddad et al. 2012) in that the L-CV and L-SK varied from site to site to reflect the at-site physiographic and climatic characteristics unlike the typical index frequency approach of Hosking and Wallis (1993). The advantage of our approach was that it did not require a strict criterion of homogeneity like the quantile regression technique as adopted by USGS (Thomas and Benson 1970).
The prediction equations for index rainfall, L-CV and L-SK were developed using GLS regression. The detailed methodology is provided in the supplementary section.
It was found that the L-SK model did not satisfy the model assumptions for the 6-min duration data, and hence, it was decided to search among the one- and two-parameter distributions for a suitable parent distribution for the 6-min duration. The candidate distributions considered were the generalised Pareto (GP), gamma (GAM), extreme value type 1 (EV1) and exponential (EXP). The Bayesian information criterion (BIC) and the Akaike information criterion (AIC) were used to evaluate the goodness-of-fit of the candidate distributions. Further details on this can be seen in the supplementary section.
The design rainfall estimates obtained as above were compared with the ARR2001 and at-site estimates. At-site estimates were obtained by fitting the GEV/GP distribution (depending on the duration) using the method of L moments.
In assessing the model performances, we compared the regional estimates obtained by the method presented here with the at-site estimates for a set of randomly selected test stations. It should be noted here that at-site estimates are subject to error such as sampling variability due to short record lengths, selection of the probability distribution and the method of parameter estimation. Regional values are preferred (e.g. generalised methods) to derive IFD data, which considers spatial smoothing and regional characteristics in the estimation. Nevertheless, comparison with at-site estimates has some merits, e.g. to assess whether the regional estimates show large differences from the at-site values, which might indicate gross errors in the adopted regionalisation method.
Here, a Monte Carlo cross-validation was adopted; from the 203 stations, 100 stations were left for the model validation. These 100 stations were selected randomly. The Monte Carlo cross-validation was then achieved by random sampling (with replacement) 30 stations (as test stations) from the 100 stations and applying the developed GLSR/regional method. This procedure was repeated 1,000 times. This enabled a sufficient degree of independent validation by including stations in the validation data set from all over Australia with different physiographic and meteorological characteristics. It should be mentioned here that instead of 100 stations (which is about 50 % of the data set), some other proportion of validation stations could have been adopted; however, use of 100 stations was deemed to be sufficient to capture the variability and representativeness of the study stations.
We assessed the relative differences firstly between two different growth curves and quantile estimates, i.e. GLSR/regional and at-site values. We used a relative bias statistic (BIAS r ) given by Eq. 2 or comparing the growth curves and the relative root mean square error (RMSE r ) by Eq. 3 for comparing the quantiles. Both the BIAS r and RMSE r provide an indication of the overall accuracy of the regional model, with the RMSE r looking at the predictive variance of the rainfall quantiles estimated by the new method. The BIAS r provides an indication of whether a model is over or under predicting as compared to the site values. Positive (negative) values of BIAS r occur when the regionalized estimates are higher (lower) than the site values.
where I obs refers to at-site estimates, and I new refers to new GLSR estimates.
4 Results
4.1 Development of regional prediction equations using GLSR
The inter-site correlation was found to depend on the duration, e.g. the correlation being higher for the 24-h duration as compared to the 1-h duration. The developed prediction equations for the index rainfall for the 1-, 2-, 3- and 12-h durations along with the associated model error variances are presented in Table 1. The \(\bar{R}_{\text{GLS}}^{2}\) values of index rainfall, L-CV and L-SK models are shown in Table 2. As seen in Table 2, the \(\bar{R}_{\text{GLS}}^{2}\) values of the developed prediction equations are quite high, in particular for the index rainfall and L-CV, which indicate a relatively good fit. Table 1 shows that the model error variances of the regression models are either zero or close to zero, which means that the sampling error dominated the total error in the regional analyses.
The standardised residual versus fitted values plots did not show any particular pattern, and there was no true outlier sites having undue influence on the regression. Figure 2 shows an example plot for the 1-h duration index rainfall and L-CV. For practical purposes, the standardised residuals should fall between ± 2, which has been mostly satisfied in Fig. 2.
The QQ plots also revealed pleasing results for the index rainfall and L-CV models for all the durations (e.g. Fig. 3 shows the results for 1-h duration). If the standardised residuals were indeed normally and independently distributed with mean 0 and variance 1, the slope of the best fit line in the QQ plot, which can be interpreted as the standard deviation of the sample, should approach one and the intercept, which is the mean of the sample, should approach 0 as the number of sites increases. Figure 3 indeed shows that the fitted line passes through the origin (0, 0) and it has a slope approximately equal to one. These results indicate that the developed prediction equations satisfy the model assumptions quite well. The QQ plots of the standardised residuals for L-SK for all the daily and sub-daily durations showed reasonably good results. The above results indicate that the GLSR method performs reasonably well for the study data set.
Table 1 presents the prediction equations for the index rainfall, L-CV and L-SK for the 6-min duration. As can be seen in Table 2, the 6-min duration index rainfall model performs relatively well as indicated by the reasonably high \(\bar{R}_{\text{GLS}}^{2}\) (75 %). The L-CV model has provided an \(\bar{R}_{\text{GLS}}^{2}\) of 55 % which is 10 % lower than the lowest \(\bar{R}_{\text{GLS}}^{2}\) of the sub-daily durations (2 h). The L-SK model shows a much lower \(\bar{R}_{\text{GLS}}^{2}\) value of 34 %. In saying this, however, its \(\bar{R}_{\text{GLS}}^{2}\) value is still comparable to the 3-h duration L-SK model.
Model assumptions were checked for normality and outliers for the 6 min as was done for the sub-daily durations. For the index rainfall, the standardised residuals versus fitted values plot in Fig. 2 reveal no true outlier site; also, no trend could be detected. The QQ plot of the standardised residuals in Fig. 3 shows that the slope is 0.99. The standardised residuals versus fitted values plot for L-CV did not show any outlier, and the QQ plot had a slope approximately 0.97. For L-SK (figures not shown), the standardised residuals showed a number of low and high outliers, with some of the standardised residual values being beyond ±2.5. The QQ plot also revealed a few high and low outliers suggesting a poor fit for L-SK. This indicates that fitting a three-parameter distribution might be inappropriate, which provided the motivation in this paper to explore the one- and two-parameter distributions for the 6-min duration.
4.2 Goodness-of-fit test for 6-min duration
In selecting a suitable distribution for the 6-min duration, the distribution that achieved the best score considering the two tests (i.e. AIC c and BIC) was taken as the parent distribution. The AIC c test favoured the GP, EV1, GAM and EXP for 50, 18, 18 and 13 % of cases, respectively. The BIC test favoured the GP, GAM, EXP and EV1 distributions for 43, 28, 17 and 12 % of the cases, respectively. Considering these results, the GAM and EXP distributions showed the poorest fit while the GP distribution showed the best fit, as summarised in Table 3. Since the GP distribution showed the best fit among the distributions considered here, it was adopted in this paper as the regional parent for the 6-min duration rainfall event.
4.3 Growth curve and quantile estimation
The growth curves were estimated for each of the sets of 30 test sites for the ARIs of 5, 20 and 100 years for the sub-daily and sub-hourly durations using index rainfall approach. To assess whether the developed regional growth curves under or over estimated the at-site growth curves, we calculated the relative bias (BIAS r ). Results are summarised in Table 4.
For the 5-year ARI and 1–12-h durations, the BIAS r ranged 1–1.9 % indicating the regional growth curves slightly overestimated the at-site ones. This bias may be considered to be negligible. From Table 4, the results for the 20-year ARI also reveal slight over estimation for the 1–12 h durations. For the 100-year ARI, however, both the 3 and 12 h showed little underestimation. All the three ARIs considered here seem to slightly underestimate the at-site values for the 6-min duration. For the 5-year ARI, taking the last 50 simulation results, out of 350 cases (50 simulations and 7 durations considered), 30 % of cases (i.e. 102/350) showed slight underestimation (with—6 % in the worst case, but for the most cases, values were falling between −6 and 0 %). For the 20-year ARI, 36 % of the cases showed underestimation (with—12 % in the worst case), similarly for the 100-year ARI, 38 % of cases showed underestimation (with—12 % in the worst case. Overall, the results suggest that there is no major difference between the regional GLSR estimated growth curves and at-site ones, which suggests that the proposed GLSR method provides reasonable estimation in most cases and does not contain any gross error.
Using the regional growth curves estimated above, we derived the rainfall quantile estimates using Eq. 1; here \(\hat{\mu }\) (the mean rainfall intensity value for a given duration at a site) was obtained from the developed regional prediction equations presented in Tables 1 and 2 (instead of at-site mean value) following the approach by Madsen et al. (2002).
4.4 Comparison of rainfall quantiles
Quantile estimates obtained by the new method developed here (regional estimates referred to as new estimates) were compared with the at-site estimates. As mentioned before, 30 test stations were selected randomly, which have been used for this comparison (these stations were not used in the derivation of the prediction equations). The selected ARIs and durations for the comparison were 5, 20 and 100 years and 6 min and 1–12 h, respectively.
Figure 4 shows both estimates (at-site and regional GLSR) on one plot for station 41359 (Oakley Aero in Queensland). It can be noticed from Fig. 4 that the new estimates compare very well with the at-site estimates. The plots for other stations showed similar results. It is worth noting that the estimates derived by at-site analysis have slightly more kinks in the curves as compared to the new estimates. As seen in Fig. 4, the transitions in durations are quite smoother with the new estimates. This shows that the new method provides design rainfall estimates with greater consistency.
The RMSE r calculated by Eq. 3 was used to compare the new rainfall quantile estimates with at-site ones. For the 5-year ARI, the RMSE r ranged from 7 to 10 % for the 6-min and 1–12-h durations. For the 20-year ARI, the RMSE r ranged from 6 to 11 %. The 100-year ARI also showed plausible results, with the RMSE r ranging from 10 to 13 %. These RMSE r values suggest that the new rainfall quantiles are consistent enough to reproduce similar results to those obtained by at-site analysis.
5 Conclusions
The paper develops a method to derive IFD data for short-duration rainfalls using L moments and GLS regression methods. This develops regional prediction equations for the index rainfall, L-coefficient of variation and L-coefficient of skewness for the 6-min and 1-, 2-, 3- and 12-h rainfall durations, which are then used to fit the GEV or GP distribution. It has been found that the prediction equations developed using GLSR satisfy the underlying model assumptions quite well for all the duration ranges and average recurrence intervals considered in this study. Goodness-of-fit tests have been carried out to find a regional parent distribution for the 6-min duration rainfall events. It has also been found that the two-parameter GP distribution approximates the at-site data ‘reasonably well’ as indicated by two goodness-of-fit tests based on the second order variant Akaike information criterion (AIC c ) and Bayesian information criterion (BIC). A Monte Carlo cross-validation is adopted to compare the at-site and GLSR/new design rainfall quantiles (Haddad et al. 2013). It has been found that the new design rainfall estimates are consistent with the at-site ones. The developed prediction equations can be used to estimate short-duration design rainfalls using daily rainfall statistics for Australia. The method can be adapted to other countries where there is a scarcity of short-duration rainfall data.
References
Abolverdi J, Khalili D (2010) Development of regional rainfall annual maxima for southeastern Iran by L moments. Water Resour Manage 24:2501–2526
Alila Y (1999) A hierarchical approach for the regionalization of precipitation annual maxima in Canada. J Geophys Res 4(24):3645–3655
Aly A, Pathak C, Teegavarapu R, Ahlquist J, Fuelberg H (2009) Evaluation of improvised spatial interpolation methods for infilling missing precipitation records. World Environ Water Resour Congr 1–10. doi: 10.1061/41036(342)598
Baldassarre DG, Brath A, Montanari A (2006) Reliability of different depth–duration–frequency equations for estimating short-duration storms. Water Resour Res 42:W12501. doi:10.1029/2006WR004911
Ben-Zvi A (2009) Rainfall intensity–duration–frequency relationships derived from large partial duration series. J Hydrol 367:104–114
Bonnin GM, Martin D, Lin B, Parzybokt T, Yekta M, Riley D (2006) Precipitation–frequency atlas of the United States, vol 1, NOAA Atlas 14, NOAA, USA
Borga M, Vezzani C, Fontana GD (2005) Regional rainfall depth–duration–frequency equations for an Alpine region. Nat Hazards 36:221–235
Egodawatta P, Thomas E, Goonetilleke A (2007) Mathematical interpretation of pollutant wash-off from urban road surfaces using simulated rainfall. Water Res 41(13):3025–3031
Gabriele S, Chiaravalloti F (2012) Using the meteorological information for the regional rainfall frequency analysis: an application to Sicily. Water Resour Manage. doi:10.1007/s11269-012-0235-6
Haddad K, Rahman A (2012) Regional flood frequency analysis in eastern Australia: Bayesian GLS regression-based methods within fixed region and ROI framework—quantile regression vs. parameter regression technique. J Hydrol 430–431:142–161
Haddad K, Rahman A, Green J (2011) Design rainfall estimation in Australia: a case study using l moments and generalized least squares regression. Stoch Environ Res Risk Assess 25(6):815–825
Haddad K, Rahman A, Stedinger JR (2012) Regional flood frequency analysis using Bayesian generalized least squares: a comparison between quantile and parameter regression techniques. Hydrol Process 26:1008–1021
Haddad K, Rahman A, Zaman M, Shrestha S (2013) Applicability of Monte Carlo cross validation technique for model development and validation using generalised least squares regression. J Hydrol 482:119–128
Hosking JRM, Wallis JR (1993) Some statistics useful in regional frequency analysis. Water Resour Res 29(2):271–281
Hosking JRM, Wallis JR (1997) Regional frequency analysis: an approach based on L moments. Cambridge University Press, Cambridge
Institution of Engineers Australia (I. E. Aust.) (2001) Australian rainfall and runoff: a guide to flood estimation, vol 1. I. E. Aust., Canberra
Jakob D, Taylor B, Xuereb K (2005) A pilot study to explore methods for deriving design rainfalls for Australia. In: Proceedings of 29th Hydrology and Water Resource Symposium. The Institution of Engineers Australia, 21–23 Feb Canberra
Jakob D, Xuereb K, Taylor B (2007) Revision of design rainfalls over Australia: a pilot study. Aust J Water Resour 11(2):153–159
Jakob D, Meighen J, Taylor B, Xuereb K (2008) Methods for deriving design rainfall estimates at sub-daily durations. In: Proceedings of 31st Hydrology and Water Resource Symposium. The Institution of Engineers Australia, 14–17 April 2008 Adelaide
Koutsoyiannis D, Baloutsos G (2000) Analysis of a long record of annual maximum rainfall in Athens, Greece, and design rainfall inferences. Nat Hazards 29:29–48
Lee CH, Kim T, Chung G, Choi M, Yoo C (2010) Application of bivariate frequency analysis to the derivation of rainfall–frequency curves. Stoch Environ Res Risk Assess 24:389–397
Madsen H, Mikkelsen PS, Rosbjerg D, Harremoes P (2002) Regional estimation of rainfall intensity duration curves using generalised least squares regression of partial duration series statistics. Water Resour Res 38(11):1–11
Madsen H, Arnbjerg-Neilsen K, Mikkelsen PS (2009) Update of regional intensity–duration–frequency curves in Denmark: tendency towards increased storm intensities. Atmos Res 92:343–349
Sarker S, Goel NK, Mathur BS (2010) Development of isopluvial map using L-moment approach for Tehri-Garhwal Himalaya. Stoch Environ Res Risk Assess 24:411–423
Stedinger JR, Tasker GD (1985) Regional hydrologic analysis, 1. Ordinary, weighted, and generalised least squares compared. Water Resour Res 22(9):1421–1432
Thomas DM, Benson MA (1970) Generalization of streamflow characteristics from drainage basin characteristics. US Geological Survey Water Supply Paper 1975
Yang T, Yu Xu C, Xi Shao Q, Chen X (2010) Regional flood frequency and spatial pattern analysis in the Pearl River Delta region using L-moments approach. Stoch Environ Res Risk Assess 24:165–182
Zakaria ZA, Shabri A, Ahmad UN (2012) Regional frequency analysis of extreme rainfalls in the west coast of Peninsular Malaysia using partial L moments. Water Resour Manage 26:4417–4433
Acknowledgments
We would like to thank Australian Bureau of Meteorology (BOM), Federal Department of Climate Change in Australia and Engineers Australia for providing financial support to the Project, BOM for providing necessary data for the study, Ms. Janice Green, BOM and Mr. Erwin Weinmann from Monash University for providing suggestions/comments on the work and Mr. Tarik Ahmed for assisting in data compilation.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Haddad, K., Rahman, A. Derivation of short-duration design rainfalls using daily rainfall statistics. Nat Hazards 74, 1391–1401 (2014). https://doi.org/10.1007/s11069-014-1248-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11069-014-1248-7