1 Introduction

Information regarding accurate estimation of extreme events such as flood magnitudes and their frequency of occurrence are of great importance in the planning, design, and management of hydraulic structures such as dams, spillways, culverts, and storm water management systems. A novel approach to the prediction of flood flows and also applicable to other hydrologic processes such as rainfall is the statistical method of regional frequency analysis. This approach promises a more reliable analysis by using information from several sites with identical behavior of flood, rather than only single-site information. With these, regional frequency analysis becomes a popular and practical means of providing flood information at sites with little or no flow data available for the purposes of flood control and engineering economics Jingyi and Hall (2004).

The major developments in flood frequency analysis revolved the idea of probability weighted moments (PWM) introduced by Greenwood et al. (1979) and the theory of L moments proposed by Hosking (1990). The approach of L moments in regional frequency analysis has been applied successfully to model floods by a number of cases studied in Malaysia (Lim and Lye 1998; Zin et al. 2009), New Zealand (Pearson 1991), Southern Africa (Kjeldsen et al. 2002), Egypt (Atiem and Harmancioglu 2006), Turkey (Saf 2009), Iran (Rahnama and Rostami 2007), China (Chen et al. 2006), Italy (Noto and Logggia 2009; Cannarozzo et al. 2009), Pakistan (Hussain and Pasha 2009), Tunisia (Abida and Ellouze 2008), Canada (Glaves and Waylen 1997; Yue and Wang 2004), the UK (Fowler and Kilsby 2003), and India (Parida et al. 1998; Kumar et al. 1999, 2003).

Introducing partial probability weighted moments (PPWM), Wang (1990a) extended the definition of PWM for fitting distribution functions to censored samples. Partial L moments (PL moments) are variants of L moments and also analogous to the PPWM. In the case of flood estimation, the interest is focused mainly on the estimation of the right-hand tail of a distribution function. Because the concern of data on small flood events can sometimes be only of little relevance to the larger ones, the PL moments method are introduced for characterizing the larger events in data. Using PL moments may reduce undesirable influences that small sample events may have on the estimation of large return period events.

The PPWM and PL moments approach related to censor data sets has been employed in a number of studies. Previous researches have been done by Wang (1990a, b; 1996) and Bhattarai (2004) which utilized PL moments in fitting generalized extreme value (GEV) distribution to the censored flood samples. Bhattarai (2004) explored different censoring levels of PL moments and found that sampling properties of PL moments, with censoring flood samples up to 30 %, are similar those of simple L moments. Some other researches can be found in Kroll and Stedinger (1996), Koulouris et al. (1998), and Moisello (2007) and recently by Kochanek et al. (2008). However, literature review reveals limited usage of the proposed PL moments in regional frequency analysis.

Wang (1997) introduced another method which is also a generalization of the L moments, called LH moments for GEV. Since then, LH moments have been used by several authors in flood frequency analysis. Lee and Maeng (2003) analyzed design floods derived through LH moments using annual maximum floods in Korea watersheds. Meshgi and Khalili (2009a, b) developed regional flood frequency analysis based on LH moments of the Kharkhe watershed, located in Western Iran. They used GEV, generalized logistic (GLO), and generalized Pareto (GPA) distributions, and a comparative study had been made between LH moments and L moments method. Deka et al. (2011) studied the statistical modeling of annual maximum daily rainfall data in Northeast India fitted using LH moments. Bhuyan et al. (2010) used LH moments for regional flood frequency analysis of the north bank region of the river Brahmaputra, India.

In this study, the regional frequency analysis of the PL moments approach is developed, by first revisiting regional frequency analysis establishment based on the L moments by Hosking and Wallis (1997) and LH moments by Meshgi and Khalili (2009a, b). For this purpose, the previous developed relationship between PL moments and GEV distribution by Wang (1996) are revisited. Next, a new relationship for GLO and GPA distributions are developed. A total number of 37 stations within Selangor Malaysia are used for PL moments regional frequency analysis, and a comparative study has been made between L moments and LH moments of order two.

2 Methodologies

2.1 Method of L moments

The L moments, introduced by Hosking (1990), are another way of summarizing the statistical properties of hydrological data. L moments can be expressed as linear combinations of PWM. The PWM of order r was formally defined by Greenwood et al. (1979) as

$$ {\beta_r} = \int\limits_0^1 {x(F){F^r}} dF $$
(1)

where F = F(x) is a cumulative distribution function, x(F) is an inverse distribution function or so-called quantile function of random variables x, and r = 0, 1, 2,… is a nonnegative integer. The first four L moments, expressed as linear combinations of PWM, are

$$ \matrix{ {{\lambda_1} = {\beta_0}} \\ {{\lambda_2} = 2{\beta_1} - {\beta_0}} \\ {{\lambda_3} = 6{\beta_2} - 6{\beta_1} + {\beta_0}} \\ {{\lambda_4} = 20{\beta_3} - 30{\beta_2} + 12{\beta_1} - {\beta_0}} \\ }<!end array> $$
(2)

The L moments ratios (L coefficient variation, L skewness, L kurtosis, respectively) are defined as

$$ \matrix{ {{\mathrm{L}} - {\mathrm{Cv}} = {\tau_2} = \frac{{{\lambda_2}}}{{{\lambda_1}}}} \\ {{\mathrm{L}} - {\mathrm{Cs}} = {\tau_3} = \frac{{{\lambda_3}}}{{{\lambda_2}}}} \\ {{\mathrm{L}} - {\mathrm{Ck}} = {\tau_4} = \frac{{{\lambda_4}}}{{{\lambda_2}}}} \\ }<!end array> $$
(3)

2.2 Method of LH moments

Wang (1997) introduced the concept of LH moments as a generalization of the L moments. The LH moments are the linear function of the expectation of the highest order statistic. The first four LH moments are given as

$$ \matrix{ {\lambda_1^{\eta } = E\left[ {{X_{{\left( {\eta + 1} \right):\left( {\eta + 1} \right)}}}} \right]} \\ {\lambda_2^{\eta } = \frac{1}{2}E\left[ {{X_{{\left( {\eta + 2} \right):\left( {\eta + 2} \right)}}} - {X_{{\left( {\eta + 1} \right):\left( {\eta + 2} \right)}}}} \right]} \\ {\lambda_3^{\eta } = \frac{1}{3}E\left[ {{X_{{\left( {\eta + 3} \right):\left( {\eta + 3} \right)}}} - 2{X_{{\left( {\eta + 2} \right):\left( {\eta + 3} \right)}}} + {X_{{\left( {\eta + 1} \right):\left( {\eta + 3} \right)}}}} \right]} \\ {\lambda_4^{\eta } = \frac{1}{4}E\left[ {{X_{{\left( {\eta + 4} \right):\left( {\eta + 4} \right)}}} - 3{X_{{\left( {\eta + 3} \right):\left( {\eta + 4} \right)}}} + 3{X_{{\left( {\eta + 2} \right):\left( {\eta + 4} \right)}}} - {X_{{\left( {\eta + 1} \right):\left( {\eta + 4} \right)}}}} \right]} \\ }<!end array> $$
(4)

For η = 0, LH moments will be Hosking (1990) L moments. The LH moments ratios (LH-Cv, LH-Cs, and LH-Ck, respectively) are defined as

$$ \tau_2^{\eta } = \frac{{\lambda_2^{\eta }}}{{\lambda_1^{\eta }}},\,\tau_3^{\eta } = \frac{{\lambda_3^{\eta }}}{{\lambda_2^{\eta }}}\,{\mathrm{and}}\,\tau_4^{\eta } = \frac{{\lambda_4^{\eta }}}{{\lambda_2^{\eta }}} $$
(5)

The details on the estimation of parameters and regional flood frequency analysis can be found in Bhuyan et al. (2010), Meshgi and Khalili (2009b), and Deka et al. (2011).

2.3 Method of PL moments

There have been numbers of research discussing the definition of partial PWM by Wang (1990a, b; 1996), Hosking (1995), and Koulouris et al. (1998). In the present study, the definition of partial PWM by Wang (1996) is employed. Wang (1996) defined partial PWM as extended from the concept of PWM to be applied to a censored sample

$$ \beta _{r}^{\prime }=\frac{1}{{1-F_{0}^{{r+1}}}}\int\limits_{{{{F}_{0}}}}^{1} {x(F)} {{F}^{r}}dF $$
(6)

where F 0 = F(x 0), x 0 being the censoring threshold. When F 0 = 0, the partial PWM becomes the ordinary PWM. Given a complete sample x (1) ≤ x (2) ≤ … ≤ x (n), the following statistic is defined by Wang (1990a) as an unbiased estimator of \( \beta _{r}^{\prime } \)

$$ b_{r}^{\prime }=\frac{1}{{\left( {1-F_{0}^{{r+1}}} \right)n}}\sum\limits_{{i=1}}^{n} {\frac{{\left( {i-1} \right)\left( {i-2} \right)\ldots \left( {i-r} \right)}}{{\left( {n-1} \right)\left( {n-2} \right)\ldots \left( {n-r} \right)}}} x_{{(i)}}^{*} $$
(7)

where

$$ \matrix{ {x_{{\left( i \right)}}^{ * } = 0\,{\text{for}}\,{x_{{\left( i \right)}}} \leqslant {x_{0}}} \\ {x_{{\left( i \right)}}^{ * } = {x_{{\left( i \right)}}}\,{\text{for}}\,{x_{{\left( i \right)}}} > {x_{0}}} \\ }<!end array> $$

The level of censoring, F 0, determines the number of the sample data points to be censored as

$$ {F_0} = \frac{{{n_0}}}{n} $$
(8)

where n is the length of the uncensored sample and n 0 is the number of occurrence of values which do not exceed x 0 in the sample (censored data points). The first four PL moments (ξ 1, ξ 2, ξ 3, and ξ 4) have the same definition and interpretations as the first four L moments.

The PL-Cv, PL-Cs, and PL-Ck are defined as

$$ {\varsigma_2} = \frac{{{\xi_2}}}{{{\xi_1}}},{\varsigma_3} = \frac{{{\xi_3}}}{{{\xi_2}}}\,{\mathrm{and}}\,{\varsigma_3} = \frac{{{\xi_4}}}{{{\xi_2}}} $$
(9)

2.4 Development of relationships between the PL moments and probability distribution

Many statistical distributions for regional frequency analysis have been investigated for extreme hydrologic variables. In this study, three probability distributions were considered: GEV, GLO, and GPA. The short-listed distributions were chosen based on previous studies such as those by Zin et al. (2009) and Zalina et al. (2002) of which these distributions were more prominent for tropical regions and by Kysely (2010) for modeling precipitation extremes. The details on the estimation of parameters of these distributions can be found by Hosking and Wallis (1993, 1997) for the L moments method and Bhuyan et al. (2010), Meshgi and Khalili (2009a, b), and Deka et al. (2011) for the LH moments method.

For the case of PL moments with the definition on Eq. (6), only the GEV distribution has been developed by Wang (1996). However, the development of relationships between PL moments and other distributions has not yet been available. In this section, the PL moments of the GEV distribution are revisited. Next, the PL moments for the GLO and GPA distributions are developed in this study. Upon the issues of censoring would improve the estimation of high return period, this study emphasizes on the censoring at 3 % of the complete data. This level of censoring would contribute to censoring level at F 0 = 0.03.

2.5 PL moments for GEV distribution

The cumulative distribution function of GEV is given by

$$ F(x) = \exp \left\{ { - {{\left[ {1 - k\left( {\frac{{x - b}}{a}} \right)} \right]}^{{\frac{1}{k}}}}} \right\}\quad k \ne 0 $$
(10)

and quantile function

$$ Q(F) = b + \frac{a}{k}\left\{ {1 - {{\left[ { - \ln (F)} \right]}^k}} \right\}\quad k \ne 0 $$
(11)

The expression of the first two PL moments and PL skewness of GEV are

$$ {\xi_1} = b + aH\left( {0,{F_0},k} \right) $$
(12)
$$ {\xi_2} = a\left[ {H\left( {1,{F_0},k} \right) - H\left( {0,{F_0},k} \right)} \right] $$
(13)
$$ {\varsigma_3} = \frac{{2H\left( {2,{F_0},k} \right) - 3H\left( {1,{F_0},k} \right) + H\left( {0,{F_0},k} \right)}}{{H\left( {1,{F_0},k} \right) - H\left( {0,{F_0},k} \right)}} $$
(14)

where

$$ H\left( {r,{F_0},k} \right) = \frac{1}{k}\left\{ {1 - \frac{{{\text P}\left[ {1 + k, - \left( {r + 1} \right)\ln {F_0}} \right]}}{{\left( {1 - F_0^{{r + 1}}} \right){{\left( {r + 1} \right)}^k}}}} \right\} $$
(15)

In Eq. (15), P(.,.)is an incomplete gamma function

$$ {\text{{\text P}}}\left( {1 + k, - \left( {r + 1} \right)\ln {F_0}} \right) = \int_0^{{ - \left( {r + 1} \right)\ln {F_0}}} {{\theta^k}} {e^{{ - \theta }}}d\theta $$
(16)

In practice, solving of the k parameter requires development of approximate methods based on Eq. (14). Equation (14) does not give an explicit solution for k and has to be solved numerically, solving using an iterative method. For this purpose, the polynomial function of computation of the following equation with good accuracy has been constructed based on Ϛ 3 as

$$ k = 0.54 - 2.292{\varsigma_3} + 1.773\varsigma_3^2 - 1.894\varsigma_3^3 + 0.954\varsigma_3^4 $$
(17)

Once the value of k is obtained, a and b can be estimated successively from Eqs. (12)–(13) as

$$ a = \frac{{{\xi_2}}}{{H\left( {1,{F_0},k} \right) - H\left( {0,{F_0},k} \right)}} $$
(18)
$$ b = {\xi_1} - aH\left( {0,{F_0},k} \right) $$
(19)

2.6 PL moments for GLO distribution

The cumulative distribution function and quantile function of GLO are

$$ F(x) = {\left[ {1 + \left[ {1 + {{\left\{ {1 - \frac{k}{a}\left( {x - b} \right)} \right\}}^{{\frac{1}{k}}}}} \right]\,} \right]^{{ - 1}}} $$
(20)
$$ Q(F) = b + \frac{a}{k}\left\{ {1 - {{\left( {\frac{{1 - F}}{F}} \right)}^k}} \right\} $$
(21)

The expression of the first two PL moments and PL skewness of GLO are

$$ {\xi_1} = b + \frac{a}{k}\left[ {1 - \frac{{{{\text B}_{{1 - {F_0}}}}\left( {1 + k,1 - k} \right)}}{{1 - {F_0}}}} \right] $$
(22)
$$ {\xi_2} = - \frac{a}{k}\left[ {\frac{{2{{\text B}_{{1 - {F_0}}}}\left( {1 + k,2 - k} \right)}}{{1 - F_0^2}} - \frac{{{{\text B}_{{1 - {F_0}}}}\left( {1 + k,1 - k} \right)}}{{1 - {F_0}}}} \right] $$
(23)
$$ {\varsigma_3} = \frac{{\frac{{6{{\text B}_{{1 - {F_0}}}}\left( {1 + k,3 - k} \right)}}{{1 - F_0^3}} - \frac{{6{{\text B}_{{1 - {F_0}}}}\left( {1 + k,2 - k} \right)}}{{1 - F_0^2}} + \frac{{{{\text B}_{{1 - {F_0}}}}\left( {1 + k,1 - k} \right)}}{{1 - {F_0}}}}}{{\frac{{2{{\text B}_{{1 - {F_0}}}}\left( {1 + k,2 - k} \right)}}{{1 - F_0^2}} - \frac{{{{\text B}_{{1 - {F_0}}}}\left( {1 + k,1 - k} \right)}}{{1 - {F_0}}}}} $$
(24)

where \( {{\text B}_{{1 - {F_0}}}}(.,.) \)is an incomplete beta function

$$ {{\text{B}}_{{1-{{F}_{0}}}}}\left( {1+k,r-k+1} \right)=\int_{0}^{{1-{{F}_{0}}}} {{{\theta }^{k}}{{{\left( {1-\theta } \right)}}^{{r-k}}}d} \theta $$
(25)

The parameter k of GLO can be computed using numerical solving of Eq. (24) in interval [−1, 1]. The estimate of parameter k is given by

$$ k = 0.184 - 1.333{\varsigma_3} + 0.514\varsigma_3^2 - 0.575\varsigma_3^3 + 0.274\varsigma_3^4 $$
(26)

Once the value of k is obtained, a and b can be estimated successively and are then given by

$$ a = \frac{{ - {\xi_2}k}}{{\frac{{2{{\text B}_{{1 - {F_0}}}}\left( {1 + k,2 - k} \right)}}{{1 - F_0^2}} - \frac{{{{\text B}_{{1 - {F_0}}}}\left( {1 + k,1 - k} \right)}}{{1 - {F_0}}}}} $$
(27)
$$ b = {\xi_1} - \frac{a}{k}(1 - \frac{{{{\text B}_{{1 - {F_0}}}}(1 + k,1 - k)}}{{1 - {F_0}}}) $$
(28)

2.7 PL moments for GPA distribution

The cumulative distribution function and quantile function of GPA are

$$ F(x) = 1 - {\left[ {1 - \frac{k}{\alpha}\left( {x - \xi } \right)} \right]^{{\frac{1}{k}}}} $$
(29)
$$ Q(F) = \xi + \frac{\alpha }{k}\left[ {1 - {{\left( {1 - F} \right)}^k}} \right] $$
(30)

The expression of the first two PL moments and PL skewness of GPA are

$$ {\xi_1} = \xi + \frac{\alpha }{k}\left( {1 - {g_{{1,1}}}} \right) $$
(31)
$$ {\xi_2} = - \frac{\alpha }{k}\left( {2{g_{{1,2}}} - 2{g_{{2,2}}} - {g_{{1,1}}}} \right) $$
(32)
$$ {\zeta_3} = \frac{{6{g_{{1,3}}} - 12{g_{{2,3}}} + 6{g_{{3,3}}} - 6{g_{{1,2}}} + 6{g_{{2,2}}} + {g_{{1,1}}}}}{{2{g_{{1,2}}} - 2{g_{{2,2}}} - {g_{{1,1}}}}} $$
(33)

where \( {g_{{r,s}}} = \frac{{{{\left( {1 - {F_0}} \right)}^{{k + r}}}}}{{\left( {k + r} \right)[1 - {{\left( {{F_0}} \right)}^s}]}} \)

The parameter k of GPA can be computed using numerical solving of Eq. (33) in interval [−1, 1]. The estimates of parameter k is given by

$$ k = - 0.2705 + 0.1009{\varsigma_3} + 1.1075\varsigma_3^2 - 0.2858\varsigma_3^3 + 0.0576\varsigma_3^4 $$
(34)

Once the value of k is obtained, a and b can be estimated successively and are then given by

$$ \alpha = \frac{{ - {\lambda_2}k}}{{2{g_{{1,2}}} - 2{g_{{2,2}}} - {g_{{1,1}}}}} $$
(35)
$$ \xi = {\lambda_1} - \frac{\alpha }{k}\left( {1 - {g_{{1,1}}}} \right) $$
(36)

3 Regional frequency analysis based on L moments

Hosking and Wallis (1993, 1997) provided step-by-step guidelines for performing regional frequency analysis, using the L moments. The four steps involved in the regional frequency analysis are outlined as follow: (a) screening of the data using discordancy test, (b) identification of homogeneous regions, (c) choice of a regional distribution, and (d) estimation of the regional frequency distribution. A discussion of the first three steps is given next. The same procedure has been applied for LH moments (Bhuyan et al. 2010; Mesgi and Khalili 2009; and Deka et al. 2011).

4 Regional frequency analysis based on PL moments

The procedures discussed in Section 4 are similarly employed for the PL moments. PL-Cv, PL-Cs, and PL-Ck are equally replaced by L-Cv, L-Cs, and L-Ck for the discordancy and the homogeneity test. Selection of an adequately fitted distribution is carried out based on the PL ratio diagram and Z test using the regional PL-Cs and PL-Ck.

4.1 Discordance test

The main goal of the discordancy measure D test is to identify those sites for which point sample PL moments are markedly different from most of the other sites. Sites with great errors in data will stand out from the other sites and be flagged as discordant. The discordancy test, D i , for site i is defined by Hosking and Wallis (1997) as

$$ {D_{i}} = \frac{1}{3}N{\left( {{u_{i}} - \overline u } \right)^{T}}{S^{{ - 1}}}\left( {{u_{i}} - \overline u } \right) $$
(37)

where \( {u_i} = {[\hat{\varsigma}_2^i\;\hat{\varsigma}_3^i\;\hat{\varsigma}_4^i]^T} \) is a vector containing the three sample PL moment ratios for site i, N is the number of sites in the region, and \( \overline u \)represents the unweighted regional average of L moments ratio for each region

$$ \overline {u} = \frac{1}{N}\sum\limits_{{i = 1}}^N {{u_i}} $$
(38)

and S is the sample covariance matrix expressed by

$$ S = \sum\limits_{{i = 1}}^N {\left( {{u_i} - \overline{u}} \right){{\left( {{u_i} - \overline{u}} \right)}^T}} $$
(39)

Generally, a site is declared as discordant from the group if the D i value is greater than a critical value. Hosking and Wallis (1997) tentatively suggested D i  ≥ 3 as the critical value for N ≥ 15 sites. If the D statistic of a site exceeds 3, its data are considered to be discordant from the rest of the regional data.

4.2 Heterogeneity test

The next step in regional frequency analysis is the assignment of the sites to regions. Hosking and Wallis (1997) proposed a heterogeneity measure H i that aims to estimate the degree of heterogeneity in a group of sites and to assess whether the sites might reasonably be treated as a homogeneous region. The heterogeneity test is then computed as

$$ H = {{{\left( {V - {\mu_V}} \right)}} \left/ {{{\sigma_V}}} \right.} $$
(40)

where μ V and σ V represent the population mean and standard deviation of the simulated V value.

$$ V = \sqrt {{\sum\nolimits_{{i = 1}}^N {{{{{n_i}{{\left( {\hat{\varsigma}_2^{{(i)}} - \hat{\varsigma}_2^R} \right)}^2}}} \left/ {{\sum\nolimits_{{i = 1}}^N {{n_i}} }} \right.}} }} $$
(41)

Here, \( \hat{\varsigma}_2^R \) is the regional average PL moments ratio, calculated using the following formula

$$ \hat{\varsigma}_2^R = \sum\nolimits_{{i = 1}}^N {{{{{n_i}\hat{\varsigma}_2^{{(i)}}}} \left/ {{\sum\nolimits_{{i = 1}}^N {{n_i}} }} \right.}} $$
(42)

where N is the number of sites and n i is the record length at sites i. The four-parameter kappa distribution is used to generate a homogeneous region with population parameters equal to the regional average sample L moments ratios.

The criteria established by Hosking and Wallis (1997) for assessing heterogeneity of a region are

  • H < 1—the region is acceptably homogeneous

  • 1 ≤ H < 2—the region is possibly homogeneous

  • H ≥ 2—the region is definitely heterogeneous

4.3 Selection of a regional frequency distribution

Hosking and Wallis (1997) suggested two approaches in testing whether the given distributions fit the data acceptably closely and hence choose the one that gives best fit to the data. The PL moments ratio diagram and Z test are employed for these purposes. The PL moments ratio diagram is a plot of PL-Cs and PL-Ck of the observed values and the calculated values from the distribution functions. Table 1 shows the coefficients for the newly developed relationships of PL-Cs and PL-Ck of the GEV, GLO, and GPA distributions based on PL moments for the range −1 ≤ Ϛ 3 ≤ 1.

Table 1 Equation coefficient of PL moment ratio for the GEV, GLO, and GPA distributions

However, direct visual inspection of the PL moments ratio diagram is somewhat subjective. Hosking and Wallis (1997) preferred an alternative approach based on goodness-of-fit test, Z test, which works directly with the regional average of L moments statistics.

For each selected distribution, the Z test is calculated as follows:

$$ {Z^{{\mathrm{DIS}}}} = {{{\left( {\varsigma_4^{{\mathrm{Dis}}} - \varsigma_4^R} \right)}} \left/ {{{\sigma_4}}} \right.} $$
(43)

where \( \varsigma_4^R \) and σ 4 are the simulated regional mean and standard deviation values obtained by kappa distribution, and \( \varsigma_4^{{\mathrm{Dis}}} \) is the regional value of the distribution function in interest.

Details computation is provided in Hosking and Wallis (1993, 1997). A calculated value of zero for |Z DIS| indicates a perfect fit. The value of the Z statistic is considered to be acceptable at 90 % confidence level if |Z DIS| ≤ 1.64. If more than one candidate distribution is acceptable, the one with lowest |Z DIS| is regarded as the best-fit distribution.

5 Case study

Records of daily rainfalls from 37 stations in Selangor with record lengths of 22 to 38 years were acquired from the Department of Irrigation and Drainage, Malaysia. The statistics and basic information of the data are listed in Table 2. All the stations, numbered 1 to 37, are located in Selangor with latitudes ranging from 26° up to 38° and longitudes from 8° to 18°, as shown in Fig. 1.

Table 2 Station name and statistics of maximum daily rainfalls for all the stations in Selangor
Fig. 1
figure 1

Location of rain gauge sites used in the study

As noted in Table 2, the means for the maximum daily rainfalls for the 37 sites in Selangor range from 83.91 mm (site 3610014) to 132.76 mm (site 2615131). Meanwhile, their standard deviations are from 21.49 mm (site 3217001) to 76.60 mm (site 2913001).

6 Results and discussions

This study would emphasize the regional frequency analysis of L moments, LH moments of order two, and PL moments at censoring level, F 0 = 0.03. When F 0 = 0, the PL moments become L moments as there are no data being censored.

Initially, the whole of Selangor was assumed as one homogeneous region, and the discordancy test was used for data verification and quality control. Results of the discordancy test, D i , are given in Table 2. It is observed that for the L moments method, D critical = 3.0 is exceeded at three locations: stations 18, 26, and 34, with D statistic values of 3.12, 3.50, and 3.75, respectively. After the second round of discordancy test, station 13 is discarded for having a D statistic value greater than 3. Therefore, these four stations are excluded from the regional frequency analysis. In order to better illustrate the discarded stations, the D values in Table 2 were marked in italic. The values of heterogeneity measures computed by carrying out the 500 simulations based on the data of 33 stations are H =−1.1129.

For the LH moments method, the D statistic values for the 37 stations vary from 0.00 to 0.94. The largest D statistic value is 0.94 for station 5; hence, none of the stations have a D statistic exceeding the critical value. The heterogeneity measure H computed for all stations in this region was−0.0618, which suggested that the region was homogeneous.

For the PL moments method, it is observed that the D statistic values exceed at two stations of 12 and 26 with D statistic values of 4.22 and 3.10, respectively. After having several similar discordancy tests, another two stations are eliminated: stations 22 and 34. Thus, the value of heterogeneity measure based on the data of 33 stations is H = 0.6493 which demonstrates acceptable homogeneity.

The regional average L moment, LH moment, and PL moment ratios of the respective study regions are calculated, and the corresponding parameter values of the fitted kappa distribution are found as presented in Table 3. Results of the H tests for the L moments, LH moments, and PL moments are given in Table 4.

Table 3 Moment ratios and parameter values of the fitted kappa distribution
Table 4 Results of the homogeneity test and goodness-of-fit test

After confirming the homogeneity of the study region, an appropriate distribution needs to be selected for the regional frequency analysis. Diagrams in Fig. 2 show a comparison of the observed and theoretical relations between the L moments, LH moments, and PL moments, respectively. In the L moment and LH moment ratio diagrams of Fig. 2, the point defined by the sample average values lies closest to the L moments and LH moments of the GEV distribution followed by GLO and GPA distributions. Analysis of the PL moment ratio diagram reveal that the sample average values of \( t_3^R = 0.3272 \) and \( t_4^R = 0.1419 \) in the diagram are better described by the GLO distribution rather than the GEV and GPA distributions.

Fig. 2
figure 2

a L moments, b LH moments, and c PL moments ratio diagram

Results of the Z test for the three distributions are given in Table 4. It has been observed that the values of Z test for both GEV and GLO distributions for all methods are less than the critical value of 1.64. On the other hand, GPA distribution failed the test with the Z test value exceeding the critical value of 1.64 for all methods. In general, the GEV and GLO distributions should be considered as the preferred distribution, as both distributions exhibit acceptable Z test values for the L moments, LH moments, and PL moments methods. The regional parameters and the quantiles estimated for both selected distributions for T = 2, 5, 10, 20, 50, and 100 years, using L moments, LH moments, and PL moments, are presented in Table 5.

Table 5 Regional parameter and quantile estimates of the GEV and GLO distributions for L moments, LH moments, and PL moments

6.1 Test for the robustness of the distribution

In regional frequency analysis, the final and imperative objective is to verify the robustness of the distribution in producing reasonably reliable estimates at all stations in the homogeneous region. The robustness of the selected regional frequency distribution is further investigated for estimation of proposed flood quantiles.

In this study, the accuracy of the estimates for the selected region is assessed using a Monte Carlo simulation procedure. In this simulation, flood quantile estimates for preferred distributions, in this case GEV and GLO distributions. In each simulation, 10,000 samples were generated from regional distributions for sample size n = 20, 40, 60, and 100. Two of the common error measures of performance used in such cases are the relative bias (RBIAS) and relative root mean square error (RRMSE) represented by

$$ {\mathrm{RBIAS}} = \frac{1}{N}\sum\limits_{{i = 1}}^N {\left( {\frac{{Q_i^S - Q_i^C(F)}}{{Q_i^C(F)}}} \right)} $$
(44)
$$ {\mathrm{RRMSE}} = \sqrt {{\frac{1}{N}\sum\limits_{{i = 1}}^N {{{\left( {\frac{{Q_i^S - Q_i^C(F)}}{{Q_i^C(F)}}} \right)}^2}} }} $$
(45)

where Q i S and Q i C(F) are the simulated and calculated quantiles of design flood, respectively. The robustness of the candidate distribution is evaluated by comparing the RBIAS and RRMSE of the estimated flood quantiles, whether the distribution is correctly determined or not.

Tables 6 and 7 present the RRBIAS and RRMSE values of quantiles computed using the L moments, LH moments, and PL moments methods, respectively. In order to better illustrate the results, the minimum achieved values are marked in bold. The results show that the RBIAS and RRMSE values generally increase with a reduction in the sample size and an increase in the return periods.

Table 6 RBIAS values for different quantiles of the best distribution for L moments, LH moments, and PL moments
Table 7 RRMSE values for different quantiles of the best distribution for L moments, LH moments, and PL moments

As shown by Table 6, PL moments of GLO and GEV distributions contribute to the smallest RBIAS values at low quantile, T = 10 years. At T = 20 and 50 years, L moments of GEV at corresponding sample sizes have produced minimum RBIAS values. At higher quantiles of T = 100 and 200 years, the minimum RBIAS values are exhibited by LH moments of GEV distribution and L moments of GLO distribution.

As the results of Table 7, almost all PL moments of GEV distribution of corresponding sample sizes produced smaller RRMSE values compared with L and LH moments. The minimum RRMSE values of L moments appears at n = 20 for return periods, T = 10 and 20 years under GEV distribution. The minimum RRMSE values of LH moments appear also under GEV distribution at n = 20 for T = 50 years, and n = 60 for T = 10 and 20 years. In this case, the minimum RRMSE values are generally best described by GEV distribution for all the three methods of L, LH, and PL moments compared to GLO distribution.

It is interesting to note that from these results, the estimation of quantiles at higher return period is best estimated by PL moments of GEV distribution. These can be found at a high return period of T = 100 and 200 years for all sample sizes which produce the lowest RRMSE values compared to L and LH moments. On the other hand, the minimum RRMSE values at lower return period are produced by L and LH moments of GEV distribution. This implies that L and LH moments appear to be more preferred in the estimation of low quantiles compared to the PL moments method.

7 Conclusions

The study provides a comprehensive evaluation of the L moments, LH moments, and PL moments, by first revisiting regional frequency analysis based on the L moments by Hosking and Wallis (1993, 1997) and LH moments by Meshgi and Khalili (2009a, b). Regional homogeneity was investigated by first assuming the entire study area as one homogeneous regional cluster. The corresponding relationships for regional homogeneity analysis by the PL moments are developed. PL moments for the GEV, GLO, and GPA distributions are also developed and used to provide the corresponding PL moments ratio diagrams and the goodness-of-fit test.

The results of this study have shown that from 37 stations in the study region, 33 stations based on L and PL moments and all of 37 stations based on LH moments are accepted statistically to be homogeneous. The Z test has shown that the GEV and GLO distributions of L, LH, and PL moments can be considered as the preferred distributions for modeling daily annual maximum rainfall in Selangor, Malaysia. Finally, Monte Carlo simulations used for performance evaluation show that the PL moments method is more efficient than L and LH moments methods of large return period events particularly at all tested sample sizes.

This work can be extended by including all the rain gauge stations of Malaysia in order to identify the most suitable region distribution for the whole country.