1 Introduction

Frequency analysis provides the probable rainfall data required for the hydraulic design of various infrastructure. Historical rainfall observations of annual maximums are required as input data; however, oftentimes there is insufficient rainfall data due to missing data and/or a short record length (Trinh et al. 2016). According to the Flood Estimation Handbook (Handbook 1999), regional frequency analysis (RFA) is recommended if the duration of the observational record is less than or equal to the design frequency (T) at a rainfall station. An at-site frequency analysis is recommended if the data period is more than or equal to two times the design frequency (2 T). Despite these recommendations, lack of data has often led to frequency analysis with short datasets. Frequency analysis for datasets over a short period is unreliable due to the large dispersion and the parameter biases. To compensate for these shortcomings, many studies have been carried out to generate rainfall data.

Daily rainfall was simulated with a stochastic model, the first-order Markov Chain, which was first used by Haan et al. (1976). Estimated transitional probabilities apply historical data and tend to produce similar statistical outputs. Srikanthan et al. (2005) proposed the various transition probabilities matrix model (TPM) based on a first-order Markov Chain. They also compared the model performance by using data from North America, South Africa and Australia. Chandler et al (2002) applied the Gaussian Displacements Spatial–Temporal Model (GDSTM) using radar data that generated stochastic simulations. This process included probabilistic assumptions that were expressed using physical parameters when constituting rainfall structures. Viglione et al. (2012) suggested locally calibrated stochastic rainfall generators to estimate extreme rainfall quantiles. Piantadosi et al. (2009) suggested a stochastically generating model that provides annual totals to assess the effects of rainfall variability. Researchers are also developing models in terms of probabilities such as general circulation models; however, these have been found to provide poor rainfall estimates with low regional accuracy (Guttorp 1996). Although various probabilistic studies have been conducted, stochastic generation does not include the physical rainfall processes affected by meteorological conditions. Therefore, we used models that deviate from this stochastic process to estimate physical rainfall generation. The Weather Research and Forecasting (WRF) model is designed for research and various applications such as mesoscale numerical weather prediction and atmospheric simulation. As WRF is a regional scale model, we used it in an attempt to interpret output for the region; this is advantageous if regional assessment is required. In general, WRF has been used to evaluate and analyze event-centric simulations. Maussion et al. (2011) evaluated the WRF capacity of three domains (30, 10, and 2 km – grid sizes) of the Tibetan Plateau. Caldwell et al. (2009) compared a WRF-based dynamic downscaling experiment at 12 km grids. Kumar et al. (2008) simulated the Mumbai event in 2005 using WRF and interpret approximately four parameters (microphysics parameters: Thompson, Lin, The WRF Single-Moment 6-class (WSM6), and Morrison). Jalowska and Spero (2019) tried to develop precipitation intensity–duration–frequency (PIDF) curves using different grid sizes (e.g. 36, 12 km) of WRF simulation. Most of these have been developed based on a short-term event as opposed to long-term simulations. Unlike previous studies, we used the WRF to generate long-term rainfall data rather than focusing on events.

The main objective of this study was to investigate whether rainfall generation using various physical systems of the WRF may be used as long-term hydrologic input data. Notably, none of the aforementioned approaches have utilized the WRF physical factors other than for estimations and statistical output during the rainfall generation process. The physically based model we used contains parameters with definite physical meanings; these parameters may be measured or determined from prior experimental data. The inputted physical option in the rainfall generation model includes cloud modeling, ice, snow, and graupel processes, and column moisture adjustment schemes (Skamarock et al. 2008; Jung and Lin 2016). In this paper, these physical options were adjusted to generate rainfall without using stochastic and statistical methods. To evaluate the reliability of the generated rainfall data, frequency analysis was conducted between the observations and generated data. By using this approach, we will generate physical rainfall in the Chung-Ju dam (CJD) basin.

2 Study area

The CJD basin, located in northeastern South Korea, was the selected region for this study, as shown in Fig. 1. This site serves as a representation of the weather in the Korean Peninsula, containing a sufficient amount of hydrologic information. The CJD is one of the largest multi-purpose dams in South Korea; it was built in the south part of the Han river basin and is a source of industrial/domestic water supply, while also contributing to flood control and power supply. The size of CJD basin is 6648 km2, with a total storage of 2.750 × 109 m3 and a flood control capacity of 6.16 × 108 m3. The longest length of river in the CJD is ~ 385 km and its average slope is 0.35 (Kim et al. 2021). The CJD has 32 rainfall gauge stations that are well-distributed in throughout the basin; Table 1 summarizes the characteristics of these rain gauge stations (STN). The highest elevation rain gauge station is the Sabuk station (705 m), while the maximum elevation difference is 558 m; this means this area may be categorized as a mountainous region. The rainfall record length is over 22 year (y) with maximum 31 y by 2017, and approximately 30% of the rain gauge stations contain the statistically required record length (~ 30 y). The rainfall in this region is concentrated in the summer season and follows the mountainous regional rainfall characteristics such as fluctuations with moisture and temperature instability. The CJD basin information and observed rainfall records were obtained at the Water Resources Management Information System (WAMIS: http://www.wamis.go.kr) in Korea.

Fig. 1
figure 1

Location of Chung-Ju dam basin and 32 rainfall stations

Table 1 List of 32 rainfall stations in the Chung-Ju dam basin

3 Methodology

Figure 2 summarizes a procedure to analyze the adaptability of physically generated rainfall data for hydrologic applications. We first selected event periods based on hourly maximum rainfall with rainfall independency (rainfall event interval: 8–10 h) and proper rainfall duration (over 12 h) over the 1999–2017 period; in total, 19 rainfall events were selected that largely occurred during summer in Korea (from July to September). Based on the selected events the WRF version 3.6.1 was utilized to generate rainfall data with a range of diverse parameter combinations. These combinations represent different physical atmospheric conditions that form rainfall droplets; details on these combinations are presented in Sect. 2.1. The generated rainfall was validated with observed rainfall data and modified using a bias correction process. This process provided a statistically similar rainfall pattern to the observed rainfall data. Then, comparative analyses were conducted using at-site and regional frequency analyses to observe the applicability of the physically generated rainfall data after bias correction (GRA).

Fig. 2
figure 2

Flow chart illustrating key steps involved in this study

3.1 Rainfall data generation

3.1.1 Weather research forecasting model

The WRF has been designed for research and various applications such as real-time mesoscale numerical weather prediction and regional climate simulation (Skamarock et al. 2008). Originally, the WRF was developed as next generation of non-hydrostatic numerical weather prediction models based on the Fifth-Generation Penn State/NCAR Mesoscale Model (MM5) and more widely used in a regional weather prediction (Kusaka et al. 2005). For real-time numerical weather predictions (NWP), the WRF model provides a range of options in numerical schemes such as physical representations, parameterizations, and data assimilation. In addition, prognostic equations in the WRF model generate spurious oscillations and numerical smoothing of mass, momentum, and entropy of the atmosphere (Pattanayak and Mohanty 2008). As such, we managed the physical values for atmospheric conditions to generate rainfall estimates based on global weather observations. In particular, atmospheric heat and moisture flux conditions were represented by two substantial physical parameters, cumulus and microphysics parameterizations, for quantitative precipitation forecasts (QPF) (Jung and Lin 2016). The cumulus parameterizations provide atmospheric temperature and humidity, cloud tendency profiles, and surface sub-grid scale (convective) rainfall. Generally, cumulus parameterizations were not adopted when the grid size was smaller than 4 km as a smaller grid size is able to resolve mesoscale dynamics, updrafts and downdrafts, and local sub-grids that also involve other physical functions (Gilliland and Rowe 2007). Some researchers have recommended using cumulus parameterizations concurrently with microphysics parameters in the grey zone (1–10 km) to avoid accumulated energy at grid points (Gerard 2007). For this study, we fixed the Betts-Miller-Janjic (BMJ) scheme (Betts and Miller 1993; Janjić 1994) for cumulus parameterization; this is well developed to adjust column moisture in the relaxation profile for an optional Eta structure.

With respect to microphysics parameterizations, 17 different parameters were applied to provide possible conditions using a range of numerical representations. Cloud properties and structures within the meso and cloud-scale were represented by microphysical parameterizations. These manage the formation processes of precipitation such as generation, growth, decay, and fall using various phases of ice in the cloud. During the microphysical processes, ice phases (i.e., cloud ice, graupel/hail, and snow) hydrometers were appropriately displayed for mid-latitude deep convective clouds and smaller grid simulations (Jung and Lin 2016). Table 2 presents the selected parameter options with acronyms for the different combinations of parameters; in total, there were 17 combinations with diverse microphysics and one cumulus parameterization. To initialize and update the boundary conditions of the WRF model, the National Center for Environmental Prediction (NCEP) and the Final (FNL) operational global analysis data were obtained from the research data archive (www.rda.ucar.edu). These NCEP FNL data were generated by the same model for the Global Forecast System (GFS) data generation with continuously updated observational data from the Global Telecommunications System (GTS). These data have a 1º × 1º grid size, which updates every 6 h for weather predictions. These FNL data include data on surface pressure, sea level pressure, sea surface temperature, geopotential height, temperature, soil moisture, ice cover, relative humidity, and u and v-winds (wind velocities toward the east and north, respectively) at 26 obligatory stretching levels between 10 and 1000 mb (www.rda.ucar.edu). We used FNL data for rainfall data generation to create a rainfall data archive based on the potential physical formation of rainfall.

Table 2 Combinations of microphysics and cumulus parameterizations for regional scale weather model simulations of rainfall generation

3.1.2 Bias correction

Generated rainfall data may be directly input into the hydrologic model for flood estimation (Jung and Lin 2016); however, with the generation of the rainfall Intensity–Duration–Frequency (IDF) curve, Kuo et al. (2014) produced rainfall data using the 5th Generation National Center for Atmospheric Research (NCAR)/Penn State Mesoscale Model (MM5). They found that the meteorological model was limited in terms of simulating storm intensity and the potential movement of a storm center which potentially leads to over-estimates in rainfall. In addition, regional climate models (RCMs) also have considerable systematic biases due to conceptualization problems, the erroneous prediction of rainfall intensity, and spatial resolution issues (Caldwell et al. 2009; Teutschbein and Seibert 2012; Requena et al. 2019). To address this systematic bias, the Quantile Mapping (QM) algorithm has been used as one of the well-developed methods for bias correction (Cannon et al. 2015; Lee and Singh 2019). A number of studies have demonstrated that the QM outperforms simple statistical bias correction (Jakob Themeßl et al. 2011). Block et al. (2009) and Piani et al. (2010) also suggested that QM was an adequate method for use in rainfall bias correction. For this study, the QM with a gamma distribution method was applied to correct biased rainfall data from the WRF simulation. The conventional QM is displayed in Eq. (1) with inverse function of the cumulative distribution function (CDF) for OBS (\(F_{0}^{ - 1}\)) and CDF of the model output (\(F_{m,base}\)). The gamma distribution contains two parameters, the shape parameter (α) and scale parameter (β); this includes an independent variable (\(x\)) and gamma function (\({\Gamma }\)) according to Eq. (2). The distribution parameters are obtained from Eqs. (3) and (4) with expectation (\(E\left( x \right) = {\upmu }\)) and variance (\(Var\left( x \right) = \sigma^{2}\)). The shape and scale parameters were related to the profile and dispersion of the distribution:

$$\hat{y} = F_{0}^{ - 1} \left( {F_{m,base} \left( x \right)} \right)$$
(1)
$$F_{G} \left( x \right) = \mathop \int \limits_{0}^{x} \frac{{\beta^{ - \alpha } }}{\Gamma \left( a \right)}x^{a - 1} \cdot \exp \left( { - \frac{x}{\beta }} \right){\text{d}}x; \quad x,\alpha ,\beta > 0$$
(2)
$$\beta = \frac{{\sigma^{2} }}{\mu }$$
(3)
$${\upalpha } = \frac{\mu }{\beta }$$
(4)

3.2 Frequency analysis

The aim of frequency analysis is to estimate design flood quantities to appropriately design infrastructure for hydrologic applications. Frequency analysis may be categorized into at-site and regional frequency analyses; the procedures involved with frequency analysis are presented in Fig. 3. Overall, this process is composed of data preparation, the selection of an appropriate probability distribution, and the estimation of quantiles for a range of storm sizes. For this process, a sufficient period of rainfall data is required for frequency analysis, where the input data period is greater than the return period (Hosking and Wallis 2005). In many cases, frequency analysis is carried out using a short data length, potentially providing an incorrect design flood due to the over/under estimation of rainfall with biases (Rahman et al. 2013). For regions with a shortage of rainfall data, at-site frequency analysis is limited in terms of generating the design storms. RFA is another means to overcome this rainfall data shortage; a number of studies have presented the merit of RFA (Lettenmaier and Potter 1985; Lettenmaier et al. 1987; Potter and Lettenmaier 1990; Stedinger and Lu 1995; Hosking and Wallis 2005). RFA utilizes possible rainfall data around a region, despite the region having insufficient observational rainfall data. The homogeneity of regions is a substantial part of RFA because the data dispersion relies on this characteristic. RFA may be categorized into four different methods: the index flood method, regional shape estimation method, hierarchical Bayesian modeling, and the focused rainfall growth extension method (Heo and Kim 2019). One of the well-developed index flood methods is an L-moment based method (Hosking and Wallis 2005); L-moment is the linear combinations of the probability weighted moment (Greenwood et al. 1979). The L-moment was selected for RFA because it is minimally influenced by unusually biased data.

Fig. 3
figure 3

Flow chart of at-site (left)/regional (right) frequency analyses

3.2.1 At-site frequency analysis

Generally, at-site frequency analysis using antecedent hydrologic data requires an estimation of the statistical characteristics of the data. One of the major processes is the selection of a representative statistical distribution for a site; the selected statistical distribution provides a design storm at-site frequency analysis. Prior to selecting a representative distribution, the randomness of data was measured using a range of methods (e.g., Correlogram Test, Run Test, Rank Correlation Coefficient Test, and Turning Point Test). Randomness presents the unbiasedness, independence, and stationarity of rainfall data. During the selection of statistical distribution, parameters with varying distributions were estimated by methods of moment, maximum likelihood, and/or probability weighted moments. The goodness-of-fit tests to select the most optimal distribution include the \(x^{2}\), Kolmogorov–Smirnov, Cramer Von Mises, and/or Probability Plot Correlation Coefficient tests. A representative statistical distribution was selected to operate at-site frequency analysis for selected frequencies.

3.2.2 Regional frequency analysis

For preparation of regional frequency analysis, data discordancy and heterogeneity tests were required to carry out the RFA using L-moment. The data discordancy test identifies discord sites using L-moment ratios based on basic statistical analysis between sites (i.e., t = standard deviation(std)/mean: L-CV, t3 = skewness/std: L-skewness, t4 = kurtosis/std: L-kurtosis) using discordancy (Di). In general, a discord site has a Di value over the criterion based on the given number of sites (e.g., for > 15 sites: Di = 3.0). Following the discordancy tests for each site, the selected region was validated for data homogeneity using a heterogeneity measure (H). A homogeneous region refers to a group of sites with identical statistical characteristics. Data from a homogeneous region can produce more reliable estimates compared to at-site estimations because of the effects of substituting spatial/regional rainfall data for limited temporal rainfall data for a site (Cunnane 1988). The criteria of homogeneous, possibly heterogeneous, and heterogeneous regions are: \(H < 1\), \(1 \le H < 2\), and \(H \ge 2\), respectively. After data preparation, we estimated the rainfall quantiles based on the selected distribution. We obtained five candidate distributions (generalized extreme values (GEV), generalized logistic (GLO), generalized normal (GNO), generalized pareto (GPA), and Pearson type III (PE3)) to represent a homogeneous region using the R package (Hosking 2019). During identification, the goodness of fit, \(Z^{DIST}\), was measured to illustrate similarities between the L-skewness and L-kurtosis of the candidate distribution with the regional average values. For example, the \(\left| {Z^{DIST} } \right| \le 1.64\) means that the fit is reasonable at a confidence level of 90%; the most optimal fit distribution has a \(Z^{DIST}\) that is close to 0.

A growth curve using the index-flood method suggested by Dalrymple (1960) was adopted to produce quantile estimates. In terms of the assumption regarding the homogeneity of a region, the rainfall quantile for each site was produced by multiplying between the average rainfall of the points and the regional growth curve. The values of a growth curve within region were similar/the same, because of regional homogeneity. The ratio of quantile estimate \(Q_{i} \left( F \right)\) and index flood (i.e., average annual rainfall quantiles: \(\mu_{i}\)) for the return period (T) is presented in the regional growth curve. In Eq. (5), \(q\left( f \right)\) is a dimensionless quantile function (i.e., regional growth curve) that is typically applied to every site. For an accurate estimation of the rainfall quantile to compare two or more different data applications for frequency analysis, the Monte Carlo simulation method was applied as shown in Eq. (6). Here, \(\hat{Q}_{i}^{\left[ m \right]} \left( F \right)\) is the site, i, quantile estimate for the non-exceedance probability F in the mth Monte Carlo simulation for a total number of simulations (M). The regional average relative RMSE (\({\text{RRMSE}}:{ }R^{R} \left( F \right)\)) of the estimated quantiles was obtained using Eq. (7):

$$Q_{i} \left( F \right) = \mu_{i} q\left( f \right)$$
(5)
$$R_{i} \left( F \right) = \left[ {M^{{ - 1}} \sum\limits_{{m = 1}}^{M} {\left\{ {\frac{{\hat{Q}_{i}^{{\left[ m \right]}} \left( F \right) - Q_{i} \left( F \right)}}{{Q_{i} \left( F \right)}}} \right\}^{2} } } \right]^{{1/2}}$$
(6)
$$R^{R} \left( F \right) = N^{ - 1} \mathop \sum \limits_{i = 1}^{N} R_{i} \left( F \right)$$
(7)

4 Results and discussion

4.1 WRF simulations

In terms of rainfall data generation, 17 different parameter combinations (i.e., microphysics and cumulus parameterizations) were applied in the WRF model, as discussed in Sect. 2.2. These various combinations rely on the initial scheme, planetary boundary layer (PBL) parameterization, and fixed domain with buffer zones. PBL parameterization calculates the vertical temperature motion, moisture conditions, and the resultant energy; this study selected the Yonsei University (YSU) PBL parameterization (Hong et al. 2006). This parameterization has an eddy coefficient scheme with a definitive entrainment layer and figurative K-profiles in the mixed layers. Figure 4 shows the Korean Peninsula with the Mercator map projection in the WRF simulation domain. For the domain layers, three nested domains including two-way interactions were adopted for a finer resolution. The Nest 1, 2, and 3 in Fig. 4 shows grid sizes of 36, 12, and 4 km, respectively. Nest 1 has a guide point for south-north (SN) and west–east (WE) at 128.058°E, 36.316°N. There were 43 and 49 grids at Nest 2 for SN and WE, respectively, while Nest 3 contained 82 (SN) and 58 (WE) grids; the generated rainfall data were from Nest 3 (4 km resolution). The selected grid points to compare rainfall generation were near to the rainfall observation points based on the longitudinal and latitudinal information.

Fig. 4
figure 4

Three nested domains in WRF simulations

To understand the WRF operations for rainfall generation, two observation stations (STN 11 and STN 13) were selected to identify the most optimal parameter combinations (i.e., microphysics and cumulus parameterization) for each event. Figure 5(a) illustrates the most optimal parameter combinations (P1–P17) based on the Nash Sutcliffe efficiency (NSE) values with horizontal (STN 11) and vertical (STN 13) patterns over 19 y. For STN 13, the P1 parameter combination were selected most frequently (26.3%); P11 was the most optimal combination for STN 11 with 15.8% selection. Based on the Fig. 5, it was difficult to report that one combination is able to represent a location or region for accurate rainfall estimation; however, a number of researchers have focused on identifying the most optimal combinations for rainfall simulations (Giorgi and Mearns 1999; Yerramilli et al. 2010; Nasrollahi et al. 2012). As opposed to identifying the most optimal parameter combinations for a STN, we investigated the potential to adopt this WRF model to generate historical rainfall data for a region for which adequate rainfall data are unavailable. After constructing the hourly rainfall of observed and generated data for selected events, the maximum rainfall of 12 h duration was plotted in (see Fig. 5b). This figure shows the maximum rainfall for observations (OBS) rainfall stations and those generated rainfall data by WRF (GRW). The observed maximum rainfall is denoted as a dark line, while the 17 GRW data from different parameters combinations are represented in a lighter gray color. Figure 5b shows that the rainfall estimations from the GRW sets were underestimated by 22.53–79.12% of the observational data as other applications of regional climate models (Mearns et al. 1999; Teutschbein and Seibert 2012). In the simulations of Kuo et al. (2014) using MM5 for central Alberta, rainfall estimations were over-simulated about 49–77%. Thus, bias correction was required in order to apply GRW for frequency analysis.

Fig. 5
figure 5

a Selected parameter combinations for two observed rainfall stations (1999–2017, vertical pattern: STN 11 and horizontal pattern: STN 13); b maximum rainfall of 12 h duration events for the OBS and GRW at 31 stations

4.2 Frequency analysis

4.2.1 At-site frequency analysis

At-site frequency analysis is preferred if there is sufficient rainfall data with a record length over two times of the return period for a watershed (Handbook 1999). For the CJD basin, the current status of rainfall information is unsatisfactory because of a shortage in rainfall information (Table 1, maximum number of years: 31 y). We selected a grid point near a rainfall observation station. All generated rainfall data following GRA were added into the observed rainfall data and treated the same way as the OBS. The generated rainfall data were 17 times the length of the simulation period (e.g. 10 y × 17 parameters combinations = 170 y data at a point). For frequency analysis, the rainfall data consisted of the annual maximum rainfall for given durations (e.g., 1 to 72 h). At-site frequency analysis was carried out following the preparation of rainfall data. To select the best statistical distribution fit for the rainfall information, a range of distributions were compared. We selected the method of probability weighted moments to estimate parameters; once the parameters were identified, the GEV distribution was selected through the goodness of fit test. Figure 6 (left) shows the IDF curve at STN 1 generated by an estimated rainfall intensity equation. The GRA with observed data were of a sufficient length for at-site frequency analysis because the GRA data at each station compensates for the lack of a decent length of observed rainfall data. Figure 6 (right) displays the relative RMSE (RRMSE) comparison between OBS only and GRA with OBS at STN 1. With respect to RRMSE, the GRA with OBS showed lower values because the larger data set potentially provided more information to generate the probability of rainfall estimation. The GRA may compensate historical rainfall data using various parameter combinations presenting physically possible atmospheric conditions.

Fig. 6
figure 6

At-site IDF (left) and the relative RMSE in Monte Carlo simulation (right)

4.2.2 Regional frequency analysis

Following bias correction for the WRF simulation data (i.e., GRA data), RFA was conducted for the CJD basin. At first, statistics for each site and discordancy measure were calculated for OBS and GRA to identify whether each site differed from the entire region; discordancy measures were obtained from L-CV, L-skewness, and L-kurtosis. A discord region was based on given criterion (i.e., > 3) dependent on the number of sites (i.e., 32 sites for this study) in the region. For each observed event, the WRF generated 17 different events attributed to 17 parameter combinations (i.e., P1–P17, Table 2). The number of total events was 544 from WRF simulations. During discordancy analysis, there were very low discordances (0.92%) from the WRF simulations. Five parameter sets (i.e., P1, P3, P8, P9, and P10) showed only one discordance station of 32 the stations with slightly higher discordancy values. In this study, a small percentage of discordant stations were included in the RFA because more data will minimize the rainfall data variances, despite the data having discrepancies. The heterogeneity measure identified the homogenous region of rainfall data; Table 3 displays the results of heterogeneities for H1, H2, and H3. The OBS and GRA were accepted as homogenous regions. The range of H-values was from -0.35 to -4.28 in which the GRAs showed larger amount of cross-correlation between sites compared to observations (-1.33– -1.60). The WRF simulation may contain some regularities during physical rainfall generation and the potentially short observation location distances produce strong cross-correlation (Hosking and Wallis 2005).

Table 3 The heterogeneity results for 32 sites of the OBS and GRA (P1–P17)

We selected feasible distributions based on the L-moment ratio diagram and the Z value; five different distributions were investigated (i.e., GEV, GLO, GNO, GPA, and PE3) in the goodness of fit test. Figure 7 illustrates the L-moment ratios between L-Skewness and L-Kurtosis for 32 sites in the CJD basin for annual maximums of 12 h rainfall events. In this figure, the cross (black) and triangle (red) points denote the L-moment ratios of 32 sites and the average ratio of 32 sites, respectively. The selection of the most optimal distribution of the data was based on the closeness between the average ratio (i.e. triangle) and representative lines or points of statistical distributions. The confirmation of preferable distributions may be estimated by the Z statistics; a statistical distribution was selected when the number of Z-values is within the criterion and close to the zero (presented as distance in Table 4). The satisfactory criterion was |Z|≤ 1.64, which corresponds to an acceptance of the hypothesized distribution at a confidence level of 90%. PE3 was the most optimal statistical distribution for GRA based on the distance from Fig. 7. The GEV was selected as a representative statistical distribution because of the number of Z-value over the criterion; however, GNO and PE3 showed higher pass ratios where the Z-values were a shorter distance from zero.

Fig. 7
figure 7

L-moment ratio of OBS and selected parameter combinations (i.e., P2, P3, P7, P9, P12)

Table 4 Z-values for goodness of fit (distance measures from zero to the Z-value)

Figure 8 displays the simple boxplots of averaged probable rainfall for a given return period by OBS, GRA, and GRW. These were generated by multiplying the quantile function of growth curves and averaged rainfall of station as per Eq. (5). The boxplot for each return period was averaged from probable rainfall for each parameter combination (P1–P17). For the probable rainfall for a 2 y return period, the GRW showed a relatively low median as expected in Sect. 4.1. With an increased return period, the variations of probable rainfall for GRW were also elevated with the presence of unusual values. The GRA showed median differences at a lower return period with higher variances. After a 10 y return period, the GRA slightly underestimated the probable rainfall, where the variances were similar to the probable rainfall distribution of a 2 y return period. The GRA and GRW for a higher return period (after 100 y) shared similar median values; the size differences between the 75th and 25th percentiles of GRA and GRW were also similar across all return periods. No substantial quartile skews were observed for the GRW and GRA as OBS. For higher return periods (> 100 y), the bias correction process may not be necessary to generate probable rainfall in terms of the median and interquartile range.

Fig. 8
figure 8

Averaged probable rainfall for specific return periods in the CJD basin

The Monte Carlo simulation was used to compare the accuracy of estimated rainfall quantile results. Figure 9 shows the averaged RRMSEs of GRW and GRA for 17 parameter sets including 32 rainfall stations during the rainfall quantile generation for non-exceedance probabilities. The averaged RRMSE for GRA had a higher accuracy compared to the averaged RRMSE for GRW; the range of RRMSEs for GRW, GRA, and OBS were 0.064–0.218, 0.046–0.181, and 0.031–0.175, respectively. The averaged RRMSEs of GRA and OBS were remarkably lower than GRW. This suggests that the GRA calculated reliable probability hydrologic quantiles. However, the GRA had a higher RRMSE compared to OBS; this differed from the Monte Carlo simulations for at-site frequency analysis. For a selected site, the at-site frequency analysis had more data as all WRF generated rainfall data were included for different parameter combinations (P1–P17) and considered observations. Figure 9 shows the comparison of the same number of rainfall data (32 rainfall data = 32 stations per event) for OBS and GRA generated by WRF for each parameter combination set (i.e., P1, P2, …, or P17). Thus, the same dataset number produced lower accuracy in terms of comparing Monte Carlo simulations. With a higher return period, the accuracies of GRW, GRA and OBS requires consideration to maximize the accuracy in future research. Based on the results, it is recommended that bias correction for WRF simulation rainfall data is carried out for frequency analysis that will estimate the design storm.

Fig. 9
figure 9

RRMSE of observation and average of GRA, GRW

4.2.3 Possible combinations for a regional frequency analysis

Figure 10 displays the possible combinations of a RFA using WRF simulated rainfall data following bias correction. The first combination involved each parameter set having its own rainfall data at 32 rainfall observation stations (method 1) as previously analyzed. Generated rainfall data for each rainfall station were recognized as regional rainfall information. Another combination was where all rainfall data generated for different parameter sets (P1–P17) were added into a station displayed as Fig. 10—method 2. Each parameter set may be considered a different region to carry out a RFA for method 2. The at-site frequency analysis in Sect. 4.2.1 forms one part of this method; however, we did not place different parameter sets as distinct regions.

Fig. 10
figure 10

Possible combinations of regional frequency analysis using GRA. Method 1: rainfall stations based combinations; method 2: parameter set based combinations

Figure 11(a) shows the possible range of the growth curves for the different parameter combinations. The range of method 2 is displayed between dotted lines, showing the larger range of the growth curves. Method 2 contained a range of physical conditions compared to the method 1 based on only one parameter set; this will generate various storm events. Method 1 showed higher growth curves for all stations. The range of growth curves for method 2 were ranged widely, including the growth curves of OBS and Method 1. With a lower return period (< 20 y), the quantiles of the two methods underestimated the regional growth curve compared to OBS (small graphs in Fig. 11a). Figure 11b presents the averaged probable rainfall for specific return periods using two different methods. For the 2 y return period, methods 1 and 2 showed a lower median of possible rainfall with larger variances. When the return period was increased, the medians of possible rainfall were well-matched. Methods 1 and 2 demonstrated higher variances compared to the OBS for all return periods; method 2 showed larger variances for all return period with some outliers. Based on the generated possible rainfall, methods 1 and 2 have potential to provide more rainfall information to generate design storms.

Fig. 11
figure 11

a Growth curves of different methods of rainfall data combinations for regional frequency analysis; b averaged possible rainfall using different rainfall data combination methods for specific return periods in the CJD basin

5 Summary and conclusions

This study applied the WRF model to generate rainfall data using parameters (i.e., microphysics and cumulus parameterization) representing a range of physical conditions in terms of moisture and heat flux in the atmosphere. Simulated rainfall data using 17 parameter combinations were evaluated in terms of at-site/regional frequency analyses; these were then compared with observed data at 32 rainfall stations. The findings from the study are as follows:

  1. 1.

    Direct usage of WRF generated rainfall information using a specific parameter combination is not recommended. In some cases, various parameter combinations may produce the best fit for observations; this means it is difficult to specify one parameter combination that is able to represent a specific region for rainfall simulations. In terms of the averaged maximum rainfall data for a given duration at each rainfall station, WRF underestimated this value for most rainfall stations. For hydrologic applications, a bias correction process was required as other regional climate models;

  2. 2.

    To analyze the applicability of generated rainfall data after bias correction (GRA), at-site and regional frequency analyses were carried out, and compared with the generated rainfall by WRF without bias correction (GRW) and/or observations (OBS). These results showed that:

    • With at-site frequency analysis, the GRA with OBS showed higher accuracy compared to only OBS in terms of RRMSEs using the Monte Carlo simulation method. GRA provided more rainfall data by the various parameter sets based on globally simulated atmospheric conditions (FNL operational global analysis data);

    • With RFA, the GRA, GRW, and/or OBS produced averaged probable rainfall for specific return periods. For a short return period (2 y), the medians of averaged probable rainfall for GRW and GRA were lower than the OBS. The simplicity of the WRF model in complex rainfall phenomena potentially underestimated rainfall. With increased return periods, the median of GRW approached the OBS with larger variances. For higher return periods (> 100 y), a bias correction process for GRW may not be required.

  3. 3.

    There were two possible methods for RFA using WRF generated rainfall data: method 1 was based on physical locations; and method 2 was based on a range of parameter sets. With a comparison of averaged possible rainfall, methods 1 and 2 showed similar medians of averaged probable rainfall, although method 2 contained more variances and some outliers.

Consequently, we conclude that 1) the rainfall generation using physical systems (i.e. microphysics and cumulus parameterization) of WRF can be applied to produce long-term hydrologic input data with a bias correction process; 2) different parameter combinations may be applied as different regions for regional frequency analysis. In future studies, physically generated rainfall data may be used to provide runoff data to analyze the spatial characteristics of mid/small watersheds.