1 Introduction

The uncertainties in the initial conditions and systematic errors in numerical weather and climate forecast models are among the main causes of inaccurate weather and climate predictions. Because the atmosphere is highly nonlinear and chaotic, a small change in the initial state can lead to a significant variation in the future (Lorenz 1969). The same can be said for atmospheric models. As part of the efforts to reduce the uncertainties in the initial conditions, the ensemble method is used widely for weather and climate predictions by configuring the forecast ensemble members (EMs) obtained from different physical processes or from allowing different small perturbations in the initial conditions (e.g., Stensrud et al. 1999; Stensrud et al. 2000). The initial perturbation methods, such as singular vector, ensemble transform Kalman filter, and ensemble transform with rescaling, have been widely used in operational centers [e.g., European Centre for Medium-Range Weather Forecasts (ECMWF), UK Met Office (UKMO), and National Centers for Environmental Prediction (NCEP)] (Buizza 1997; Richardson 2000; Wei et al. 2006; Hunt et al. 2007; Bowler et al. 2008; Wei et al. 2008).

Coupled general circulation models (CGCMs) are normally used for long-range seasonal forecasts. These models allow various interactions and feedback among the atmosphere, oceans, sea ice, and land surface (Meehl 1995). Various multi-model ensemble (MME) methods, which are considered an effective means to improve seasonal predictability by offsetting the biases in individual models, have been introduced in several operational centers [e.g., ECMWF, NCEP, Predictive Ocean Atmosphere Model for Australia, and Asia-Pacific Economic Cooperation Climate Center (APCC)] for quasi-real-time seasonal predictions (e.g., Molteni et al. 2011; Lim et al. 2012; Kirtman et al. 2014; Ham et al. 2019).

Although CGCM is used widely for long-term climate forecasting, they are unsuitable for investigating regional-scale phenomena because of their coarse spatial resolution. Therefore, a dynamic downscaling method utilizing a regional climate model (RCM) nested with a global climate model (GCM) has commonly been used to overcome this limitation (e.g., Ahn et al. 2012; Ahn et al. 2016a; Hur and Ahn 2017; Im et al. 2017b; Lee et al. 2019; Ahn et al. 2021). The RCM with high resolution allows a detailed description of regional-scale atmospheric processes with complex geographical and topographic information. Cocke and LaRow (2000) and Cocke et al. (2007) reported that precipitation downscaled by an RCM provided better regional representations than a GCM, with higher predictability for the frequency of heavy rainfall events.

Recent studies have utilized a multi-RCM ensemble to meet the demands for high-resolution climate prediction. The multi-RCM ensemble downscaling (MRED) initiated by the Climate Prediction Program for the Americas (CPPA) is one example of the multi-RCM ensemble prediction studies (https://rcmlab.agron.iastate.edu/mred/). In the MRED project, NCEP Climate Forecast System (CFS) reforecasts were downscaled using seven different RCMs with 10 EMs during the boreal winter season (December through April) from 1982 to 2003 (e.g., Yoon et al. 2012; De Sales and Xue 2013; Shukla and Lettenmaier 2013; De Haan et al. 2015). The results indicated that the prediction skills of the MRED (i.e., multi-RCM mean) were higher than those of the CFS, mainly in terms of the finer-scale distributions of atmospheric variables and statistical characteristics of daily mean precipitation (Yoon et al. 2012). Shukla and Lettenmaier (2013) concluded that a range of combination strategies, such as giving higher weights for RCMs with the highest prediction skills, are needed because several biases were still found in the MRED according to specific variables and regions.

More than half of the annual precipitation over South Korea occurs in summer (June–July–August). Therefore, adequate simulations of the precipitation during that period in a model are important. Furthermore, South Korea requires a high-resolution model because the climate presents large spatiotemporal variations because of the combined effects of geographical features (e.g., local topography and monsoon) (e.g., Kang and Hong 2008; Hong and Ahn 2015). As an ongoing effort to simulate the summer precipitation over South Korea, many researchers have investigated the reproducibility of and future changes in the precipitation characteristics using downscaled high-resolution multi-RCM data (e.g., Hong and Ahn 2015; Ahn et al. 2016b; Im et al. 2017a). On the other hand, insufficient research has been conducted on seasonal predictions of regional-scale precipitation over South Korea using multi-RCM data.

Massive computing resources are needed to produce multiple RCM ensemble seasonal predictions. Therefore, one institution typically produces an ensemble set using one model (so-called single model ensemble (SME)) for quasi-real-time seasonal predictions. A designated institution then collects and re-ensembles the SME sets produced by different institutions, called a MME. In producing seasonal predictions using the MME, each institution must produce an SME within an appropriate time. It should be emphasized that, however, it is a time-consuming work for an institution even to produce a set of large EMs using one RCM to deliver a SME because of computing resources.

This paper proposes an ensemble mean method (EMM) to increase the prediction efficiency by shortening the computational time to produce an SME. This method obtains the SME by integrating the RCM once using initial and lateral boundary conditions obtained by arithmetically averaging the outputs of the GCM EMs. The EMM was first mentioned by Yoshimura and Kanamitsu (2013). They insisted that the EMM constructed by averaging the GCM ensembles could dampen the high-frequency variations in the wind fields, resulting in an underestimation of the transient components of moisture divergences and precipitation. Nevertheless, their analysis focused on the whole global domain using a global dynamical downscaling model. The evaluation of simulated variables in specific regions by applying the EMM to RCM has not been adequately discussed so far. Recent studies proposed that correcting systematic biases inherent to the GCM outputs could improve dynamical downscaling simulations (e.g., Xu et al. 2019; Adachi and Tomita 2020). Many researchers have utilized various sophisticated modified boundary dynamical downscaling methods (MBDDS), such as the mean bias correction method (e.g., Peng et al. 2013; Bruyère et al. 2014; Ratnam et al. 2016), mean and variance bias correction method (e.g., Xu and Yang 2012; Hoffmann et al. 2016), and quantile-quantile correction method (e.g., Michelangeli et al. 2009; Colette et al. 2012). Lim et al. (2019) suggested an MBDDS approach by applying the mean bias correction method to the GCM ensemble mean fields. The approach improved the downscaled winter climate over East Asia in terms of the climatological mean, interannual variability, and extreme events. Nonetheless, the MBDDS corrects each variable individually, indicating that the physical relationships between variables, such as hydrostatic equilibrium and geostrophic wind balance, may not be preserved (e.g., Meyer and Jin 2016; Hernández-Díaz et al. 2017). In addition, it is unclear if the corrected GCM outputs will help improve the seasonal predictability of downscaled precipitation.

The main purpose of this study is to apply the EMM that utilizes the ensemble mean GCM outputs as the initial and lateral boundary conditions of RCM to summer precipitation in South Korea. The seasonal predictions obtained using the EMM are compared with those obtained by the conventional method, which produces predictions by applying boundary and initial conditions obtained from each GCM EM to an RCM. Unlike conventional methods, which require multiple integrations by an RCM for ensemble prediction, the EMM requires integration only once. This study compares the predictions produced by these two methods by applying them to summer precipitation in South Korea. If the two methods yield similar results, the EMM can be a potentially better alternative method for regional-scale seasonal predictions because it has the advantage of significantly reducing the computing time and costs. The remainder of this paper is organized as follows. Section 2 introduces the observational data, model description, experimental design, and evaluation methodology. The obtained results are presented in Sect. 3. A summary and conclusions are given in Sect. 4.

2 Data and experimental design

2.1 Observation data

The monthly mean enhanced reanalysis data with a 2.5° horizontal resolution provided by the Climate Prediction Center Merged Analysis of Precipitation (CMAP) (Xie and Arkin 1997) are used to verify the precipitation simulated by the CGCM. The daily observational data from 72 in situ weather stations obtained from Automated Surface Observing System (ASOS) of the Korea Meteorological Administration validate the downscaled results. The daily precipitation data with a 0.25° horizontal resolution obtained from the fifth-generation European Centre for Medium-Range Weather Forecasts Reanalysis (ERA5) are also utilized. The analysis period of this study is the summer (June–July–August, JJA) from 2000 to 2021. The summer precipitation over South Korea obtained from two grided precipitation datasets (i.e., CMAP and ERA5) is consistent with that from ASOS. CMAP and ASOS show similar climatological precipitation, while the ERA5 tends to underestimate the precipitation compared to the two datasets (Fig. 1a–c). In terms of interannual variability, however, CMAP and ERA5 show the good agreement with ASOS (Fig. 1d).

Fig. 1
figure 1

Climatological precipitation in South Korea from 2000 to 2021 (June–July–August) obtained from a CMAP, b ERA5, and c 72 weather station data of the Korea Meteorological Administration Automated Surface Observing System (ASOS). d Time series of precipitation averaged over South Korea derived from the three precipitation datasets

2.2 Coupled general circulation model

The CGCM used in this study is the Pusan National University (PNU) CGCM v2.0 (hereafter, PNUv2.0), which is one of the models participating in the APCC MME long-range prediction system. A detailed CGCM description and the process of producing predictions are presented elsewhere (e.g., Kim and Ahn 2015; Sun and Ahn 2015; https://www.apcc21.org/ser/global/modelDescription.do?lang=en). The model data used are the hourly forecast datasets during boreal summer (June–August) of five EMs with the initial dates of 7, 9, 11, 13, and 15 of April (i.e., with a 1.5-month lead time). The five EMs may not be sufficient for seasonal prediction, but we expect that the number of EM will not have much effect on the conclusion of this study.

2.3 Regional climate model and experimental design

The RCM used in this study is the Weather Research and Forecasting (WRF) model version 4.0. The model configuration consists of two-way interactive triple-nested domains with resolutions of 60 km (domain 1), 12 km (domain 2), and 2.4 km (domain 3) (with a 5:1 downscaling ratio) (Fig. 2). Only the output from domain 3 is used for analysis (Fig. 2b). The initial and lateral boundary conditions are updated every hour using the atmospheric and land variables from PNUv2.0, such as geopotential heights, horizontal wind components, temperatures, relative humidities, soil moistures, and soil temperatures (e.g., Hur and Ahn 2015; Ahn et al. 2018; Kim et al. 2019; Kim et al. 2021; Song et al. 2021). The model is integrated from 00 UTC on May 29 to 00 UTC on September 1 each year. The initial 3 days are the spin-up period to consider the dynamic adjustment of the lateral forcing and internal physical dynamics of the model (e.g., Ahn et al. 2012). The following are selected for the model physics schemes: WRF single-moment 6-class microphysics scheme (Hong and Lim 2006), Dudhia shortwave radiation scheme (Dudhia 1989), rapid radiative transfer model longwave radiation scheme (Mlawer et al. 1997), revised Monin-Obukhov surface-layer scheme (Jiménez et al. 2012), unified Noah land-surface model scheme (Chen and Dudhia 2001), Yonsei University planetary boundary layer scheme (Hong et al. 2006), and Kain-Fritsch convection scheme (Kain, 2004). The convective scheme is not used in domain 3 because the resolution in this domain is at a convection-permitting scale. In South Korean studies, Seo and Ahn (2020) reported that a convection-permitting WRF experiment (i.e., an experiment in which the cumulus parameterization scheme is turned off) simulated more similar distributions of the mean and extreme summer precipitation to the observed one than the other experiment (i.e., the experiments in which the cumulus parameterization scheme is turned on). Table 1 lists the detailed WRF configuration.

Fig. 2
figure 2

Topography heights (unit: m) for the a domain 1 and b domain 3. The inner boxes in a indicate the nested domains (domains 2 and 3). The dots in b represent the location of in situ weather observational stations (72 stations)

Table 1 Configuration of the WRF used in this study

Two experiments are designed in this study. In the first experiment, five EMs are downscaled dynamically using the WRF. In this case, the initial and lateral boundary conditions of each EM are obtained from the forecast dataset of the corresponding PNUv2.0 EMs. Hereafter, five WRF forecasts obtained from the first experiment are referred to as EXP1_EM1, EXP1_EM2, EXP1_EM3, EXP1_EM4, and EXP1_EM5, respectively. The arithmetic average of those downscaled forecasts obtained by the simple composite method is called EXP1. The second experiment is carried out in the same manner as the first, but integration is performed only once using the initial and lateral boundary conditions obtained by arithmetically averaging the EMs of PNUv2.0. The WRF forecast obtained from the second experiment is referred to as EXP2. This approach is similar to Lim et al. (2019), but the bias correction method is not applied to the driving PNUv2.0 variables. Figure 3 shows a schematic diagram of the overall experimental design.

Fig. 3
figure 3

Schematic diagram of experiments used in this study

2.4 Evaluation methodology

The summer precipitation over South Korea is generally influenced by northward moisture transport associated with the westward extension of the western North Pacific subtropical high (e.g., Baek et al. 2017; Kim et al. 2017; Song and Ahn 2022). According to the moisture budget equation at an atmospheric column, precipitation is related to the process of vertically integrated moisture flux (VIMF) convergence. The VIMF is calculated as follows:

$$\textrm{VIMF}=-\frac{1}{g}{\int}_{p_s}^{p_t}\left(q\times \textbf{V}\right) dp$$

where g is the gravitational acceleration; ps (pt) is the pressure at the surface (top) of the atmosphere (pt is chosen as 300 hPa in this study); q is the specific humidity; and V is the horizontal wind vector.

The mean bias error (MBE), root mean square error (RMSE), temporal correlation coefficients (TCC), and hit rate (HR) are used to evaluate the performance of the simulated precipitation. The MBE, RMSE, and TCC are, respectively, defined as follows:

$$\textrm{MBE}=\frac{1}{N}\sum\limits_{n=1}^N\left({M}_n-{O}_n\right),$$
$$\textrm{RMSE}=\sqrt{\frac{1}{N}\sum\limits_{n=1}^N{\left({M}_n-{O}_n\right)}^2},$$
$$\textrm{TCC}=\frac{\sum_{n=1}^N\left({M}_n-\overline{M}\right)\left({O}_n-\overline{O}\right)}{\sqrt{\sum_{n=1}^N{\left({M}_n-\overline{M}\right)}^2}\sqrt{\sum_{n=1}^N{\left({O}_n-\overline{O}\right)}^2}},$$

where M (O) represents the value of the model (observation). N indicates the total analysis period, and overbars represent the average values over the sample of size N.

The HR is the defined probability of observed events that are correctly forecast as follows:

$${\textrm{HR}}_{\textrm{Above}\ \textrm{Normal} \ \textrm{(AN)}}=\frac{A}{\left(A+B+C\right)}$$
$${\textrm{HR}}_{\textrm{Near}\ \textrm{Normal}\ \textrm{(NN)}}=\frac{E}{\left(D+E+F\right)}$$
$${\textrm{HR}}_{\textrm{Below}\ \textrm{Normal}\ \textrm{(BN)}}=\frac{I}{\left(G+H+I\right)}$$
$${\textrm{HR}}_{\textrm{Total}}=\frac{\left(A+E+I\right)}{N}$$

In the contingency table for calculating HR, the observed and simulated values are classified as above normal, near normal, and below normal according to the 0.43 standard deviation threshold, respectively (Table 2). The HR > .33 (i.e., reference value of random prediction) is considered skillful.

Table 2 Contingency table (3 × 3) for calculating the hit rate. N is the total number of years

3 Result

3.1 Predictability of summer precipitation over South Korea in PNUv2.0

The seasonal prediction skill in PNUv2.0 for the summer precipitation is first examined. Figure 4 presents the spatial distribution of MBE and TCC obtained from PNUv2.0. The original PNUv2.0 data is interpolated onto a CMAP grid point using bi-linear interpolation to compare with the CMAP. The simple composite method, where equal weighting is assigned to each EMs, is used in this analysis. The PNUv2.0 has significant dry biases over the Korean Peninsula (Fig. 4a). The area-averaged precipitation near South Korea (i.e., five grid points) obtained from CMAP and PNUv2.0 are 7.62 mm ∙ day−1 and 4.36 mm ∙ day−1, respectively, indicating that the model underestimates the precipitation over that region. The area-averaged TCCs over the same region obtained from PNUv2.0 is 0.20, which is not significant at the 95% confidence level based on a two-sided Student’s t test. In addition, the result is insufficient to obtain the statistical significance because of the short analysis period of the time series (22 years) (Fig. 4b). The simulating interannual variability of precipitation has been a major challenge for the climate model. According to previous studies, many climate models participating in operational seasonal forecast systems exhibit relatively low performance in predicting precipitation compared to temperature. In particular, the prediction skills of precipitation are lower in extra-tropics than in the tropics (e.g., Kim et al. 2012; Min et al. 2014; Ham et al. 2019).

Fig. 4
figure 4

Spatial distribution of a mean bias error and b temporal correlation coefficients of summer mean precipitation (unit: mm ∙ day−1) from 2000 to 2021 (JJA) derived from PNUv2.0. The value of the upper-right corner above each plot indicates the area averaged value over South Korea (black dots; five grid points)

3.2 Comparison of performance on precipitation between EXP1 and EXP2

Figure 5 shows the spatial distribution for the 22-year (2000-2021) averaged summer precipitation derived from the observation, EXP1, and EXP2 at 72 in situ observational sites. The precipitation datasets obtained from EXP1 and EXP2 are interpolated into the locations of the in situ observational stations using the inverse distance weighting interpolation method. Regarding the observed results, high precipitation is concentrated in two regions of South Korea. One is a southern coastal region, and the other is a region extending from the northwestern part to the northeastern part of South Korea (Fig. 5a). This is similar to Qiu et al. (2020), even though the period of data is not the same. The EXP1 exhibits dry biases over the entire region of South Korea but wet biases over the northwestern regions. As a result, they capture only one of the two regions with high observed precipitation (i.e., the northern part of South Korea). For PNUv2.0, dry biases are found over the entire South Korea region, as shown in Fig. 4a. EXP1 retains the dry biases seen in PNUv2.0 compared with the observed one, but they tend to be reduced. In addition, this experiment data shows better performance in simulating regional-scale details than PNUv2.0 (Fig. 5b, d). These results are also revealed by analyzing each EM (figures not shown). The all-station averaged precipitation and MBE in EXP1 (ensemble spread) is 5.76mm ∙ day−1 (5.10~6.49mm ∙ day−1) and - 2.03mm ∙ day−1 (- 2.69~- 1.29mm ∙ day−1), respectively. Here, the ensemble spread (from minimum to maximum) is based on the results obtained from the five EMs (i.e., EXP1_EM1, EXP1_EM2, EXP1_EM3, EXP1_EM4, and EXP1_EM5). The spatial distribution pattern of EXP2 is similar to that of EXP1. On the other hand, EXP2 alleviates the dry biases observed in EXP1 and shows similarity to that observed in quantitative aspects of precipitation (Fig. 5c, e). The all-station averaged precipitation in EXP2 is 7.59 mm ∙ day−1, which is comparable to the observation (7.79 mm ∙ day−1).

Fig. 5
figure 5

Spatial distribution of summer mean precipitation (unit: mm ∙ day−1) derived from a ASOS, b EXP1, and c EXP2 during 2000–2021 (JJA). d, e The same as b and c, respectively, but for mean bias error. The averaged values over 72 weather stations are shown in the upper-right corner above each panel

The moisture flux is investigated to understand the different performances of precipitation observed in EXP1 and EXP2. Figure 6 shows the spatial distribution for the 22-year (2000–2021) averaged summer precipitation, meridional and zonal components of VIMF (hereafter, VIMF_Y and VIMF_X, respectively) in the inland areas of South Korea derived from EXP1 and EXP2. Consistent with Fig. 5, EXP2 simulates more precipitation over the entire region of South Korea than EXP1. The area-averaged precipitation of EXP1 (ensemble spread) and EXP2 are 6.36 mm ∙ day−1 (5.68~7.14 mm ∙ day−1) and 8.17 mm ∙ day−1, respectively (Fig. 6a, b). Regarding VIMF_Y, both EXP1 and EXP2 show similar performance in simulating the spatial distribution and the area-averaged value. The area-averaged VIMF_Y of EXP1 (ensemble spread) and EXP2 are 123.54 kg ∙ m−1 ∙ s−1 (109.69~138.48 kg ∙ m−1 ∙ s−1) and 126.38 kg ∙ m−1 ∙ s−1, respectively (Fig. 6c, d). On the other hand, EXP2 tends to simulate the VIMF_X more strongly than EXP1, which may contribute to the enhanced convergence of VIMF. The area-averaged VIMF_X in EXP1 (ensemble spread) and EXP2 are 95.25 kg ∙ m−1 ∙ s−1 (84.40~111.66 kg ∙ m−1 ∙ s−1) and 134. 28kg ∙ m−1 ∙ s−1, respectively, and the area-averaged VIMF convergence in EXP1 (ensemble spread) and EXP2 are 7.94 ×10−5 ∙ kg ∙ m−2 ∙ s−1 (6.83~8.65 ×10−5 ∙ kg ∙ m−2 ∙ s−1) and 10.34 ×10−5 ∙ kg ∙ m−2 ∙ s−1, respectively (Fig. 6e, f). The comparison with EXP1 demonstrates that EXP2 strongly simulates the convergence of VIMF, resulting in abundant precipitation.

Fig. 6
figure 6

Spatial distribution of summer mean a precipitation (unit: mm ∙ day−1), c meridional, and e zonal components of vertically integrated moisture flux (unit: kg ∙ m−1 ∙ s−1) in the inland areas over South Korea derived from EXP1 during 2000–2021 (JJA). b, d, f The same as a, c, and e, respectively, but for EXP2. The area-averaged values are shown in the upper-right corner above each panel

Figure 7 shows the vertical distribution of main variables at each pressure level from 1000 to 300 hPa obtained from the two simulations to determine if the different performance of VIMF_X seen in EXP1 and EXP2 can be attributed to the differences in specific humidity or wind. The differences in the zonal wind between EXP1 and EXP2 appear to be marginal in the lower atmosphere but are amplified in the middle and upper atmosphere (Fig. 7a). Regarding the meridional wind, although EXP2 tends to underestimate the middle atmosphere compared to the EXP1 one, both simulations show similar performance (Fig. 7b). The differences in specific humidity between the two simulations are small (Fig. 7c). The different performance of VIMF_X seen in EXP1 and EXP2 is attributed to the difference in intensity of upper-level jet stream, resulting in a difference in precipitation.

Fig. 7
figure 7

Vertical distribution of a zonal wind, b meridional wind, and c specific humidity at each pressure level from 1000 to 300 hPa averaged over inland areas of South Korea obtained from EXP1 (blue line; blue shading indicates the ensemble spread) and EXP2 (red line) during 2000–2021 (JJA)

The summer precipitation of South Korea is composed largely of two peaks: Changma (late June–late July), which is a component of the East Asian summer monsoon along with Baiu over Japan and Mei-yu over China (the so-called BCM front, e.g., Hong and Ahn, 2015), and Post-Changma (mid-August–early September) (e.g., Ha et al. 2012; Lee et al. 2017). It is important to adequately simulate the abovementioned intra-seasonal variability of precipitation in a model. Figure 8 shows the temporal variation of the daily mean precipitation averaged over all stations obtained from in situ observation, EXP1 (with ensemble spread) and EXP2 for 2000–2021. Both EXP1 and EXP2 simulate the first precipitation peak (i.e., Changma) earlier than the peak noted in the observed data, indicating an overestimation (underestimation) of the precipitation during June (July). Both simulations exhibit limited ability to capture the amount and timing of the second precipitation peak (i.e., post-Changma) (Fig. 8a). The MBEs in EXP1 (ensemble spread) for June, July, and August are 1.87 mm ∙ day−1 (1.21~2.67mm ∙ day−1), – 3.32mm ∙ day−1 (– 4.35~–2.18mm ∙ day−1), and – 4.52 mm ∙ day−1 (–4.81~–4.06mm ∙ day−1), respectively, and that the summer mean value is – 2.03 mm ∙ day−1 (- 2.69~- 1.29 mm ∙ day−1). The MBEs in EXP2 for June, July, and August are 3.24 mm ∙ day−1, – 0.64 mm ∙ day−1, and –3.08 mm ∙ day−1, respectively, and that for summer mean value is – 0.20 mm ∙ day−1 (Fig. 8b). These results suggest that EXP2 tends to overestimate the precipitation through three consecutive months compared to EXP1. As a result, the MBEs in EXP2 are reduced during July and August, leading to decreased MBEs for the entire summer, compared to those in EXP1.

Fig. 8
figure 8

a Daily mean precipitation (unit: mm ∙ day−1) averaged over 72 weather stations derived from the ASOS (grey bars), EXP1 (blue line; blue shading indicates the ensemble spread), and EXP2 (red line) during 2000–2021 (JJA). The two vertical lines represent the starting date of July and August. b Same as a, but for the mean bias error of monthly precipitation

Figure 9 shows the time-latitude cross section of a 3-day moving average of daily precipitation zonally averaged from 124°E to 131°E. Both EXP1 and EXP2 precipitation data are interpolated onto an ERA5 grid point using the bi-linear interpolation method to facilitate a comparison with ERA5. The observed precipitation peaks appear twice during summer (i.e., Changma and Post-Changma periods). The observed Changma rainband gradually advances northward into South Korea from mid-June to late July (Fig. 9a). Both EXP1 and EXP2 capture the northward march of the Changma rainband but overestimate the intensity above 35°N during the onset phase. In particular, EXP1 tends to underestimate the precipitation intensity during the entire Changma period, but EXP2 shows similar results to the observed one. Although both simulations cannot capture the timing and intensity of the Post-Changma phase, as mentioned in Fig. 8, the distribution in EXP2 is much closer to the observed pattern than in EXP1 (Fig. 9b, c).

Fig. 9
figure 9

Hovmöller diagram of zonally (124°E to 131°E) averaged 3-day moving averaging daily precipitation (unit: mm ∙ day−1) derived from a ERA5, b EXP1, and c EXP2 during 2000–2021 (JJA)

Figure 10a shows a time series of the precipitation averaged over all stations obtained from in situ observation, EXP1 (with ensemble spread), and EXP2. The TCC in EXP2 (0.45) is higher than that in EXP1 (0.01), which is significant at the 95% confidence level from the two-sided Student’s t test. The RMSE decreases from 2.81 mm ∙ day−1 in EXP1 to 1.88 mm ∙ day−1 in EXP2, which is related mainly to an overestimation of the precipitation in EXP2 compared to EXP1. In addition, the HRAN, HRNN, HRBN, and HRTotal increase from 0.17, 0.33, 0.30, and 0.27 in EXP1 to 0.67, 0.50, 0.70, and 0.64 in EXP2, respectively (Fig. 10b). These results indicate the prediction skills in EXP2 are even better than those obtained in EXP1.

Fig. 10
figure 10

a Time series of summer mean precipitation (unit: mm ∙ day−1) averaged over 72 weather stations derived from the ASOS (grey bars), EXP1 (blue line; blue shading indicates the ensemble spread), and EXP2 (red line) during 2000–2021 (JJA). b The same as a, but the hit rate derived from EXP1 (blue bars; blue lines indicate the ensemble spread) and EXP2 (orange bars)

For more detailed analysis, Fig. 11 presents the spatial distribution of skill scores and its area-averaged values, respectively. Both EXP1 and EXP2 show a similar spatial distribution of RMSE, simulating the large RMSE over the two regions in that high precipitation is observed, as shown in Fig. 5. The area-averaged RMSE in EXP2 is similar to that in EXP1 (Fig. 11a, b). Although the spatial details of TCC and HRTotal show somewhat discrepancy in the two simulations, the area-averaged TCC and HRTotal in EXP2 are slightly higher than those in EXP1 (Fig. 11c–f). Generally, the ensemble mean (i.e., EXP1) provides a better prediction of precipitation than its EMs. These results suggest that EXP2 can provide comparable or better results than EXP1 and can be used as an alternative method for seasonal predictions on the regional scale because of the reduced time and costs of integration.

Fig. 11
figure 11

Spatial distribution of a root mean square error (unit: mm ∙ day−1), c temporal correlation coefficients, and e hit rate derived from EXP1 during 2000–2021 (JJA). b, d, and f The same as a, c, and e, respectively, but for EXP2. The averaged values over 72 weather stations are shown in the upper-right corner above each panel

Yoshimura and Kanamitsu (2013) mentioned that the use of ensemble mean GCM fields as the initial and boundary conditions of RCM may not be used in the short-term forecast because it may underestimate the variations of hydrological variables. In addition, Erfanian et al. (2017) suggested that using the ensemble forcing approach, which derives the initial and boundary conditions of the RCM from the ensemble average of multiple GCMs, may be unsuitable for weather forecasts because it can smooth out the temporal variations from individual GCMs. Unlike the previous studies, however, experiments using the ensemble mean fields (i.e., EXP2) simulate similar or slightly more precipitation than conventional experiments (i.e., EXP1). These results may be due mainly to the convection-permitting model (CPM) simulations. The CPM no longer relies on convection parameterization schemes and has been shown to offer a more realistic representation of convection not captured at coarser resolutions (e.g., Ban et al. 2014; Berthou et al. 2020; Yun et al. 2020). In this study, the Kain-Fritsch convection scheme is used in coarse domains (horizontal spatial resolutions are 60 km and 12 km) but not in the nested domain (horizontal spatial resolution is 2.4 km). The EMM, which dampens the high-frequency variations in the wind fields, may have little impact on the precipitation simulation because the convection parameterization schemes with atmospheric variable-based trigger function (e.g., Betts and Miller 1986; Grell 1993; Kain, 2004) are not applied to the nested domain. In addition, the inter-EM spreads in a single GCM are not large enough to smooth out the atmospheric variables compared to inter-individual model spreads in multiple GCMs. This is because EMs in a single model have a similar systematic bias defined as the difference in the mean state between the simulation and observation. The results of the present study suggest that a combination of the EMM and CPM may be useful in producing fine-scale precipitation for seasonal predictions.

4 Summary and conclusions

This study investigates the advantages of the EMM in regional-scale seasonal forecasting. For this purpose, two WRF experiments are carried out to obtain the simulated precipitation over South Korea from 2000 to 2021 (June to August). In the first experiment, five EMs are dynamically downscaled using the initial and lateral boundary conditions obtained from the output of each PNUv2.0 EM, and the simple composite method is applied to the results of each member for ensemble prediction. In the second experiment, the WRF integration is performed only once using the initial and lateral boundary conditions obtained by arithmetically averaging the outputs of the PNUv2.0 EMs. The data obtained from the first and second experiments are referred to as EXP1 and EXP2, respectively.

EXP2 produced a closer result to the observed precipitation amounts than EXP1. This improvement is attributed to the strongly simulated zonal wind from the middle to the upper atmosphere, which can influence the VIMF_X and convergence of VIMF. According to the moisture budget equation at an atmospheric column, proper convergence of VIMF can lead to reasonable precipitation. Both EXP1 and EXP2 simulate the Changma onset earlier than observation and limited ability to capture the precipitation during post-Changma period. On the other hand, compared to EXP1, the MBEs in EXP2 are reduced during July–August, leading to decreased MBEs for the entire summer period. In addition, EXP2 shows comparable or better performance in simulating the interannual variability of summer precipitation than EXP1.

These results suggest that the EMM can be a potentially powerful tool because it can decrease the prediction time significantly by reducing the number of ensemble integrations of the RCM to one. Massive computing resources are needed for quasi real-time seasonal ensemble predictions on a regional scale (below 3-km spatial resolution). The EMM can be used as an alternative method for seasonal predictions on the regional scale because it can reduce the time and costs of integration.