Keywords

8.1 Introduction

Hydrologic forecasting is a crucial non-structural flood mitigation measure and provides an essential basis for flood warning, flood control and reservoir operation (Guo et al. 2004; Calvo and Savi 2009; Chen et al. 2014a; Zhang et al. 2015; Fan et al. 2016; Liu et al. 2017; Wu et al. 2017). Forecasting models that are widely used at present are typically deterministic, and model outputs are provided to users in the form of deterministic values (Chen and Yu 2007; Coccia and Todini 2011; Ma et al. 2013; Bergstrand et al. 2014; Li et al. 2014). However, a hydrological forecasting model is only a simulation of the real hydrological processes and is therefore imperfect and not precise (Ravines et al. 2008; Wetterhall et al. 2013). These models accept hydrological input, meteorological input, etc., and utilize conceptualized model parameters; and these complex factors cause inevitably uncertainties in the hydrologic forecasts (Freer et al. 1996; Montanari 2007; Montanari and Grossi 2008; Renard et al. 2010; Chen et al. 2014b). The principle of rational decision-making under uncertainty indicates that when a deterministic forecast turns out to be wrong, the consequences will probably be worse than a situation where no forecast is available (Krzysztofowicz 1999; Wetterhall et al. 2013; Ramos et al. 2013). A rational decision maker who wants to make optimal decisions should therefore take forecast uncertainty explicitly into account (Verkade and Werner 2011; Ramos et al. 2013). Therefore, quantitative assessment of inherent uncertainty is a critical issue. Hydrologic forecasting services are trending toward providing users with probabilistic forecasts, in place of traditional deterministic forecasts.

The transition from a deterministic forecast to a probabilistic forecast is based on quantification of the uncertainty inherent in the deterministic forecast. The Bayesian Forecasting System (BFS) proposed by Krzysztofowicz (1999) provides a general framework to produce probabilistic forecasts via any deterministic hydrological model. Various probabilistic forecasting systems suited to different purposes have been developed within this framework (Reggiani and Weerts 2008; Calvo and Savi 2009; Biondi et al. 2010; Weerts et al. 2011; Sikorska et al. 2012; Pokhrel et al. 2013).

In the BFS, the total uncertainty is decomposed into input uncertainty and hydrological uncertainty. The hydrological uncertainty processor (HUP) is a component of the BFS that quantifies the hydrological uncertainty and produces probabilistic forecast under the hypothesis that there is no input uncertainty (Krzysztofowicz and Kelly 2000). Through Bayes’ theorem, the HUP combines a prior distribution, which describes the natural uncertainty about the realization of a hydrologic process, with a likelihood function which quantifies the uncertainty in model forecasts, and outputs a posterior distribution, conditional upon the deterministic forecasts. This posterior distribution provides a complete characterization of uncertainty, including quantiles, prediction intervals and probabilities of exceedance for specified thresholds which are needed by rational decision makers and information providers who want to extract forecast products for their customers.

The HUP can be implemented in many ways, as different mathematical models for prior distribution and likelihood function can be developed. Krzysztofowicz and Kelly (2000) introduced a meta-Gaussian HUP, which was developed by converting both original observations and model forecasts into a Gaussian space by using the Normal Quantile Transform (NQT). This meta-Gaussian HUP has been widely used by many researchers in the fields of hydrology and meteorology (Chen and Yu 2007; Biondi et al. 2010; Biondi and De Luca 2013; Chen et al. 2013a).

The prior density and likelihood function are conditional probability distributions. It is well known that copula function has an outstanding capability to model joint distributions and gives flexibility in choosing an arbitrary marginal distribution (e.g. non-Gaussian form), nonlinear and heteroscedastic dependence structure. The conditional probability distribution can be expressed in the explicit form using copula function (Favre et al. 2004; Nelsen 2006; Zhang and Singh 2006, 2007a, b, c; Genest and Favre 2007; Bárdossy and Li 2008; Chen et al. 2010; Zhang et al. 2011, 2012). These advantageous characteristics of the copula function motivate us to develop the prior distribution and likelihood function models in the original space directly without a data transformation procedure into Gaussian space. Liu et al. (2017) proposed a post-processor based on copula function for deterministic forecast model to produce probabilistic forecasts within the general framework of the HUP.

Despite the tremendous amount of resources invested in developing more hydrologic models, no one can convincingly claim that any particular model in existence today is superior to other models for all type of applications and under all conditions (Wu et al. 2015; Liu et al. 2016; Ba et al. 2017). Different models are capable of capturing different aspects of the hydrologic processes. The uncertainty of each model arises from parameters calibration, the design of the model structure, and input measurements, which partially brings underlying imprecise influence (Götzinger and 2008; Li et al. 2011; Hemri et al. 2015). One of the primary techniques to reflect different uncertainties in hydrological forecasts is to create an ensemble of forecast trajectories (Seo et al. 2006; Madadgar et al. 2014).

The Bayesian Model Average (BMA) method introduced by Raftery et al. (2005) follows a statistical technique to combine the advantages of different models. Different from other multi-model methods, the BMA method presents a more realistic description of predictive uncertainty, since the BMA predictive variance can be decomposed into two components: between-model variability and within–model variability (Ajami et al. 2007). The BMA method is a statistical procedure that infers consensus predictions by weighing individual predictions based on their probabilistic likelihood measures, with the better performing predictions receiving higher weights than, the worse performing ones. The method has been explored to improve both the accuracy and reliability of streamflow predictions (Vrugt and Robinson 2007; Liang et al. 2011). Duan et al. (2007) concluded that the combination of multi-model ensemble strategies using the BMA framework could quantify statements on prediction uncertainty and significantly improve verification performances. Zhou et al. (2016) compared the mean prediction of BMA with its individual parameter transfer method (physical similarity approach) and demonstrated that the probabilistic predictions of BMA could reduce the uncertainty with a significant degree. Nevertheless, the standard BMA method imposes lots of pseudo variation requirements, and this influences precise understanding of data variations, which gives rise to further development of this theoretical research (Madadgar and Moradkhani 2014).

Klein et al. (2016) used a mixture of marginal density distribution to estimate the predictive uncertainty of hydrologic multi-model ensembles by using pair-copula construction. Similar researches show that copula technique is an effective tool for reflecting the unclear and complex relationships because it can flexibly choose the arbitrary type of the marginal distributions instead of Gaussian distribution (Carreau and Bouvier 2016; Khajehei and Moradkhani 2017). According to the promising results of using copula functions in post-processing of hydrologic forecasts, Madadgar and Moradkhani (2014) firstly integrated copula functions with BMA to estimate the posterior distribution and found that Copula-BMA (CBMA) is an effective post-processor to relax any assumption on the distribution of conditional probability density function (PDF). The CBMA not only displayed better deterministic skill than BMA but also confirmed the impact of posterior distribution in calculating the weights of individual models by EM algorithm. Results indicated that the predictive distributions are more accurate and reliable. It is also shown that the post-processed forecasts have better correlation with observation after CBMA application. The CBMA method in the meteorological application does not need to assume the shape of the posterior distribution and leaves out the data-transformation procedure and demonstrates that predictive distributions are less bias and more confident with small uncertainty (Möller et al. 2013). Inspired by the ideas of Madadgar and Moradkhani (2014), a general framework of the combination of copula Bayesian processor with BMA (CBP-BMA) is proposed by He et al. (2018).

8.2 Hydrologic Uncertainty Processor Based on Copula Function

8.2.1 Hydrologic Uncertainty Processor

Let predict and H be the observed discharge whose realization h is being forecasted. Let estimator S be the output discharge generated by a corresponding deterministic forecast model whose realization s constitutes a point estimate of H. Let random variable H0 represent the observed discharge at the time n = 0 when the forecast is prepared; then Hn (n = 1, 2, …, N) is the observed discharge at lead time n; and Sn (n = 1, 2, …, N) is the corresponding deterministic forecast discharge at lead time n. What the rational decision maker then needs is not a single number sn, but the distribution function of predictand Hn, conditional on H0 = h0 and Sn = sn. The purpose of the HUP is to supply such a conditional distribution function through Bayesian revision (Liu et al. 2017).

The Bayesian procedure for information revision of uncertainty involves two steps. First, the expected conditional density function of deterministic forecast discharge Sn given that H0 = h0 is derived via the total probability law:

$$\kappa_{n} (s_{n} | h_{0} ) = \int\limits_{ - \infty }^{ + \infty } {f_{n} (s_{n} | h_{0} ,h_{n} ) \cdot g(h_{n} | h_{0} )dh_{n} }$$
(8.1)

Second, the posterior density function of predictand Hn conditional on a deterministic forecast Sn = sn and observed discharge at the forecasting time H0 = h0, is derived via Bayes’ theorem (Krzysztofowicz and Kelly 2000):

$$\phi_{n} (h_{n} | h_{0} ,s_{n} ) = \frac{{f_{n} (s_{n} | h_{0} ,h_{n} ) \cdot g_{n} (h_{n} | h_{0} )}}{{\kappa_{n} (s_{n} | h_{0} )}}$$
(8.2)

In concept, Bayes’ theorem revises the prior density function gn(hn/h0), which characterizes the prior uncertainty about Hn, given observed discharge at the forecasting time H0 = h0 The extent of the revision is determined by the likelihood function fn(sn/h0, hn), which characterizes the degree to which Sn = sn reduces the uncertainty about Hn. The result of this revision is the posterior density function Φn(hn/h0, sn), which quantifies the uncertainty about Hn that remains after the deterministic forecast model generates forecast Sn = sn.

8.2.2 Meta-Gaussian HUP

As we can see from Eq. 8.2, the posterior density depends on the prior density function and likelihood function. The most widely used technique to describe the prior density and likelihood functions is the meta Gaussian model. In this model, the NQT method (Bogner et al. 2012) is applied to convert both actual flow Hn and predicted flow Sn into the Gaussian space. Then the transformed {hn/h0} and {sn/h0, hn} are assumed to be linear and normally distributed. Subsequently, linear regression method is then employed to determine the posterior density of Hn in the transformed Gaussian space, from which the posterior density function of Hn in the original space can be found. For the sake of the following comparison, the detailed procedure is presented as follows (Krzysztofowicz and Kelly 2000).

8.2.2.1 Normal Quantile Transform

Specifying and determining marginal distributions of the actual flow H0, Hn and predicted flow {Sn: n = 1, …, N} is the first step. The actual flows {Hn: n = 0, 1, …, N} are considered as random variables. Given only such a record, there is usually no basis for assigning a probability distribution to flow Hn that differs from the distribution assigned to flow H0, for any n = 1, 2, …, N within a few days. In other words, there is no a statistical difference between these 1 + N flow series (Koutsoyiannis and Montanari 2015). Therefore, we hold the opinion that the variables Hn follow the same marginal cumulative distribution functions (CDF) with H0, and thus only the CDF of H0 needed to be fitted. The predicted flows {Sn: n = 1, …, N} are considered as different random variables and different CDFs needed to be fitted for variable Sn.

Let \(\Gamma\) and \(\overline{\Lambda }_{n}\) be the CDF of H0 and Sn with corresponding densities \(\gamma\) and \(\bar{\lambda }_{n}\), respectively. The NQT of a variate is defined as a composition of the inverse of the standard normal distribution Q, and the CDF of the variate is assumed to be strictly increasing. The transformed variates are

$$W_{n} = Q^{ - 1} [\Gamma (H_{n} )],\quad n = 0,1, \ldots ,N$$
(8.3)
$$X_{n} = Q^{ - 1} [\overline{\Lambda }_{n} (S_{n} )],\quad n = 0,1, \ldots ,N$$
(8.4)

where Wn and Xn are the normal quantiles of Hn and Sn, respectively. Q−1 is the inverse function of Q.

8.2.2.2 Modeling in the Transformed Space

  1. (1)

    Prior density

The model for the prior density rests on the assumption that the actual river discharge process in the transformed space is governed by the normal-linear equation.

$$W_{n} = cW_{n - 1} +\Xi$$
(8.5)

where c is a parameter and \(\Xi\) is a variate stochastically independent of Wn−1 and normally distributed with mean zero and variance 1 − c2. Consequently, the conditional mean and variance are

$$E(W_{n} |W_{n - 1} = w_{n - 1} ) = cw_{n - 1}$$
(8.6)
$$Var(W_{n} |W_{n - 1} = w_{n - 1} ) = 1 - c^{2}$$
(8.7)

The prior density for lead time n takes the form

$$g_{{Q_{n} }} (w_{n} |w_{0} ) = \frac{1}{{(1 - c^{2n} )^{1/2} }} \cdot q\left[ {\frac{{w_{n} - c^{n} w_{0} }}{{(1 - c^{2n} )^{1/2} }}} \right]$$
(8.8)

where q denotes the standard normal density and subscript Qn denotes a density in the space of transformed variants.

  1. (2)

    Likelihood function

The model for the likelihood function rests on the assumption that the stochastic dependence between the transformed variate is governed by the normal-linear equation

$$X_{n} = a_{n} W_{n} + d_{n} W_{0} + b_{n}\Theta _{n}$$
(8.9)

where \(a_{n} ,b_{n}\) and \(d_{n}\) are parameters and \(\Theta _{n}\) is a stochastically independent variate of \((W_{n} ,W_{0} )\) and normally distributed with mean zero and variance \(\delta_{n}^{2}\). Consequently, the conditional mean and variance are

$$E(X_{n} |W_{n} = w_{n} ,W_{0} = w_{0} ) = a_{n} w_{n} + d_{n} w_{0} + b_{n}$$
(8.10)
$$Var(X_{n} |W_{n} = w_{n} ,W_{0} = w_{0} ) = \delta_{n}^{2}$$
(8.11)

The conditional density function is

$$f_{{Q_{n} }} (x_{n} |w_{n} ,w_{0} ) = \frac{1}{{\delta_{n} }} \cdot q\left( {\frac{{x_{n} - a_{n} w_{n} - d_{n} w_{0} - b_{n} }}{{\delta_{n} }}} \right)$$
(8.12)
  1. (3)

    Posterior density

The posterior density derived from the prior density and likelihood function takes the form as follows

$$\varphi_{{Q_{n} }} (w_{n} |x_{n} ,w_{0} ) = \frac{1}{{T_{n} }}q\left( {\frac{{w_{n} - A_{n} x_{n} - D_{n} w_{0} - B_{n} }}{{T_{n} }}} \right)$$
(8.13)

In which \(A_{n} = \frac{{a_{n} t_{n}^{2} }}{{a_{n}^{2} t_{n}^{2} + \delta_{n}^{2} }},B_{n} = \frac{{ - a_{n} b_{n} t_{n}^{2} }}{{a_{n}^{2} t_{n}^{2} + \delta_{n}^{2} }}\), \(D_{n} = \frac{{c_{n} \delta_{n}^{2} - a_{n} d_{n} t_{n}^{2} }}{{a_{n}^{2} t_{n}^{2} + \delta_{n}^{2} }}\), \(T_{n}^{2} = \frac{{t_{n}^{2} \delta_{n}^{2} }}{{a_{n}^{2} t_{n}^{2} + \delta_{n}^{2} }}\), and \(t_{n}^{2} = 1 - c^{2n}\).

8.2.2.3 Posterior Density and Distribution in the Original Space

With all densities in the transformed space belonging to the Gaussian family, all densities in the original space belong to the meta-Gaussian family. The meta-Gaussian posterior density of actual river discharge conditional on model output discharge S0 = s0 and observed river discharge H0 = h0 takes the form

$$\begin{aligned} \phi_{n} (h_{n} |s_{n} ,h_{0} ) & = \frac{{\gamma (h_{n} )}}{{T_{n} \cdot q\left\{ {Q^{ - 1} [\Gamma (h_{n} )]} \right\}}} \\ & \quad \cdot q\left\{ {\frac{{Q^{ - 1} [\Gamma (h_{n} )] - A_{n} \cdot Q^{ - 1} [\overline{\Lambda }_{n} (s_{n} )] - D_{n} \cdot Q^{ - 1} [\Gamma (h_{0} )] - B_{n} }}{{T_{n} }}} \right\} \\ \end{aligned}$$
(8.14)

The corresponding meta-Gaussian posterior distribution takes the form

$$\Phi _{n} (h_{n} |s_{n} ,h_{0} ) = Q\left\{ {\frac{{Q^{ - 1} [\Gamma (h_{n} )] - A_{n} \cdot Q^{ - 1} [\overline{\Lambda }_{n} (s_{n} )] - D_{n} \cdot Q^{ - 1} [\Gamma (h_{0} )] - B_{n} }}{{T_{n} }}} \right\}$$
(8.15)

8.2.3 Copula-Based HUP

Copula function is an effective tool used to develop prior distribution and likelihood function models, in which the predictand and the deterministic forecasts are allowed to have distribution functions of any form, along with nonlinear and heteroscedastic dependence structure. Therefore, it can be implemented in the original space directly without a data transformation procedure into Gaussian space. The copula function and theory have been introduced in detail in Chap. 2.

8.2.3.1 Prior Density

The prior CDF of Hn given H0 = h0 can be expressed as

$$G_{n} (h_{n} |h_{0} ) = P(H_{n} \le h_{n} |H_{0} = h_{0} )$$
(8.16)

where Gn(hn/h0) is the conditional CDF, and P is the non-exceedance probability.

The prior density function gn(hn/h0) is the corresponding probability density function (PDF) of gn(hn/h0) and can be defined as

$$g_{n} (h_{n} |h_{0} ) = \frac{{dG_{n} (h_{n} |h_{0} )}}{{dh_{n} }}$$
(8.17)

Let H0 and Hn be random variables with marginal CDFs, U1 = FH(H0) and U2 = FH(Hn). Then, U1 and U2 are uniformly distributed random variables; and u1 denotes a specific value of U1, and u2 denotes a specific value of U2.

Using the copula function, the joint CDF is expressed by \(G_{n} (h_{n} ,h_{0} ) = C(F_{H} (h_{0} ),F_{H} (h_{n} )) = C(u_{1} ,u_{2} )\)

The conditional CDF Gn(hn/h0) and PDF gn(hn/h0) can be rewritten as follows (Zhang and Singh 2006)

$$G_{n} (h_{n} |h_{0} ) = P(U_{2} \le u_{2} |U_{1} = u_{1} ) = \frac{{\partial C(u_{1} ,u_{2} )}}{{\partial u_{1} }}$$
(8.18)
$$g_{n} (h_{n} |h_{0} ) = \frac{{\partial^{2} C(u_{1} ,u_{2} )}}{{\partial u_{1} \partial u_{2} }} \cdot \frac{{du_{2} }}{{\partial h_{n} }} = c(u_{1} ,u_{2} ) \cdot f_{H} (h_{n} )$$
(8.19)

where \(c(u_{1} ,u_{2} )\) is the density function of \(C(u_{1} ,u_{2} )\), and \(c(u_{1} ,u_{2} ) = \partial^{2} C(u_{1} ,u_{2} )/\partial u_{1} \partial u_{2}\); \(f_{H} (h_{n} )\) is the PDF of \(H_{n}\). Equation 8.19 is the expression of the prior PDF.

8.2.3.2 Likelihood Function

It is considered that Sn is a random variable with marginal CDF \(u_{3} = F_{{S_{n} }} (s_{n} )\) and PDF \(f_{{S_{n} }} (s_{n} )\). The conditional CDF of Sn given H0 = h0 and Hn = hn can be expressed as

$$F_{n} (s_{n} |h_{0} ,h_{n} ) = P(S_{n} \le s_{n} |H_{0} = h_{0} ,H_{n} = h_{n} )$$
(8.20)

where \(F_{n} (s_{n} |h_{0} ,h_{n} )\) is the conditional CDF.

The corresponding PDF of \(F_{n} (s_{n} |h_{0} ,h_{n} )\) is defined as

$$f_{n} (s_{n} |h_{0} ,h_{n} ) = \frac{{dF_{n} (s_{n} |h_{0} ,h_{n} )}}{{ds_{n} }}$$
(8.21)

Using the copula function, the joint CDFs of H0, Hn and Sn, denoted as Fn(h0, hn, sn) can be expressed as \(F_{k} \left( {h_{0} ,h_{n} ,s_{n} } \right) = C(F_{{H_{0} }} (h_{0} ),F_{{H_{n} }} (h_{n} ),F_{{S_{n} }} (s_{n} )) = C(u_{1} ,u_{2} ,u_{3} )\). Thus, the conditional CDF \(F_{n} \left( {s_{n} |h_{0} ,h_{n} } \right)\) and PDF \(f_{n} \left( {s_{n} |h_{0} ,h_{n} } \right)\) are rewritten as follows, (Zhang and Singh 2007c)

$$F_{n} \left( {s_{n} |h_{0} ,h_{n} } \right) = P(U_{3} \le u_{3} |U_{1} = u_{1} ,U_{2} = u_{2} ) = \frac{{\partial^{2} C(u_{1} ,u_{2} ,u_{3} )/\partial u_{1} \partial u_{2} }}{{c(u_{1} ,u_{2} )}}$$
(8.22)
$$f_{n} \left( {s_{n} |h_{0} ,h_{n} } \right) = \frac{1}{{c(u_{1} ,u_{2} )}} \cdot \frac{{\partial^{3} C(u_{1} ,u_{2} ,u_{3} )}}{{\partial u_{1} \partial u_{2} \partial u_{3} }} \cdot \frac{{du_{3} }}{{ds_{n} }} = \frac{{c(u_{1} ,u_{2} ,u_{3} )}}{{c(u_{1} ,u_{2} )}} \cdot f_{{S_{n} }} (s_{n} )$$
(8.23)

where \(c(u_{1} ,u_{2} ,u_{3} ) = \partial^{3} C(u_{1} ,u_{2} ,u_{3} )/\partial u_{1} \partial u_{2} \partial u_{3}\) is the density function of C(u1, u2, u3). From another point of view, given H0 = h0 and S0 = s0, the likelihood function of Hn can be calculated by Eq. 8.23.

8.2.3.3 Posterior Density

Substitute Eqs. 8.19 and 8.23 to Eqs. 8.1 and 8.2, then the posterior density function of Hn can be rewritten as follows

$$\phi_{n} (h_{n} |h_{0} ,s_{n} ) = \frac{{c(u_{1} ,u_{2} ,u_{3} )}}{{\int_{0}^{1} {c(u_{1} ,u_{2} ,u_{3} )du_{2} } }} \cdot f_{H} (h_{n} )$$
(8.24)

For fixed realizations \(H_{0} = h_{0}\) and \(S_{n} = s_{n} ,u_{1}\) and \(u_{3}\) are constants, while \(u_{2}\) varies from 0 to 1. Since the denominator \(\int_{0}^{1} {c(u_{1} ,u_{2} ,u_{3} )du_{2} }\) cannot be obtained directly by an analytic method, the Monte Carlo sampling technique (Yu et al. 2014; Xiong et al. 2014) is applied by following steps: (1) Generate M random numbers \(u_{2}\) from uniform distribution U(0, 1); (2) Compute the value of C(u1, u2, u3); (3) The mean value of the M calculated C(u1, u2, u3) equals to the definite integral \(\int_{0}^{1} {c(u_{1} ,u_{2} ,u_{3} )du_{2} }\) approximately (Robert and Casella 2013; Kroese et al. 2013). Subsequently, the posterior density function \(\phi_{n} (h_{n} |h_{0} ,s_{n} )\) can also be estimated.

8.2.3.4 Candidate Marginal Distributions and Trivariate Copulas

The main purpose of this study aims to extrapolate the extreme events far beyond the observations. The probability distribution of daily flows refers to the flow duration curve, which gives a summary of flow variability at a site and is interpreted as a relationship between any discharge value and the percentage of time that this discharge is equaled or exceeded during a given period (Vogel and Fennessey 1994; Castellarin et al. 2004; Shao et al. 2009). Flow-duration curve has been widely used by engineers and hydrologists around the world in numerous applications, such as hydropower generation, inflow forecasting, and designing of irrigation systems (Vogel and Fennessey 1995; Yokoo and Sivapalan 2011; Gottschalk et al. 2013).

Even though flow-duration curve can be defined and constructed for different time scales, such as daily, weekly or monthly stream flows, our study will focus on a daily flow-duration curve. If the daily streamflow is assumed to be a random variable, the flow-duration curve may also be viewed as the complement of the cumulative distribution function used in hydrologic frequency analysis when identifying the percentage of time with probability (Castellarin et al. 2004). As a consequence, the flow-duration curve is also a very practical tool used to describe hydrological regimes and represents the relationship between magnitude and frequency of flow (Vogel and Fennessey 1995; Liucci et al. 2014; Xiong et al. 2015).

Six commonly used distributions in hydrology, namely Normal, GMA, Gumbel, P-III, Log-Normal and Log-Weibull, are selected as candidate models for H0 and Sn (n = 1, …, N). These univariate probability distributions are summarized in Table 1.1 of Chap. 1. L-moment method is used to estimate the distribution parameters for given data series (Hosking 1990). The Kolmogorov-Smirnov statistic D is used to measure the goodness of fit between the hypothesized distribution and the empirical distribution (Tsai et al. 2001; Arya et al. 2010). In this study, the 95% confidence level is selected to reject or accept a fitted distribution. The probability distribution which provides the minimum D value is chosen as the best fitting distribution.

To estimate of the posterior density functions expressed in Eq. 8.24, three-dimension joint distributions of H0, Hn and Sn are needed to be constructed. The symmetric copulas are not considered because the dependence among the three variables pairs (H0, Hn), (H0, Sn) and (Hn, Sn) are not the same, which will be tested against data for the case study. Hence, we use three widely used asymmetric trivariate Archimedean copulas, namely Gumbel-Hougaard, Frank and Clayton as candidates. These three trivariate Archimedean copulas are described in Table 2.2 of Chap. 2. Dependence parameters of the trivariate copula functions are estimated using the maximum pseudo-likelihood method (Zhang and Singh 2007b, c; Chen et al. 2010). The RMSE is used to measure the goodness of fit of the copula distribution (Zhang and Singh 2007a). The copula which has the smallest RMSE value is preferred.

8.2.4 Evaluation Criteria

8.2.4.1 Performances of Deterministic Forecasts

Two widely applied criteria, namely Nash-Sutcliffe efficiency (NSE) and Relative Error (RE) are adopted to evaluate the performance of the deterministic forecast model (Xiong and Guo 1999; Liu et al. 2016).

  1. (1)

    Nash-Sutcliffe efficiency

The first criterion is the Nash-Sutcliffe efficiency (NSE) coefficient (Nash and Sutcliffe 1970) which is defined by

$$NSE = \left[ {1 - \frac{{\sum\nolimits_{t = 1}^{T} {(h_{t} - s_{t} )^{2} } }}{{\sum\nolimits_{t = 1}^{T} {(h_{t} - \bar{h})^{2} } }}} \right] \times 100{\% }$$
(8.25)

where t is the time step, T is the total number of time steps; \(h_{t}\) and \(s_{t}\) are the simulated and observed discharges at time t, and \(\bar{h}\) is the mean value of the observed discharge. Nash-Sutcliffe efficiency can range from −∞ to 1. An efficiency of 1 (NSE = 1) corresponds to a perfect match of simulated discharge to the observed data. An efficiency of 0 (NSE = 0) indicates that the model predictions are as accurate as the mean of the observed data, whereas an efficiency less than zero (NSE < 0) occurs when the observed mean is a better predictor than the model. Essentially, the closer the model efficiency is to 1, the more accurate the model is.

  1. (2)

    Relative error

The second criterion used is the relative error (RE) of the total runoff amount fit between the observed and simulated discharge series, defined as (Xiong and Guo 1999)

$$RE = \left[ {1 - \frac{{\sum\nolimits_{t = 1}^{T} {(h_{t} - s_{t} )} }}{{\sum\nolimits_{t = 1}^{T} {h_{t} } }}} \right] \times 100{\% }$$
(8.26)

RE represents a systematic error of water balance simulation. A value of RE closes to zero indicates a good agreement between observed and simulated runoff volume. In this study, we rank NSE as the primary criterion, while RE is an auxiliary criterion. Only when simulated discharge series yield the same (higher) NSE value, the one with the smaller RE value is preferred. Otherwise, the simulation with smaller RE value does not reveal any superiority (Liu et al. 2016). For instance, the model with all simulated discharges equal to the mean of observed values can easily provide RE = 0. Unfortunately, in this case, the NSE = 0, which clearly means an undesired simulation.

8.2.4.2 Performances of Probabilistic Forecasts

The probabilistic forecast technique is expected to provide (a) accurate forecast probabilities, further on named reliability; and (b) narrow forecast intervals, further on the named resolution. Several methods, e.g., predictive quantile-quantile (QQ) plot, α-index and π-index have been proposed in the literatures to evaluate probabilistic forecasts (see e.g. Gneiting et al. 2007; Laio and Tamea 2007; Thyer et al. 2009; Engeland et al. 2010; Renard et al. 2010; Madadgar et al. 2014; Smith et al. 2015) and are used in this study.

  1. (1)

    Predictive QQ plot

The predictive quantile-quantile (QQ) plot provides an overall assessment of whether the total predictive uncertainty is consistent with the observations. This requires a diagnostic approach that compares a time-varying distribution (the predictive distribution at all times t) to a time series of observations (Thyer et al. 2009; Evin et al. 2014). The predictive QQ plot provides a simple, intuitive and informative summary of the performance of probabilistic prediction frameworks (Gneiting et al. 2007; Laio and Tamea 2007).

The predictive QQ plot is constructed as follows: Let Ft be the CDF of the predictive distribution of runoff at time t, and ht the corresponding observed runoff. If the hypotheses in the calibration framework are consistent with the data, the observed value ht should be consistent with the distribution Ft. Hence, under the assumption that the observation ht is a realization of the predictive distribution, the p-value Ft(ht) is a realization of a uniform distribution on [0,1]. The predictive QQ plot compares the empirical CDF of the sample of p values Ft(ht) (t = 1, …, T) with the CDF of a uniform distribution to assess whether the hypotheses are consistent with the observations.

As illustrated in Fig. 8.1, the predictive QQ plot can be interpreted as follows (Thyer et al. 2009): (1) if all points fall on the 1:1 line, the predicted distribution agrees perfectly with the observations; (2) If the observed p values cluster around the mid-range (i.e., a low slope around theoretical quantile 0.4–0.6), the predictive uncertainty is overestimated; (3) If the observed p values cluster around the tails (i.e., a high slope around theoretical quantile 0.4–0.6), the predictive uncertainty is underestimated; (4) If the observed p values at the theoretical median are higher/lower than the theoretical quantiles, the modeled predictions systematically under/over predict the observed data.

Fig. 8.1
figure 1

Interpretation of the predictive QQ plot

Other metrics are the supportive quantitative scores derived from the predictive QQ plot (Laio and Tamea 2007; Thyer et al. 2009; Madadgar et al. 2014). The metrics α-index assesses the reliability of forecasts, and π-index indicates the resolution (precision, sharpness) of the predictive distribution (PD).

  1. (2)

    Reliability

Reliability means that the forecast should be well calibrated. This can be checked graphically: deviations from the bisector (the 1:1 line) denote interpreted deficiencies (see Fig. 8.1). To simplify the comparison of QQ plots, it is summarized using an index that quantifies the reliability of the PD (Renard et al. 2010; Madadgar et al. 2014):

$$\alpha {-} {\text{ index}} = 1 - \frac{2}{T}\sum\limits_{t = 1}^{T} {\left[ {\left| {q^{em} (p_{t} ) - q^{th} (p_{t} )} \right|} \right]}$$
(8.27)

where \(p_{t}\) is the observed p-value at time t; \(q^{em} (h_{t} )\) is the empirical quantile of \(p_{t} ,q_{t}^{th} (h_{t} )\) is the theoretical quantile of \(p_{t}\) obtained from the uniform distribution U[0, 1]; T is the number of \(p_{t}\) values.

The α-index measures the closeness of quantile plot of the observations to the corresponding uniform quantiles and reflects the overall reliability of the PD. According to Thyer et al. (2009), as the area between the empirical CDF of the observed p-values and the CDF of the uniform distribution in the predictive QQ plot becomes larger, the value of α-index decreases towards zero. It varies between 0 (worst reliability) and 1 (perfect reliability).

  1. (3)

    Resolution

“Resolution” denotes the sharpness (effectively, the “average precision”) of the PD. Note that two inferences can both yield reliable PDs, but with different resolutions. Sharpness refers to the spread of the forecast PDFs and is a property of the predictions only. The more concentrated the forecast PDF, the sharper the forecast, and the sharper the better, subject to calibration (Gneiting et al. 2005). In this paper, the resolution is quantified by π-index defined as the average relative precision of the predictions (Renard et al. 2010; Madadgar et al. 2014):

$$\pi {-} {\text{ index}} = \frac{1}{T}\sum\limits_{t = 1}^{T} {\frac{{E[H_{t} ]}}{{Sdev[H_{t} ]}}}$$
(8.28)

where E[Ht] and Sdev[Ht] are the expected value and standard deviation of Ht obtained from the predictive distribution at time t.

Greater value of π-index indicates greater resolution (lower uncertainty) of forecasts. However, comparison of sharpness may not be a meaningful approach when the employed methods do not primarily perform equally in the α-index metric. Assuming that precision has lower priority than reliability, given similar forecast reliability, the method with greater resolution (lower uncertainty) is preferred; otherwise, the method with higher resolution does not reveal any superiority. Most of literature rank reliability as the primary criterion, while sharpness is secondary to reliability (Madadgar et al. 2014).

  1. (4)

    Continuous rank probability score

The goal of probabilistic forecasting is to maximize the sharpness of the forecast PDFs subject to calibration. However, the trade-off between reliability and sharpness have been discussed in previous researches (Xiong et al. 2009; Li et al. 2010a; Kasiviswanathan et al. 2013), which show that these two desirable objectives could not be achieved simultaneously. It is not adequate to judge the performances of probabilistic forecasts only by reliability or sharpness. The continuous rank probability score (CRPS) is a standard measure that combines reliability and sharpness (Hersbach 2000; Gneiting et al. 2005) and is used for selecting the preferred model.

The CRPS measures the average distance between the predicted and the observed CDFs over the entire period. It is the integral of the Brier scores at all possible threshold values r for the continuous predictand (Hersbach 2000). Specifically, if F is the predictive CDF and ht is the verifying observation, the CRPS is defined as (Hersbach 2000; Gneiting et al. 2007; Pappenberger et al. 2015)

$$CRPS = \frac{1}{T}\sum\limits_{t = 1}^{T} {\int\limits_{ - \infty }^{ + \infty } {\left[ {F_{t} (r) - H_{s} (r - h_{t} )} \right]^{2} dr} }$$
(8.29)

where Hs(r − ht) denotes the Heaviside step function and takes the value 0 when r < ht and the value one otherwise.

For a deterministic forecast system, the CRPS reduces to the mean absolute error (MAE). Thus, the CRPS is sometimes interpreted as a generalized version of the MAE (Zhao et al. 2015). This is an advantage of CRPS and consequently allows the comparison of deterministic and probabilistic forecasts (Gneiting et al. 2007; Pappenberger et al. 2015). The smaller the CRPS value is, the better the prediction performance. Its minimal value of zero is only achieved in the case of a perfect deterministic forecast.

8.2.5 Case Studies

8.2.5.1 Study Area and Data

Three Gorges Reservoir (TGR) is a vitally important and back-bone project in the development and harnessing of the Yangtze River in China. The annual average discharge and runoff volume at the dam site are 14,300 m3/s and 4510 × 108 m3, respectively. The total storage capacity of the TGR is 393 × 108 m3, of which 221.5 × 108 m3 is flood control storage. The reservoir has a surface area of about 1080 km2, an average width of about 1100 m, a mean depth of about 70 m and a maximum depth near the dam of about 170 m. With all the profiles being narrow and deep, the TGR retains the long narrow belt shape of the original river section and is a typical river channel-type reservoir.

As shown in Fig. 8.2, the intervening basin of TGR has a catchment area of 55,907 km2, about 5.6% of the upstream Yangtze River basin. There are 40 rainfall gauged stations in the intervening basin and two hydrological stations (Cuntan and Wulong), which control the upstream inflow and tributary inflow, respectively. The data set for TGR inflow forecasting includes the daily runoff data of the Cuntan, Wulong and Yichang hydrological stations, arithmetic mean of observed rainfall data in the intervening basin during the flood period (June 1–September 30) from 2003 to 2009. The period 2003–2007 is used for deterministic forecast model calibration and 2008–2009 is used for validation (Li et al. 2010b; Chen et al. 2015).

Fig. 8.2
figure 2

Sketch map of the TGR’s intervening basin

8.2.5.2 Deterministic Inflow Forecasts of the TGR

The inflow of TGR consists of three components, i.e., the main upstream inflow, the tributary inflow from the Wu River, and the lateral flow from the TGR intervening basin as shown in Fig. 8.2. A multiple-input single-output linear systematic model is chosen for the inflow forecasting of the TGR (Liang et al. 1992). The total inflow to the TGR can be expressed by the following equation

$$\widehat{Q}_{t} = A\sum\limits_{j = 1}^{{m_{1} }} {R_{t - j + 1}^{(1)} h_{j}^{(1)} } + \sum\limits_{j = 1}^{{m_{2} }} {R_{t - j + 1}^{(2)} h_{j}^{(2)} }$$
(8.30)

where \(R_{j}^{(1)}\) is the lateral flow from the TGR intervening basin which is calculated via the Xinanjiang model (Zhao 1992). \(R_{j}^{(2)}\) is the upstream inflow (inflow at Wulong is added to the inflow at Cuntan). A is the area of the TGR intervening basin, \(m_{1} ,m_{2}\) are the memory length of the system corresponding to \(R_{j}^{(1)}\) and \(R_{j}^{(2)} ,h_{j}^{(1)} \,{\text{and}}\,h_{j}^{(2)}\) are the jth ordinates of the pulse response functions relating inputs \(R_{j}^{(1)}\) and \(R_{j}^{(2)}\), which are calculated by the Nash model as follows

$$h_{j}^{(1)} = \frac{1}{T}\int\limits_{(j - 1)T}^{jT} {\left[ {S_{i} (t) - S_{i} (t - T)} \right]/T\,dt} \quad (i = 1,2)$$
(8.31)
$$S_{i} (t) = \int\limits_{0}^{t} {\frac{1}{{NK_{i}\Gamma (NK_{i} )}}e^{{ - (\tau /NK_{i} )}} \left( {\tau \backslash NK_{i} } \right)^{{N_{i} - 1}} d\tau } \quad (i = 1,2)$$
(8.32)

where Si(t) is the step response function of the ith input, Ni and NKi are the parameters, and T is the time-step. \(\Gamma ( \bullet )\) is the gamma function.

The Xinanjiang model was developed in the middle 1970s for forecasting flows in the Xinanjiang reservoir, China. The model has been widely applied for flood forecasting in a large number of basins all over the world, especially in China. Until now, this model is the most popular rainfall-runoff hydrologic model in China for streamflow forecasting in humid and semi-humid areas. Its main feature is the concept of runoff formation on the repletion of storage, which denotes that runoff is not produced until the soil moisture content of the aeration zone reaches field capacity (Zhao 1992; Xu et al. 2013). The Xinanjiang model includes two components, namely, runoff generation and runoff routing. It has 17 parameters that include seven runoff generating component parameters and 10 runoff routing parameters. These parameters are abstract conceptual representations of non-measurable watershed characteristics that have to be calibrated by an optimization method. Figure 8.3 shows the flowchart of the Xinanjiang model for three water sources. All symbols inside the blocks are variables including inputs, outputs, state variables and internal variables while those outside the block are parameters (Zhao 1992; Cheng et al. 2006; Li et al. 2011; Lin et al. 2014; Si et al. 2015).

Fig. 8.3
figure 3

The flow chart of Xinanjiang model for three water sources

The deterministic forecast model is calibrated respectively by taking NSE and RE as objective functions via automatic calibration methods with multiple objectives (Madsen 2000). Table 8.1 presents the calibrated parameters obtained for the Xinanjiang model of the TGR intervening basin and multiple-input single-output linear systematic model for TGR. The simulation results for the NSE and RE in the calibration period are 97.72 and −1.04%, respectively. Meanwhile, in the verification period, the NSE and RE are 95.84 and −0.21%, respectively. These results show that the deterministic forecast model is proved to be quite efficient in simulating the inflow series for the TGR. The deterministic forecasts obtained from the well-calibrated deterministic model are subsequently applied to produce the probabilistic forecasts through the meta-Gaussian HUP and copula-based HUP.

Table 8.1 Estimated parameters of Xinanjiang model in the TGR intervene basin

8.2.5.3 Determination of Marginal Distributions

In this study, future rainfalls are treated as the case of perfect foreknowledge, rather than using the real forecast rainfalls to obtain simulated flows in the future, when the established deterministic forecast model is operated in the real-time forecasting mode. This is only for the illustration purpose if forecast rainfalls are available in reality and these would be used. The forecast lead times are 24 h (n = 1), 48 h (n = 2), and 72 h (n = 3). Especially, for each forecasting time in the record, the recorded rainfall data of 24-, 48- and 72- later followed by this forecasting time are treated as the “deterministic rainfall forecasts” (i.e., assuming perfectly known rainfalls in the future). Then these perfect forecasts of the rainfalls are input to the well-calibrated deterministic forecast model, which in turn produced model inflows (s1, s2, s3). They are attached to actual inflow (h0, h1, h2, h3) to obtain one joint realization of the model-actual inflow process. The dataset from 2003 to 2009 are used to calibrate and compare the meta-Gaussian HUP and copula-based HUP.

The sample series of H0 is taken from June 1 to Sept. 27 every year, S1 from June 2 to Sept. 28, S2 from June 3 to Sept. 29, and S3 from June 4 to Sept. 30, thus all these four variables will have a data length of 833. The parameters of six candidate distributions are estimated by the L-moment method, and the K-S tests are used to verify the null hypothesis. The null hypothesis could not be rejected at the 95% confidence level (critical value is 0.0471) for all six candidate distributions except Normal distribution. For the four hydrological variables, Gumbel distribution provides the minimum D value and is chosen as the best fitting distribution, respectively. Figure 8.4 shows the empirical CDF values obtained from the Gringorten plotting-position formula (Zhang and Singh 2006) and theoretical CDF values calculated by the Gumbel distributions. It can be seen that the theoretical values fit the empirical values very well. For comparison purposes, the copula-based HUP used the same marginal distributions as the meta-Gaussian HUP.

Fig. 8.4
figure 4

Empirical and theoretical values fitted by Gumbel distributions

8.2.5.4 Calibration of Meta-Gaussian HUP

For the given climatic record of actual flows, the joint sample {(h0, h1)} is formed of realizations on two consecutive days. Each joint realization (h0, h1) is processed through the empirical NQT to obtain the transformed joint sample {(w0, w1)} and this joint sample is used to estimate the Pearson’s correlation coefficient c. The advantage of using the empirical distributions in the NQT (instead of the parametric distribution) is that the estimate of c remains unaffected by the choice and the goodness of fit of the parametric model. The estimated result of Pearson’s correlation coefficient is 0.951.

The procedure for validating the meta-Gaussian dependence structure for the likelihood function parallels the procedure described in the prior density section above. The NQT performs adequately, as the empirical structure of dependence between Xn, Wn and W0, appears to be linear and homoscedastic. The meta-Gaussian model for the likelihood function captures the nonlinearity and heteroscedasticity of the dependence structure between Sn, Hn and H0.

8.2.5.5 Calibration of Copula-Based HUP

The rank-based correlation (Kendall’s coefficient) matrix of variables, H0 Hn and Sn are shown in Table 8.2. It is demonstrated that the dependence among the three variables pairs (H0, Hn), (H0, Sn) and (Hn, Sn) are not the same. Furthermore, the highest correlation coefficient is exhibited in the variables pair (Hn, Sn). This result indicates that rather than symmetric, the asymmetric trivariate copula functions may be more appropriate to be used to three-dimension joint distributions of H0, Hn and Sn. When constructing the three-dimension joint distributions using the asymmetric copula functions, the structures (Hn, Sn)H0 are applied. Specifically, copula was firstly built for (Hn, Sn), and then for H0 and C(FHn(hn, Fsn(sn))).

Table 8.2 Ranked based correlation matrix of the variables

The three-dimension joint distributions of H0, Hn and Sn (n = 1, 2, 3) are constructed using the three candidate trivariate copula functions. Dependence parameters of the trivariate copula functions are estimated using the maximum pseudo-likelihood method, and the results are listed in Table 8.3. It is found that Frank copula performs best with the smallest RMSE values for the three joint distributions. Empirical CDFs obtained from the Gringorten plotting-position formula and theoretical CDFs calculated from Frank copula for three joint distributions are plotted in Fig. 8.5. An overall satisfactory agreement between the empirical and theoretical CDF is shown. Hence, the asymmetric trivariate Frank copula functions have good performances in modeling the joint distributions of H0, Hn and Sn.

Table 8.3 Estimated parameters of the three candidate copulas
Fig. 8.5
figure 5

Note Rank represents number of ordered pair, ranked in the ascending order in terms of theoretical joint CDF, respectively

Plots of empirical and theoretical values estimated by Frank copulas for three joint CDFs.

8.2.5.6 Comparison of the Meta-Gaussian HUP and Copula-Based HUP

  1. (1)

    Posterior median forecasts

For 24-, 48- and 72 h lead times, the model efficiency NSE and relative error RE calculated by both the deterministic forecast model and posterior median forecasting associated with the meta-Gaussian HUP and copula-based HUP are listed in Table 8.4. It is shown that both the results of the meta-Gaussian HUP and copula-based HUP are slightly better than those of the deterministic forecast model, and the copula-based HUP is comparable to the meta-Gaussian HUP. Compared with deterministic forecasts, the NSE and the RE of the copula-based HUP for 24-, 48- and 72 h lead times forecasts are improved by 1.24, 1.26 and 1.26% and reduced by 0.17, 0.57, and 1.72%, respectively. It is also noted that the accuracy of posterior median forecasts of the both HUPs decreases as the lead time increases.

Table 8.4 Comparison of performances evaluation criteria for deterministic forecasts
  1. (2)

    Probabilistic forecasts

The predictive QQ plot, α-index, π-index and CRPS are adopted to evaluate the probabilistic forecasts. Figure 8.6 presents the predictive QQ plots regarding the meta-Gaussian HUP and copula-based HUP for 24-, 48- and 72 h lead times. Using Fig. 8.1 as a guide to assess the results, it is clear that the overall performances of all predictive QQ plots are acceptable. Both meta-Gaussian HUP and copula-based HUP systematically under-predict the inflows, since the observed p values at the theoretical median are a bit higher than the theoretical quantiles. In addition, it also shows that the observed p values cluster around the tails (i.e., a high slope around theoretical quantile 0.4–0.6). This finding means that the predictive uncertainty is somewhat underestimated for both HUPs. The overall behaviors of meta-Gaussian HUP and copula-based HUP are found to be similar. The QQ plot for copula-based HUP is slightly closer to the 1:1 line than meta-Gaussian HUP. That is to say, the copula-based HUP performs marginally better regarding reliability. Nonetheless, these underestimations for both meta-Gaussian and copula-based HUPs are in such zones where p values are relatively higher, indicating such differences may not be statistically significant.

Fig. 8.6
figure 6

The predictive QQ plots of meta-Gaussian HUP and copula-based HUP

The results of α-index, π-index and CRPS are summarized in Table 8.5. For both meta-Gaussian HUP and copula-based HUP, it is clearly shown that the α-index value increases (higher reliability) when the lead time increases. However, it should be noted that this is at the expense of decreasing π-index value (lower resolution). Besides, the copula-based HUP has slightly larger α-index values while smaller π-index values compared with the meta-Gaussian HUP. Regarding CRPS value, both HUPs outperform the deterministic forecasts which demonstrate the effectiveness of probabilistic forecasts. Comparison results also indicate that the copula-based HUP is marginally better than the meta-Gaussian HUP. The CRPS value of the copula-based HUP for 24-, 48- and 72 h lead times is improved (decreased) by 16.6, 21.2, and 23.3%, respectively.

Table 8.5 Comparison of performances evaluation criteria for probabilistic forecasts

Although that such marginally better performance does not result for each year, for illustration purposes, the observed and median discharges, and 90% inflow prediction intervals estimated by meta-Gaussian HUP and copula-based HUP in 2004 are presented in Figs. 8.7 and 8.8, respectively. It can be seen that most observed inflows are contained within the 90% prediction intervals. This demonstrates that these 90% prediction intervals can effectively capture the forecast uncertainty and provide more information for decision-making in flood control and reservoir operation. As lead time increases, the 90% prediction intervals become wider (i.e., greater uncertainty).

Fig. 8.7
figure 7

The 90% prediction intervals, median and observed discharges in 2004 (meta-Gaussian HUP)

Fig. 8.8
figure 8

The 90% prediction intervals, median and observed discharges in 2004 (copula-based HUP)

8.3 Uncertainty Analysis of Hydrological Multi-model Ensembles Based on CBP-BMA Method

Inspired by the ideas of Madadgar and Moradkhani (2014), a general framework of the combination of copula Bayesian processor with BMA (CBP-BMA) is proposed by He et al. (2018), where the Bayesian theory is applied in the transformation of the posterior distribution. The flowchart of different probability forecast methods based on deterministic models is described in Fig. 8.9.

Fig. 8.9
figure 9

Flowchart of hydrologic multi-model ensembles for uncertainty analysis

8.3.1 Description of the Hydrological Models

Three world-famous conceptual hydrological models are implemented in the Mumahe catchment, including the Xinanjiang (XAJ), HBV and SIMHYD models. The XAJ model has been used in humid and semi-humid region worldwide (Zhao 1992). It consists of a runoff generation component with seven parameters and a routing component with ten parameters. Those model physical parameters represent the abstract conceptual expression of watershed features. The HBV model is a synthetic flow model with 13 parameters needed to be calibrated. Units of HBV model makes up of the routines for snowmelt accumulation, evapotranspiration and soil routine and response function. The core concept assumes runoff volume changes with soil humidity exponentially (Montero et al. 2016). The SIMHYD model is a lumped conceptual hydrological model which contains seven parameters needed to be calibrated. The model divides runoff into three components: surface flow, interflow and base flow. The surface flow is infiltration excess runoff, inter-flow is estimated as a linear function of the soil wetness, and base flow is simulated as a linear recession from the groundwater store (Chiew et al. 2009; Yu and Zhu 2015). The infiltration rate is a core of the model.

8.3.2 Bayesian Model Averaging (BMA)

Raftery et al. (2005) successfully extended BMA to statistical post-processing for forecast ensembles. The BMA method addresses total model uncertainty by conditioning not only on a single outstanding model but on the entire ensemble models. The method was originally proposed as a pathway for method combination of several competing models (Duan et al. 2007; Liang et al. 2011).

According to BMA (Duan et al. 2007), the ensemble predictive density of the actual flow variable q, given the different hydrologic model simulations of K models [S1, S2,…, SK] and the observations during the training period, Q, can be expressed in terms of the law of total probability:

$$p(q|S_{1} ,S_{2} , \ldots ,S_{K} ,Q) = \sum\limits_{i = 1}^{K} {p(S_{i} |Q) \cdot p_{i} (q|S_{i} ,Q)}$$
(8.33)

where p(Si|Q) is the posterior probability of ith model prediction. This static term can also be expressed as wi, reflecting how well the ensemble term fits the observation dataset. It ranges from 0 to 1 since the posterior model probabilities add up to one. Before the implantation of BMA algorithm, the expected value of observation and forecast for each model should be equal zero (E[q − Si] = 0). Any bias-correction method, such as linear regression, should be applied to substitute the bias-corrected forecast (fi) for the original deterministic forecast:

$$f_{i} = a_{i} + b_{i} \cdot S_{i}$$
(8.34)

where {ai, bi} are the coefficients of the linear regression model.

The term pi(q|fi, Q) is the conditional pdf of h based on the bias-corrected simulation fi and the observation dataset. Moreover, the power Box-cox transformation is taken for the computational convenience of using a Gaussian distribution. The posterior distribution pi(q|fi, Q) is mapped to a Gaussian space with mean fi and variance \({\text{s}}_{i}^{2}\); i.e., pi(q|fi, Q) ~ g(q|fi, \(\sigma_{i}^{2}\)). The BMA predictive mean and variance of q are defined as follows (Raftery et al. 2005):

$$E(q|Q) = \sum\limits_{i = 1}^{K} {p(f_{i} |Q) \cdot E[p_{i} (q|f_{i} ,Q)]} = \sum\limits_{i = 1}^{K} {\omega_{i} f_{i} }$$
(8.35)
$$Var(q|Q) = \sum\limits_{i = 1}^{K} {\omega_{i} \left( {f_{i} - \sum\limits_{i = 1}^{K} {\omega_{i} f_{i} } } \right)^{2} } + \sum\limits_{i = 1}^{K} {\omega_{i} \sigma_{i}^{2} }$$
(8.36)

Successful application of the BMA method requires estimations of the weight \(\omega_{i}\) and variance \(\sigma_{i}^{2}\) of the individual pdf. The log maximum likelihood function rather than the likelihood function is optimized for reasons of both numerical stability and algebraic simplicity. If the BMA parameters are estimated by \(\theta = \left\{ {\omega_{i} ,\sigma_{i} ,i = 1,2, \ldots ,K} \right\}\), the log likelihood function of \(\theta\) is mathematically denoted as:

$$l(\theta ) = \log \left( {\sum\limits_{i = 1}^{K} {\omega_{i} \cdot p_{i} (q|f_{i} ,Q)} } \right)$$
(8.37)

After the completion of BMA parameter estimation by the EM algorithm (Duan et al. 2007), another feature of the BMA method is to make use of Monte Carlo method to derive BMA probabilistic ensemble prediction for any time t (Kuczera and Parent 1998). The procedures are described as follows (Zhou et al. 2016).

  1. (1)

    Select the probabilistic ensemble size, M (M = 100).

  2. (2)

    Randomly generate a value of k from the numbers [1, 2, …, K] with probabilities \([\omega_{1} ,\omega_{2} , \ldots ,\omega_{i} ]\). The detail processes are shown as follows: (a) Initial the cumulative weight \(\omega_{0}^{{\prime }} = 0\) and compute \(\omega_{i}^{{\prime }} = \omega_{i - 1}^{{\prime }} + \omega_{i}\) for \(i = 1,2, \ldots ,K\); (b) Generate a random number u between 0 and 1; and (c) If \(\omega_{i - 1}^{{\prime }} \le u \le \omega_{i}^{{\prime }}\), then the ith member of the ensemble predictions are chosen.

  3. (3)

    Generate a value of q from the pdf of pi(q|fi, \(\sigma_{i}^{2}\)).

  4. (4)

    Repeat steps (2) and (3) for M times.

The results are sorted in ascending order, and the 90% confidence interval can be derived within the range of the 5 and 95% quantiles.

8.3.3 The Hybrid Copula-BMA (CBMA)

As illustrated before, the BMA predictive distribution provides a weighted average of simulation pdf which generally complies with a parametric distribution, e.g., Gaussian distribution after the box-cox transformation. Madadgar and Moradkhani (2014) employed copula to estimate the posterior distribution of forecast variables for each model, i.e., \(p_{i} (q|f_{i} ,Q)\), and found that the hydrological forecasts are improved after the integration of copulas and BMA (CBMA). A series of research demonstrates that the procedures of CBMA not only eliminate the prophase bias-correction and the external calculation of variance but also simplify the calculation of the weighted average and the probability model structure by copula (Möller et al. 2013).

Alternatively, in statistical applications, the conditional probability distribution of h given \(s_{i}\) (i = 1, 2, 3) is expressed as (Madadgar and Moradkhani 2014):

$$f(q|s_{i} ) = \frac{{f(q,s_{i} )}}{{f(s_{i} )}} = \frac{{c(u,v_{i} ) \cdot f(q) \cdot f(s_{i} )}}{{f(s_{i} )}} = c(u,v_{i} ) \cdot f(q)$$
(8.38)

where \(c(u,v_{i} )\) is computed for each pair of (u, vi), \(f(q)\) represents the marginal distribution of actual flow. Although different copula families have been proposed and described in current studies (Chebana and Ouarda 2007), several families of Archimedean copulas, including Frank, Gumbel, and Clayton, have been popular choices for dependence models in hydrologic analyses due to their simplicity and generation properties.

The predictive distribution of CBMA is modified as follows (Madadgar and Moradkhani 2014):

$$f(q|s_{1} ,s_{2} , \ldots ,s_{K} ) = \sum\limits_{i = 1}^{K} {\omega_{i} f(q|s_{i} )} = \sum\limits_{i = 1}^{K} {\omega_{i} \cdot c(u,v_{i} ) \cdot f(q)}$$
(8.39)

It can be seen from Eq. 8.39 that it relaxes any assumption on the type of posterior distribution \(f(q|s_{i} )\), whose term can be directly inferred with the help of copula functions. Once the term \(f(q|s_{i} )\) is defined, their weights are estimated by the EM algorithm with a few adjustments, which can refer to Madadgar and Moradkhani (2014) for details.

The hybrid CBMA model applies the idea of “pair and ensemble”. The pair of observation q and the ith model simulation is established to get the probability task by the well-developed copula theory, while the ensemble is to formulate a consensus probability interval.

8.3.4 Copula Bayesian Processor Associated with BMA (CBP-BMA) Method

8.3.4.1 Copula Bayesian Processor (CBP)

Copula Bayesian processor (CBP) is developed as another component of the probabilistic forecasting system in virtue of the integration of Bayesian theory and copula functions. The CBP procedure generates a probabilistic result and quantifies the hydrologic uncertainty under the assumption that input uncertainty is ignored, which refers to hydrologic uncertainty processor (Krzysztofowicz and Kelly 2000). This method also has the advantage of leaving out a data transformation procedure into Gaussian space. The Bayesian procedure based on the law of total probability involves two parts for information revision of uncertainty (Zhang and Singh 2007a, b, c):

  1. (1)

    The expected conditional density function of deterministic simulation, \(S_{i}\) given \(Q = q\) is expressed as:

    $$\kappa (s_{i} |q) = \int {f(s_{i} |q) \cdot g(q)dq}$$
    (8.40)

    where \(f(q|s_{i} )\) has the same conception as before, \(g(q)\) represents the prior density function.

  2. (2)

    The posterior density function conditional on a deterministic result \(S_{i} = s_{i}\) is derived via Bayes’ theorem:

    $$\phi (q|s_{i} ) = \frac{{f(s_{i} |q) \cdot g(q)}}{{\kappa (s_{i} |q)}}$$
    (8.41)

Equations 8.40 and 8.41 could be rewritten by using copula functions, i.e., the CBP form of the right term is mathematically expressed by:

$$\phi (q|s_{i} ) = \frac{{f(s_{i} |q) \cdot g(q)}}{{\int {f(s_{i} |q) \cdot g(q)dq} }} = \frac{{c(u,v_{i} )}}{{\int_{0}^{1} {c(u,v_{i} )du} }} \cdot g(q)$$
(8.42)

The final CBP outputs a posterior distribution of the process, conditional upon the deterministic simulation. Since the analytical solution to the integral term \(\int_{0}^{1} {c(u,v_{i} )du}\) is very complex, the Monte Carlo technique is used to estimate the posterior density function \(\phi (q|s_{i} )\) (Robert and Casella 2011; Kroese et al. 2013).

8.3.4.2 The CBP-BMA Method

The difference between the CBP-BMA and CBMA methods is the estimation procedure of the posterior density function.

$$\phi (q|s_{1} ,s_{2} , \ldots ,s_{K} ) = \sum\limits_{i = 1}^{K} {\omega_{i} \phi (q|s_{i} )} = \sum\limits_{i = 1}^{K} {\omega_{i} \frac{{c(u,v_{i} )}}{{\int_{0}^{1} {c(u,v_{i} )du} }}g(q)}$$
(8.43)

It should be rational to assign weights on account of multiple deterministic results. The calculation process of weights is conducted by the EM algorithm (Montanai and Grossi 2008). The three main steps of the presented weights calculating paradigm can be summarized as follows:

$$\begin{aligned} w_{i}^{Iter} & = \frac{1}{T}\sum\limits_{t = 1}^{T} {z_{i,t}^{Iter} } \\ z_{i,t}^{Iter} & = \frac{{w_{i}^{Iter - 1} \cdot \phi (q_{t} |s_{i,t} )}}{{\sum\nolimits_{i = 1}^{K} {w_{i}^{Iter - 1} \cdot \phi (q_{t} |s_{i,t} )} }} = \frac{{w_{i}^{Iter - 1} \cdot c(u_{t} ,v_{i,t} )g(q_{t} )/\int_{0}^{1} {c(u_{i} ,v_{i,t} )du_{t} } }}{{\sum\nolimits_{i = 1}^{K} {w_{i}^{Iter - 1} \cdot c(u_{t} ,v_{i,t} )g(q_{t} )/\int_{0}^{1} {c(u_{i} ,v_{i,t} )du_{t} } } }} \\ l(\theta_{Iter} ) & = \log \left( {\sum\limits_{i = 1}^{K} {w_{i}^{Iter - 1} } \cdot \sum\limits_{i = 1}^{K} {c(u_{i} ,v_{i,t} )g(q_{t} )} /\int_{0}^{1} {c(u,v_{i,t} )du_{t} } } \right) \\ \end{aligned}$$
(8.44)

where T is the length of the training period; and z is a latent variable. Compared with the standard BMA method, the calculation of variance and data transformations are eliminated in Eq. 8.44. The posterior probability of qt is calculated only once while it need be re-calculated every time in the standard BMA method.

8.3.5 Evaluation Criteria for Multi-model Techniques

8.3.5.1 Deterministic Model Assessment Indices

To evaluate the quality of the deterministic model, three metrics are used.

  1. (1)

    Nash-Sutcliffe efficiency coefficient (NSE), see Eq. 8.25

  2. (2)

    Daily root mean square error (DRMS)

    $$DRMS = \sqrt {\frac{{\sum\nolimits_{i = 1}^{N} {(q_{o}^{i} - q_{m}^{i} )^{2} } }}{T}}$$
    (8.45)

    As the second tool employed is sensitive to the differences between observations and simulations, the values of DRMS approaching to stand for better performance.

  3. (3)

    Kling-Gupta efficiency (KGE)

    $$\begin{aligned} KGE & = 1 - \sqrt {(r - 1)^{2} + (\beta - 1)^{2} + (\gamma - 1)^{2} } \\ \beta & = \overline{{q_{m} }} /\overline{{q_{o} }} \\ \gamma & = CV_{m} /CV_{o} = \frac{{\sigma_{m} /\overline{{q_{m} }} }}{{\sigma_{o} /\overline{{q_{o} }} }} \\ \end{aligned}$$
    (8.46)

    where r is the Pearson correlation between the observation and simulation, b is the bias ratio indicator; g is the variability ratio (Kling et al. 2012). All calculative variables are replaced by the expected values of the estimate predictive distributions.

8.3.5.2 Verification of Probabilistic Simulations

With regard to assessment of assessing the uncertainty analysis of simulation interval, Xiong et al. (2009) and Dong et al. (2013) presented multiple verification indices and applied in hydrologic practice. Three main metrics are selected to evaluate the simulation uncertainty intervals generated by the BMA, CBMA and CBP-BMA methods.

  1. (1)

    Containing ratio (CR)

The containing ratio is utilized as a significant index for assessing the goodness of the uncertainty interval. It is defined as the percentage of observed data points that fall between the prediction bounds, directly reflecting the interval performance.

$$CR = \frac{{\mathop C\nolimits_{i = 1}^{N} (q_{l}^{i} \le q_{o}^{i} \le q_{u}^{i} )}}{N} \times 100{\% }$$
(8.47)

where \(q_{l}^{i}\) is denoted as the lower bound corresponding to 5% quantile at time t, \(q_{u}^{i}\) is denoted as the upper bound corresponding to 95% of the quantile. \(\mathop C\nolimits_{i = 1}^{N}\) is the number of the observed data points \(q_{o}^{i}\) that satisfy the inequality conditions.

  1. (2)

    Average bandwidth (BW)

$$B = \frac{1}{N}\sum\limits_{i = 1}^{N} {(q_{u}^{i} - q_{l}^{i} )}$$
(8.48)

where BW is also an index measuring the average width of estimated uncertainty interval just as the definition name indicates. Smaller values of BW show a greater precision. Consider two forecasts with the same containing ratio; the situation with smaller BW is preferred because it has less uncertainty or greater precision.

  1. (3)

    Average deviation amplitude (DA)

The average deviation amplitude DA is an index to quantify the average deflection of the curve of the middle points of the prediction bounds from the observed streamflow hydrograph. It is defined as

$$D = \frac{1}{N}\sum\limits_{i = 1}^{N} {\left| {\frac{1}{2}(q_{u}^{i} - q_{l}^{i} ) - q_{o}^{i} } \right|}$$
(8.49)

where the notations are defined previously.

8.3.6 Case Study

The Mumahe catchment (Fig. 8.10), a sub-basin of Hanjiang River basin in China is selected as a case study. The catchment lies in Shanxi Province with an area of 1224 km2 and locates in the subtropical monsoon region with a humid climate and fairly plenty of precipitation. The annual mean precipitation and runoff is 1070 and 687 mm, respectively. The available dataset contains daily precipitation, runoff, and evaporation with a length of 11 years (1980–1990). The first year (1980) is used as the spin-up period for each hydrologic model to achieve the best effective model formulation. The remaining years (1981–1990) are divided into two sub-periods, with seven years (1981–1987) for calibration and three years (1988–1990) for validation.

Fig. 8.10
figure 10

Sketch map of the Mumahe catchment

Different multi-model techniques, i.e., BMA, CBMA, and CBP-BMA, are applied to combine the ensemble flow simulation. The structures of three hydrologic models ought to be determined as the deterministic results are crucial to final uncertainty analysis. As mentioned above, the calibration parameters of the first BMA method are \(\omega_{k}\) and \(\sigma_{k}^{2}\); In the CBMA method, they are the parameters of marginal distributions, weights \(\omega_{k}\) and the parameters of the PDF of the copula. In the CBP-BMA method, the Monte Carlo sampling technique is also used to obtain the integral item.

8.3.6.1 Deterministic Hydrologic Model Simulations

The genetic and simplex algorithms are used for model calibration on account of their flexibility and good convergence. The genetic algorithm can acquire the global optimal value with independent of initial parameter values. The simplex algorithm is of high accuracy with low convergence rate. With the merit of two methods integrated, the approximately optimal values of model parameters are obtained. Three deterministic assessment indices: NSE, DRMS and KGE scores over the calibration period (1981–1987) and the validation period (1988–1990) are calculated for XAJ, HBV, and SIMHYD models. Table 8.6 indicates that the XAJ model has the best results, the HBV model takes the second place, and the SIMHYD model behaves worst among the three. The reason can be attributed to the dissimilar process for the calibration of each model (Nasonova et al. 2009). In practice, it might partially refer to inaccurate estimation of model parameters as one of the error sources of the model structure, abstract formulation of physical processes, and different sources of forcing data set for each model. In general, these simulation results can be used as the input data of multi-model ensemble in terms of the NSE and KGE values, which are 85% and higher than 0.82 respectively beside the ill value of KGE of HBV.

Table 8.6 Deterministic accuracy assessment of different hydrological models

8.3.6.2 Determination of the Marginal Distributions

The marginal distributions of the random variables of H and Si (i = 1, 2, 3) need to be determined. Five common candidate distributions, namely Normal, Gamma, Gumbel, P-III and Log-Normal, have been fitted to the daily mean streamflow values as well as to the XAJ, HBV and SIMHYD model simulations.

Regarding random variable H, the parameters of five candidate distributions are estimated by the method of L-moment (Hosking 1990), and the parameter values are listed in Table 8.7. The K-S tests are used to verify the null hypothesis, and the corresponding statistic DK-S values are also listed in Table 8.7. It is shown that the null hypothesis could not be rejected at the 95% confidence level (threshold value \(D_{\text{n}} = 1.36/\sqrt N\), N is the number of sampling points) for Log-Normal distribution with providing the minimum DK-S value. Meanwhile, Fig. 8.11 indicate the Log-Normal is satisfactory on visual inspection that the cumulative distribution function (CDF) plots of the theoretical Log-Normal distributions fitted the empirical CDF values obtained from the Gringorten plotting-position formula (Zhang and Singh 2006) relatively well. The estimation of marginal distributions for Si had the similar procedures. The Kolmogorov-Smirnov statistics DK-S indicate that the Log-Normal distribution also gives the best fit in this study.

Table 8.7 Estimated parameters and statistic test Dk-s of five candidate marginal distributions
Fig. 8.11
figure 11

Comparison of the empirical and theoretical cumulative distribution functions

8.3.6.3 Archimedes Copula Selection and Estimation

In the application of the CBMA and CBP-BMA methods, a copula function to link the CDF of observation and model simulations needs to be defined. The Gumbel, Clayton and Frank copula belonging to Archimedes family are chosen to test for flexibility and universality (Madadgar and Moradkhani 2014; Chen et al. 2015).

For Archimedes copula, the Kendall correlation coefficient τi (i = 1, 2, 3) between observed and different simulated flows is firstly derived. The higher τi indicator reflects the stronger correlation between observation and model simulation. The corresponding copula parameter θi is calculated by the method based on the inversion of τi in Table 2.3 of Chap. 2. The parameter estimators and goodness-of-fit test (RMSE and AIC) are used to determine the best fit copula for integrating the streamflow properties. The results illustrate that copulas have the good performance in exploring the associations of observed and simulated flows. All variables passed the null hypothesis for Gumbel and Frank copulas. Gumbel copula performs with the lowest RMSE and AIC values.

8.3.6.4 Deterministic Assessment of Three Ensemble Methods

We check the mean simulation of hydrologic multi-model ensembles using three criteria illustrated in Sect. 8.3.5.1. The effectiveness results of BMA, CBMA and CBP-BMA methods are listed in Table 8.6. The performances of different multi-models are better than that of the individual XAJ model regarding NSE. The BMA method outperforms the reference model at the cast of DRMS and KGE indicators, The CBMA and CBP-BMA methods slightly improve in all aspects during the calibration period, which have excellent properties in the validation period. The reason of the CBMA and CBP-BMA methods enhancing the performance can be attributed to that copula functions are efficient tools to remove bias instead of a simple bias correction such as linear regression in the BMA method (Madadgar and Moradkhani 2014). Especially, copula has reliable parameter estimation prior model average procedure. Another reason might be owed to the weight of each individual model, which is directly influenced by the estimation of posterior distributions.

Figure 8.12 illustrates the bar plots of KGE score and its components. The KGE score might be a little descending through BMA or CBMA application, a little incremental through CBP-BMA application in comparison with the best XAJ model. The correlation coefficients between observation and simulation of individual models are up to 0.93 in the calibration period and 0.92 in the verification period, which represent stronger correlation for the values are more than 0.9. However, β indictor of deterministic models varies from 0.64 for HBV model to 0.97 for XAJ model. The value less than 1 indicates the total amount of streamflow simulation in any individual model is less than that of observation. It might cause the general underestimation of the mean streamflow (negative bias) in hydrological multi-model ensemble applications. The BMA method is such a promising method for locating simulation to observation for its term β closer to 1. Regarding the variability ratio, all methods except for HBV could perfectly perform, but no particular method is superior to others with all \(\gamma \approx 1\).

Fig. 8.12
figure 12

The simulation results of KGE score and its components

8.3.6.5 Probabilistic Verification of Three Ensemble Methods

For probabilistic verification of simulation, Figs. 8.13 and 8.14 describe the uncertainty bands of different methods for the representative year during calibration and verification periods with a visual inspection. These two plots indicate that the observed values approximately fall within the 5–95% uncertainty range and fit the mean flow hydrograph for all multi-model ensembles. In this case, the 90% confidence interval could capture the flood peaks but miss more low flow values.

Fig. 8.13
figure 13

The 90% uncertainty interval, observed, mean simulation for the Mumahe catchment in 1987 during the calibration period

Fig. 8.14
figure 14

The 90% uncertainty interval, observed, mean simulation for the Mumahe catchment in 1990 during the validation period

Three probabilistic verification measurements (CR, BW, DA) are presented in Table 8.8. It can be seen from these quantitative indices that they have a good performance regarding containing ratio, which is corresponding to the confidence interval. The probability of observed value falling in the range should be in accord with the percentage of confidence interval containing points through many independent statistical experiments. The CBP-BMA method performs better than CBMA method regarding CR index because it roughly covers 91% of the sample points, which is more than CBMA does. A combination of CR and BW possess the power to make a decision on model probabilistic performance. The comparison between the CBMA (and CBP-BMA) and BMA methods exactly illustrates that the CBMA method outperforms the BMA method, either CR, BW or DA, especially, the containing ratios of CBP-BMA method in different periods are up to 91.17 and 91.33%, respectively. Referring to the smaller BW result in the CBMA and CBP-BMA methods, the total predictive variance is reduced by relaxing the PDF generated by copula functions rather than the Gaussian posterior distribution via box-cox transformation. Since the between-model variance keeps identical after using the same EM algorithm in all three methods, it is inferred that the reduction of within-model variance works.

Table 8.8 Uncertainty assessment of different hydrological multi-model ensembles

The CBMA and CBP-BMA methods are two flexible and robust approaches to estimate uncertainty regarding the optimal bandwidth and average deviation amplitude. They have an intuitive and simple structure conditional on several model simulations by the integration of BMA and copula tools, which makes this method promising to derive uncertainty. The difference between them reflected in the procedure of processing posterior distribution. Further improvement might be realized through the weight allocation for each model or the nonparametric posterior distribution.

8.4 Conclusion

Hydrological forecasting services are trending toward providing users with probabilistic forecasting, and adequate assessment of uncertainty forecasts is an important issue and task. A copula-based HUP for probabilistic forecasting and CBP-BMA method for evaluating uncertainties of hydrologic multi-model ensembles are proposed. Three Gorges Reservoir (TGR) and Mumahe basins are selected as case studies. The main conclusions are summarized as follows:

  1. (1)

    The output of the HUP is a posterior distribution of the process, conditional upon the deterministic forecast. This posterior distribution provides the complete and well-calibrated characterization of uncertainty needed by rational decision makers who use formal decision models and by information providers who want to extract various forecast products for their customers (e.g., quantiles with specified exceedance probabilities, prediction intervals with specified inclusion probabilities, probabilities of exceedance for specified thresholds).

  2. (2)

    Based on copula function, the prior density and likelihood function of the HUP are explicitly expressed, and the corresponding posterior density and distribution can be obtained using the Monte Carlo sampling technique. This copula-based HUP can be implemented in the original space directly without a data transformation procedure into Gaussian space and allows for any form of marginal distribution of predictand and the deterministic forecast variable, and a nonlinear and heteroscedastic dependence structure.

  3. (3)

    The proposed copula-based HUP is comparable to the meta-Gaussian HUP regarding the posterior median forecasts. It is also shown that probabilistic forecasts produced by the copula-based HUP have slightly higher reliability and lower resolution compared with the meta-Gaussian HUP. According to the CRPS value, it is found that both HUPs are superior to deterministic forecasts which highlight the effectiveness of probabilistic forecasts, and the copula-based HUP is marginally better than the meta-Gaussian HUP.

  4. (4)

    Deterministic results of different multi-model ensembles outperform those of the individual model. The CBMA and CBP-BMA methods slightly outperform BMA method regarding NSE, DRMS, and KGE. When the CBMA method is used as a reference, the CBP-BMA method can improve the NSE and KGE and enlarge DRMS values. Underestimation of all individual models may cause negative bias of ensemble multi-model.

  5. (5)

    The combination of containing ratio and bandwidth index demonstrates the probabilistic model performance with the auxiliary index-average deviation amplitude. It is found that containing ratio is approximately equal to the percentage of the confidence interval. The CBMA or CBP-BMA methods outperform BMA method regarding evaluation criteria with a high containing ratio, small uncertainty, and average deviation amplitude.