Keywords

1 Introduction

The main objective of this chapter is to give a general overview of the present state of ensemble forecasting methods based on already existing references. Ensemble approaches applied in short-range and climate time scale are introduced in detail, together with ensemble visualization and interpretation possibilities, which are used at the Hungarian Meteorological Service (HMS).

The present chapter contains five sections. After the introduction, Sect. 11.2 provides a general description of uncertainties in atmospheric weather prediction models and climate models and gives motivation for using probabilistic forecasts. Sect. 11.3 focuses on ensemble designs that are quantifying the previously described uncertainties. Specific techniques are detailed which define different kinds of perturbations in an ensemble system. Section 11.4 represents how operational ensemble systems can be constructed and how ensemble methods can be applied in the practice of climate modelling research. Furthermore, this section gives examples for the interpretation and visualization of ensemble probabilistic products. Section 11.5 is a short summary of the chapter.

2 Uncertainties in Numerical Weather Predictions and Climate Projections

Theoretically, the error sources in NWP can be divided into two main groups [39]. First group is called “God-given errors” which refers to the intrinsic chaotic characteristic of the atmosphere and the climate system. Similarly to simple low-dimensional systems described by non-linear equations [30], the atmosphere is also very sensitive to its initial conditions. Small differences in the current initial states can cause large differences among the future ones. In other words, even small uncertainties can grow rapidly and might have significant impact on the weather forecast outputs. Since perfect initial conditions cannot be given, predictability has been always limited in numerical weather predictions. Evolution of the climate system also has sensitivity to its initial state, however, within this initial condition description of the oceans and ice sheets is also included. The other group of errors can be called “man-made errors” which refers to the incomplete human knowledge about the system to be described and technical limitations about its modelling. Numerical models are not perfect counterparts of the Earth system and they contain many approximations, for instance the underlying mathematical equations are solved numerically with temporal and spatial discretization. In practice these two main types of errors cannot be separated and they affect each other in a very complex way.

The initial conditions of NWP models are mostly produced by complex data assimilation methods which are using observations and background information. This background is usually a short-range forecast valid at the analysis time and consequently imperfect. Observations can also contain errors, since there might be instrument errors or they might not be representative for their vicinity. Additionally, there might be significant spatial and temporal inhomogeneities in the observations. The assimilation algorithms themselves also use approximations providing another source of errors. Specifying initial conditions for climate simulations faces similar challenges, but it requires measurements and background information about the climate system, (e.g., the deep ocean) making the data assimilation procedure even more complex.

Governing model equations are partial differential equations which cannot be solved analytically, thus they are discretized and then solved numerically. Taking the available computer resources into account, the discretization is limited. In spite of the fact that the current supercomputers are extremely powerful, the model grid is still unable to directly resolve all the meteorological phenomena at the desired spatial scales, consequently some of the processes have to be parameterized. These parametrizations can only give an estimation of the net effect of sub-grid scale processes. Models also need lower and upper boundary conditions and their specification can be particularly difficult for the surface. Additionally, in limited area models the proper treatment of lateral boundary conditions (LBCs), which are used to connect the processes inside and outside the regional domain, is non-trivial and a potential source of error. In climate modelling, not only natural processes are represented in approximate way but also human activity has to be taken into account as the forcing factor of future climate change. Anthropogenic activity is quantified in climate model simulations via hypothetic emission scenarios (discussed in Sect. 11.3.3).

It is important to underline that atmospheric predictability is highly varying and affected by the weather situation (Fig. 11.1a, b), meaning that it is higher in stable conditions. It also depends on the forecasted parameter. For instance, 500 hPa geopotential field describes the large synoptic-scale motions and it is more predictable than precipitation, which is influenced by local effects and small-scale phenomena like convection. Although non-hydrostatic models can describe convection explicitly, predictability is overall lower towards smaller scales.

Fig. 11.1
figure 1

(a) An example of the plume diagram from the results of LAMEPS at the HMS. It shows the time evolution of the 6 h total precipitation values predicted by the ensemble members. Forecasts were run at 18UTC 15 March 2014. Blue curves belong to the perturbed members while orange denotes the control member and grey is the ensemble mean. (b) The same as (a) but forecast was started at 18UTC 15 May 2014

All these uncertainties reveal the necessity of providing not only single-forecasts and single-projections but probabilistic information corresponding to the predictability of the given atmospheric state and limitations of modelling. Since smaller-scale phenomena have lower predictability, the importance of probabilistic forecasts is growing along with the increased model resolution and continuous model improvements.

3 Ensemble Methods

Nowadays the only feasible way to produce probabilistic forecasts is to conduct an ensemble of model integrations. In ensemble prediction systems (EPS) not only a single model run predicts the future state, but an ensemble of forecasts gives many possible realizations of the atmospheric (climate) system. The members of such ensemble in NWP can differ slightly from each other in their initial conditions or model formulations. These small differences are called perturbations and they are supposed to be large enough to produce sufficient ensemble spread within the existing uncertainties. There are many perturbation generation methods that are dedicated for special types of possible uncertainties mentioned in the previous section. These methods are divided into two main groups: the first one focuses on initial condition perturbations (see Sect. 11.3.1), while the second one represents model uncertainties (see Sect. 11.3.2). There are some practical ways to generate ensemble systems, which are detailed in Sects. 11.3.211.3.4.

Generally an EPS contains 10–50 members, which would make an enormous computational growth if the high resolution operational model versions were applied in the ensemble system. To avoid this extraordinary cost, a compromise is needed and EPS members run usually at a coarser resolution. The member using unperturbed initial condition and model formulation is called control. Usually EPS is designed in a way that perturbed initial conditions have a symmetric structure around the control.

Let us present the partial differential equation system of the atmosphere in a very schematic way for the better understanding of the perturbation generation methods detailed in the present section.

$$ \begin{array}{c}\frac{dx}{dt}=F\left(x;t\right)\\ {}x\left(t=0\right)={x}_0\end{array} $$
(11.1)

In Eq. (11.1) vector x contains the state variables (e.g., pressure, temperature, wind component, humidity) describing the atmospheric state, F denotes the forecast model and x 0 is the corresponding initial condition of the equation. The model state at time T is the time integral of Eq. (11.1):

$$ x(T)={\displaystyle \underset{t=0}{\overset{T}{\int }}F}\left(x;t\right)dt={\displaystyle \underset{t=0}{\overset{T}{\int }}\left(A\left(x;t\right)+P\left(x;t\right)\right)}dt $$
(11.2)

In Eq. (11.2) F can be divided into two parts: the explicitly handled non-parameterized (A) and the parameterized small-scale processes (P). The latter processes are typically convection in non-hydrostatic models, turbulence, microphysics and radiation. This separation is important because the second term (P) is more uncertain than the first one (A). The unperturbed control member of an ensemble system can be directly described by Eqs. (11.1) and (11.2), while modified equations are needed to explain the various perturbation generation methods.

3.1 Initial Condition Perturbations

Historically the first and currently the most commonly used methods to create ensemble prediction systems are perturbing the initial conditions of NWP models. These methods are mostly based on either finding the most unstable perturbations, which are growing fastest during the model forecasts, or determining and quantifying the model initial condition (analysis) error sources. In the next sections the most popular methods are briefly summarized, like the computation of Singular Vectors (SV), the determination of the Breeding Vectors (BV) and the application of the Ensemble of Data Assimilations (EDA) method.

If these methods are applied then the initial condition equation of Eq. (11.1) should be modified with an additional perturbation term. Consequently, the initial condition of an arbitrary j-th ensemble member can be written as

$$ {x}_j\left(t=0\right)={x}_0+{y}_j\left(t=0\right). $$
(11.3)

Below different ways of defining y j are explained.

3.1.1 Singular Vectors

The computation of singular vectors is one of the first perturbation generation methods and was developed at the European Centre for Medium-Range Weather Forecasts (ECMWF) in the early 1990s [6]. The basic idea is to find such directions of the phase space (defined by the state variables of the model), where perturbations can grow fastest in the early forecast evolution when the linear approximation is still valid, normally the first 12–48 h of the forecast.

Let us consider the system described by Eq. (11.1) and its initial condition perturbation y(t = 0) as defined by Eq. (11.3) which will result in the solution x(t) + y(t) at time t. If y(t) is sufficiently small then the Taylor-series of the right-hand-side function F around x(t) can be written as

$$ F\left(x(t)+y(t)\right)=F\left(x(t)\right)+\frac{dF}{dX}y(t)+O\left({y}^2(t)\right). $$
(11.4)

Equations (11.1) and (11.4) can be combined as

$$ \frac{d\left(x(t)+y(t)\right)}{dt}=\frac{dx(t)}{dt}+\frac{dF}{dX}y(t)+O\left({y}^2(t)\right), $$
(11.5)

which can be further simplified into the tangent linear equation considering the linear approximation:

$$ \frac{dy(t)}{dt}=\frac{dF}{dX}y(t). $$
(11.6)

The general solution of the tangent-linear equation can be also formulated by the propagator matrix (denoted as M in Eqs. (11.7)–(11.11)), which holds the relationship between perturbations at the initial t 0 and final t 1 instants:

$$ y\left({t}_1\right)= My\left({t}_0\right). $$
(11.7)

As mentioned above, the main idea of the SV method is to find the fastest growing perturbations in a linear system (so the assumption of linearity is important while considering the SVs). This linear perturbation growth in the [t 0;t 1] time-interval can be quantified with a properly selected norm. For the perturbation growth the ratio described by Eq. (11.8) must be maximized.

$$ \frac{{\left\Vert y\left({t}_1\right)\right\Vert}_E}{{\left\Vert y\left({t}_0\right)\right\Vert}_E}=\frac{{\left\Vert My\left({t}_0\right)\right\Vert}_E}{{\left\Vert y\left({t}_0\right)\right\Vert}_E} $$
(11.8)

The proper choice of the norm E is crucial in practice. Note that the norm defined in the initial and final time instants might be different. This norm can be defined in association with an inner product <;> E as follows:

$$ {\left\Vert y\right\Vert}_E^2={\left\langle y;Ey\right\rangle}_E. $$
(11.9)

In Eq. (11.9) E is a positive definite Hermitian matrix. In case of Euclidean norm this E matrix becomes the identity and consequently all the state vector variables are combined with the same weight. The use of Euclidean norm provides an unphysical metric since the state variables with larger units would dominate in the norm, e.g., temperature. Therefore, a norm is desirable which has physical meaning when combining the various model state variables. For instance, the total energy is widely used as a physically sound norm, where the weights are provided according to the contribution of the given variable to the total energy. There are also experiments with CAPE (convective available potential energy) norm in limited area models [42].

It can be noted that the norm might also contain a geographic projection operator calculated over a given area of interest. The definition of such target areas can help to focus e.g. on the tropics, where the perturbations are often improperly represented by the global models [40]. Targeted SVs also allow to focus on such areas where dynamically downscaled limited area ensemble systems run (see in the next section).

If in Eq. (11.9) the size of the initial perturbation ||y(t 0 )|| E is defined as a unit, then the goal is to find the maximum of ||y(t 1 )|| E . The formula of Eq. (11.9) can be deduced after considering the propagator Eq. (11.7) and transforming the norms into scalar products and using the definition of the adjoint of the propagator matrix M*:

$$ {\left\Vert y\left({t}_1\right)\right\Vert}_E^2={\left\Vert My\left({t}_0\right)\right\Vert}_E^2={\left\langle My\left({t}_0\right); My\left({t}_0\right)\right\rangle}_E={\left\langle M* My\left({t}_0\right);y\left({t}_0\right)\right\rangle}_E. $$
(11.10)

Equation (11.10) shows that the search of the fastest growing perturbations is equivalent to finding the v i (t 0 ) eigenvectors of the M * M matrix having the largest eigenvalue of σ i 2:

$$ M*M{\it v}_i\left({t}_0\right)={\sigma}_i^2{\it v}_i\left({t}_0\right). $$
(11.11)

The square roots of the σ i 2 eigenvalues are called singular values and the eigenvectors v i (t 0 ) are the singular vectors of M. The eigenvectors belonging to the largest eigenvalues show those directions of the phase space, where the perturbations have the largest growth in the [t 0 ;t 1] time interval based on the E norm. In realistic atmospheric models the dimension of the eigenvalue problem is huge, therefore its solution is non-trivial and it is obtained through special numerical algorithms. In meteorology generally the Lanczos-algorithm is applied [29]. The initial condition perturbations of Eq. (11.3) can be computed as one solution with combining the largest singular vectors of the different target areas:

$$ {x}_j={x}_0+{y}_{SV}={x}_0+{\displaystyle \sum_{k=1}^{N_{TA}}{\displaystyle \sum_{i=1}^{N_{SV}}{\alpha}_{ki}}}{v}_{ki}. $$
(11.12)

In Eq. (11.12) N TA is the number of target areas, N SV is the number of used singular vectors and α is a parameter scaling the perturbation to the size of the estimated analysis error.

3.1.2 Breeding Method and Kalman Filter

The breeding method [46] was developed in the US simultaneously to the above-mentioned singular vector technique. The main conceptual difference of the breeding method with respect to the singular vectors is that the largest uncertainties are sought in the past (in the assimilation cycle) and not in the near future. This is achieved by “breeding” of past perturbations with retaining only the most unstable ones. The applied procedure is iterative. First, some small, random perturbations are generated and added to the NWP analysis. Then short range numerical forecasts are run based on the unperturbed control and the perturbed initial conditions. The evolution of these initial perturbations is monitored by tracking the differences between the control and perturbed forecasts. Cyclically, these perturbations are rescaled and then added again to a new analysis. After that new forecasts are started from the newly perturbed initial condition and the process restarts. In such an iterative procedure after few steps the system is able “to breed” the necessary perturbations (Fig. 11.2), with selecting the perturbations growing fastest during the assimilation cycle. They can be used for perturbing the model initial conditions and create a forecast ensemble.

Fig. 11.2
figure 2

The schematic description of the breeding method in case of two members. x 1 denotes the unperturbed and x 2 the perturbed member, d i refers to the rescaled perturbations and d f to the bred perturbations at the end of each breeding cycle

The original implementation of the breeding method was built on the top of a data assimilation cycle, so forecast perturbations were rescaled in every 6 h having the same size as in initial time and they were added to a regular analysis. In further tests even longer time periods were applied (12 or 24 h) to find the fastest growing modes of the perturbations [47]. These tests also underlined the weakness of the method, namely that a globally constant rescaling factor is not able to reflect geographical variation and accuracy of the observing system. One possible solution can be the so called masked breeding method where a latitude and longitude dependent rescaling factor is defined. Although none of the above described breeding method realizations can correctly take into account the forecast error variances.

This problem can be handled by Kalman Filter (KF) based methods. The classic KF concept provides relationship between forecast and analysis error covariances via the linear model, its transposed version and the model error covariance matrix. Such a relationship can help to iteratively evolve the analysis and background error covariance matrices through the data assimilation cycles and take into account flow dependent errors. While the classic KF method is computationally expensive the Ensemble Transform Kalman Filter (ETKF) was introduced. In ETKF a special transformation matrix is defined by the estimation of background error covariance matrix (given by the forecast perturbations of an ensemble system) and the observation error covariance matrix. Such a transformation matrix can be used to upgrade the error covariance matrices in a data assimilation system, but moreover it is possible to use it to transform forecast perturbations into analysis perturbations again [1]. Such a transformation has the advantage against the breeding method that it can reflect the background error variances.

The Ensemble Kalman Filter (EnKF) can be also mentioned as a data assimilation related application of ensemble systems. In EnKF both the forecast and the analysis error covariances are estimated from the spread of the forecast and analysis perturbations, respectively. Unlike to KF, EnKF uses the nonlinear model operator to evolve the analysis state into the forecast state [20].

3.1.3 Ensemble of Data Assimilations

The main idea behind the Ensemble of Data Assimilations technique is the simultaneous execution of more data assimilation cycles [24]. The differences among these data assimilation cycles are provided by the quantified uncertainties in the data assimilation system. The knowledge of these uncertainties provides a clue about the realistic error sources of the system and makes possible to compute analysis and short range error statistics. The analysis error statistics can be used to define suitable perturbations to an ensemble prediction system and the short range error statistics can be used for computing the background error covariances for the data assimilation system.

For the better understanding, first we will show how a variational data assimilation system can be formulated by defining a cost function, which measures the deviations of the analysis from the various information sources used in the assimilation process. The solution of the variational problem can be obtained by minimization of a cost function ensuring that the meteorological analysis is optimally near to all the ingredients of the assimilation system taking into account their corresponding reliabilities. The two most important sources of information in a data assimilation system are the observations and the background fields (short range NWP forecasts valid at the analysis time). The cost function of the variational system can be written as:

$$ J(x)={\left(x-{x}_b\right)}^T{B}^{-1}\left(x-{x}_b\right)+{\left(o-H(x)\right)}^T{R}^{-1}\left(o-H(x)\right). $$
(11.13)

In Eq. (11.13) x is the model state, B is the background error covariance matrix, x b is the background state, o contains the observations, R is the observation error covariance matrix and H is the observation operator (which establishes relationship between model space and observation space). The B and R covariance matrices are essential ingredients of the system, providing proper weighting between the observation and background information.

In such data assimilation system considering the linear approximation the analysis update can be defined as follows:

$$ \begin{array}{l}{x}_a^k={x}_b^k+{K}_k\left({o}^k-{H}_k{x}_b^k\right)\\ {}{x}_b^{k+1}={M}_k{x}_a^k.\end{array} $$
(11.14)

In Eq. (11.14) superscript k refers to the assimilation cycling and the gain matrix K k can be written as:

$$ {K}_k={B}_k^b{H}_k^T{\left({H}_k{B}_k^b{H}_k^T+{R}_k\right)}^{-1}. $$
(11.15)

In practice the uncertainties taken into account in EDA are related to the observations, to the background fields, to the model formulation and to the lower boundary conditions. In an EDA system, the observations are usually perturbed by a random number η k drawn from a Gaussian distribution which has zero mean (no systematic errors are assumed) and its standard deviation equals to the estimated standard deviation of the observation error. Consequently, the formula of the perturbed analysis can be written as a modification of Eq. (11.14):

$$ \begin{array}{c}{\left(\tilde{x}\right)}_a^k={\left(\tilde{x}\right)}_b^k+{K}_k\left({o}^k+{\eta}_k-{H}_k{\left(\tilde{x}\right)}_b^k\right)\\ {}{\eta}_k\in N\left(0;R\right).\end{array} $$
(11.16)

The background fields are not explicitly perturbed since they will be automatically different during the assimilation cycles through the evolved perturbations coming from the previous step (Fig. 11.3). Additionally, model uncertainties can be also quantified in the assimilation cycle by perturbing the model formulation (M′) and consequently the modified equation of the forecast step of the analysis Eq. (11.14) can be written as:

Fig. 11.3
figure 3

The schematic representation of the Ensemble of Data Assimilations (EDA) system. Only the control member (x) and an arbitrary perturbed member (x′) are visualized with the corresponding unperturbed (y) and perturbed (y′) observations

$$ {\left(\tilde{x}\right)}_b^{k+1}=M^{\prime}{\left(\tilde{x}\right)}_a^k. $$
(11.17)

Model error representation methods will be detailed in Sect. 11.3.2. It has to be noted that boundary condition uncertainties can be taken into account in the above-mentioned forecast step. A possible lower boundary condition perturbation method is described and the perturbed lateral boundary conditions of the limited area models are mentioned in Sect. 11.4.1.

If the R observation error covariance matrix is properly estimated and the perturbed M′ model formulation gives back the model errors correctly, then the perturbations of the data assimilation system realistically represent the uncertainties of the system.

$$ \begin{array}{l}{y}_a\equiv {\tilde{x}}_a-{x}_a\\ {}{y}_b\equiv {\tilde{x}}_b-{x}_b\end{array} $$
(11.18)

In practice perturbed EPS members can be directly initialized by the perturbed analysis. In this case Eq. (11.3) can be rewritten in a very simple way:

$$ {x}_j={\mathrm{x}}_0+{y}_{EDA}={x}_a+{y}_a. $$
(11.19)

Another possibility is to define EDA perturbations as a difference between the perturbed background fields and their ensemble mean. These perturbations can be added to an analysis, which is produced independently from the EDA system (let us denote as x A ). This procedure has the advantage that this additional analysis can have better quality (finer resolution or created by a more sophisticated assimilation method) and there is no need to wait for the most recent analysis of the EDA members. In this case the perturbed initial condition of the arbitrary j-th member of an EPS containing N members can be written as:

$$ {x}_j={x}_0+{y}_{EDA}={x}_A+\left({\tilde{x}}_j-{\overline{x}}_b\right)={x}_A+{\tilde{x}}_j-\frac{1}{N}{\displaystyle \sum_{i=1}^N{\tilde{x}}_{bi}}. $$
(11.20)

If the perturbations of EDA are correctly defined using adequate observation and background error statistics then EDA shows those directions of the phase space, where the data assimilation uncertainties are the largest. Therefore these perturbations can effectively contribute to the initial condition perturbations used for an EPS. Some more details of the ECMWF specific application is described in Sect. 11.4.1.

3.2 Representation of Model Uncertainties

As already mentioned in Sect. 11.2 there are many sources of uncertainties related to the atmospheric models. Although in principle, inflated initial condition perturbations can partly account for model imperfections, they are not designed for that goal and therefore other methods should be used to represent model-related uncertainties. For that purpose, generally model formulations are not identical for every ensemble member and their variety will result in such perturbations which represent the model uncertainties in an ensemble system. Based on these model formulation differences Eq. (11.2) can be modified for perturbed ensemble members as follows:

$$ {x}_j(T)={\displaystyle \underset{t=0}{\overset{T}{\int }}F^{\prime}}\left({x}_j;t\right)dt. $$
(11.21)

In Eq. (11.21) the F forecast model is replaced by its perturbed counterpart F′. The multi-model method simply uses more types of models and then combine their results, which means that in Eq. (11.21) F′ represents a set of the applied NWP models (see below). Other methods try to identify perturbations from the most uncertain parts of the model, which are the various parameterizations of the sub-grid scale processes. Following this concept the parts of F′ can be separated like it was the case in Eq. (11.2).

$$ {x}_j(T)={\displaystyle \underset{t=0}{\overset{T}{\int }}F^{\prime}}\left({x}_j;t\right)dt={\displaystyle \underset{t=0}{\overset{T}{\int }}\left(A\left({x}_j;t\right)+P^{\prime}\left({x}_j;t\right)\right)}dt $$
(11.22)

In Eq. (11.22) P′ represents the perturbed contribution of the parametrization schemes while the contribution of non-parametrized processes A remains unchanged. Similarly to the multi-model method, these perturbations can be generated simply by using more parametrization schemes (multi-physics method, see below) or using the same scheme but with perturbed settings (perturbed parameter method, see below) or applying identical schemes with stochastic modifications in their net contribution (stochastic physics, see below).

3.2.1 Multi-Model, Multi-Physics and Perturbed Parameter Method

In current NWP modelling practice there is no superior model, which performing best in all conditions: all models have strengths and weaknesses. Different models are better or worse depending on multiple factors like the current weather situation or forecasted variable for instance. This variety of model performances motivates experts to use several numerical models at the same time and provide information from all of them to the users of forecast and climate model outputs (e.g., forecasters, end users). The integrations of these models can be handled as members of an ensemble system and they can provide information about forecast uncertainty [10]. This ensemble generation method is called multi-model approach which is rather a practical way to express model uncertainties without defining model perturbations in a scientifically rigorous manner. This technique is often applied to estimate uncertainties in climate projections, since running simulations on decadal or longer time frame requires huge computational capacity, especially on global scale. Therefore, climate ensembles are usually created by merging single (or at best a few) climate experiments of individual institutes. Results of the most typical multi-model climate ensemble are published in assessment reports of the Intergovernmental Panel on Climate Change (IPCC; e.g., IPCC AR5 WGI, 2013), however, several ensemble systems composed of regional climate simulations are also available (see Sect. 11.4.1).

It should be mentioned that even inside an NWP model there are more available parameterization schemes which performance is also situation and variable dependent. As already mentioned and described by Eq. (11.22) these parameterized processes are the most uncertain parts of the model formulations and they can be perturbed while non-parameterized processes stay unchanged. A practical way to take into account this uncertainty is provided by multi-physics method where different parameterization schemes are paired to different members of an ensemble system [51].

A practical disadvantage of the multi-model and multi-physics methods is that forecast centres cannot easily maintain many models at the same time or construct large number of equally reliable parametrization schemes and consequently ensure the sufficient ensemble population.

In a well-designed model and parametrization system there are large number of tuning parameters which are empirically defined and their precise tuning is uncertain. The main idea behind the perturbed parameter approach is to keep the same model and physical parameterization schemes for every ensemble member and perturb only the most uncertain parameters. These parameters can be set differently for all members or their value can vary stochastically between realistic thresholds [8].

3.2.2 Stochastic Physics

The original stochastic physics scheme was developed in ECMWF and later it was referred to as BMP scheme [5]. Similarly to the previously described approaches it is supposed that sub-grid scale processes (described by model physics) are more uncertain than the large-scale motions (described by the model dynamics on the model grid). Due to this reason the total contribution of parametrization schemes is perturbed by multiplying its original value with a random number. In this case P′ of Eq. (11.22) can be described as follows:

$$ P^{\prime}\left({x}_j;t\right)={\left\langle {r}_j\left(\lambda, \phi, t\right)\right\rangle}_{D,T}\ast P\left({x}_j;t\right). $$
(11.23)

In the BMP scheme r j values are uniformly distributed in the [1 − β; 1 + β] interval. β is an important parameter of the scheme which can control the scale of the perturbation and in practice it is usually set to 0.5. The r j values are kept constant in several grid boxes over a D × D large geographical domain and for more time steps over a T time interval. Their typical values vary between some hundreds of kilometers for D and between 3 and 12 h for T. A disadvantage of the BMP scheme is that r j values are independently picked random numbers which might lead to unphysical spatial and temporal jumpiness in the perturbed tendency fields.

This deficiency is addressed by the revised version of stochastic physics scheme which is called as Stochastically Perturbed Parameterized Tendencies (SPPT) scheme. Its main aim is to ensure well-defined temporal and spatial correlation between the r j values of the different model grid boxes.

$$ P^{\prime}\left({x}_j;t\right)=\left(1+\alpha {r}_j\right)\ast P\left({x}_j;t\right) $$
(11.24)

If the SPPT scheme is applied in spectral models, then r j fields can be generated in spectral space and then transformed to grid point space where the actual parameterization computations are performed. Therefore r j is described by spherical harmonics in a spectral global model [38] and by bi-Fourier functions in a spectral limited area model [3]. The r j field is evolved by a so-called spectral pattern generator where its spectral coefficients (r j ′) mn are described by a first order auto-regressive [AR(1)] process which ensures the temporal correlation.

$$ \begin{array}{c}{\left({r}_j^{\prime}\right)}_{mn}\left(t+\varDelta t\right)=\varphi r{^{\prime}}_{mn}(t)+\sigma {\mu}_{mn}(t)\\ {}\varphi = \exp \left(-\varDelta t/\tau \right)\end{array} $$
(11.25)

In the AR(1) process described by Eq. (11.25) all the new (r j ′) mn values are calculated from two parts. The first part is the previous value multiplied by φ which is the one-timestep correlation set by the decorrelation-timescale τ. In the second part μ values are independent random numbers picked from a Gaussian distribution with 0 mean, 1 variance and bounded into the [−2; 2] interval. These values are multiplied by the σ parameter which is responsible for the size of the perturbations and it is (similarly to the original BMP scheme) most commonly set to 0.5. While r j fields are represented in spectral space, the horizontal correlation of grid-point values are ensured after the spectral transformation. In the spectral pattern generator the so-called space correlation length (L) can control the “smoothness” of rj fields (Fig. 11.4). In practice horizontal and temporal correlation are set according to the characteristic scale of the errors in the atmospheric processes which is represented by the scheme. There are experiments where two r j fields are combined [38]: one of them represents fast evolving synoptic-scale errors (σ = 0.5, τ = 6 h, L = 500 km) and the other one represents slow evolving planetary-scale errors (σ = 0.2, τ = 30 days, L = 2500 km).

Fig. 11.4
figure 4

An example for r j field used in SPPT scheme and evolved by the spectral pattern generator of the ALADIN model. The horizontal correlation length is set to 500 km

In Eq. (11.24) α is an additional height-dependent function which can modify the vertical structure of the perturbations. In the recent implementations it is set to 1 except near to the surface and near to the model top where it smoothly relaxes to 0. This relaxation is necessary to avoid numerical instabilities coming from inconsistencies between the surface (top of the model) and the perturbed low-level (high-level) atmospheric tendencies. Although experiments are started recently to apply SPPT also in boundary layer [37].

As it is expected and experienced, SPPT scheme is able to improve ensemble systems by ensuring sufficient spread through the model integration by the perturbed model formulations. It can be also noted that its positive impact can be measured in the quality improvement of model climatology (especially in the case of precipitation and in the tropics; [38]). It is related to the fact that SPPT not only takes into account the model uncertainty but it can recover the variety of the sub-grid scale process tendencies which is often hidden by the “deterministic” nature of the parameterization schemes.

3.3 Representation of Uncertainties Related to Anthropogenic Activity

On multi-decadal and longer time scales, besides the natural drivers, anthropogenic activity is also an important forcing of the climate system, consequently, climate models must take them into account. Human activity can contribute to the climate change in several ways: e.g., through the emissions of greenhouse gases (GHG) and aerosol particles, land use or demographic change. These effects can be considered in climate models uniquely through meteorological parameters, e.g., via equivalent carbon dioxide concentration as external forcings. An important type of climate simulations is when CO2 level is changed to a fixed value (e.g., to its double) and the model is run until a new equilibrium. Such an experiment does not provide information about the temporal evolution and the dynamics of climate change, however, it allows exploring (and possibly explaining) the sensitivity of different models to a given radiative forcing.

With the increasing computational capacity, the so-called transient method was introduced: climate model integrations are forced with time-dependent atmospheric greenhouse gas and aerosol concentration levels. Transient model runs can simulate a number of important aspects of climate variability, like North Atlantic Oscillation, monsoon systems, El Nino events. Most importantly, it can be applied to study the future climate change trends and their impacts. Time series of concentrations derived by Integrated Assessment Models (IAMs), which calculate GHG concentrations as response to the assumed environmental and economic processes (and vice versa). Since there are several possible pathways of the global future socio-economic developments, the most likely future concentration equivalents can be described only with limitations. Therefore, climate simulations based on these scenarios are called and treated as projections (instead of forecasts).

An important scenario family is SRES (Special Report on Emissions Scenarios; [36]) which consists of four basic scenario sets distinct in assumed global population change and main features of the economic and technological developments from 2001 onwards along the twenty-first century. It was widely applied in global climate model (GCM) experiments providing scientific basis for the third and fourth assessment reports of Intergovernmental Panel on Climate Change [21, 22]. Measurements of the anthropogenic emissions in the last decade urged the need to review the SRES scenarios. The RCP (Representative Concentration Pathways; [35]) scenarios were constructed following a new methodology: using selected pathways of radiative forcings or equivalent CO2 concentration levels, Earth System Models (i.e., climate models) and IAMs are integrated simultaneously and interactively to estimate the future response of climate and socio-economic conditions to the varying atmospheric and radiative forcings. RCPs cannot be identified with any given socio-economic scenario: they are referred to their radiation forcing value for 2100, which can be resulted along several socio-economic development paths. RCPs have four representative versions depending on their radiative forcing levels considered for 2100 from pre-industrial value (Fig. 11.5). The RCP scenario family has been used in the GCM simulations serving results for [23].

Fig. 11.5
figure 5

Time evolution of the total anthropogenic radiative forcing relative to pre-industrial (about 1765) level between 2000 and 2300 for RCP scenarios, and SRES scenarios (until 2100) as computed by the Integrated Assessment Modelling Consortium (IAMC) [23]

3.4 Other Methods

Multi-model and multi-physics methods have already been described in Sect. 11.3.2. This list can be supplemented with other methods following a similar basic idea. For instance multi-analysis methods start forecasts from various analyses computed by different forecast centres. This technique can be also combined with the multi-model method or with the multi-LBC approach used in limited area ensemble systems, where EPS members can be coupled to different global models [10, 12]. Such multi-LBC methods are addressing uncertainties of the lateral boundary conditions.

Considering that many meteorological services and forecast centres run their own EPS, a logical step can be combining them and generating a more populous ensemble. Such systems are able to represent many types of uncertainties due to the big variety of the applied methods and the large number of EPS members. In practice, the setup of multi-ensembles (ensemble of ensembles) can be technically challenging because of the significant data transfers between forecast centres. In case of limited area ensembles, the different integration domains can add additional difficulties. Due to the mentioned issues multi-ensembles are more used for research and quality control purposes [14].

4 Applications of Ensemble Forecasts

In the past decades ensemble systems have become increasingly popular tools to provide probabilistic forecasts and projections both in numerical weather prediction and in climate applications.

The ensemble method was first implemented in medium-range global models (see below). Later many national meteorological services started to run ensembles with their limited area models (LAMEPS) to refine global probabilistic forecasts on a shorter time range and for a smaller area of interest (see below). Recently the focus of research and development is shifted towards the so-called convection-permitting ensembles, where such fine-resolution, non-hydrostatic models are used and are able to resolve deep convection explicitly [12, 34, 50]. The prediction of small-scale meteorological events is very uncertain due to their low predictability. This fact motivates the use of probabilistic forecasts on finer resolution even more.

Adaptation to climate change impacts requires high-cost efforts from the economics and societies. Therefore, credibility of the climate information providing input for these actions has great importance. Due to long-term consequences of the adaptation strategies, the most essential aspect of this credibility is to quantify the uncertainties of climate model simulations. In climate projections targeting multi-decadal and centennial time scales, uncertainties are mainly originated from approximations used in description of physical and anthropogenic processes. It means in practice that climate ensembles are constructed with choosing different anthropogenic scenarios and different climate models. The huge computational requirements and the limited national resources motivate the international co-operations in establishing climate ensembles. The first climate ensemble system was composed of GCM simulations in 1995. Although limited area models have been used for climate purpose since the 1990s [13], the first ensemble system consisting of regional climate model simulations was organized only in mid-2000s.

4.1 Some Examples of Ensemble Systems

4.1.1 The ECMWF Ensemble Prediction System

The ECMWF operational Ensemble Prediction System (ENS) produces 51 forecasts (1 control and 50 perturbed members) for the quantification of the forecast uncertainties in the Integrated Forecasting System (IFS). The forecast uncertainties are quantified as the result of initial and model perturbations.

The initial perturbations of the ENS are determined by adding a combination of EDA and SV perturbations to the unperturbed analysis (which is the high resolution ECMWF analysis) described by Eq. (11.26).

$$ {x}_j={x}_0+{y}_{EDA}+{y}_{SV} $$
(11.26)

EDA perturbations (yEDA) are generated by computing differences between the 6 h EDA forecasts and the EDA mean, like in Eq. (11.20). The 6 h EDA forecasts are chosen since the latest EDA is not yet available at the time of analysis. The SVs are computed by the optimization of the total energy growth in a 48 h time interval using various target areas for the extra-tropics and the tropics. The SVs are linearly combined (see Eq. (11.12)) and the perturbations are scaled to have an amplitude locally similar to the analysis error estimation obtained from 4D-Var (4- dimensional variational data assimilation).

The uncertainties of the lower boundary conditions can be also considered in an ensemble of data assimilation cycles. The method applied in ECMWF generates perturbations with errors correlated with the sea surface temperature fields [49]. Model uncertainties are taken into account by adding stochastic perturbations to the physics parameterization tendencies using SPPT (see Sect. 11.3.2) and Spectral Kinetic Energy Backscatter (SKEB) schemes [38].

4.1.2 Limited Area EPS Activity at Hungarian Meteorological Service

The operational regional EPS of the Hungarian Meteorological Service (HMS) is based on the hydrostatic ALADIN model [18] and runs with 8 km horizontal resolution over a continental European domain (Fig. 11.6a). The system has a control and 10 perturbed members which are the dynamical downscaling of the first 11 members of the French global EPS, called PEARP (Prévision d'Ensemble ARPEGE). In that global system initial condition perturbations are generated as a combination of EDA and SV perturbations and model uncertainty is taken into account by the multi-physics approach [11]. Global perturbations have impact in the limited area system through the downscaled initial and lateral boundary conditions. The operational LAMEPS can provide useful probabilistic guidance for the forecasters and the end-users as it is shown on some examples in Sect. 11.4.2. Some experiments showed the efficiency of targeted singular vectors which can inject locally efficient perturbations into the global system. These perturbations can also penetrate into the limited area model domain through the downscaling process [15]. The slightly positive impact of an EDA implementation was also shown where only near-surface observations were perturbed in an ensemble of surface optimal interpolations [19].

Fig. 11.6
figure 6

(a) ALADIN model domain. (b) AROME model domain

HMS has also started its convection-permitting ensemble research based on the AROME non-hydrostatic model [41, 44]. Integrations run with 2.5 km horizontal resolution over a domain covering the Carpathian Basin (Fig. 11.6b). Most of the tests were launched (similar to the operational LAMEPS) with 10+1 members coupled to the PEARP or in some cases to the ECMWF’s EPS. Such a convection-permitting ensemble system is able to properly describe the small-scale structure of thunderstorms and help in the early warning of hazardous events as will be demonstrated in Sect. 11.4.2.

The EDA scheme was extensively tested and its positive impact was quantified [44]. In the applied configuration 10+1 EDA members were used to initialize the 10+1 EPS members in accordance with Eq. (11.19). During the data assimilation cycles all the observations were perturbed both in atmospheric variational assimilation and in surface optimal interpolation. The quality of the single members were improved by the impact of data assimilation itself and the spread of the ensemble system was increased by the injected initial condition perturbations.

The influence of the SPPT scheme was also examined in AROME-EPS [44]. The parameters of the SPPT scheme were tuned in a way to attribute smaller scale errors to the perturbations which are adequate to the finer model resolution of a non-hydrostatic model (σ = 0.5, τ = 2 h, L = 500 km or 125 km). The impact of the scheme proved to be more neutral than generally in global systems.

4.1.3 Coupled Model Intercomparison Projects (CMIPs)

The climate system is composed of atmosphere, hydrosphere, cryosphere, land surface and biosphere, including highly non-linear feedbacks between them. Weather prediction is concentrating primarily on short- and medium-range description of the atmosphere, which is the most well-known and rapidly changing part of the Earth system. Climate models simulate the asymptotic behaviour of the complex climate system, where their components have a variety of adjustment time scales changing from years to hundreds of thousand years. Consequently, response of the climate system to an external forcing can be determined by coupled models, which incorporate mainly atmospheric and ocean model components, simulating not only the atmospheric and ocean motions but also sea-ice processes and interactions between them. Even though the first realistic atmosphere–ocean general circulation model (AOGCM) experiment dates back to 1975 [31], systematic collection of AOGCM output of leading climate centres was started in mid-1990s by the Working Group on Coupled Modelling of WCRP.Footnote 1 Simulations were based on a common protocol in order to establish a database supporting the climate community to study, validate, evaluate and intercompare AOGCM results. While CMIP1Footnote 2 [28] and CMIP2 [9] were composed of control runs (i.e., experiments for the past climate with observed forcing) and idealized forcing scenario runs (i.e., experiments with 1 % CO2 concentration increase per year), respectively, series of realistic climate change simulations were started with CMIP3 [32] in 2005. These model runs described not only the natural forcings for the past, but the future projections were preceded by comprehensive scenario constructions resulted in SRES emission scenarios. Experiments focused on three emission scenarios (SRES A2, A1B and B1), each of them representing a substantially different future pathway of anthropogenic activity (indicating approximately 850, 700, 550 ppm CO2 concentration by 2100, respectively). Results are freely available in CMIP3 database and provided input to the IPCC Fourth Assessment Report [22]. CMIP3 was followed directly by CMIP5 [45] in 2010; a new numbering was introduced referring to the corresponding IPCC reports (since CMIP5 results served as input for IPCC AR5; [23]). CMIP5 model simulations have already applied RCP scenarios for prescribing future anthropogenic forcings. Experiments addressed three main issues: (1) to assess the scientific background of model differences in carbon cycle and clouds feedbacks, (2) to examine climate predictability on decadal time scales, (3) to identify reasons for different responses produced by similarly forced models. The sixth phase of CMIP [33] is still under design: simulations will be carried out with Earth System models extended with additional model components and their main focus will be on model biases, predictability and uncertainty issues.

4.1.4 Ensembles of Regional Climate Model Simulations

The first ensemble of regional climate model simulations in Europe were produced in the PRUDENCEFootnote 3 FP5Footnote 4 project [7]. The time horizon of the RCM experiments was 2071–2100 and 1961–1990 was chosen as reference. Due to limited computer resources, time-slice simulations were achieved, meaning that RCM runs concentrated only on the selected two time frames. This is scientifically sound in regional modelling (especially if the RCM contains exclusively an atmospheric model component), since regional models provide dynamical downscaling of GCM outputs and the downscaled outcomes are basically independent of initial date of the integration. The regional experiments focused on Europe with 50 km horizontal resolution using two largely different SRES emission scenarios (A2 and B2, with approx. 850 and 600 ppm CO2 concentration levels in 2100, respectively). Contrary to PRUDENCE, in the ENSEMBLES FP6 project (2004–2009) transient climate simulations (cf. transient method in Sect. 11.3.3) were accomplished for the period of 1951–2100 covering Europe on 25 km horizontal resolution [48]. The simulations were conducted with various RCMs driven by outputs of various GCMs. More focus was put on precipitation projections: the finer resolution and the improved model features led to better representation of related fine scale structures and temporal distribution [2]. The main target of studies was 2021–2050, and it is known based on Hawkins and Sutton [16, 17] that choice of emission scenario has no significant impact to the range of climate projection uncertainties in this time frame. Therefore, the same scenario forcing was applied in most RCM experiments, which is SRES A1B considered as a medium scenario by the end of the century. Consequently, ensemble of ENSEMBLES represents the model uncertainties, which is of key essence in case of precipitation projections (see also Chap. 12 of Szabó and Szépszó in the same volume, [43]). At the same time, this ensemble was not fully well-balanced, because the majority of the RCMs were driven only by two GCMs. Since lateral boundary conditions have great importance on regional outcomes, over-representation of 1 or 2 selected GCMs in the ensemble may bias the probabilistic information.

Recently, the most important cooperation in regional climate modelling is CORDEX [26], initiated by WCRP in 2009. Its original objective was to cover the poorly researched continents (especially Africa) with high- (12–50 km) resolution regional climate model experiments. Nowadays, CORDEX has lots of branches focusing on different regions of the Earth, for instance EURO-CORDEX [25] for Europe. The unprecedented fine-resolution simulations with most recent climate models show improved performance over Europe with respect to the ENSEMBLES outputs [27]. Forcings and lateral boundary conditions for CORDEX RCM experiments are provided by CMIP5 results using different RCP scenarios. As a result, the CORDEX ensemble represents both model and scenario uncertainties, moreover it makes possible to study the impact of emission scenario families (through inter-comparisons with earlier results obtained by PRUDENCE or ENSEMBLES).

4.2 Visualization Methods

In this section some visualization methods are shown which are connected to the above-described applications. The primary aim of these interpretation methods is generally to concisely summarize all the information which is provided by the ensemble members. They can for instance quantify the uncertainty of the forecasts or represent them in a probabilistic way or underline the likelihood of any meteorological event of interest.

4.2.1 Plume Diagram

Plume diagrams have been already referred to in Sect. 11.2 as the demonstration of flow-dependent uncertainty. These plots show the values of a given meteorological variable in all the ensemble members as a function of time for a given geographic location. They are very informative about the growth of the forecast uncertainty and the range of possible future values of a given variable.

The precipitation values predicted by the LAMEPS of the HMS can be compared in a forecast started from 18UTC on 15th of March 2014 (Fig. 11.1a) and in another case, exactly 2 months later (Fig. 11.1b). In the first case a light cold front crossed Hungary which precipitation pattern was rather certain and consequently similar in all the EPS members. In the second case the so-called Yvette storm hit the whole Central European region causing damages by its strong wind gusts and large precipitation amounts. A very complex precipitation field belonged to this cyclone which had low predictability and therefore the ensemble members showed large spread.

Climate projections can also be visualized similarly to the plume diagrams. Figure 11.7 shows the evolution of global mean annual temperature as projected by an ensemble of climate models. The first panel depicts temperature change relative to 1961–1990 based on results of 15 GCM simulations, in which the future CO2 concentration values were uniquely prescribed according to SRES A1B emission scenario. So the 15 projections were conducted with different global climate models taking the same external forcing into account. The annual mean temperature change is foreseen to be in the ranges of 1.0–2.2 °C by 2050 and 2.2–3.8 °C by 2100. In the second panel of Fig. 11.7, projections are extended with outputs of 30 additional experiments achieved with the same GCMs, but applying two additional SRES emission scenarios, A2 and B1. It can be noticed that uncertainty is growing using significantly different scenarios. This enhancement is not uniform in time, scenario choice has greater impact during the second part of the twenty-first century: the projected interval of mean temperature change does not increase significantly (0.5–2.2 °C) until 2050, while warming is expected to be between 1.7 and 4.5 °C until 2100 considering all the three emission scenarios. This means that in projections for the next few decades there is larger departure between results of simulations obtained with different GCMs but with the same emission scenario than vice versa. (What is not surprising considering that CO2 concentration levels in different scenarios start to diverge from around 2030.)

Fig. 11.7
figure 7

(a) Global annual mean temperature change (°C) relative to 1961–1990 based on results of 15 global climate model simulations using SRES A1B emission scenario for description of future anthropogenic activity. (b) Same as (a), but results are based on 45 global climate model simulations using three different SRES emission scenarios (red: A2, green: A1B, blue: B1). Thick curves represent the multi-model means within the given scenarios, grey and black curves indicate the results of control runs and their multi-model mean, respectively

4.2.2 Probabilistic Map

Probabilities can be computed based on the individual EPS members, where mostly the members are taken into account with equal weight. First a meteorological variable (or a climate parameter) and a corresponding threshold value should be defined; for instance temperature below zero degree (or mean precipitation change over zero percent, i.e., precipitation increase). Then the probability of reaching such a threshold can be calculated from the ensemble members at every point of a given domain. The geographical visualization of those probabilities represents the likelihood of the occurrence of a given meteorological event or a climatological tendency.

Probabilistic maps can draw attention to extreme or dangerous meteorological events. On 14th of March, 2013 the probability of a devastating snowstorm was studied in the LAMEPS of the HMS (see Fig. 11.8). The probability of this event can be defined by the joint probability distribution of more meteorological variables (such as temperature, the amount of fresh snow and wind gust) which provide the necessary conditions for the occurrence of a snowstorm. For every variable a different threshold can be defined and some of their combinations can be used. In this way the probability and strength of such complex weather event can be determined together with its geographic extension. From the top left panel to the bottom right panel of Fig. 11.8 the thresholds of fresh snow amount and wind gust values increase, i.e., the joint probabilities show the likelihood of conditions with increased threat.

Fig. 11.8
figure 8

Probability of devastating snowstorm defined as joint probability reaching given thresholds for temperature, fresh snow and wind gust. On all the maps the threshold of temperature was set to 0°, while the amount of fresh snow in 12 h is increasing from left to right (5, 10, 15 cm) and wind gust is increasing from top to bottom (10, 15, 20 m/s). Colors refer to the level of the threat and orange shows the probability of reaching the highest thresholds. Figures were drawn from a 12h forecast of LAMEPS run at 18UTC 14 March 2013

A probabilistic map can also be used in climate applications and its construction is based on the same methodology. Nevertheless, one has to be careful with interpreting this information in the same way as in weather prediction: while ensemble members in NWP represent equally likely forecasts, this cannot be considered for climate projections. In long-term projections, uncertainty due to scenario-type description of anthropogenic activity becomes more and more important with increasing lead time. However, probabilities cannot be associated to these scenarios, since future aspects of human activity strongly depend on socio-economic decisions and cannot be specified with any accuracy [4]. Consequently, the resulted projections are evaluated rather as possible (instead of probable) outcomes with given conditions. Figure 11.9 was created using results of 17 RCM simulations of ENSEMBLES, each of them applied 25 km horizontal resolution and the A1B emission scenario. Percentage values correspond to the ratio between the numbers of model experiments producing winter mean precipitation increase and decrease from 1961–1990 to 2021–2050. Assuming A1B emission pathway as realistic and probable emission pathway, it can be stated that probability of winter mean precipitation increase exceeds 70 % North from Hungary, whereas in Southern and Eastern Europe increase and decrease are equally likely.

Fig. 11.9
figure 9

Probability of winter mean precipitation increase (%) for 2021–2050 with respect to 1961–1990 based on results of 17 RCM experiments available in ENSEMBLES database

4.2.3 Stamp Diagram

It is possible to visualize all the ensemble members next to each other for a given meteorological variable. These diagrams cannot be informative about the details but they are able to warn forecasters on the possibility of hazardous weather, even if it appears only in a limited number of members.

As it was already mentioned the predictability of such small-scale phenomena like thunderstorms is rather low but at the same time they might mean risk in terms of disaster management. That was the case on the evening of 20th of August, 2013 when several events with mass public participation were held in Hungary and which were threatened by the convective activity. In the test version of convection-permitting EPS of HMS (AROME-EPS, see Sect. 11.4.1) almost all the members predicted thunderstorms with small-scale structure (Fig. 11.10). While the existence of precipitation seemed very certain its localization and intensity showed a large variability from member to member. Stamp diagram can easily warn the decision-makers if any of the members predicts hazardous thunderstorm for a given area and possibly suggest the cancellation or postponement of an event.

Fig. 11.10
figure 10

The stamp diagram of forecast for 3-h precipitation amount between 21 UTC on 20th and 00UTC on 21st of August, 2013. Top left panel shows precipitation estimated from radar measurements, while other panels represent the members of convection-permitting ensemble system tested at the HMS (AROME-EPS)

5 Summary

In this work the recent ensemble approaches have been reviewed both in the numerical weather prediction and climate projection fields. The uncertainties of atmospheric and Earth system modelling were underlined giving the motivation for using probabilistic forecasts. Ensemble methods were presented as the only feasible way to get probabilistic information, meaning not only a single but an ensemble of model runs are taken into account.

The key issue in ensemble prediction systems is how differences between the members of an ensemble are defined. Various methods can perturb the initial conditions of the atmosphere, while other methods can represent the model formulation uncertainties. In climate projections, the initial state of the system is less important, but anthropogenic activity is an additional source of uncertainty taken into account through different emission scenarios.

Some examples have been given how the described ensemble approaches can be used in NWP and climate projection systems. It was noted that despite the recent model improvements, uncertainties cannot be neglected. As the resolution of the applied models is getting finer, the smaller-scale motions are resolved explicitly in the dynamical equations. Predictability of these motions is also limited resulting that probabilistic forecasts will be important in the future.