1 Introduction

Changes in the sea level on the global and local scale are one of the emerging topics in present climate research (Munk 2002; Church et al. 2008), as a large number of coastal mega cities, populations and regions will be threatened by sea level rises in the future (Woodworth 2006; IPCC 2007; Hallegatte et al. 2010). Much of the work is done on the assessment, hindcasting and projection of mean sea level, which can be found through tide gauges (Hupfer et al. 2003; Woodworth and Blackman 2004; Vilibic and Sepic 2010; Haigh et al. 2010a), satellite altimetry (Beckley et al. 2007; Cazenave et al. 2009) or with numerical models (Flather et al. 1998; Gregory et al. 2001; Church et al. 2004; Landerer et al. 2007). Sea level extremes have also gained attention recently as they are directly responsible for coastal hazards and floods, and property damage. In addition to these dangers, there are also some serious concerns about the financial impact of changes in weather extremes and their resulting losses (West et al. 2001; Tol et al. 2008; Munich Re 2008; Hallegatte et al. 2010).

To study the projected implications of climate change, a consistent scenario of future climate is needed. Such scenarios are produced by globally-coupled atmosphere-ocean circulation models. However, for shallow seas like the Baltic Sea or the North Sea, the present generation of such global models do not have the necessary resolution to properly resolve the complex topography. As the global climate models have a typical resolution in the order of 250 km, they are of limited use to describe the dynamics in the Baltic Sea with a dimension of 1000  ×  2000 km. Typically, they also lack important shelf sea physical processes like turbulent mixing, overflows, fronts and a realistic description of the bathymetry and the coastline. Hence, there is an increasing need to use regional/local ocean models to provide valuable, high-resolution information to governments, stakeholders and coastal engineers (Ådlandsvik and Bentsen 2007; Melsom et al. 2009; Holt et al. 2010; Brown et al. 2010).

Meteorologists have hypothesised that climate change may be leading to increased storminess within the NE Atlantic, and a string of empirical studies (Schinke 1993; Schmith et al. 1998; Alexandersson et al. 2000; Kjellström 2004; Beniston et al. 2007; Kjellström et al. 2011) have investigated whether the storm climate of the NE Atlantic and northern Europe has actually changed over the past century or will change within the next century. The results of these studies show a wide spread and even do not agree in sign, nevertheless there is some suggestion that the magnitude and frequency of storms may indeed have increased over the past century, and especially during the past thirty years. Lehmann et al (2011) recently showed, by using reanalysis data for the Western Baltic Sea, a shift in the peak of the number of strong wind days from November/December to January/February. However, the analysis did not show a change in the total number of strong wind days. A further recent study by Nikulin et al. (2011), taking an ensemble of global models into account, confirmed the high uncertainty in possible changes of extremes over Europe. However, only the southern part of the Baltic Sea showed a consistent increase in strong wind events within the ensemble.

This study focuses on changes in the sea level extremes for the western part of the Baltic Sea. The Baltic Sea is a marginal, semi-enclosed water body, with a highly stratified water column. The mean circulation of the Baltic is governed by a weak estuarine flow driven by the fresh water excess, which is mainly supplied by the discharge of the rivers surrounding the Baltic Sea (Wyrtki 1954; HELCOM 1986; Feistel et al. 2008). Especially the Western Baltic Sea, the transition region between the North Sea and the Baltic Sea, is a challenging region, which is characterised by complex bathymetry, intermittent stratification, gravity flows and fronts (Fennel and Sturm 1992; Schmidt et al. 1998; Lass and Mohrholz 2003; Burchard and Rennau 2008). The intermittent inflows of high-saline water from the Kattegat, passing the Western Baltic Sea, maintain the salinity balance in the whole Baltic (Matthäus and Franck 1992). These inflows are further important driver for the deep water ventilation and necessary to keep/establish the permanent salinity stratification in the Baltic Sea (Meier et al. 2006b). The general stratification of the Baltic Sea consists of brackish surface water, separated by a permanent halocline in a depth of 35–70 m, from a dense bottom water pool in the different sub basins. The salinity is decreasing from the entrance of the Baltic (Kattegat, 33 k/kg) toward the northern part of the Baltic Sea (Bothnian Bay, 3 g/kg). This difference is causing a constant sea level difference of approximately 35 cm between both ends of the Baltic Sea (Ekman and Mäkinen 1996). The water level within the Baltic Sea is manly controlled by large-scale atmospheric patterns with a significant correlation to the North Atlantic oscillation index (NAO) (Heyen et al. 1996; Gustafsson 1997; Anderson 2002; Jevrejeva et al. 2005). Moreover, there is a pronounced annual cycle of the sea level variability, with maximum variance in late autumn to early winter (Samuelsson and Stigebrandt 1996; Hünicke et al. 2008), which is related to winter storm activity. As the Baltic Sea is a semi closed basin, standing waves with maximum amplitudes at the extreme ends and a node in between (Neumann 1941; Samuelsson and Stigebrandt 1996) are possible, but can also occur in sub basins (Jönsson et al. 2008). These seiches will additionally contribute to extreme sea levels.

During recent years, numerous studies were published dealing with sea level extremes in the Baltic Sea. Part of these studies were purely based on data analysis of observations (Baerens and Hupfer 1999; Omstedt et al. 2004; Barbosa 2008; Hanson and Larson 2008; Kowalewska-Kalkowska and Wisniewski 2009), but the majority used a combination of observations and numerical modelling (Suursaar et al. 2003; Johansson et al. 2004; Meier et al. 2004; Hünicke et al. 2008). The possible projections based on these studies are varying. Baerens and Hupfer (1999) found that storm surges on the German Baltic coast would not change significantly. Similar results have been reported for the North Sea (see Langenberg et al. 1999). However, Meier et al. (2004) discussed the problem that the models of Baerens and Hupfer (1999) or Langenberg et al. (1999) were based on a statistical description and could therefore not reproduce the change in the high intra-monthly percentiles relative to the mean level. Therefore, they found no significant changes in storm surges at all. Recent downscaling experiments to estimate changes in storm surge heights for the North Sea showed significant changes (Woth et al. 2006; Weisse et al. 2009). Furthermore, Johansson et al. (2004) concluded that the past negative trend in mean sea level in the Gulf of Finland will not continue in the future, because an accelerated global average sea level rise will offset the land uplift. Meier et al. (2004) showed that the uncertainty in the large scale forcing is still high, but they concluded that sea level extremes might increase more than monthly mean sea level. Furthermore, their results showed that medium scale models are able to reproduce surge heights for most parts of the Baltic Sea. The only region lacking agreement was the Western Baltic Sea. Thus, they suggested using higher spatial discretisation for this part of the Baltic Sea.

This paper uses dynamical downscaling to regionalise present climate and future global climate scenarios for the Western Baltic Sea. The final spatial resolution of the model is approximately 1 km to allow a realistic description of topographic features like sills, sounds, and coastlines. Although the model is used to give a statistical description of the surge levels, a recent hindcast simulation with an identical setup (Burchard et al. 2009), as used in this study, gave good agreement with observed levels. In the first part of the paper, downscaling experiments of the last 40 years are discussed. Even though results from climate projections of two greenhouse gas emission scenarios proposed by the Intergovernmental Panel on Climate Change (IPCC 2007) are shown, in a latter section, focus is given to robust measures for handling uncertainties in the projections. Because the uncertainty of mean sea level rise is still high (Rahmstorf 2007; Lowe and Gregory 2010; Radic and Hock 2011), sensitivity experiments are used to estimate the impact of changes in mean sea level and wind speed on the storm surge heights. Furthermore, the analysis of present day surge levels and projections does not only focus on stations, but rather tries to give a spatial description of the surges.

2 Methods and data

2.1 The forcing scenarios

The meteorological forcing for our simulations was provided by the dynamic downscaling carried out with the Climate Limited-area Model—CLM (CLM 2008), the climate version of the operational weather forecast model of the German Weather Service. The horizontal resolution of the CLM is about 18 km (this is high enough to capture the effect of land/sea transition) and the time resolution is taken as 3 h for all necessary meteorological variables (10 m wind, air temperature, dew point temperature, cloud cover, air pressure and precipitation). The global climate model is ECHAM5/MPI-OM (Jungclaus et al. 2006). The forcing data set covers the period from 1960 to 2100. It is divided into the reference period (C20) covering the years 1960–2000 and the two greenhouse-gas emission scenarios, A1B and B1 (2001–2100). Here, the A1B scenario is a more pessimistic one, whereas the B1 scenario is more optimistic with a reduction in C02 emission. For each scenario, two realisations are available, thus 480 years of simulation data are accessible.

The oceanic boundary conditions are taken from the transient Modular Ocean Model, MOM-3.1 (Griffies et al. 2001) simulations of Neumann (2010), covering the entire Baltic Sea and parts of the North Sea with a horizontal resolution of 3 nm and 77 vertical (geopotential) grid layers, with a near-surface resolution of 2 m, increasing with depth.

2.2 The local ocean model

The General Estuarine Transport Model [GETM, Burchard and Bolding (2002); Burchard et al. (2009)], which has been used for the present numerical study, combines the advantages of bottom-following coordinates with the turbulence module of the General Ocean Turbulence Model [GOTM, Umlauf et al. (2006)]. GETM is a three-dimensional free-surface primitive equation model using the Boussinesq and boundary layer approximations. Vertical mixing is parameterised by means of a two-equation \(k-\varepsilon\) turbulence model coupled to an algebraic second-moment closure (Canuto et al. (2001), see also Burchard and Bolding (2001)). Due to the high importance of gravity flows in the Western Baltic Sea bottom-following coordinates are advantageous as they do not need an additional overflow parameterisation like in geopotential ocean models (Beckmann and Döscher 1997). For the climate simulations we employ the same setup as described by Burchard and Rennau (2008) or Burchard et al. (2009), but we shortly summarise the main settings. For horizontal discretisation, a bathymetry with a resolution of approximately 1 km (426  ×  469 ×  35 grid-points, 60% active water points) is used. Since numerical mixing is significant even at such a resolution (Burchard and Rennau 2008), explicit horizontal mixing is neglected. Further, bottom- and surface-fitted vertical coordinates with 35 vertical layers and a horizontally homogeneous bottom layer thickness of 0.4 m are applied, such that the flow can smoothly advect along the bed. Although we use an ocean model with a wetting and drying algorithm, no changes in the coastline due to the increase in sea level are considered. Thus, the inundation of land due to flooding is not taken into account (fixed coastline). We further use a bottom roughness length of z 0 = 0.001 m, which gave the right propagation speed of Baltic inflows (Burchard et al. 2009).

To force our high-resolution local model along the open boundaries (Fig. 1), four-hour mean profiles of temperature and salinity are extracted from the MOM simulations. Additionally, the sea surface elevation and the depth-averaged currents are extracted with a temporal resolution of one hour. The depth-averaged currents and sea surface elevation are combined in a ’Flather boundary condition’ (Flather 1976). Although the eastern open boundary is close to the region of interest, this coupling of MOM and GETM gave good agreement between observations and simulations in previous studies (Burchard and Rennau 2008; Burchard et al. 2009). The simulations of Neumann (2010) do not include the effect of sea level rise, therefore it has to be explicitly added. A sensitivity run of MOM, where the sea level rise were included, did not yield significant different results, compared to the MOM run without sea level rise (T. Neumann personal communication). To include the sea level rise, we follow the projections of the IPCC, where the possible range for the A1B scenario is given as 21–50 cm and for the B1 scenario as 18–38 cm. In our experiments, we have chosen a sea level rise of 50 cm for the A1B scenario and of 25 cm for the B1 run. However, as the results of Radic and Hock (2011) show, the contribution of glacier/ice caps melting are still associated with large error bars. In contrast to the original settings of Burchard and Rennau (2008) or Burchard et al. (2009), the time step was increased to 15 s for the barotropic and to 375 s for the baroclinic mode. These settings are close to the stability criterion, but allow for a fast time stepping. To keep the simulation computationally feasible, the whole domain was decomposed into 251 active sub domains (21 × 22 × 35 grid-points,).

Fig. 1
figure 1

Model domain, open boundaries, and location of the Western Baltic Sea. The grey shading indicates the depth below mean sea level in metre, with a contour level spacing of 10 m. Upper panel a map of whole Baltic Sea with the location of the station SLA-Landsort. Additionally, the boundaries of the local model domain are indicated. The location of Danish observation stations are: DBA-Ballen, DHO-Hornbæk and DGE-Gedser. The Swedish stations are SYS-Ystad and SSI-Simrishamn. The stations located in Germany are: GFL-Flensburg, GKI-Kiel, GLU-Lübeck, GWI-Wismar, GWA-Warnemünde, GST-Stralsund, GGR-Greifswald, GSA-Sassnitz, GKO-Koserow and GUE-Ückermünde. Further, GB denotes the Great Belt and OS the Øresund. The thick green lines indicate the location of open boundaries. The dashed thin green line shows the region where spatial data are analysed

In this study, we use from the model output only the sea surface elevation data. They have an hourly resolution at the stations indicated in Fig. 1. For the spatial analysis, three-hour fields are used. To reduce the storage space, the spatial date are only stored for a subregion of the whole model domain. This subregion is indicated in Fig. 1 by the green dashed line. For an analysis/validation of the baroclinic fields, the interested reader is referred to Gräwe and Burchard (2011).

The downscaling as presented here are fully baroclinic simulations. Thus, changes in the water balance due to changes in the fresh water fluxes or due to steric effects are included. As the scope of this paper are sea level extremes, we believe that changes due to steric effects or changes in water fluxes are of minor importance compared to changes in storm surge heights due to sea level rise or changes in winds. Furthermore, as GETM has two open boundaries, it is not fully mass/volume conservative. Therefore, these changes will not be discussed in this manuscript.

2.3 Sensitivity experiments

To understand the impact of different forcing factors on the simulation results, sensitivity experiments are carried out. Here, the focus is on the variation of the sea level rise and mean wind speed. In a first set of experiments, the sea level at the boundary is raised by a constant offset of 40 cm, C20SL04 and by 80 cm, C20SL08 for the period 1970–2000. This procedure keeps the barotropic gradient between the Kattegat and the Baltic Sea constant. The intention of these sensitivity experiments is to study the interaction of variation in mean sea level and surge height. As shown by Prandle and Wolf (1978) or Jones and Davies (2007), the nonlinear interaction can lead to significant deviation from a linear assumption.

The second set of sensitivity studies deals with the increase in wind speed. Because there is no overall agreement on future changes in the storminess (duration, frequency or intensity), in the experiment C20U05 the wind speed is increased by 5%, which is in the range of possible changes (Räisänen et al. 2004; Jacob et al. 2007; Christensen and Christensen 2007; Kjellström et al. 2011).

As we cannot rerun the outer model (MOM), these sensitivity studies are limited to the Western Baltic Sea setup. For changes in mean sea level, this limitation can be regarded as unproblematic. However, for the variation of wind speed (C20U05), there are significant contributions, especially for NE-wind, in surge height, because for NE-wind, we have the longest possible fetch in the Baltic Sea. However, the results of Nikulin et al. (2011) indicate that only the southern part of the Baltic Sea might be affected by a future increase in wind speeds and thus the length of the fetch is of less importance. Nevertheless, limiting the changes in wind speed (C20U05) only to the Western Baltic Sea, might lead to an underestimation of storm surge heights in the local model. In Table 1 the settings of the sensitivity experiments are summarised.

Table 1 Settings of the sensitivity experiments covering the period 1970–2000

2.4 Data

Sea level records have been obtained for 15 sites around the Western Baltic Sea (Fig. 1). The sea level records have been converted into the same format and referenced in universal time +0 h and chart datum. The mean of all time series has be removed to avoid biases due to different national reference heights, as we are only interested in the deviations from the mean and not in the absolute height. The data are available with an hourly resolution and have been rigorously checked for common errors such as data spikes. Spurious records have been excluded. As missing data can bias trend estimates, particularly in short records, years have only been included in the analysis if they were at least 75% complete. The observed time series have been separated into their three components: mean sea level, astronomical tide, and non-tidal residuals. This separation is performed by means of a separate tidal analysis for each calendar year, using the harmonic tidal analysis MATLAB toolbox T_TIDE of Pawlowicz et al. (2002).

In Fig. 2, the coverage of available observation data is summarised. The shortest time series cover 22 years; the longest has a length of 45 years. The gauge data at DBA are only used for tidal analysis and not for estimation of surge levels.

Fig. 2
figure 2

Availability of the sea level observation records. Counted are only years if they were at least 75% complete. See Fig. 1 for the location of the gauges

2.5 Estimation of return periods

To estimate the return periods and return levels of storm surges, it is common to use Extreme Value Theory (Coles 2001), by applying, for instance, the Annual Maxima (AM) method. Here, the highest annual value of surge level is picked out of a time series and afterwards, an appropriate extreme value distribution is fitted to the annual maxima. A possible choice is the generalized extreme value distribution (GEV) with cumulative distribution function as,

$$ F(x;\mu,\sigma,\xi) = \exp\left\{-\left[1+\xi\left(\frac{x-\mu}{\sigma}\right)\right]^{-1/\xi}\right\}, $$
(1)

where μ is the location parameter, σ the scale parameter and ξ the shape parameter. The shape parameter ξ governs the tail behaviour of the distribution. The sub-families defined by ξ→ 0, ξ > 0 and ξ < 0 correspond, respectively, to the Gumbel, Fréchet, and Weibull families (Coles 2001), whose cumulative distribution functions are displayed in Eqs. 24. In Fig. 3a the density functions are visualised.

  1. 1.

    Gumbel distribution or type I extreme value distribution for ξ→ 0:

    $$ F(x;\mu,\sigma)=\exp\left\{-\exp\{-(x-\mu)/\sigma\} \right\}. $$
    (2)
  2. 2.

    Fréchet distribution or type II extreme value distribution for ξ > 0:

    $$ F(x;\mu,\sigma,\alpha)=\left\{ \begin{array}{ll} &0 x\leq \mu ,\\ \exp\left\{-\left(\frac{x-\mu}{\sigma}\right)^{-\alpha}\right\} & x>\mu. \end{array} \right. $$
    (3)
  3. 3.

    Weibull distribution or type III extreme value distribution for ξ < 0:

    $$ F(x;\mu,\sigma,\alpha)=\left\{ \begin{array}{ll} \exp\left\{-\left(\frac{x-\mu}{\sigma}\right)^{\alpha}\right\} &x<\mu, \\ &1 x\geq \mu. \end{array} \right. $$
    (4)
Fig. 3
figure 3

a Probability distributions of limiting distributions of the GEV for an arbitrary choice of parameters, with mu = 0.6, σ = 0.2, and ξ varying as 0.4, 0, −0.4 and b visualisation of tail behaviour in a return-value plot

A further important property of the three types of limiting distributions is the tail behaviour, which is shown in Fig. 3b. If the shape parameter ξ > 0, then the GEV distribution is said to be heavy tailed. Because its probability density function decreases at a slow rate in the upper tail, the moments of the GEV are infinite for orders greater than ξ > 1/2 (e.g., the variance is infinite if ξ > 1/2; the mean is infinite if ξ > 1). If ξ < 0, then the distribution has a bounded upper tail, thus a limiting maximum extreme value. The case of ξ = 0 in Eq. 1, obtained by taking the limit of the general expression as ξ→ 0, is termed the Gumbel distribution (i.e., an unbounded, thin tail) with an exponential tail roll-off.

Although the AM-method is widely used (Carter and Challenor 1981; Coles 2001; Katz et al. 2002; Bernier et al. 2007), there have been several developments to improve extreme value estimation by incorporating more data than just the annual maximum. Another possibility of using more data is based on the r-largest order statistics within a block (i.e. a year) for small values of r (Sobey and Orloff 1995; Soares and Scotto 2004; An and Pandey 2007; Haigh et al. 2010b). The origins of the use of the asymptotic distribution of the r-largest-order statistics method can be traced back to Weissman (1978). Smith (1986) proposed a method for extending the classical analysis for the case when the r-largest values are available for each year, but he only considered these ideas for the limiting Gumbel-distribution. Extensions for the GEV distribution for the r-largest-order statistics have been considered by Tawn (1988).

One alternative approach that enables the modelling of extreme value data is the peak over threshold method (POT) (Coles 2001; Naess and Clausen 2002; van den Brink et al. 2005; Letetrel et al. 2010). The POT approach originated in hydrology quite a while ago (Shane and Lynn 1964; Todorovic and Zelenhasic 1970). Its rationale is that if additional information about the extreme upper tail were used, besides the annual maxima (i.e., other relatively high values in the sample), then more accurate estimates of the parameters and quantiles of extreme value distributions would be obtained. Thus, all data that are above a certain threshold are used and fitted by a Generalised Pareto distribution (GPD). The GPD has a cumulative distribution:

$$ F(x;\mu,\sigma,\xi)=\left\{ \begin{array}{lll} 1 - \left(1+ \frac{\xi(x-\mu)}{\sigma}\right)^{-1/\xi} &\hbox{for} & \xi \neq 0, \\ 1 - \exp \left(-\frac{x-\mu}{\sigma}\right) & \hbox{for} & \xi = 0, \end{array} \right. $$
(5)

where μ is the threshold, σ the scale parameter and ξ the shape parameter as defined above. Statistical properties of the threshold approach have been discussed in detail by Davison and Smith (1990) or by Leadbetter (1991). However, the POT approach converges to the GEV as shown by Pickands (1975).

Except for the AM-method the r-largest value and POT need the definition of certain thresholds. For the r-largest value method, the value of r has to be defined and for POT the excess threshold. Therefore, these parameters have to be defined in advance and will definitely influence the surge level estimations.

Like in the POT method, the quality of the r-largest statistic heavily depends on the choice of r (Coles 2001). The values used in the literature vary strongly: Tawn (1988) used values of r varying from 3 < r < 7 in the North Sea, Coles (2001) used r = 10 for Venice surge data, Soares and Scotto (2004) used r = 5 for the North Sea, Butler et al. (2007) used r = 20 for the southern North Sea, and Haigh et al. (2010b) used r = 8 for extremes in the English channel. Thus, the number of annual largest surge levels has to be estimated separately for the Western Baltic Sea. A possible way to estimate r is to look onto the convergence of the parameters μ, σ and ξ (Coles 2001). Such a convergence table is given in Table 2 for a single station. Here on can see that for r = 5, ξ and σ show a local minimum, hence the choice of r = 5 might for this station the appropriate number. A second criterion is that r should be selected such that it minimizes the variance associated with a required quantile estimate. Thus, there is a bias-variance trade-off associated with the number of order statistics r used in the analysis. A small value of r can result in large variance, but a large r is likely to cause a bias (Smith 1986).

Table 2 Convergence of GEV-parameters ξ, σ and μ, for station GWA
Table 3 Amplitude of tidal constituents at different sea level gauges in centimetre

The POT is a useful alternative to the popular GEV method in the field of extreme value estimation. However, the threshold sensitivity of quantile estimates is a topic of concern. The experience suggests that a very high threshold resulting in a small POT sample would increase the sampling uncertainty (variance) associate with a quantile estimate. On the other hand, as threshold is lowered to include more data, quantile bias tends to increase (Smith 1986). In this sense, it is expected that an optimal threshold might exist that would minimise both bias and variance. However, popular choices for the threshold are, for instance, the 95, …, 99, 99.5th percentile of annual sea level elevations (Table 3).

To finally estimate the return level \(\zeta_{m}\) with m the return year and 1/m the return period, since to a seasonable degree of accuracy, the level \(\zeta_{m}\) is expected to be exceeded on average once every 1/m years, it is necessary to invert Eq. 1:

$$ \zeta_m=\left\{ \begin{array}{lll} \mu - \frac{\sigma}{\xi}\left(1-\left[1-m\right]^{-\xi}\right), &\hbox{for} & \xi \neq 0 \\ \mu - \sigma\log\left(-\log(1-m)\right), & \hbox{for}& \quad \xi = 0 \end{array} \right.. $$
(6)

2.6 Seasonality

The majority of the storm activities take place during the autumn and winter periods (Fig. 4a) (Samuelsson and Stigebrandt 1996; Hünicke et al. 2008; Lehmann et al. 2011). If the analysis is based on calendar years, there is a risk that extreme events that occur during the same season (December to February) will be referred to different years. In addition, events occurring during January-February in one year will belong to the same year as an event occurring in December the same year. Thus, the division of the time series into calendar years is not justified. Instead, the data series were divided into what is often referred as a climatic year. In our case, a climatic year starts on June 1st of a particular year and ends on May 31st the following year.

Fig. 4
figure 4

a 95th percentile of observed monthly sea level and b deviation of the C20 simulations from the observations in percent

In Fig. 4b the deviations of the C20 simulations from the observed annual cycle are given. For all three stations, the simulations show a consistent underestimation of the 95th percentile in the winter season of up to 8%, whereas in summer the deviations are less than 6%. The bias is most pronounced for the Station GKO. Here the impact of the eastern model boundary is the greatest. Thus, this mismatch might be an indication that the large scale Baltic Sea model (MOM) deviates from the observed annual cycle, and therefore causing the bias in the local model. However, for the inner water of the Western Baltic Sea (GFL and GWA) the agreement is well.

To further investigate, if the deviations in the annual cycle are caused by the driving oceanic model or by the atmospheric model, in Fig. 5a the annual cycle of mean wind speed and 95th percentile for the station GWA is given. Cleary, the higher wind speeds (mean and 95th percentile) in the winter season are evident. Looking onto the deviations of CLM compared to the observations (Fig. 5b), the mean wind speed shows a nearly constant underestimation of 2%, which however can be expected from a "coarse" atmospheric climate model. For the 95th percentile, a tendency to give lower values during winter and higher values during summer can be seen. Thus, the deviation in monthly mean surge levels as seen in Fig. 4 can consistently be explained by the deviations of the annual cycle in the atmospheric model.

Fig. 5
figure 5

a Mean wind speed and 95th percentile of observed monthly wind data. b Deviation of the C20 simulations from the observations in percent. Data are show for the Station GWA for the period 1961–2000

3 The present day climate 1961–2000

Generally, the observed sea level \(\zeta(t)\) at a given location and time can be considered as:

$$ \zeta(t) = \langle\zeta\rangle + \zeta_{tide} + \zeta_{surge} + \zeta_{NL} + \zeta_{ext} $$
(7)

where \(\langle\zeta\rangle\) is the mean sea level over some suitably long period (steric, i.e. density induced effects, variations due to changes in the evaporation/precipitation balance), \(\zeta_{tide}\) is the astronomically generated component, \(\zeta_{surge}\) is the meteorologically generated component due to storms, \(\zeta_{NL}\) is the contribution that occur between the components due to non-linear dynamical processes in shallow water, and \(\zeta_{ext}\) are the contribution due to large scale effects, like the filling state of the Baltic Sea. One might also assemble in \(\zeta_{ext}\) the contributions of seiches or baric lows as discussed by Wiśniewski and Wolski (2011). This separation is useful but imperfect because the mean sea level also includes averaged effects of storm surges, and long-period tides include meteorologically generated contributions, so that care is needed to avoid some double accounting.

3.1 Tides

Typically, the tides show no important influence along the German Baltic Sea coast, with an M2 tidal amplitude of less than 5 cm, Fig. 6. Only in the Great Belt and in the Kattegat, significant contributions can be seen, with an M2 tidal amplitude of bigger than 15 cm.

Fig. 6
figure 6

Amplitude of simulated M2 tide in centimetre. The contour levels spacing is 1 cm

To show that GETM can reproduce the observed tidal amplitudes, a comparison of observations and simulations of four tidal constituents (M2, N2, S2 and K1) are given in Table 3. The comparison shows that the local model is able to reproduce the tidal amplitude in the Western Baltic Sea.

3.2 Correlation time

Consider the time series of sea levels comprising of a sequence of separate independent storms, each having an average standard storm length T. Then, if only the maximum value within each storm is extracted, the r largest such values for the year are the required r largest independent annual events, under the T-hour storm length assumption. Clearly, it is important to estimate T accurately, because if the estimate is too small, events which are actually dependent may be included; whereas, if the estimate is too large, events which are actually independent may be excluded, such that the precision of the procedure would be reduced.

In order to extract the r-largest independent surge levels and to distinguish between two surge events, it is necessary to use the concept of the standard storm length, adopted by Tawn (1988). The standard storm length T can be estimated through the analysis of the autocorrelation function R(τ) of the process (Fig. 7a). A first approach to compute an integral time scale is simply integrating the autocorrelation function:

$$ T_{int} = \int\limits_0^\infty R(\tau) d\tau. $$
(8)
Fig. 7
figure 7

Autocorrelation R(τ) of hourly sea surface elevation at Rostock: a autocorrelation function and error bars and b decadal logarithm of R(τ) and fit of two exponential functions with time scales T 1 and T 2

For three stations at the German coast, T int is approximately 2–3 days (Table 4), based on the analysis of observed time series (Fig. 2). The estimates based on the simulations agree well with this time scale. A second method to estimate this important time scale is based on the assumption that the autocorrelation function has an exponential decay and the decay time gives the time scale we are looking for. For the Western Baltic Sea, the autocorrelation can be modelled by two exponential functions, like the logarithmic plot in Fig. 7b suggests. Thus, the autocorrelation function is expressed as:

$$ R(\tau) = a_1 e^{-\tau/T_1} + a_2 e^{-\tau/T_2}, $$
(9)

where τ is the time lag, T 1 and T 2 are the individual decay time scales, and a 1 and a 2 are fitting constants. The estimates based on the observations indicate that the two time scales are 2 days and 8–12 h. Again, the estimates based on the simulations yields similar results (Table 4). The fast decay with values of 8–12 h can be associated with the daily cycle (for instance land/sea breeze). The slow decay, with a decay time of 2 days might be the time scale we are looking for, the decay of single sea level events.

Table 4 Correlation time of sea surface elevation at different gauges in hours

The estimation of T, the average surge duration yields values of less than 3 days, based on two different methods. Stigge (1994) estimated the average duration of a storm surge in the Southern Baltic as 12–24 h. This is shorter than our estimates, but he considered the duration of the storm surge itself, whereas we a looking for a correlation time. However, taking the larger value (T int  = 3 days) ensures that two events are not double counted. Thus, to separate two surge events and to extract the r-largest independent surge level, the method is as follows:

  1. 1.

    identify the largest value which is extracted from the series

  2. 2.

    pick out the largest remaining three-hour values from within the climatic year of interest

  3. 3.

    discard values with a lag of T/2 and less from either side of the value chosen in (2)

  4. 4.

    repeat until the r-largest value is extracted.

The same procedure is valid for the AM method. The surge duration criterion is not required for the POT-method because all data above the threshold are used.

3.3 Validation of surge heights

To estimate surge levels along the German coast (Fig. 1), we compare at first surge level estimation by three methods: AM, r-largest, and POT, with varying parameters. The results are given in Table 5.

Table 5 Comparison of different methods to estimate the 30-year return surge level based on observations

The comparison indicates that all methods give similar results, except the r-largest method with r = 10. Here a systematic underestimation can be seen. One can also see that the 95% confidence intervals are smallest for r = 10. Thus, the gain in confidence leads to a bias in the surge level estimation (Smith, 1986). The estimates of POT are stable by varying the threshold (μ=99th percentile or μ=99.9th percentile of mean annual sea surface elevation). This is an indication that the tail behaviour is well sampled by the generalised Pareto distribution.

Because the r-largest approach with r=5 gives reasonable results and the narrowest confidence intervals and the equivalence of the POT and the r-largest method was shown (Coles 2001; An and Pandey 2005; Haigh et al. 2010b), we will use this approach during the remaining part of the paper.

In Table 6, a comparison of estimated return levels, based on observation and simulation, for the 14 stations indicated in Fig. 1 is given. For most of the stations, the difference between observed and simulated levels is less than ±5%. This shows that GETM can reproduce the present day statistic of storm surges. However, by having a closer look to selected stations, some interesting differences are visible. Results from stations GKO, SYS, SSI, and DTE agree well with the observed surge levels, indicating that the boundary conditions are correct. We conclude that MOM provides the right large scale forcing in the Bornholm Basin (except for the deviations in the annual cycle, Fig. 4). A station, which shows significant deviations, is GWI with an underestimation of over 15%. The surge levels of the two nearest stations GLU and GWA are well reproduced. For this station, problems might be caused by the still coarse topography. A similar issue can be seen for the station GUE. The model over predicts the surge levels by 10–20%. Because the comparison for the nearby station GKO shows excellent agreement, we believe that the overestimation is caused by the too wide connection between the Baltic Sea and the Odra Lagoon. The connecting channels have a width of 1 km (grid resolution), whereas in reality this might be less than 150 m. Thus, the greater crossectional area leads to higher surge levels. A further point worth mentioning is that even with a resolution of 1 km, we can see the effect of the grid and missing topographic features (Jones and Davies 2009), like narrow, curvy fairways/channels or nearly dry/wet areas. Whereas most stations surrounding the Arkona Basin have a varying sign in the deviations from the observations, all western stations (GFL, GKI, GLU, and GWI) underestimate the surge levels. Thus, the surge is too strongly damped as it propagates westwards. However, a second possible explanation is the "coarse" resolution of the CLM. Not all effects of the land-sea transition are resolved and so the peak velocities are underestimated, lowering the estimated surge heights.

Table 6 Comparison of surge heights for different return level at gauge stations in the Western Baltic Sea

Nevertheless, Table 6 indicates that GETM is able to reproduce the present surge levels. Again we have to note that GETM is used to give a statistical description of the surge levels; therefore, we do not compare time series. However, a recent hindcast simulation with an identical setup (Burchard et al. 2009), as used in this study, gave good agreement by directly comparing the observed and simulated sea levels.

3.4 Spatial distribution of surge heights

In Fig. 8a, the spatial variation of the 30-year return height is shown. Highest surges in the Western Baltic Sea occur along the German coast with high impact regions around the stations GKO, GWI, and GLU. One can clearly see that the highest values are at south-westerly coasts (peak values of 1.9 m), which are caused by north-easterly storms (Stigge 1994; Baerens and Hupfer 1999). In Fig. 8b the differences between the two realisations C201 and C202 are shown. In most regions the difference is rather small (±2%). Highest values can be seen around station GLU and GWI with values of up to 6%.

Fig. 8
figure 8

a Spatial variation of the 30-years return level in metre for the period 1961–2000. The contour levels spacing is 5 cm. b Difference of the 30-year return level in % between the two realisations of C20. The contour levels spacing is 1%. c Spatial variation of the 100-year return level in metre for the period 1961–2000. d Spatial variation of the of the shape parameter ξ (Eq. 1) for the period 1961–2000

To show surge levels different from the often-used 30-year level, Fig. 8c depicts the spatial distribution of the 100-year return level. The pattern follows closely the distribution of the 30-year level, but with higher values (peak values of 2.1 m). Now also higher surges occur close to the Øresund and north of DGE, which might be caused by storms from SE/S.

To quantify the spatial distribution of the GEV shape parameter ξ in Fig. 8d, a plot is given. As discussed in Sect. 2.5, the sign of ξ controls the tail behaviour of the GEV. In most parts of the Western Baltic Sea ξ is negative, indicating an upper limit in the surge height or an asymptotic value. However, in the Great Belt and in the entrance of the Øresund, ξ is positive, which leads to a power-law behaviour (heavy tailed) and a slow convergence in the upper tails. Thus, there is no theoretical upper bound in the maximum surge height and high surge levels are likely.

4 Benefit of high-resolution models

In the climate simulations of Meier et al. (2004), their 6 nm setup could not reproduce the surge level at the station GGR. Further, in their nearly 100 year hindcast simulations, the maximum sea level at the station GWA was 82 cm, which is only half the value observed (Table 6). They concluded that the deviations are caused by the coarse horizontal resolution: "Another obstacle is the underestimation of extremes in the western Baltic Sea. This problem will be solved when an increased grid resolution of the ocean model is used...". To better quantify the benefit of our high-resolution local model, we compare the return values for two different stations DHO, GWA computed by GETM, and the outer driving model MOM (Neumann, 2010). Additionally, we compare the return values for Landsort (SLA, Fig. 1), based on observations and MOM. The station Landsort is commonly used as a proxy for the filling state of the Baltic Sea, however we use this station here to quantify the large-scale performance of MOM. Because the MOM results are only available every 6 h, we subsampled the GETM and observation time series. Table 7 indicates that MOM underestimates the surge heights at DHO and GWA. We assume that this is due to its coarse resolution. For instance the MOM setup with its B-Grid, needs at least two grid cells to resolve a channel, hence a width of 6 nm. The same holds for the Great Belt. This also implies that the crossectional area of the Danish straits is changed (keeping the depth constant) or the depth of the straits has to be changed (keeping the crossectional area constant). Both changes will alter the flow structure, and the volume transport through the Danish straits. Therefore, the piling up of water in the Danish straits is reduced, because the crossectional area of the MOM setup is higher than for GETM. Thus, for an accurate modelling of storm surges in the Western Baltic Sea, a resolution of better than 1 km in the Danish Straits is required.

Table 7 Comparison of 30-year return surge heights for stations DHO, GWA, and SLA

The comparison for SLA indicates that MOM matches the 30-year return level. Therefore, the large scale forcing is well represented.

5 A historical surge revisited

In 1872, the most severe storm surge along the German coast happened with peak values of over 3 m above mean sea level (Baensch 1875; Baerens and Hupfer 1999). During this event, 275 people drowned. There have been recently some successful attempts to reconstruct the atmospheric and oceanic conditions during this surge (Rosenhagen and Bork 2009). We use the results of Sect. 3.3 to estimate the return period of this event. In Table 8, the observed surge levels at three stations in the Western Baltic Sea are given. Using the parameter estimates based on Table 6, the return periods for these stations can be estimated. The values indicate that the return periods, with values larger than 100000 years, cannot be computed based on present day values. If one takes the possible uncertainty in parameter estimation into account (the 95%-significance level), the return period ranges between 500 and 5000 years. This is still a wide range, but underlines the exceptional character of this event.

Table 8 Estimation of return period of the storm surge of 1872 for the stations GFL, GLU and GGR

6 Climate projections for the Western Baltic Sea 2012–2100

6.1 Atmospheric forcing

To set the stage, the analysis of possible trends in annual wind speed deviates (mean speed \(\langle u \rangle,\) standard deviation σ, 95th percentile and 99th percentile) for the model domain (only above water points) are given in Table 9. At a first glance, all wind deviates show a positive trend. To test whether the trends are significant or not, a Mann-Kendall trend test was used (Mann 1945; Kendall 1975). The results indicate that, especially for the A1B scenarios, a significant increase in the mean, variability, and strong wind events can be seen. This is not the case for the B1 scenarios. However, in the climate community there is no overall agreement whether the frequency and intensity of storms will increase in future climate (Räisänen et al. 2004; Christensen and Christensen 2007; Kjellström et al. 2011), although the A1B scenarios point in that direction. Nevertheless, recent downscaling experiments to estimate changes in storm surge heights for the North Sea also showed significant increase in higher quantiles of wind speed and surge levels (Woth et al. 2006; Weisse et al. 2009).

Table 9 Trends in wind speed deviates (mean speed \(\langle u \rangle,\) standard deviation σ, 95th percentile and 99th percentile) for the period 2001–2100

6.2 Nonlinear effects

An important factor affecting the surges levels are the tides (Prandle and Wolf 1978; Jones and Davies 2007; Horsburgh and Wilson 2007). For instance Prandle and Wolf (1978) or Horsburgh and Wilson (2007) showed that the timing between the tidal phase and the peak surge level can lead to significant nonlinear interaction, which can influence the surge level. Although the contributions of the tides are negligible in most regions of the Western Baltic Sea (Sect. 3.1), the mean sea level rise can be seen as a slow tide (with a period in the range of centuries). Thus, the expected rise in mean sea level might change the return level of present storm surges.

To study these nonlinear interactions, the two sensitivity experiments C20SL04 and C20SL08 are used (Table 1). Since the forcing in these experiments is the same as in the C20 run, with the exception of the mean sea level, the modification in surge heights can be studied.

In Fig. 9a, the 30-year return level is plotted against the rise in mean sea level for three selected stations (GFL, GWA, and GKO). In the simplest case, one could assume that a rise in mean sea level by 1 m could also lead to a rise in surge level by 1 m, creating a linear relationship with a slope parameter of unity. As it can be seen in Fig. 9a, this is not the case for the three stations GFL, GWA and GKO. For instance, the rise in surge level is much more pronounced for station GFL than for GKO. To have a better description of the modification of surge levels due to mean sea level rise, the spatial variations of the slope parameter are given in Fig. 9b. This can be read as follows: for the station GKO the slope is close to 0.9, indicating that a mean sea level rise of 1  m leads to a change in 30-year return level by only 0.9 m, or that a 0.1 m mean sea level rise gives a increase in surge level of 0.09 m. For the western part, one can see slope parameters greater than unity and, therefore higher surge levels due to mean sea level rise (at the station GFL a sea level rise of 1 m gives surge levels that are 1.1 m higher than present day values).

Fig. 9
figure 9

a Linear fit of 30-years return level for sensitivity experiments C20, C20SL04, and C20SL08. b Spatial variation of the linear slope parameter. The contour levels spacing is 1 cm

6.3 Sensitivity to changes in wind speed

Because there is no overall agreement whether the frequency and intensity of storms will increase in future climate or the mean wind speed will change (Räisänen et al. 2004; Christensen and Christensen 2007; Kjellström et al. 2011), the sensitivity experiment C20U05 (Table 1) is used to study the effect of a possible increase of wind speed by 5%. This value is motivated by the trend analysis in Table 9; specifically the A1B scenarios gave similar values. One can assume that an increase in wind speed will lead to higher surge levels, because more water is pushed against the coasts. To study the impact of changes in wind speed on surge levels, Brown et al. (2010) did similar experiments for the Irish Sea, as did Meier et al. (2004) for the whole Baltic Sea.

The changes in the 30-year return level for the Western Baltic Sea are shown in Fig. 10. One can clearly see the increase in surge levels at the south-westerly coasts, with values of 20 cm around GFL and 10 cm around GKO. Furthermore, the western part is more affected by changes in the wind speed than the region surrounding the Arkona Basin.

Fig. 10
figure 10

Increase in 30-years return surge level in metre, compared to C20, by increasing the wind speed by 5%. The contour levels spacing is 1 cm

6.4 Non-stationary extremes

In the reference runs C20, it was assumed that the statistics of the extremes could be described by a stationary process. Figure 11a, b indicates that for the period 1960–2000 this is a valid assumption. However, for the period 2000–2100, both scenarios show a trend in the annual maxima (due to mean sea level rise), therefore violating the stationarity assumption (Coles 2001).

Fig. 11
figure 11

Time series of annual maxima surge heights for the a A1B scenario and b B1 scenarios, for the station GWA

For the Western Baltic Sea it seems plausible that the basic level of the annual maximum (r-largest) is changing linearly in time due to the mean sea level rise, but in other respects, the distribution is unchanged. Using the notation GEV(μ, σ, ξ) to denote the GEV distribution (Eq. 1) with parameter μ, σ and ξ, it follows that a suitable model for \(\zeta_t,\) the annual maximum (r-largest) sea level in year t, might be

$$ \zeta_t \sim GEV(\mu(t),\sigma,\xi), $$
(10)

where

$$ \mu(t) = \mu_0 + \beta\,t, $$
(11)

with the additional parameters μ0 and β. In this way, variations through time in the observed/simulated process are modelled as a linear trend in the location parameter μ(t) (Coles 2001; Katz et al. 2002; Mínguez et al. 2010). The parameter β is therefore the annual rate of change in annual maximum (r-largest) sea level. The simplest approach is to fix β to the mean sea level rise. A bit more sophisticated, and the method we have used, is to use the results from Sect. 6.2 and assign the results from Fig. 9b to β.

6.5 Projected changes in surge heights

In this section, the projected changes in return levels for the end of the century are analysed. In Fig. 12a the expected 30-year return level for A1B is shown. Highest surge levels can be seen around the stations GLU and GKO with peak values of 2.2 m. The general pattern is similar to the C20 run (Fig. 8a), only the maximum elevation is higher. However, at the station DGE, the surge level is now comparable to the stations GLU and GKO, which was not seen in the C20 simulations. The differences between the two realisations (Fig. 12b) indicate higher variability in the Great Belt and around station GKI, with differences of up to 8%. The higher variability at station GKI can be explained by the sensitivity of this region to changes in the wind forcing (Fig. 10). The variability in the Great Belt can also be seen in the C20 run (Fig. 8b). A similar pattern is visible in the 30-year return level for the B1 scenario (Fig. 12d, e). Only the maximum elevation with peak values of 2.0 m is somewhat smaller. Furthermore, the 100-year return levels for the B1 scenario show at the stations GLU, GKO and DGE the highest impact of storm surges similar to Fig. 12c.

Fig. 12
figure 12

Projections for 2071–2100: a 30-year return level A1B, b difference between the two A1B realisations, c 100-year return level A1B, d 30-year return level B1, e difference between the two B1 realisations and f 100-years return level B1

An important question arises when discussing the projections in surge levels: are these changes driven by the change in mean sea level or an increase in wind speed, or by the occurrence of new atmospheric patterns that changes the preconditioning of surges in the Baltic Sea (Baerens and Hupfer 1999; Rosenhagen and Bork, 2009). To investigate the changes in 30-year return level, we use the findings of Sects. 6.2 (sensitivity to changes in mean sea level) and 6.3 (sensitivity to changes in wind speed), to remove these effects from the anticipated changes in surge levels. The underlying assumption is that they add up in a linear superposition. These residual changes are depicted in Fig. 13a and b. For the A1B scenario, one can see additional changes of up to 10 cm in the region around station GKO, which cannot be explained by the changes in wind speed and sea level rise. At the station GKI, surge levels are approximately 10 cm smaller than linear superposition predicts. A similar pattern can be seen for the B1 scenario, only with lower values. Both scenarios show a reduction in the western part, whereas the southwestern part shows an increase in surge levels. However, the relative changes, based on linear superposition (Fig. 13c, d), indicate that the changes are less than ±2% of the 30-year return level. Especially for the B1 scenario, the deviations are rather homogeneous (Fig. 13d). For the A1B scenario (Fig. 13c), the deviations show again the east-west pattern, as discussed above, with differences of up to ±4%. Nevertheless, these deviations of ±4% are still in the range of inter-scenario differences (Fig. 12b, e). Thus, the increase in surge level for both scenarios can consistently be explained by the mean sea level rise and the increase in wind speed. However, the pattern, visible in scenario A1B (Fig. 13c), can be an indication that the driving mechanism causing the storm surges might change. A second explanation can be the interaction between the wind induced changes and the contribution due to sea level rise.

Fig. 13
figure 13

Projections of residual changes in 30-year return level after removing effects of sea level rise (Fig. 9) and increase in wind stress (Fig. 10) for 2071–2100: a A1B in metre, b B1 in metre. Projections of residual changes in 30-year return level in percent: c A1B and d B1

7 Conclusion

In this paper, transient climate simulations covering the period 1960–2100 were carried out using a high resolution ocean model (GETM) for the Western Baltic Sea. These simulations are based on the IPCC scenarios A1B and B1, each with two realisations. Despite the fact, that this study is only based on boundary conditions from one regional atmospheric model (CLM) and one medium scale ocean model (MOM) and not a full ensemble, this analysis can offer a valuable description of a changing environment. However, with the presented analysis uncertainty estimates, based on ensemble prediction (Jacob et al. 2007; Meier et al. 2006a) cannot be given. As we used a dynamical downscaling from only one global model (ECHAM5/MPI-OM), we cannot sample the possible spread as presented by Nikulin et al. (2011). To cope with these limitations, we used sensitivity analysis to estimate the impact of sea level rise and changes in wind speed. Especially for changes in mean sea level, we could show a linear scaling for the changes in storm surge height compared to the mean sea level. The findings of this study can be summarised as follows:

  1. 1.

    The contribution of the tides to surge levels can, in most parts of the Western Baltic Sea, be neglected. Exceptions are the Kattegat, the Great Belt, and the Øresund, thus the tides do not contribute to surge levels.

  2. 2.

    The results of present day simulations are close to the observations. GETM could reproduce the storm surge return levels with deviations of less than ±5%. Around the Arkona Basin, the bias has varying sign, indicating no systematic error. For the most western stations, the model generally underestimates the surge levels due to the still coarse resolution of 1 km, which lead to a too strong damping of the surge. An optimisation of the bottom roughness might help to resolve this issue. However, although we used GETM as the last model component in a downscaling chain, the whole nesting approach provided reasonable boundary conditions to force the next model, leading to these results.

  3. 3.

    The modelled annual climatological cycle deviates from the observations with an underestimation during the winter season and an overestimation during summer. However, these deviations are smaller than 8%. This seasonal bias is mainly caused by the atmospheric model, which shows a systematic underestimation of the mean wind speed and of the higher percentiles (Fig. 5). As we use a model system where several boundary values are passed from one model to the next one, it is difficult to estimate the contributions due to error propagation.

  4. 4.

    An analysis of the projected wind speed for the scenarios A1B and B1 revealed that, especially for the A1B scenarios, an increase in mean wind speed by approximately 4% and in the 95th and 99th percentiles by 3–5% can be seen. A sensitivity study of the effects of an increase in wind speed by 5% indicated that the additional wind stress leads to surge levels that are up to 20 cm higher than the unperturbed simulation.

  5. 5.

    In the simulations, a mean sea level rise of 50 cm for A1B and 25 cm for B1 during the 21st century is prescribed. This increase in mean sea level also increases the surge heights. Due to the nonlinear interaction of mean water level and surges, the increase in surge level does not follow directly the change in mean water. A spatial analysis showed that in the western part of the model domain, the surge levels show a stronger increase than the mean sea level rise, whereas in the eastern part, the ratio is reversed. The results show that the sea level rise has greater potential to increase surge levels than increased wind speeds.

  6. 6.

    The projections for the end of the century, 2071–2100, show an increase of the 30-year return level to 2.4 m for A1B and 2.2 m for B1. These levels are with 0.6 and 0.4 m significant higher than present day values. However, after removing the effects of mean sea level rise and increase in wind speed, based on linear superposition, residual changes of approximately ±2% are left unexplained. These deviations are within the range of inter-realisation differences. Thus, changes in surge levels can consistently be explained by the increase in mean sea level and the anticipated increase in wind speed.

Finally, we have again to state that this paper presents only one downscaling of two IPCC scenarios from one regional atmospheric model and one medium scale ocean model. A broader ensemble of regionalised scenarios is necessary to give a more reliable assessment of the future state of the Western Baltic Sea and the uncertainties involved.