1 Introduction

General circulation models (GCMs) are the primary tool available today for climate simulation and future climate change assessment (IPCC 2001). Although they incorporate the main characteristics of the general circulation pattern, the performance for the simulation of present climate is rather poor when their projections are applied to regional scale. This is mainly due to the typical horizontal resolution of the GCMs (1°–3°), largely imposed by computational restrictions, sufficient to resolve the large-scale forcings but not the impact of the local effects. The large-scale circulation is itself modified by the upscale energy transfer from shorter scale motions, and at least some kind of parameterization of the local forcings (e.g. mesoscale mountain-induced drag) is necessary to obtain realistic results on the large-scale. However, the lack of explicit representation of local circulations prevents the accurate simulation of subregional spatial gradients of the meteorological variables, necessary to characterize the climate of the region.

In the Spanish Mediterranean area (Fig. 1), local effects exert a particularly strong influence on the distribution of meteorological variables. This is due to the characteristic complex distribution of land and sea (both the Atlantic ocean and Mediterranean sea are important sources of moisture and thermal regulators), orography (dominated by prominent coastal mountain systems) and even vegetation and soil characteristics. Among all surface meteorological parameters, precipitation is undoubtedly the most critical variable for past, present and future social and economic impacts on Mediterranean Spain. Its scarcity during the summer months along with the increasing touristic activity is reflected in strong stress on the water resources, especially after abnormally dry years. On the other hand, extreme precipitation events are common in the region and damaging flash-flood events occur virtually every year (see, for example, Amengual et al. 2007).

Fig. 1
figure 1

The Spanish Mediterranean area (outlined in Fig. 2). It includes a smoothed version of its orography and the position of the stations included in the daily rainfall data base (410 in total)

At large scale, rainfall generation in Mediterranean Spain can be associated with the incidence of either Atlantic low pressure systems and fronts (which are effective in the western and northern areas), or Mediterranean disturbances (crucial for the eastern flank of the region). These general rain-bearing flow types and their episodic nature determine the sequence of rainfall events in the region, being the specific mesoscale features of the circulation and the complex topographical forcing the responsible for the detailed spatial structure of the rainfall. Many studies, particularly focused on flash flood events, have shown the important role of orography through its modulation of the large-scale flow in the form of mesoscale perturbations (e.g. Mediterranean cyclogenesis; Doswell et al. 1998; Ramis et al. 1998; Romero et al. 2000), the enhancement of coastal rainfall due to mechanically-forced upward vertical motions and the associated downstream sheltering effect (Jansà et al. 1991; Romero et al. 1998). A deficient representation of topography and these mesoscale aspects in current GCMs would imply, therefore, overly smoothed, inaccurate information of future regional climate rainfall scenarios.

Dynamical downscaling applied to GCM outputs attempts to account for the effects of mesoscale forcings by nesting a higher resolution limited area model over the specific area of interest (Giorgi 1990; Giorgi and Mearns 1999). This idea was originally based on the concept of the one-way nesting, in which large-scale meteorological fields from general circulation model runs provide initial and time-dependent meteorological lateral boundaries conditions (LBCs) for high-resolution regional climate models (RCMs) simulations. Issues of major importance in this technique are: numerical nesting strategy, spatial resolution difference between the driving data and the nested model, spin-up, update frequency of the LBCs, model physics, domain size, etc. Many of these issues have been addressed in subsequent studies, like Denis et al. (2002) and Beck et al. (2004), concluding that in spite of the sensitivity of the results to the RCM set-up, the one way nesting strategy has skill in downscaling large-scale information to the regional scales.

In spite of the rapidly-growing computing power during the last decade, the global nature of future climate simulation efforts and the wide range of greenhouse gas emission scenarios to consider still pose serious challenges for existing and foreseen computational capability. Current GCMs represent a balance among sufficiently realistic physical parameterizations, well-behaved numerical schemes, and grid resolution. Any effort directed toward enhancing the value of the GCMs downscaling results should consider the relative benefits expected from improvement of each of these model aspects. An increase of horizontal grid resolution and temporal frequency of GCM output data ingested by limited area models seems to be a reasonable strategy to follow, but it is not clear this is the optimum approach, especially in areas with such a dominant orographic role as it is the case for Mediterranean Spain. Denis et al. (2003) explored for north-eastern America the issue of the sensitivity of a one-way nested RCM to the spatial resolution and to the temporal updating frequency of the LBCs. The one-way nesting aproach was found to produce satisfactory results for most of the fields investigated with a combination of up to T30 spatial resolution and up to 12 h temporal update interval.

Antic et al. (2004) addressed the downscaling ability over the west coast of North America, a region with complex topography. Orography and coastline were found to have a positive impact on the quality of the downscaled fields in comparison with the results of Denis et al. (2003). In particular, the nesting technique produced significantly improved cold season precipitation fields over the Rocky Mountains area. Dimitrijevic and Laprise (2005) extended the previous study to the summer season, in which the physical processes of local convective origin exert a predominant role. They found lower skill than during the winter, where precipitation generation is dominated by the large-scale dynamical processes and orographic forcing. The sensitivity of the downscaled precipitation to the spatial and temporal resolution of the LBCs was found to be weaker in summer than in winter, owing to the different nature of the responsible physical mechanisms. On the other hand, some studies have found a relative insensitivity of mesoscale model forecasts to the precise structure of initial and boundary conditions in areas with complex terrain (see, for example, Mass and Kuo 1998). In this work we evaluate the sensitivity of mesoscale numerical simulations of rainfall for Mediterranean Spain to large-scale model input data resolution, to help to answer the question whether GCM higher resolution would provide improved dynamically downscaled information in that region in the context of climate change research.

Starting with a description of the mesoscale model used and of the meteorological and rainfall data bases, the next section explains the methodology followed to assess the simulated rainfall quality as function of input data resolution. The rest of the paper is organized in three main parts: first, results for the whole Mediterranean Spain are presented and discussed; second, the subdomain spatial variability is examined; and third, the results are evaluated as a function of six characteristic circulation types derived in earlier work (Romero et al. 1999b; Sotillo et al. 2003). To conclude, implications of the obtained results for the dynamical downscaling task in the region are summarized.

2 Data base and methodology

This study is based on numerical simulations of 165 daily rainfall events in Mediterranean Spain during the period 1984–1993. Romero et al. (1998) used a homogeneous and complete data base comprising daily rainfall series at 410 stations (Fig. 1), to show that 1,275 significant rain days occurred in the region during that decade. A total of 165 days attained the heavy rainfall threshold as defined in the previous study (2% of the stations registered at least 50 mm). To avoid excessive computer time, then, the study was restricted to this reduced population set, with the hypothesis that the results obtained for the heavy rainfall limit would also apply to categories with lower daily rainfall amounts. Obviously, a heavy rainfall day, in our definition, does not imply heavy rainfall in all parts of Mediterranean Spain, but comprises a spatial range of rainfall intensities, from weak or no rain at all, to large values. The seasonal distribution of the selected events follows the typical pattern of the Mediterranean climate: 45% in autumn, 35% in winter, 15% in spring and only 5% in summer.

Six simulations are performed for each heavy rainfall day by nesting the HIRLAM mesoscale model within large scale analyses. The HIRLAM model, a co-operative project of several European national weather services used in the operational context (Källén 1996), is formulated using an Arakawa C-grid in the horizontal and a hybrid p–σ vertical coordinate system. An Eulerian semi-implicit time scheme (Källén 1996) and fourth order explicit linear horizontal diffusion were chosen to integrate the model equations. The model includes parameterizations of short and long wave radiation processes (Savijärvi 1990), turbulent fluxes of momentum, sensible heat and soil moisture data (Louis 1979), shallow convection (Källén 1996), cloud microphysical processes and moist convection (Sundqvist et al. 1989). In addition, soil temperature and soil water variations over land are solved using a force-restore model. Soil moisture and sea surface temperatures are initialized to climatological values. Sea surface temperature is kept constant throughout the simulation. The model is applied over the geographical window comprising from 29.45 W to 19.45 E and from 20.00 N to 59.30 N (Fig. 2), with a horizontal grid resolution of 0.3° (about 30 km).

Fig. 2
figure 2

Geographical domain considered for the HIRLAM model simulations. The Spanish Mediterranean area is highlighted

Large-scale meteorological analyses used to nest the HIRLAM model are constructed from the European Centre for Medium Range Weather Forecasts (ECMWF) ERA-15 spectral reanalysis of geopotential height, temperature, relative humidity and horizontal wind components at eleven standard pressure levels. These analyses are available at 00, 06, 12 and 18 UTC. For the first three experiments, the spectral analyses are gridded onto three different meshes with 1°, 2°, and 3° horizontal resolutions. Note that the equivalent spatial resolution of the ERA-15 fields is 1.125° (approximately 125 km in the region of interest), thus the 2° and 3° experiments imply a coarsening of the information contained in the reanalyses. An implicit nonlinear normal mode initialization, following Temperton’s scheme (Temperton 1988), is used to remove fast gravity modes from the model integration. Then, the model is run over a 54 h period, starting at 00 UTC on the day before the cataloged heavy rainfall day. The accumulated precipitation during the last 24 h of simulation is verified against raingauge observations valid for the same 06–06 UTC period. The first 30 h of simulation allow for the boundary condition to spread across the domain and make the sensitivity tests to boundary conditions relevant (Alpert et al. 1996; Homar 2001; Denis et al. 2002). This resembles the archetype configuration of dynamical downscaling experiments from GCMs, where the memory to the initial conditions in the model is rapidly lost.

Three experiments are thus defined, referred to in the paper as 1°, 2°, and 3° according to the used resolutions for the analyses. These experiments are considered representative of the current range of horizontal resolutions utilized in GCMs for climate simulations. A fourth, fifth and sixth experiments, referred to as 1° + 12, 1° + 24 and 1° + 48 h, are run by using 1° resolution input data but less frequent—12, 24 and 48 h apart, respectively—boundary updates for the large-scale meteorological fields. In HIRLAM, the time varying boundary conditions during model integration are defined by linear interpolation between large scale data at consecutive boundary update times. This last set of experiments attempts to analyze the effects of a low-frequency GCM output system, a characteristic often required owing to data storage limitations. The effects of the LBCs update is probably dependent on the dimension of the integration domain to some extent, in the sense that the smaller the domain, the larger and quicker is the impact of the error arising from the linear interpolation of two boundary conditions too far apart. To explore this issue, a sensitivity analysis of the robustness of the results with respect to integration domain dimension could be carried out, but this kind of analysis is beyond the scope of the study.

Precipitation forecasts are compared against observed rainfall. For each day, observations are interpolated into the 408 model grid points that lie inside the study area, using the kriging method from a network of 410 rain-gauge stations (Fig. 1). Several verification scores are derived from these individual comparisons to assess the model performance for experiment: mean spatial correlation and its standard deviation among the 165 modeled and observed precipitation events (\({\bar{r}}\) and σ r respectively); analogous, mean spatial and standard deviation of the root mean square error (\({\bar{\varepsilon}}\) and σɛ respectively); and the area under the relative operating characteristic curve (ROC score). The event-average of the spatial distribution of precipitation from observations and from the set of experiments are also compared.

The ROC curve method, based on the Signal Detection Theory, is a relatively new approach in Atmospheric Sciences, having been brought into the field as a verification tool by Mason (1982). The method combines false alarm rate (FAR) and probability of detection (POD) for a discrete number of predefined thresholds, giving an equal number of points on a graph of POD (vertical axis) against FAR (horizontal axis) to form the ROC curve. The area under the curve, or ROC score, is then used to assess the skill of the forecast system (Stanski et al. 1989). A perfect system yields an area of 1, whereas a curve lying along the diagonal (ROC score = 0.5) would reflect essentially worthless random forecasts. For the present study, ROC curves were constructed using precipitation thresholds set at 0, 1, 2, 4, 8, 16, 32, 64 and 128 mm.

3 Results and discussion

A multiperspective approach has been followed to summarize the model performance for the six types of simulations. First, the overall performance for the bulk of Mediterranean Spain is examined, and then the analysis is refined by taking into account the spatial variability of the results and six heavy rain-bearing flow regimes.

3.1 Whole of Mediterranean Spain

The performance of the six sets of simulations is evaluated for all heavy rainfall days and the whole Mediterranean Spain. Results are summarized in Table 1 for correlation and root mean square error measures. The mean spatial correlation values for the 165 events do not show large differences among the experiments dealing with the spatial resolution, for all of them the correlation is close to 0.4. However, a small degradation of the forecasts is obtained as the spatial resolution of the input data decreases. Moreover, as the boundary update interval increases the forecasts have less skill, with similar values of correlation for the 3° and the 1° + 12 h experiments and significantly lower scores for 1° + 24 and 1° + 48 h experiments.

Table 1 Average spatial correlation \({(\bar{r}),}\) average root mean square error \({(\bar{\varepsilon},}\) in mm) and their dispersions, σ r and σɛ (in mm), for the six experiments with regard to the observed rainfall

These results suggest that the update frequency of LBCs has larger impact on the downscaling results than their spatial resolution. The values of σ r do not show high sensitivity to data resolution, although a decrease of this parameter with higher spatio-temporal resolution is observed. Regarding the root mean square error, similar results are obtained. The model is more sensitive to the boundary conditions update frequency than to their horizontal resolution. On the other hand, no significant differences among σɛ values are obtained among all the set of experiments.

As an alternative and more appropriate verification method, ROC scores for the considered resolutions have been calculated. A single ROC curve is obtained for each experiment after comparing simulations against observations at all grid points and for all events. Figure 3 shows the obtained ROC curves for the experimental data sets. All curves lie well above the diagonal, and the only appreciable difference among them is the lower score attained by the 1° + 24 and 1° + 48 h sets (Table 2). To examine the significance of the differences among the curves, a bootstrap test (Diaconis and Efron 1983) with 1,000 repetitions is applied for each experiment. Table 2 shows the 95% confidence intervals for the ROC scores, confirming that the 1°, 2°, 3° and 1° + 12 h experiments do not produce significantly different ROC curves, whereas the 1° + 24 and the 1° + 48 h experiments forecast significantly degraded precipitation fields. Therefore, from the ROC method perspective, it cannot be concluded that a significant improvement of forecast skill in Mediterranean Spain is obtained by initializing the mesoscale model with high resolution meteorological data within the considered spatial range. Rainfall downscaling products might benefit somewhat more from improved boundary conditions frequency. These results can be compared with those obtained by Antic et al. (2004) for the west coast of North America. They examined the sensitivities to spatial resolutions of T30, T60 and T360 (roughly 5°, 2.5° and 0.5°, respectively) and temporal resolutions of 12, 6 and 3 h. Changes in spatial resolution of driving data from T30 to T60 in their experiments had more repercussion on the downscaling ability than a change from T60 to T360, and the improvements derived from the increase of the LBCs update frequency became more evident when using high spatial resolution (T360) than with the coarser input data.

Fig. 3
figure 3

ROC curves for 1°, 2°, 3°, 1° + 12, 1° + 24 and 1° + 48 h experiments

Table 2 ROC scores and the 95% percentile confidence intervals for the six experiments

3.2 Subdomain spatial variability

In this section, the quality of the forecasts is examined as a function of location within Mediterranean Spain by computing the ROC curve for each of the 408 model grid points over the area (Fig. 4) and by comparing the mean spatial distributions of observed and simulated rainfall fields (Fig. 5). A wide range of rainfall enhancement or suppression mechanisms have been identified in the region owing to its complex orography. These processes have high spatial variability and depend on the specific flow type (Romero et al. 1999a, b). Atlantic flows, mostly associated with large-scale low pressure systems, favour rainfall over the western and northern zones but are hardly effective in the east and southeast; Mediterranean air flows, less common and associated with smaller-scale disturbances, encourage rainfall in these latter zones but not in sheltered areas like western and central Andalusia. The northerly flows, often associated with the Genoa gulf cyclogenesis, produce precipitation in the Balearics and eastern Catalonia but do not influence the other areas. Such a diversity of rainfall mechanisms and flow-orography interactions must logically be noted to a certain degree on the spatial variability of the forecast quality.

Fig. 4
figure 4

Spatial distribution of ROC scores for the six sets of experiments: a 1°, b 2°, c 3°, d 1° + 12 h, e 1° + 24 h and f 1° + 48 h

Fig. 5
figure 5

Spatial distribution of mean precipitation (in mm) calculated from: a observations, b 1° experiment and c 1° + 48 h experiment

A direct comparison between the mean observed rainfall distribution (Fig. 5a) and the predicted fields (Fig. 5b, c) reveals that the model underestimates precipitation across the domain. However, the mean spatial distribution of the forecast fields resembles the observations, except in high mountainous ranges of east Andalusia where the artificial orographic effect in the model is obvious. Comparing Figs. 4 and 5, it is evident that a direct relationship between model skill expressed in terms of ROC score and in terms of the average precipitation can not be established. For instance, higher ROC scores are found in western Andalusia than in the Aitana range area, but the event-average precipitation is better in the second zone. A detailed analysis of the results reveals that the day-by-day agreement between model and observations in western Andalusia is quite good, except for a systematic underforecast of rainfall amounts. In contrast, a higher variability in the daily performance of the model is obtained in the Aitana range area, with a great proportion of observed extreme daily rainfall values that the model is not able to reproduce—thus lowering probability of detection—and, on the opposite, some simulated daily values significantly above observations, contributing to increase false alarm rates.

The results for the six sets of experiments (Fig. 4) reveal only slight differences among their performance when the spatial resolution is changed or the boundary conditions are 12 h apart. Degradation is clearly visible for the 1° + 24 and 1° + 48 h experiments as it was globally observed in last section for the Mediterranean Spain (Table 2). This general decrease in ROC scores is partially attributable to the weaker average precipitation amounts obtained as the boundary conditions update frequency decreases (see Fig. 5c for 1° + 48 h mean precipitation). Regarding the spatial distribution of ROC values, the degradation of the areal-averaged ROC score for 1° + 24 and 1° + 48 h experiments is also observed in many subareas, particularly over Catalonia and most of Andalusia (Fig. 4e, f).

Notable contrasts in model performance emerge among different areas in Mediterranean Spain (Fig. 4a). The highest scores are obtained over Catalonia, central and western Andalusia, the Balearics and some areas of the southeast (ROC score > 0.75). On the contrary, eastern Andalusia and many parts of Valencia are characterized by lower ROC values (ROC score < 0.75). The higher ROC scores over the western and northern regions (even exceeding 0.9 in mountainous areas of western Andalusia) can be associated with the relatively high forecast capability for Atlantic flow situations (see next section). These flows are generally associated with large-scale pressure systems which do not suffer appreciable orographic modification as they approach from open oceanic areas. Even if these disturbances contain some analysis or forecast error, and consequently uncertainties in the impinging flow direction are present, no significant effects on the rainfall pattern are likely, especially in western Andalusia, where the exposure to the Atlantic moist flows is effective for a wide range of flow directions. The relatively high scores in central and eastern part of Catalonia and, in a minor measure, in most of the Balearics can also be attributed to the relatively good forecasts of northern Mediterranean cyclones (see next section). Many of these cyclones, particularly those developed near the Gulf of Genoa, are the result of Alpine cyclogenesis (Buzzi and Tibaldi 1978). The main ingredients of these cyclogenesis events, an Atlantic frontal system associated with a baroclinic trough and the extensive Alpine barrier, are well-captured by numerical weather prediction models, resulting frequently in good rainfall forecasts.

In contrast, the lower ROC scores generally found in the east-facing regions of Mediterranean Spain (from eastern Andalusia to south Catalonia) can be attributed to the particular nature of the rainfall systems—often convectively driven—that develop over the Mediterranean sea. Dimitrijevic and Laprise (2005) pointed out similar problems in reproducing the precise timing and location of the convective precipitation events that prevail during the summer season in western North America. Furthermore, Mediterranean disturbances are typically smaller than Atlantic systems, even of mesoscale size, and are often a consequence of the strong disruption of the westerly mid-latitude circulation. In addition, the closed characteristics of the western Mediterranean basin and the prominent surrounding mountain chains strongly modulate the low level flow in the form of pressure dipoles, secondary cyclones and other mesoscale circulations (Reiter 1975). The mesoscale properties of the Mediterranean circulations and the complex physical processes involved in their genesis affect the predictability of these features. In particular, small errors in the near-surface flow direction can lead to appreciable rainfall modification, owing to the complexity of the orography and coastline pattern. Not surprisingly, then, poorer rainfall forecasts are obtained for Mediterranean rain-bearing flow regimes (see next section), and therefore over the sensitive east-facing areas (Fig  4a).

The distinct model behavior for Atlantic and Mediterranean rain-bearing flow situations might have implications for the downscaling of GCM simulations if major changes in atmospheric pattern frequencies take place. For Mediterranean Spain, Sumner et al. (2003) found, using ECHAM-OPYC3 GCM (Roeckner et al. 1998), marked decreases in frequency for many near-surface circulations with a westerly or northerly component during the twenty-first century, whereas a general increase was found for atmospheric patterns with an easterly component. Slightly more uncertain future downscaled precipitation fields are possible therefore, owing to the increase of the relatively poorly-handled easterly regimes.

3.3 Major rain bearing flow regimes

In order to complement the previous results, the data set has been broken down into six major rain bearing flow regimes that affect Mediterranean Spain. These regimes were also considered by Sotillo et al. (2003), after regrouping in a smaller set 19 rainfall-producing atmospheric patterns derived in Romero et al. (1999b) from a large sample of rainy days which included our group of events. The 165 heavy rainfall days simulated are then subdivided into one of the following flow types (Fig. 6):

  • A (Atlantic flows, 53 days), comprising surface circulations from the SW-W produced by Atlantic lows.

  • C (Cold front passage, 11 days), or winds from the NW-N over the Iberian Peninsula associated with the passage of a cold front.

  • SW (Southwestern disturbances, 31 days), that is, troughs or cut-off lows at mid-tropospheric levels to the west of Gibraltar Strait, with the surface low near the Gulf of Cadiz which induces winds from the SE-E.

  • S (Southern disturbances, 30 days), similar to the previous one but with the upper-level disturbance and surface low axis located about Gibraltar Strait.

  • SE (Southeastern disturbances, 19 days), with the low-level disturbance to the east of Gibraltar Strait.

  • N (Northerly flows, 21 days), normally associated with low pressure centres located over the western Mediterranean basin.

Model performance results as a function of flow regime are shown in Fig. 7 and Table 3, corresponding to 1° input data resolution experiments. Weak but physically consistent differences emerge among the flow types, with the lowest ROC scores for SW, SE and N situations (0.758, 0.694 and 0.767, respectively). The remaining flow regimes produce scores close to 0.8: A (0.789), C (0.812) and S (0.787). Then, as already emphasized in last section a certain distinction can be made between Atlantic or northern Mediterranean disturbances (A, C and N), and low latitude disturbances that induce surface flows with a significant easterly component over Mediterranean Spain (SW, S and SE). The former situations support rainfall distributions of higher predictability; the latter flow types induce more complex rainfall responses, not so easily handled by mesoscale models. An exception to this general rule, however, is indicated by the fact that the southern disturbances exceed in performance the northerly situations. The southern disturbances are typically associated with substantial rainfalls focused around the Aitana range, a highly exposed area (Romero et al. 1999b) where a wide range of flow directions of easterly component lead to essentially the same rainfall responses due to its geographical nature as mountainous cape (Fig. 5a). Such relative independence to the flow direction benefits the precipitation predictability. On the contrary, our northerly flows category includes, in addition to the highly predictable Genoa-type cyclones, some low-pressure systems located to the south-southeast of the Balearic Islands (see the composite pattern in Fig. 6f), resembling the SE flow pattern except that the disturbance is located further east. The forecast uncertainty associated to this type of situations penalizes the overall score of the N pattern.

Fig. 6
figure 6

Composites of the six major rain bearing flow regimes: a Atlantic flows (A); b Cold front passage (C); c Southwestern disturbances (SW); d Southern disturbances (S); e Southeastern disturbances (SE) and f Northerly flows (N). The continuous lines represent the geopotential height field at 925 hPa (contour interval is 20 m), and the dashed lines that at 500 hPa (same contour interval). Surface lows and highs are indicated

Fig. 7
figure 7

ROC curves for the six major rain bearing flow regimes (for 1° input data resolution)

Table 3 ROC scores for the six major rain bearing flow regimes (for 1° input data resolution)

In analogy with the analysis presented in Sect. 3.2, the spatial dependence of the forecast accuracy as function of flow type is examined. However, population sizes at domain grid points would be too low to produce useful results on model performance from the ROC statistic computations. In order to alleviate this problem, the circulation types have been further simplified by considering only two categories: northern disturbances, associated with a significant Atlantic or northerly component at low levels, composed by A, C and N situations (85 days); and southern disturbances, associated with a dominant easterly flow component over Mediterranean Spain, composed by SW, S and SE situations (80 days).

Although still not very large, these increased population categories appear to offer interpretable results. First of all, the results in Fig. 8 confirm that the overall skill of mesoscale predictions is favoured under Atlantic flows. The high latitude disturbances (Fig. 8a) produce higher ROC scores towards the west and north of Mediterranean Spain, including the Balearic Islands, and some orographic units of the south and east. All these areas are directly exposed to Atlantic and northerly flows. Lower values are found over sheltered areas, such as the Gulf of Valencia and areas of the southeast. In these latter areas, however, the southern disturbances offer better ROC values (Fig. 8b), which appears to be consistent with the Mediterranean nature of the associated flows. Nevertheless, the southern disturbances still exhibit the highest ROC values towards the north, some areas of the south-west and the Balearics, not over the more exposed areas of Valencia, Murcia and eastern Andalusia as it would have been expected. A reasonable explanation for this result is that the easterly rainfall regimes typically comprise many convective, low-predictability type events over the previous provinces.

Fig. 8
figure 8

Spatial distribution of ROC scores for the two flow categories defined by: a Atlantic flows (A), Cold front passage (C) and Northerly flows (N); and b Southwestern (SW), Southern (S) and Southeastern (SE) disturbances

3.4 Orographic influence

It has been previously suggested that the degree of low sensitivity to the spatial and temporal resolution of the input datasets can be due to the dominant role of the orography in controlling the rainfall distribution over Mediterranean Spain, to the extent of overcoming the dynamical action induced by sub-synoptic features embedded in the circulation. It would be interesting to verify—or reject—this hypothesis by reproducing the kind of statistical analysis presented in last sections for other remote, smooth orographic regions. This is, of course, beyond the scope of the study, but as an alternative, two additional oceanic regions—besides our study zone (noted as ZONE in Fig. 9)—have been considered. These regions have been defined over the Atlantic ocean (ATL) and the Mediterranean Sea (MED) (see Fig. 9), with the same areal extent than ZONE. If the hypothesis is true, then a greater degradation of the forecast quality with coarser input data resolution should be observed on these non-orographic areas than in the study zone. It is interesting to note that Denis et al. (2003) and Antic et al. (2004) did not found significant differences between western and eastern parts of North America about the sensitivity of downscaled precipitation to the spatial and temporal resolution jumps of LBCs.

Fig. 9
figure 9

Geographical location of the three regions considered for the model performance analysis (see text): Mediterranean Spain (ZONE), Atlantic ocean area (ATL) and Mediterranean sea area (MED)

Since the observed rainfall over the ATL and MED zones is unknown on any of the 165 simulated days, the analysis has been carried out by considering the 1° experiment results as the “truth", and comparing the 2°, 3°, 1° + 12, 1° + 24 and 1° + 48 h results with that truth, for each of the three zones. The areal mean precipitation was first examined to ensure that a large fraction of the 165 simulated days yields significant rainfall in both oceanic regions. This important requirement for the statistical significance when comparing results for ZONE against those for ATL and MED could be verified. As a brief summary, the areal mean precipitation values once averaged over the 165 episodes are 6.8, 4.2 and 4.3 mm for ZONE, ATL and MED, respectively.

Results are summarized in Tables 4 and 5. In terms of spatial correlation averaged over the 165 events under study, higher and more uniform values are found for ZONE than for ATL and MED, and the decrease of this correlation when coarser—in space or time—resolution data is used, is far more appreciable for the oceanic areas than for the study zone (Table 4). In terms of the relative root mean square error, lower values are obtained over ZONE than over MED for all experiments, but interestingly, for all experiments except 2° and 1° + 12 h, ATL offers lower values than ZONE, that is, closer agreement with the 1° results (Table 5). A remarkable feature reflected in both tables is that updating the LBCs at 12 h intervals instead of 6 h has a greater negative impact on the “forecast" than a decrease of horizontal resolution from 1° to 2° in the input data.

Table 4 Average spatial correlation \({(\bar{r})}\) between the six sets of simulations and the 1° experiment. The analysis is performed for the three regions shown in Fig. 9
Table 5 Average root mean square “error" relative to the 1° experiment, normalized by the mean precipitation \({(\bar{\varepsilon_{r}}),}\) for the six experiments. The analysis is performed for the three regions shown in Fig. 9

4 Conclusions

This study represents an attempt to examine the problem of dynamical downscaling of precipitation over Mediterranean Spain—a highly vulnerable region according to most of the climate change precipitation scenarios (Meteorological Office 2001; Watson and Zinyowera 2001)—with respect to its sensitivity to the spatial and temporal resolution of GCM input fields. The methodological approach to the problem has been determined, first, by the availability of precipitation and meteorological data, and second, by limitations in computer time which prohibited a large number of numerical simulations. Specifically, our conclusions have been outlined from various sets of 165 mesoscale numerical simulations of heavy rainfall events in Mediterranean Spain, initialized with real meteorological grid analyses at six different spatial and temporal resolutions, under the following assumptions: (i) heavy rainfall events are representative of the whole fraction of rainfall days with respect to the model sensitivity—or insensitivity—to input data resolution; (ii) the six considered resolutions (1°, 2° and 3° in space, plus 1° + 12, 1° + 24 and 1° + 48 h in time) are sufficient to describe the actual envelope of sensitivities of the forecast system; and most importantly, (iii) the use of smoothed meteorological analyses is equivalent to coarse grid GCM outputs.

Hopefully, then, at least a first guess on the effects of GCM resolution for dynamical downscaling tasks in Mediterranean Spain can be derived from this work. The major finding—in general agreement with the results for other regions with complex orography—is that the forecast skill is relatively insensitive to the spatial resolution of the boundary fields, but that it diminishes significantly for updates less frequent than 12 h apart, at least for the examined range of spatial and temporal resolutions.

Some implications can be derived from the presented analysis on the spatial variability of model performance and its dependence on flow type. First, the best-behaved areas in terms of forecast accuracy, are those more exposed and dependent on Atlantic and northerly flows (western Andalusia, Catalonia and the Balearics) and highlands in general, whereas many areas of eastern Andalusia and the Iberian eastern flank, often dominated by convective type rainfalls, exhibit relatively large forecast uncertainties. This mapping is a valuable information for improving the definition of spatially-dependent confidence intervals in Mediterranean Spain when dealing with precipitation downscaling products, and also with the real-time numerical model prediction of rainfall events. Second, the analysis has shown that the high latitude disturbances embedded in the midlatitude westerlies generally offer better rainfall forecasts than situations with a strongly negative (i.e. easterly) flow component.

These findings would imply a changed reliability on the downscaled precipitation from GCMs if significant changes in flow type frequencies are to be expected in the area owing to climate change. In this respect, Watson and Zinyowera (2001) and Sumner et al. (2003), among others, note that climate change signal in Mediterranean Spain could be associated with marked decreases in frequency for many near-surface circulations with a westerly or northerly component, and a general increase for easterly component flows.