1 Introduction

Nested Regional Climate Models (RCMs) have become standard tools for downscaling low-resolution atmospheric reanalyses or global climate simulations into high-resolution fields, and can be used for various purposes (for an introduction, see Giorgi and Mearns 1999). They provide an attractive approach to achieving finer spatial resolution of the atmospheric circulation as they make regional climate simulations and climate-change projections possible at an affordable computational cost. RCMs can be run at fairly high resolutions (with grid meshes of a few tens of kilometres) over an area covering some millions of square kilometres. RCM climate simulations have been performed for several regions of the world, including North America (e.g., Kunkel et al. 2002; Gutowski et al. 2004; Liang et al. 2004). In addition, climate-change projections have been performed for various parts of North America (e.g., Giorgi et al. 1998; Laprise et al. 1998, 2003; Plummer et al. 2006; see also references cited in the review by Wang et al. 2004). A noteworthy alternative approach to nested limited-area RCMs—but whose discussion is beyond the scope of this work—is that of stretched-grid global models (e.g., Déqué et al. 1998; Fox-Rabinovitz et al. 2001).

In recent years, coordinated efforts involving several countries have led to the generation of an increasing amount of regional climate simulations over different areas of the world (Prediction of Regional scenarios and Uncertainties for Defining EuropeaN Climate change risks and Effects, PRUDENCE, Christensen et al. 2002; Project to Intercompare Regional Climate Simulations, PIRCS, Takle et al. 1999; North American Regional Climate Change Assessment Program, NARCCAP, Mearns 2004; Regional Climate Model Inter-comparison Project for Asia, RMIP, Fu et al. 2005; Arctic Region Model Intercomparison Project, ARCMIP, Tjernstrom et al. 2005). These coordinated efforts come as a consequence of lessons learned in recent years regarding the need for multi-model intercomparison studies to improve individual models, and to better understand model variability and uncertainties involved in the downscaling process. The study of these uncertainties, normally carried out with an ensemble of simulations, is not only expensive in computer resources but it is also a complex issue (see Moss and Schneider 2000; Webster 2003; Dessai and Hulme 2004). In this work we will concentrate on the study of uncertainty sources affecting the one-way nested RCM approach for long integration times (20 years). These sources of uncertainty are numerous, and here we will concentrate especially on those that are the consequence of the liberty given to the RCM user by the very nature of nested RCMs; that is, the liberties in the choice of configuration parameters such as domain size, driving information, and nesting method. While some works have been devoted to the development of recommendations regarding optimal setups of these parameters (e.g., Warner et al. 1997), these decisions are left to the user’s judgment in each particular situation. In order to assess the relative importance of the uncertainty introduced by changes in configuration parameters, a preliminary estimation of the uncertainty due to changes in driving data as well as in model physics is also carried out.

The examination of uncertainties attempted in this work is preliminary only since a thorough evaluation at the 20-year timescale could entail prohibitive demands on computer resources. The simulations discussed in this work, for example, account already for many years of computing time. Sources of uncertainty in the RCM’s dynamics and physics, as well as those introduced by Coupled General Circulation Models (CGCMs), must eventually be analysed in depth, though this seems possible only through collaborative projects such as those mentioned above. This estimation is of capital importance in the context of the climate change debate.

As part of this objective, the present work concentrates on the simulation of the North American climate using an ensemble of simulations generated by the Canadian Regional Climate Model (CRCM) with various configuration setups.

The rationale behind using an ensemble of RCM simulations to study uncertainty in the climate downscaling process is discussed in Sect. 2. The model and the experimental configurations are described in Sect. 3. Studies of uncertainty introduced into the North American downscaled climate by modifications of parameters in model configurations are carried out in Sect. 4. From the different climates produced by the ensemble of simulations, a spread is obtained and discussed in order to assess the simulated climate robustness to configuration changes (Sect. 5). Finally, discussion and conclusions are presented in Sect. 6.

2 Rationale

Over the last few years, researchers have suggested the need for ensemble techniques––either in single-model or multi-model versions––in order to achieve a meaningful simulation of regional climate (e.g., Jacob and Podzun 1997; Weisse et al. 2000; Yang and Arritt 2002; Wang et al. 2004). This suggestion follows from the use of these techniques in global model experiments, where the methodology is already well developed (e.g., Murphy et al. 2004). In some cases, the use of ensembles is restricted to the computation of the ensemble average, which commonly displays better skill scores than individual simulations (Hagedorn et al. 2005), and sometimes the ensemble is used also to explore the uncertainties associated with the estimation of the climate (Räisänen 2001). In this section, we discuss the reasons and the rationale behind the use of ensemble methods in this study.

When an RCM is used to simulate the climate over a given region, it is expected that model approximations, driving data imperfections and internal variability may limit the quality of the results, normally measured by one or more skill scores. The skill level attained by a single model is usually achieved after a process, in which parameterizations are carefully developed and adjusted. In some cases, especially when parameter choice is not easily inferred from first principles or physical experimentation, an exploratory phase goes on a trial-and-error basis and, when results are satisfactory (reaching what the modeler considers a reasonable skill level or an acceptable reproduction of physical processes), the quest for the appropriate parameter values may end.

While this may be a reasonable approach, we should keep in mind that the selection of both the set of parameters and of the skill measures contains a component of subjectivity. A thorough study of a good sample of all possible parameters and skill scores is a gigantic task, and because of that, a sense prevails among many researchers that the values used for some parameters can be no more than educated guesses.

Under these circumstances, one may think that a number of different reasonable model configurations could produce simulations with similar abilities regarding reproduction of physical processes or skill score values by, for example, compensating errors from different processes or locations (e.g., Verhoff et al. 1999). Hence, we may regard any of these possible solutions as somehow interchangeable, and the choice of a particular one as arbitrary. This leads us to think of the simulated climate not as an individual statistical value (e.g., one average value) but as a set of values (e.g., the set of average climates produced with different configurations). This range or distribution of values represents part of the uncertainty associated to the simulated climate.

There are various sources of uncertainty present in an RCM simulation, and they can be divided according to their origin: (a) internal variability (e.g., triggered by differences in the initial conditions), (b) sensitivity to nesting configuration (e.g., domain size and location, relaxation technique, driving imperfections), (c) dependence on RCM physics and dynamics (e.g., type of convective parameterization), and iv) dependence on boundary forcing. (e.g., type of GCM).

The internal variability holds a particular role among these sources of uncertainty in the sense that any change introduced in model configuration will trigger it. For this reason the uncertainty introduced by the internal variability will be considered as a sort of “noise level”, against, which all other sources will be compared (i.e., if a given modification in model configuration produces an uncertainty level similar to that of internal variability, we may consider that no significant effect is introduced by the modification). Internal variability will be discussed in Sect. 4.1.

Model sensitivity to nesting configuration will be developed in Sect. 4.2, in particular, the uncertainty introduced by nesting technique, domain size and errors in observational lateral forcing. These sources of uncertainty are inherent to the one-way nesting technique, and hence their exploration is fundamental for the evaluation of the one-way RCMs as a tool, since large sensitivity to any of these parameters would question the way RCMs are used.

A thorough evaluation of the uncertainty introduced by variety in the model physics of the RCM is an extremely demanding task that will not be performed here. In this work we make a preliminary examination of this issue by using only two different versions of the same model, which may provide us with a first order estimation of the expected changes (Sect. 4.3.1).

The influence of lateral boundary forcing in the simulated climate of RCMs has been stressed several times (e.g., Noguer et al. 1998). This influence becomes particularly important when driven by CGCMs, due to the usually biased climate generated by these models. In the present work a preliminary estimate of the uncertainty introduced by lateral forcing when driven by GCMs will be made by evaluating the error introduced by driving the CRCM with the CGCM2 (Sect. 4.3.2).

Each of the experiments discussed above shed some light on individual sources of uncertainty, but it is also of interest to estimate their combined effect. Section 5 presents an evaluation of the combined effect of some of these sources by selecting a subset of simulations.

3 The CRCM and experiment design

3.1 Brief description of the CRCM

The model versions used for this study are evolutions of the Canadian Regional Climate Model (Caya and Laprise 1999; Laprise et al. 2003; Plummer et al. 2006). This limited-area nested model uses a dynamical kernel (Laprise et al. 1997) based on the fully elastic non-hydrostatic equations solved by a non-centred semi-implicit semi-Lagrangian three-time-level integration scheme with a weak running time filter. The horizontal grid is uniform in polar-stereographic projection and its vertical resolution is variable with a Gal-Chen scaled-height terrain-following coordinate (Gal-Chen and Sommerville 1975). An Arakawa C-type grid is used for the location of the atmospheric variables (with staggering in the horizontal as well as in the vertical). The lateral boundary conditions are provided through the one-way nesting method presented by Davies (1976) and modified by Robert and Yakimiw (1986), which is applied over a 10-point sponge zone. In all simulations except CANDAVIES (introduced in Table 1), an additional large-scale nudging technique developed by von Storch et al. (2000) and modified by Riette and Caya (2002) was applied within the regional domain to weakly force CRCM’s large-scale circulation towards that of the nesting data. The large-scale nudging was applied to horizontal winds of wavelengths larger than 1,400 km only. The intensity of large-scale nudging varies in the vertical, starting from zero just above 500 hPa and increasing to a maximum strength corresponding to a relaxation time of 10 h at the model top (∼10 hPa).

Table 1 Different configurations of CRCM simulations

Three versions of the CRCM were used, which are called 3.6.1, 3.6.3, and 3.7.1. Those in the 3.6 series are very similar to the version used in Laprise et al. (2003), but include a new convective scheme and a slight modification in a parameter associated to cloud formation (Bechtold et al. 2001; Paquin and Laprise 2003). The evolution from 3.6.1 to 3.6.3 includes the elimination of minor coding errors that affected the computation of ground water holding capacity, as well as the introduction of an interactive mixed-layer/thermodynamic-ice lake model developed by Goyette et al. (2000). This lake model simulates the evolution of surface water temperature and ice cover over the North American Great Lakes, and comes only into play when the CRCM is driven by a GCM (simulation NAGCM introduced in Table 1). These versions share some of the subgrid-scale physical parameterization package of the atmospheric GCMii (McFarlane et al. 1992) and of the second-generation Canadian Coupled General Circulation Model (CGCM2; Flato and Boer 2001).

The CRCM 3.7.1 is an important evolution from previous versions. Changes include modifications of soil water capacity and snow mask threshold for the disappearance of snow, a decrease in the bare ground evaporation factor, a new vertical diffusion scheme (Jiao and Caya 2006), a new cloud scheme (Lorant et al. 2002), a new radiation scheme (Puckrin et al. 2004), and a new interpolation technique for the topography (for a detailed description of each of the modifications, please see Plummer et al. 2006).

All simulations were performed using the same resolution, corresponding to a 45 km (true at 60°N) grid-size mesh. In the vertical, 29 unequally spaced levels were used with the lowest thermodynamic level located at about 25 m above the surface and the computational rigid lid near 29 km in height. A time step of 15 min is used for all simulations.

3.2 Experiment design

The simulations presented in this work were generated with different model configurations (see Table 1). Changes between configurations come from modifications in the model code (version), domain size, driving data, and nesting technique.

Two domain sizes were used: (a) the North American domain (NA) that covers most of North America and a large portion of the three adjacent ocean bodies on a 201 by 193 gridpoint computational domain (e.g., Fig. 3), and (b) the pan-Canadian domain (CAN), which covers all of Canada and part of the United States on a 193 by 145 gridpoint computational domain (e.g., Fig. 5). Both domains share the location of their northern and western boundaries.

Three sets of data were used to supply the required atmospheric driving information to the CRCM: The National Centers for Environmental Prediction/National Center for Atmospheric Research global reanalyses (NCEP/NCAR; Kalnay et al. 1996), the ERA40 reanalyses from the European Centre for Medium-Range Weather Forecasts (ECMWF, Uppala et al. 2005), as well as simulated fields from a CGCM2 run (Flato and Boer 2001). In all cases, the atmospheric driving data were available at 6 h intervals and were linearly interpolated to the CRCM 15-min timestep. Both the NCEP/NCAR and the ERA40 reanalyses are publicly available on a 2.5° × 2.5° lat-lon grid with the ERA40 dataset being degraded from its original grid equivalent to 1.125° × 1.125° lat-lon. The CGCM2 data was available on a Gaussian grid with T32 triangular truncation approximately equivalent to a 3.75° × 3.75° on a lat-lon grid.

When driven by reanalyses, CRCM’s sea surface temperatures and sea-ice cover were prescribed from the AMIP II monthly data (Fiorino 1997; Atmospheric Model Intercomparison Project). Monthly values were interpolated in space and time to serve as time-dependent lower boundary conditions over the oceans and Great Lakes (an algorithm inspired in that of Sheng and Zwiers (1998) was used to preserve the original monthly means). When driven by the CGCM2, CRCM’s sea surface temperatures and sea-ice cover were taken from CGCM2’s daily-simulated values. However, since the North American Great Lakes are not resolved by the CGCM2, the mixed-layer/thermodynamic-ice lake model (Goyette et al. 2000) was activated in the CRCM to simulate the evolution of surface temperature and ice cover over these lakes (simulation NAGCM).

The geophysical fields over land points, such as liquid and frozen soil water content, snow amount and ground temperature, were initialized with the monthly mean values from an existing CGCM climatology. A summary of all simulations discussed in this work is presented in Table 1.

The climate statistics were computed from 6 h output archives for all seasons (defined as 3-month periods), however, the analysis presented in this paper will focus mainly on the winter (December, January, February) and summer (June, July, August) seasons. A 10-gridpoint zone at the perimeter of the domain, corresponding to the Davies relaxation area, was removed for diagnostic and display purposes. At least a 2-year spinup period was allowed for each run.

Gridded surface climate data from the Climatic Research Unit (CRU2; Mitchell and Jones 2005) were used to evaluate the simulated seasonal mean variables. The observational datasets were interpolated from their original 0.5° × 0.5° lat-lon global grid onto the CRCM 45 km polar-stereographic grid.

4 Sensitivity experiments

The analysis of uncertainty sources performed in this section is organized as a series of sensitivity studies and is divided into three parts: Sect. 4.1 discusses the experiments concerning sensitivity to initial conditions (internal variability). Section 4.2 presents experiments regarding sensitivity to changes in important parameters governing the one-way nesting configuration of the CRCM’s sensitivity to nesting method, domain size, and driving analysis. Section 4.3 presents two additional sensitivity experiments: one that studies the effect of CRCM physics modifications (Sect. 4.3.1), and another that investigates the effect of large modifications in the information driving the RCM (Sect. 4.3.2). Comparisons are discussed in Sect. 4.4 and are summarized in Table 2.

Table 2 List of CRCM simulations compared in sensitivity tests. Differences between runs, analyzed period, as well as common model configuration, are mentioned

4.1 Uncertainty introduced by internal variability

Internal variability, which has been studied for several decades in global models, has also been detected in RCMs (e.g., Jacob and Podzun 1997; Weisse et al. 2000; Giorgi and Bi 2000; Christensen et al. 2001; Caya and Biner 2004; Alexandru et al. 2007). The presence of internal variability not only implies that instantaneous values from runs produced with a set of varying initial conditions may differ substantially, but also that their climate statistics may differ as well, although it is expected that statistics such as time-averaged values will tend to resemble each other as the number of years included in the average increases. However, uncertainty in the estimation of climate statistics from model simulations deserves attention, particularly with respect to variables having important small-scale variability such as precipitation (Räisänen 2001).

The estimation of internal variability, which entails the realization of an ensemble of climates for every new configuration of the model, is unfortunately very demanding in computer resources. A more affordable approach is to limit the ensemble to the realization of a “twin” experiment, that is, using only two runs differing in their initial conditions. This has been the approach most favored when long simulations are needed (e.g., Giorgi and Bi 2000; Caya and Biner 2004; Rinke et al. 2004). This approach may not provide a very precise estimation but due to the large amount of grid points present in the domain, a fair sense of the variability may be obtained over area-averaged values.

Figure 1 illustrates the spatial root-mean-squared difference (RMSD) between the 3-month averages of simulations NA2 and NA for all seasons over the continental region of the domain. The RMSD is shown for 20 consecutive years (1961–1980). These runs are based on CRCM version 3.6.3 and differ only in their initial conditions, the NA2 run starting 1 month later than the NA run, as mentioned in Table 1. A large inter-annual variation is present for the surface temperature RMSD for all seasons (Fig. 1b), but no trend seems to exist for the entire period. Some inter-annual variability can be seen for precipitation too, also with no apparent trend (Fig. 1a). The lack of trend in these time series reflects the fact that the two simulations are as much correlated at the beginning (after a spinup period) as they are near the end of the period. This is expected since the loss of correlation between runs with differences in their initial conditions occurs within the first weeks of integration. As shown by several authors (e.g., Caya and Biner 2004) this loss of correlation is not complete, as is the case in global models, but pulsates up and down. This lack of trend in the internal variability differs from results obtained by Wu et al. (2005), who found a monotonic decrease in monthly internal variability in a 4-month integration.

Fig. 1
figure 1

Time evolution of the spatial root-mean-squared difference (RMSD) between seasonal averages (precipitation in panel a, and temperature in panel b) produced by the NA2 and the NA simulations (only land points are considered). The year is divided into spring (MAM), summer (JJA), fall (SON), and winter (DJF). The studied period is 1961–1980

The inter-annual variability of the seasonal RMSD may have two different origins. First, the fact that different years produce different anomalous circulations, and hence develop different levels of internal variability. Second, the estimation of spread with only two ensemble members is rather inaccurate, and this may still be the case despite averaging over the entire domain.

The internal variability on seasonal averages is relatively important in magnitude and it is interesting to see how this magnitude is affected when the average period is lengthened, that is, if instead of a single season more than one are included in the average.

Figure 2 depicts the domain-averaged RMSD between seasonally averaged fields for precipitation (a) and surface temperature (b), from “ twin” simulations NA2 and NA. The RMSD is shown as a function of averaging-time length for the four seasons. The figure illustrates that the difference between the seasonal averages decreases as the averaging period increases. As shown in the appendix, for variables that are uncorrelated in time and that have little spatial correlation, the difference decreases following a function of the form

Fig. 2
figure 2

Spatial root-mean-squared difference (RMSD) between temporal averages (precipitation in panel a, and temperature in panel b) produced by the NA2 and the NA simulations for the period 1961–1980 (only land points are considered). The length of the temporal average is taken as a variable, and the year is divided into spring (MAM), summer (JJA), fall (SON), and winter (DJF). The thin, solid lines are curves obtained from Eq. 1, for different values of the constant C

$$ {\text{RMSD}} \approx \frac{C} {{{\sqrt N }}}\,, $$
(1)

where C is a constant related to the variability of the field, and N is the number of years included in the sample. In order to illustrate the general shape, functions of this form are also plotted in Fig. 2 as thin, solid lines for different values of the constant C.

The RMSD for precipitation seems to follow the general form of Eq. 1 quite closely in all seasons. Summer months exhibit the largest internal variability, probably due to the more random character of convective precipitation. Winter depicts the least variability, while spring and fall show similar behavior of intermediate amplitude.

Surface temperature RMSD does not follow the curves defined by Eq. 1 as neatly as does precipitation, and some results (not shown) suggest that it may not be attributable to a slow mode of internal variability triggered by snow feedback mechanisms. One possibility is that since surface temperature has a higher spatial correlation than precipitation, the spatial average could be less effective in removing the sampling noise and hence leaving an important accumulation of errors. It is interesting to note that the surface temperature noise level produced by CRCM’s internal variability is only a fraction of the 0.5°C found by Giorgi and Francisco (2000) for several regions of North America, from a 30-year annual mean generated by an ensemble of global model simulations.

This experiment was repeated with CRCM version 3.7.1 (see simulations NA4 and NA3 in Tables 1, 2) and results (not presented) show that the internal variability is in general rather similar to that of version 3.6.3 described in this section. There are some differences worth mentioning, however. Internal variability in fall precipitation is around 30% more intense, while spring surface temperature internal variability is around 40% smaller, becoming comparable to that of summer and fall.

Differences found in this experiment suggest that individual studies may be needed for specific model configurations in order to reach a good estimation of internal variability. However, given the required computer resources for this approach, and the fact that both estimations of internal variability (using CRCM 3.6.3 and 3.7.1) give qualitatively similar results, it is assumed for the purpose of this work that the estimated values are representative of other model configurations presented in the next section. It should be noted, however, that internal variability in RCMs increases with domain size (e.g., Lucas-Picher et al. 2004). Hence, in order to avoid the risk of underestimating internal variability we have chosen results from the simulation that is performed over the largest domain.

In addition to the area-averaged values, it is also interesting to study the geographical distribution of internal variability. Figure 3 depicts the difference field NA2-NA for 20-year average precipitation and surface temperature, for the summer and winter months (period 1961–1980). These fields show the differences responsible for the RMSD of the longest averaging length presented in Fig. 2. In the case of precipitation, several noisy structures––small-scale features unrelated to surface forcing such as topography or land-water contrasts—are visible over a large part of the domain in both seasons. It is also worth noting that the amplitude of the noise increases from west to east, as is usually the case with internal variability growing away from the inflow western boundary. The noise maximum for precipitation is displaced further into the east during winter than during summer, which is probably related to the fact that strong convection occurs over the Atlantic Ocean and that stronger winds sweep the perturbations eastward. For surface temperature it is winter that presents the largest amplitude, although differences do not extend to the ocean surface (both simulations used prescribed SSTs; see Sect. 3.2). In addition, some structures in the northern regions (e.g., Baffin Island) seem to be more conspicuous than the surrounding noise. An analysis of the time series of several variables at several locations (not shown) suggests that the amplitude of these features is probably due to the small sample size and not to the existence of a wider range of timescales. For example, no evidence of a trend was found in either soil temperature, water content or snow accumulation.

Fig. 3
figure 3

Internal variability experiment: differences of 20-year averages (period 1961–1980) between runs NA2 and NA (NA2-NA), for the summer (left-side panels) and winter (right-side panels), for precipitation (in mm/day, upper panels) and temperature (in °C, lower panels)

In the case of precipitation, a better idea of the magnitude of the internal variability can be obtained by studying the relative difference between the two simulations (Fig. 4). The length-scale of these structures seems to be smaller in summer than in winter, likely a consequence of summer convection. It can be seen that this relative difference is rarely larger than 10%.

Fig. 4
figure 4

Internal variability experiment: relative differences in precipitation 20-year averages (period 1961–1980) between runs NA2 and NA [(NA2-NA)×NA-1], for the summer (left-side panel) and winter (right-side panel)

These results provide an estimation of the noise level to which deviations originating from configuration changes should be compared, in order to establish their significance.

4.2 Uncertainty caused by nested-model configuration variety

4.2.1 Sensitivity to nesting technique

In the last few years, the idea of forcing large-scale patterns within the entire regional domain has gained momentum (e.g., von Storch et al. 2000; Riette and Caya 2002; Miguez-Macho et al. 2004). Here, simulations differing only in nesting method are compared: the CANDAVIES using only nudging in the sponge area (Davies 1976), and the CAN using, in addition, large-scale spectral nudging in the entire domain. See Table 2 and Sect. 3.1 for a description of the nesting technique.

Figure 5 shows the difference field CANDAVIES-CAN for 20-year average precipitation and surface temperature, during summer and winter (period 1975–1994). Summer precipitation reveals well-organized differences between the two runs, with the Canadian region being drier in the CANDAVIES simulation. The opposite effect is seen in most of the continental US. In the region of the outflow boundary (on the rhs of the domain) spurious precipitation in the large-scale nudged (CAN) simulation has clearly decreased. Relative differences (not shown) present a similar pattern, with precipitation being reduced by between 10 and 20% over Canada and being increased by a similar amount over most of the continental US. During winter, the CANDAVIES simulation seems to be drier, especially in British Columbia and over the Atlantic, to the east of Canada. Relative differences (not shown) indicate that a decrease in precipitation of around 10% covers almost all Canadian territory while in the polar region an increase of around 10% is found.

Fig. 5
figure 5

Nesting technique experiment: differences of 20-year averages (period 1975–1994) between runs CANDAVIES and CAN (CANDAVIES-CAN), for the summer (left-side panels) and winter (right-side panels), for precipitation (in mm/day, upper panels) and temperature (in °C, lower panels)

For surface temperature, the CANDAVIES is warmer than CAN in a large part of the domain in all seasons. During summer, differences surpassing two degrees are concentrated in the north of North America, while in winter, as illustrated in the lower-right panel of Fig. 5, differences are important over the entire Canadian territory. It is worth noting that similar positive differences are also found between the surface temperature fields from CANDAVIES and those of the driving NCEP reanalyses (not shown). This is consistent with the fact that, due to large-scale nudging, CAN should be closer than CANDAVIES to the driving fields (note, however, that nudging was only applied to the wind field, and only above 500 hPa). The cause of the difference between CANDAVIES and CAN is not clear, but it is possible that it originates in systematic errors in the model physics. While it has become clear that large-scale nudging is beneficial for both diminishing the internal variability and preventing large departures from the driving fields (Miguez-Macho et al. 2004), not enough is yet known about possible negative effects such as diminished ability to produce small scales or distorted spectral power. For this reason we believe that the choice of nesting method is still open to the researcher’s judgment, and hence the results presented here are an estimation of the uncertainty related to this freedom of choice.

4.2.2 Sensitivity to domain size

The sensitivity of RCMs to domain size and location has been reported several times and constitutes an acknowledged drawback for regional climate simulations (e.g., Jones et al. 1995; Jacob and Podzun 1997; Seth and Giorgi 1998; Juang and Hong 2001; Pan et al. 2001; Rojas and Seth 2003; Vannitsem and Chomé 2005).

Figure 6 depicts the difference field CAN2-NA for 20-year average precipitation and surface temperature, for summer and winter (period 1975–1994). Precipitation during summer seems to be most affected in the southeastern part of the domain. Relative differences (not shown) present a similar pattern, with precipitation being reduced by around 10% over Canada and increased by more than 25% in the central US. During winter, differences in precipitation concentrate mostly on the New England states and the adjacent Atlantic waters. Here, the relative differences (not shown) indicate an increase reaching 50%. Surface temperature fields indicate that the simulation over the small domain is significantly cooler in most of the north and east of North America in both seasons. It is worth noting that these simulations have been performed with large-scale nudging, and hence large-scales winds are relaxed towards a common pattern.

Fig. 6
figure 6

Domain-size experiment: differences of 20-year averages (period 1975–1994) between runs CAN2 and NA (CAN2-NA), for the summer (left-side panels) and winter (right-side panels), for precipitation (in mm/day, upper panels) and temperature (in °C, lower panels)

It is believed that the differences identified in this experiment could be reduced by using a more constraining large-scale nudging (see for example, Miguez-Macho et al. 2004). However, since the debate regarding the risks and benefits of a forceful nudging is far from over, we believe that the sensitivity shown in this test is a reasonable estimate of the uncertainty introduced by the freedom of choice of domain size.

It is known that simulations over larger domains deviate from the driving fields more than those over smaller domains (Lucas-Picher et al. 2004). This may explain the generally cooler surface temperature in CAN2 as a result of this simulation being closer to the driving data, as discussed in the previous section (see Fig. 5). This explanation, however, applies less clearly for precipitation during summer, and even less during winter. As shown in Fig. 5, differences due to nesting technique in winter precipitation do not seem as large as those of Fig. 6. As can be seen in Table 2, differences between the configurations of runs CAN2 and NA also include different starting times. Experiments indicate, however, that starting time accounts for little variability present in the field (see Fig. 3). For the case of slow-varying variables such as soil water content, experiments (not shown) indicate that they tend to stabilize after a spin-up time of around 2 years (simulation CAN2 has a 2-year spin-up time, while NA has 16 years). Hence, most differences between simulations may be attributed to change in domain size.

4.2.3 Sensitivity to driving with different observational datasets (ERA40 vs. NCEP reanalyses)

This section studies the uncertainty introduced into the downscaled climate through the lack of precision/coverage in the atmospheric observation system, here represented by the use of different objective reanalyses from NCEP and ECMWF. The same monthly ocean data is used in both cases. Similar experiments have already been performed recently (e.g., Rinke et al. 2004; Wu et al. 2005), although for periods shorter than one year. In addition, Liang et al. (2004) presented results for precipitation over continental US for two particular summer seasons.

Figure 7 depicts the difference field NAERA-NA for 20-year average precipitation and surface temperature, for summer and winter (period 1961–1980). During summer, precipitation in northern Canada differs slightly, particularly in western Quebec. Although its structure is noisy, differences are larger than those originating from internal variability (see Fig. 3). The southeastern part of the continent seems to be affected by a dipole of positive/negative precipitation anomaly, the ERA40 producing a wetter run over the Caribbean region and a drier crescent-shaped area in central United States and northern Mexico. A large difference of several mm/day is seen in the southeast of the domain, probably related to differences in the way humidity in this inflow region interacts with model physics at the edge of the sponge (the sponge zone is not included in the figure). Relative differences (not shown) present a similar pattern, with precipitation being reduced by between 5 and 10% in most drier areas and reaching values of around 30% reduction in Northern Mexico. An increase of more than 20% is found in the Caribbean region. During winter, differences in precipitation are smaller and mostly located near the British Columbia Coast Mountains. However, inspection of the relative differences in this area (not shown) reveals that the increase is small, around 5%. Over northern Mexico, reduction in precipitation reaches 25%. Most of the differences in precipitation between simulations are similar to those between driving data (not shown). Important exceptions are: northern Quebec and Florida during summer, and southeastern US during winter, where the ERA40 dataset is drier than NCEP. None of the characteristics mentioned here seem to agree with the results obtained by Liang et al. (2004) in their study of two single summer seasons over the continental US.

Fig. 7
figure 7

ERA40/NCEP experiment: differences of 20-year averages (period 1961-1980) between runs NAERA and NA (NAERA-NA), for the summer (left-side panels) and winter (right-side panels), for precipitation (in mm/day, upper panels) and temperature (in °C, lower panels)

Surface temperature shows a positive difference for the ERA40-nested run during summer on most of the continent, and becomes particularly strong in the diagonal axis between Oregon and Labrador. Differences in surface temperature during winter become stronger over northern Canada. As shown in Rinke et al. (2004), daily temperature values in NCEP and ERA40 reanalyses may differ greatly, with even larger differences in polar regions during winter. The generally warmer values in the NAERA over the NA simulation are also present between the driving data ERA40 over NCEP reanalysis (not shown); structures, however, show important dissimilarities in summer over the western coast, and in winter over the continental US, where ERA40 is warmer than NCEP reanalysis.

4.3 Other uncertainties affecting RCM downscaling

4.3.1 Sensitivity to driving data (GCM vs. NCEP reanalyses)

In this section a comparison is made between surface fields downscaled by the CRCM nested in two different ways: one with NCEP reanalyses and the other with CGCM2-generated fields (see NAGCM-NA in Tables 1, 2). Differences between these fields can be mostly seen as a CRCM-propagation of errors already present in the driving GCM data, these errors being the consequence of the difficulties of the driving GCM in simulating some of the climatic features of North American.

The error introduced by driving the CRCM with the CGCM2 may be assumed to be representative of errors (although not necessarily with the same sign) introduced by driving the CRCM with other CGCMs of the same complexity. As a result, we can think these differences as a preliminary estimation of the uncertainty introduced by CGCMs when used to drive the CRCM.

Figure 8 shows that during summer, precipitation displays small differences in the North American continent north of Mexico, with a relatively dry bias in the GCM-driven run over the US, and a wet bias over Canada. Relative differences (not shown) present a similar pattern, with precipitation being reduced by around 10% over the central and western US, increased by 5% over Canada and by up to 25% over the New England region. During winter, a similar pattern persists, with differences in precipitation even less evident in absolute values. The relative differences, however, reach larger values over the central and western US, where precipitation is decreased by more than 25%.

Fig. 8
figure 8

GCM/NCEP reanalysis experiment: differences of 20-year averages (period 1961-1980) between runs NAGCM and NA (NAGCM-NA), for the summer (left-side panels) and winter (right-side panels), for precipitation (in mm/day, upper panels) and temperature (in °C, lower panels)

Continental surface temperatures show little difference during summer, especially over the Canadian territory. The US, however, is affected by a cold-warm west-east dipole. Inspection of the driving fields revealed that, during summer, both downscaled simulations are much similar to each other than are the driving fields. This suggests that, throughout summer, the CRCM is only weakly dependent on the atmospheric and ocean fields that drive it. This behavior has been reported in past publications (e.g., Noguer et al. 1998; Han and Roads 2004), and it is relevant because the opposite argument is generally used to suggest that any improvement in RCM climate change simulations is contingent to improvements in global models.

On the other hand, surface temperature differences are very large during winter, especially in the northern region. Differences between the CGCM2 and the NCEP reanalyses seem to be responsible for most of the bias (not shown). Contrary to the summer case, the information provided by the nesting data is of paramount importance during winter, especially for northern latitudes.

Over the oceans, differences in surface temperature and precipitation are strongly influenced by the different ocean information driving the NAGCM and NA simulations (see Sect. 3.2).

4.3.2 Sensitivity to RCM model version

In this section a comparison between simulations NA3 and NA, from CRCM versions 3.7.1 and 3.6.3, respectively, is performed (see Sect. 3.1 for description of differences, as well as Tables 1, 2). The differences in the surface fields produced by the two model versions can be thought of as a preliminary estimation of the uncertainty introduced by the users’ freedom of choice among RCMs.

Figure 9 displays the difference field NA3-NA for 20-year average precipitation and surface temperature, for summer and winter (period 1961–1980). During summer, in the updated version 3.7.1, precipitation is reduced over land while increased markedly over the Caribbean and Atlantic Ocean. Relative differences (not shown) present a similar pattern, with precipitation being reduced by around 50% over most of the continent. During winter a similar but weaker reduction of precipitation is found over the continent, although an increase is present in the states surrounding the Gulf of Mexico. The relative decrease over the continent is of around 10–20%, while the increase is also of around 10–20%.

Fig. 9
figure 9

Model version experiment: differences of 20-year averages (period 1961–1980) between runs NA3 and NA (NA3-NA), for the summer (left-side panels) and winter (right-side panels), for precipitation (in mm/day, upper panels) and temperature (in °C, lower panels)

Summer surface temperature has changed substantially, with a large increase in the central and southern US, and a decrease over most of Canada. This decrease is particularly strong in northern Quebec, Labrador, Yukon and Alaska and over the high mountain peaks of British Columbia (the last being related, partially, to the increase in effective topography resolution discussed in Sect. 3.1, since corrections for height differences were not performed). During winter, a relative cooling occupies most of North America with the exception of the Baja California region and the Dakotas. Notably, this decrease extends into the Gulf of Mexico and the adjacent Atlantic waters despite the fact that sea-surface temperatures are provided by the same AMIP II data in both runs.

4.4 Summary

Figure 10 displays a summary of the sensitivity experiments presented in this section. It shows the RMSD between 20-year averaged pairs of simulations considering only land grid points. Figure 10a shows surface temperature and Fig. 10b precipitation. Experiments have been ordered with increasing RMSD values for surface temperature with the aim of suggesting a ranking among the uncertainties introduced by different sources. This order has been maintained for the display of precipitation as well, although the ranking is somewhat different. This ranking, however, should be understood as a first approximation; as was said in the introduction, a much larger number of ensembles, of parameter varieties and model types would be needed for a more reliable description, and hence Fig. 10 should be considered only as a hint of the overall relationships.

Fig. 10
figure 10

RMSD between 20-year seasonal average climatologies (land grid points only) for a surface temperature, and b precipitation. The different boxes indicate the pair of runs evaluated. “Inter. var.” compares NA2 and NA, that differ only in initial conditions; “ERA-NCEP” compares NAERA and NA, which only differ in the nesting fields; “Domain” compares CAN2 and NA which differ only in the integration domain; “Nesting” compares CANDAVIES and CAN, which differ only in nesting technique; “GCM-NCEP” compares NAGCM and NA, which differ only in the nesting fields; and “CRCM ver.” compares NA3 and NA, that differ in model version. Note from Table 2 that “Domain” and “Nesting” are computed in a different temporal window from the others

The results presented in Fig. 10 show that, after a 20-year averaging, the largest sources of uncertainty are those introduced by the driving model and the RCM itself (choice of GCM and RCM), while those inherent to one-way nesting seem less important. Internal variability constitutes a small part of the uncertainty for both variables.

A change in driving reanalyses introduces a perceptible effect in the downscaled fields, especially for summer precipitation, which is affected by the third largest uncertainty after the GCM-driven and the modified CRCM version simulations. This result, also observed by Wu et al. (2005) for a 4-month integration, is important because it shows that even small differences in the observed fields –that is, our limited capacity to unambiguously define the state of the atmospheric system at a given time–, can be more damaging than the uncertainty introduced by latitude in choice of domain size and nesting technique.

The uncertainty introduced by the freedom of choice in domain size and nesting technique seems to be comparable; the relative importance depending on the season and the variable. It is important to mention that these two sensitivity experiments were performed over different time slices (see Table 2), and hence some of the differences could also be accounted for by differences in circulation patterns between periods (variability in the climate system).

In the case of surface temperature, the season that is most or least affected changes from experiment to experiment. For precipitation, on the other hand, owing to the important seasonal changes in absolute precipitation, summer is generally the most affected, and winter the least. An exception to this rule occurs in the experiment concerning sensitivity to domain size (see Sect. 4.2.2), where fall and spring become, respectively, the most and least affected.

5 Combined effects of uncertainties

As discussed in Sect. 1, the optimal way of studying the uncertainty introduced by the one-way nesting configuration would be to use a large ensemble of simulations, in which all the parameters under investigation are modified for all possible combinations. This approach is very demanding in computer resources and beyond the possibilities of our, and most, research centers.

However, with the resources at hand it is possible to consider the combined effect of some uncertainty sources by the constitution of an ensemble of simulations using some of those presented in the sensitivity studies (Sect. 4). Given the small size of the ensemble and the way configurations were varied, this ensemble will not capture the complex non-linear interactions acting when two or more configuration parameters are changed simultaneously. However, it will give a preliminary indication regarding, which areas seem more sensitive to parameter modification.

The choice of ensemble members may be performed in different ways; here it was decided to include only those runs displaying comparable levels of uncertainty (as seen in Sect. 4.4), and that pertain mostly to CRCM configuration choices.

From the list of simulations presented in Table 1, four are chosen (CAN, CANDAVIES, NA, NAERA). The choice of the members participating in the ensemble is done by selecting only those that cover the 20-year period of study (1980–1999), with the condition that no more than one member of a given configuration participates. The reason for this is that simulations differing only in initial conditions were shown to be quite similar (see Sect. 4.1 and Fig. 3), and hence, the inclusion of more that one member with the same configuration would unfairly give that configuration more weight in the average. No attempt has been made to optimize the ensemble mean by minimizing the contribution of those runs with poorer performance (as suggested by Giorgi and Mearns 2002, for example).

For the sake of completeness, the ensemble mean as well as the corresponding observed fields are also presented, although the discussion will be centered mostly on ensemble spread.

Figure 11 displays 20-year average surface temperature during summer and winter for the CRU2 observed mean (upper panels), the CRCM-modeled ensemble mean (central panels), and the ensemble spread, which is the standard deviation between time averages (lower panels). CRCM-modeled surface temperature fields represent rather well the structure of the observed fields, particularly in the small-scale orographic details over the Rocky Mountains. Some biases are present, however, and they are unevenly distributed regionally and seasonally, particularly in the mountainous region during summer (cold bias) and in the polar region during winter (cold bias). For the ensemble spread, the largest values in summer are found in the western and northern regions of Canada, reaching values higher than1.0°C in western Nunavut. There are areas of very low spread in western Alaska and central US, where values smaller than 0.1°C are reached. During winter, much of northern Canada and Alaska have the largest spreads, and areas in northern and western US also have large values that exceed 1.0°C. A large area of low values is located in the eastern US, while an isolated, well-defined minimum in Oregon and Washington states is lower than 0.1°C.

Fig. 11
figure 11

Upper panels display observed 20-year average surface temperature (CRU2 in °C), while central panels show model ensemble mean. Lower panels display the ensemble spread. Summer( left-side panels) and winter (right-side panels) seasons are presented (period 1980–1999)

Figure 12 displays 20-year seasonal average precipitation during summer and winter for the CRU2 observed mean (upper panels), and the CRCM-modeled ensemble mean (central panels). The lower panels depict the spread between 20-year seasonal averages. The CRCM-produced fields are very rich in small scales especially over mountainous regions. It is worth mentioning that these small scales are not the product of sampling noise (they are much larger than those seen in the internal variability experiment presented in Fig. 3), but the product of factors such as topographic features. Small scales are not so clearly present in the CRU2 database. The lack of a trustworthy high-resolution precipitation dataset makes the task of quality evaluation of small-scale spatial patterns rather difficult for most areas, and almost impossible in northern regions, where the inter-station distance is much larger than model resolution. The CRCM produces more precipitation than observed almost everywhere, especially during summer, although the gradients at the continental scale are well represented: an increase in precipitation from west to east for most of North America, and a decrease toward the pole in northern Canada and especially in Quebec.

Fig. 12
figure 12

Upper panels display observed 20-year average precipitation (CRU2 in mm/d), while central panels show model ensemble mean. Lower panels display the ensemble spread. Summer (left-side panels) and winter (right-side panels) seasons are presented (period 1980–1999)

The ensemble precipitation spread during summer follows somewhat the pattern of the ensemble mean, which is an expected result due to the nature of precipitation frequency distributions. This is not, however, the case for orographic precipitation in western Canada, where low spread is present despite intense precipitation. A region of low spread and high precipitation is also found in the Carolinas. During winter, precipitation spread is also correlated with precipitation intensity, with the exception, again, of orographic precipitation in northern British Columbia and Alaska. A noteworthy spread maximum is also found south of New England. Inspection of the coefficient of variation (not shown; defined as precipitation ensemble spread over precipitation ensemble mean) displays summer values inferior to 10% over most of the continent, but reaching 15% in the area centered in northern Mississippi. For winter, the coefficient of variation is also lower than 10% over most of the domain although values surpassing 20% can be found in an area encompassing Nebraska and Kansas as well as south of New England.

6 Conclusions

The objective of this work is to study some of the uncertainties related to freedom of choice in RCM configuration, and to evaluate these uncertainties in light of those introduced by other sources such as internal variability, driving (GCM) and regional (RCM) model selection. This study has been carried out from simulations performed with the Canadian Regional Climate Model (CRCM) over North America. The approach followed consisted in the analysis of a set of simulations performed with different model configurations. Before analyzing the effect of parameter changes on model results, the internal variability was investigated with the aim of estimating the intrinsic noise level against which all other experiments should be compared. Internal variability is shown to be small: except for the polar region, a 20-year simulation may be considered to have an internal noise of the order of 0.2°C in surface temperature and relative differences in precipitation of less than 10%. Experiments also show that the estimation of internal variability is weakly dependent on model version. Experiments concerning changes in nesting technique, domain size, and observational driving data, displayed sensitivities greater than those generated by internal variability. This study did not aim to understand the causes of the mentioned sensitivities, but to estimate the relative uncertainty introduced by each modification.

Additional sensitivity experiments were performed with the aim of gauging the relative importance of the uncertainties introduced by RCM configuration liberty with respect to those dependent on model choice (driving GCM and driven RCM).

A sensitivity test was performed by comparing simulations where the driving data were changed from NCEP reanalyses to GCM-generated fields. This experiment aimed to estimate the uncertainty produced by the possible use of a variety of GCMs. The difference between fields downscaled from the observed data and those downscaled from the GCM simulations is a first order estimation of the uncertainty introduced by the diversity of existing GCMs. The last sensitivity test was performed by changing the control CRCM for an updated version that included several modifications. The aim of this test was to estimate the uncertainty introduced by the availability of multiple imperfect RCMs of similar skill.

Results from the sensitivity studies suggest a ranking regarding the relative importance of uncertainty sources due to model configuration changes; although by no means applicable on a general basis, the ranking provides some good news. For example, it is interesting to see that the uncertainty introduced by changes in domain size or nesting technique is comparable to that of a change in atmospheric reanalyses (which are our best approximation to reality). In addition, these three sensitivities (to nesting technique, to domain size, and to driving observational dataset) are considerably smaller than those depending on the GCM and RCM chosen to perform the downscaling. These results reduce the relative importance of the fact that RCMs show in general sensitivity to domain and nesting technique choices.

The comparatively large effect of a change in RCM version—which in certain places surpasses the effect of changing the nesting atmospheric reanalyses for GCM-simulated data—indicates that a good amount of inter-version differences is to be expected, and hence the use of multi-version or multi-model ensembles becomes imperative. Results also suggest that during summer the CRCM behaves somehow independently from the driving data and hence much of the responsibility for a successful simulation belongs to the RCM.

The ensemble spread estimated by just four members does not intend to be more than a qualitative guide to the robustness of the estimated climate in different regions. Areas of large spread may not only identify regions of variable skill in the model, but also regions that are sensitive to small configuration changes in general, and hence, that may pose particular difficulties to numerical modeling. It is expected that future studies taking advantage of new available simulations over North America will be able to make more reliable estimations of uncertainty than those presented in this work.