1 Introduction

Available global climate projections provided in the Fourth and Fifth Assessment Reports (AR4 and AR5) of the Intergovernmental Panel on Climate Change (IPCC 2007, 2013) suggest an increase of several degrees in global temperatures by the end of the 21st century. This primary effect of human greenhouse-gas emissions is likely to be accompanied by a global acceleration of the water cycle (Bosilovich et al. 2005; Huntington 2006) due to changes in evaporation and precipitation patterns. Overall, climate change should have an impact on water resources management, agriculture and other crucial activities from local to global scale. However, climate projections are based on General Circulation Models (GCMs) that are integrated at a too coarse resolution to capture heterogeneities met at smaller scales. Grid cells of several degrees in resolution smooth regions such as mountain ranges where they fail to represent topography-induced gradients in hydro-climatic variables (Buytaert et al. 2010). Therefore, GCMs do not provide a directly usable output for local climate change impact studies, particularly in topographically-varied areas like steep mountainous areas.

In recent years, bias correction and downscaling techniques have been developed to derive the output of GCMs for impact studies (Gleick 1986; Maraun et al. 2010; Teutschbein et al. 2011). Among them, statistical downscaling is a straightforward approach that establishes a relationship between climate model output as independent variables and in situ observations of desired parameters as dependent variables (Teutschbein et al. 2011). Then, the fitted relationship is extrapolated in time to predict future conditions at the same site from climate model projections. Although these methods are useful to create data relevant to the impact community from global projections, some drawbacks exist. First, the relationship between GCM coarse grid cell predictions and point data is assumed to remain steady with time (Leung et al. 2003). Second, some of the most elaborated methods like SDSM (Wilby et al. 2002) require long time series of observational data to establish a robust statistical relationship. Such data are not available in many parts of the world, thereby reducing the choice of suitable downscaling methods in data-scarce environments like high mountain headwater catchments.

Two climate projections can lead to heterogeneous weather time series when downscaled using the same approach. Hydrologists have addressed this source of uncertainty by driving single hydrological models with sets of boundary conditions derived from ensembles of GCM simulations. Both single climate models initialized from different conditions (Wood et al. 2002) and multiple climate models (Buytaert et al. 2009. 2010; Andersson et al. 2011; Bennett et al. 2012; Immerzeel et al. 2012; Andres et al. 2013) have been utilized to create ensembles of weather projections.

Along with the uncertainty introduced by GCMs and downscaling processes, the use of rainfall-runoff models to translate meteorological variables into hydrological variables introduces further uncertainties. Dobler et al. (2012) concluded that the uncertainty introduced by different hydrological parameter sets was minor compared to climate modelling and downscaling. However, several model inter-comparison projects have pointed out that conceptual differences between hydrological model structures dwarfed the parameter uncertainty despite the availability of high quality calibration datasets (e.g. Chiew et al. 1993; Smith et al. 2004; Breuer et al. 2009). Only little research is available that addresses the relative contribution of GCMs and hydrological model structures to runoff projection uncertainty despite the increasing concern of the hydrological community on the reliability of downscaling approaches (Blöschl and Montanari 2010; Ehret et al. 2012). Bastola et al. (2011) have applied 4 conceptual hydrological models to predict discharge forced by inputs from 3 regionalized GCMs driven by 2 emission scenarios in temperate oceanic Irish catchments. They concluded that the uncertainty introduced by hydrological models themselves is substantial and should be included in any impact study. A combination of 8 GCMs, 2 emission scenarios and 4 hydrological models of various complexities applied in the oceanic Tualatin catchment of Oregon, US, showed a lower importance of hydrological model selection as compared to the GCM uncertainty for total runoff (Najafi et al. 2011) but hydrological model uncertainty gained importance during low flows. Similar conclusions were drawn by Velázquez et al. (2013) in two temperate catchments located in Bavaria and Québec, and by Bosshard et al. (2013) for a catchment in the Swiss Alps. These studies highlight the need of considering multiple hydrological models when projecting water resources availability under increasing drought or flood risk with climate change (Bastola et al., 2013). A previous study by Lavado Casimiro et al. (2011) compared two monthly water balance models in combination with three GCMs and two emission scenarios for conditions similar to our study. They targeted the identification of the most adapted hydrological model for various catchments in the Peruvian Andes and did not provide estimates of the contribution of different elements to the final uncertainty in runoff projections.

All these studies have been carried out in well monitored environments where runoff calibration data were available over time periods long enough to reliably constrain model parameters. In this paper we address the sources of uncertainty in runoff projections for a remote, poorly monitored, headwater catchment of the Ecuadorian Andes that is representative of tropical mountain regions. Here, we investigate the relative contribution of GCMs and hydrological models to the final uncertainty in runoff projections to assess how these two sources of uncertainty in runoff projections compare to each other in places where data available for hydrological, or impact, model calibration is scarce.

2 Materials and methods

2.1 Study area and available data

Hereafter follows a brief description of the San Francisco catchment. For more details, the reader is referred to the Electronic Supplemental Material. The San Francisco catchment covers an area of 75 km2 on the eastern side of the Andean Cordillera in Southern Ecuador. Elevation ranges from 1,720 m to 3,155 m above sea level. Annual rainfall rates vary from 1,500 to 4,900 mm yr−1 (Rollenbeck and Bendix 2011) and it never snows. Due to the topography, precipitation is highly variable in both space and time and poorly covered by the four climate stations. Land use within the catchment is dominated by forests (68%) and is organized in two major sectors: a southern part with pristine tropical montane rainforest (Podocarpus National Park) and northern sections that have been partly converted to pasture. The highest altitudes are covered by Páramos, which are neo-tropical alpine grass and wetlands. Detailed information about the catchment and data processing for model setup can be found in the Electronic Supplemental Material and was described by Plesca et al. (2012).

Runoff in the Rio San Francisco is characterized by extremely high rainfall-runoff coefficients ranging from 0.74 to 0.81 for different subcatchments, corresponding to an annual total runoff between 2,041 to 3,090 mm (Crespo et al. 2012). Daily observations of discharge and climate data (precipitation, air temperature, wind speed, humidity and solar radiation) are available for the time period from 4th April 2007 to 31st May 2008 that represents a cumulative rainfall of about 3980 mm with the wettest month being June 2007 with 580 mm. While the monitoring period may be too short to effectively constrain model parameters in combination with its poor spatial coverage, we consider it representative for the data availability in many remote areas of the world, especially montane headwater catchments. Moreover, some previous studies indicate that about a year of data may be sufficient to efficiently calibrate a hydrological model (Brath et al. 2004; Perrin et al. 2007) although the timing of the measurements of course plays an important role (Seibert and Beven, 2009). A decade of research has contributed to methods for Predictions of Ungauged Basins (PUB initiative; Sivapalan et al. 2003; Hrachowitz et al. 2013). We propose our multi-forcing (section 2.2), multi-model (section 2.3) procedure described hereafter as an approach to resolve such predictions, especially since climate change impacts are to be felt globally in regions previously poorly monitored.

2.2 Climate projections and downscaling

We downloaded 21st century projections of 8 GCMs (Table S1 in the Electronic Supplemental Material) driven by two AR4 emission scenarios (A1B and A2) from the Data Distribution Centre of the IPCC (www.ipcc-data.org, accessed 14th August 2010). For computational efficiency, we rely on a subset of the CMIP3 ensemble that is representative of the spread and distribution of the variables of interest in future projections. We selected monthly precipitation, maximum and minimum temperature for our research domain. While the A1B scenario corresponds to a globalised world with a population peaking during the 21st century and a balanced use of fossil and renewable energy, the A2 scenario focuses on a regionally oriented development of the economy and a continuously increasing population.

In headwater catchments of the Ecuadorian Andes, rain gauge density is often too low to allow a good representation of topographical gradients (Buytaert et al. 2010) and available time series are short. Therefore, we have to rely on a simple downscaling procedure with low data requirements. In this study, we apply the widely used delta change method (Gleick 1986; Maraun et al. 2010). Average monthly differences between a reference 30-year control period (1960–1990) and three 30-year time slices of the 21st century (2010–2039; 2040–2069; 2070–2099) are first extracted for each GCM. Then, these are applied to existing time series recorded at the four stations shown in Figure S1 to generate a total of 48 projections of weather data resulting from 2 emission scenarios, 8 GCMs and 3 time slices. In our case, the relative anomaly is used for precipitation and the absolute anomaly for temperature. The most important assumptions behind this method are that (1) no changes will occur in the temporal pattern of precipitation events (i.e. number of dry and wet days) and temperature, and (2) the magnitude of the change is independent of the intensity of the precipitation and the absolute temperature. While these are very large assumptions, the limited availability of observational data does not allow for a more complex approach such as SDSM. Furthermore, we are mostly interested in the uncertainty within the hydrological ensemble as compared to the one introduced by GCM projections. It is worth noting that the obtained time series that we ultimately use to drive our ensemble of hydrological models reflect average changes for the given time slices but not for a particular year within the time slice.

2.3 Hydrological model ensemble

In remote environments the applicability of detailed hydrological models is often challenged by the scarcity of information on landscape features like soil and vegetation types, making it harder to identify a priori more suited model structures (Plesca et al. 2012). Therefore, one usually has to rely on non-parsimonious conceptual models, which may require extensive calibration of empirical parameters. Where streamflow observations are not available over a period of time that is representative of the full variability met in the catchment, parameter sets become less identifiable. The potentially high equifinality (Beven 2006) introduces the risk of over-fitting model parameters to calibration data (Huisman et al. 2009; Wade et al. 2008) and can lead to inconsistent simulations under change (Exbrayat et al. 2013). Several ways exist to address this problem at least partially, such as transferring parameter sets from neighbor catchments in proxy-basin tests (Klemeš, 1986), relying on global or regional values (Seibert 1999), or using surrogates of runoff data like remotely sensed river width (Sun et al., 2010). Here, instead we rely on several independent model structures applied to simulate the behaviour of the studied catchment.

Our corresponding ensemble of hydrological models is composed of 7 different model structures: HBV-light (Seibert 1997), LASCAM (Sivapalan et al. 1996), CHIMP (Exbrayat et al. 2010), HEC-HMS (USACE 2001), NAM (Nielsen and Hansen 1973), SWAT (Arnold et al. 1998) and HBV-D (Lindgren et al. 2007). Table S2 and corresponding text in the Electronic Supplemental Material provides a summary of their main features and a more detailed description of these models. Hydrological behavior of the basin and the hydrographs used for all hydrological models during the available period are reported in Plesca et al. (2012) and are not reproduced here. For the purpose of generating climate projections of water resources availability, each hydrological model was driven by the whole dataset of 48 downscaled projections (section 2.2) using optimal parameter sets previously determined in Plesca et al. (2012), thus resulting in a total of 336 simulations. Therefore, for the same input data, differences between models only depict differences in the way the models make use of driving conditions, i.e. the structural uncertainty, starting with discrepancies in spatial distribution (Table 2). However, since the most lumped (HBV-light) and most distributed (HBV-D) models perform similarly well in simulating the runoff at the outlet of the San Francisco catchment while integrating the same processes, differences in the representation of infiltration excess processes seem to be the major source of uncertainty in the hydrological modelling of this catchment. Despite the poorer performance of some models during calibration, and the over-parameterised LASCAM during validation, we decided to keep all models in the ensemble cohort as they may contribute to the overall ensemble performance (Viney et al. 2009).

3 Results

3.1 Downscaled climatic projections

Several studies on climate projections in the Andes have shown that precipitation is likely to increase on the Amazonian slopes of the Ecuadorian Andes (Urrutia and Vuille 2009; Buytaert et al. 2010; Buytaert and Bièvre 2012). The mean increase in precipitation is 9.9% for the A1B scenario and the 2010 – 2039 period, increasing to 27.0% for the A2 scenario and the 2070 – 2099 period. However, the anomaly shows a very strong seasonality, with the larges changes concentrated during the peak of the wet season (May – June – July) of the monomodal precipitation regime. These changes may be attributed to an intensification of the continental low-level jet over the Andes but are subject to a strong inter-GCMs uncertainty that grows through time as exhibited on Figs. 1 and 2. From a hydrological perspective, it means a very strong increase in seasonality with more extreme events during the wet season.

Fig. 1
figure 1

Distribution of the delta changes between historic (20C3M scenario) and future (A1B scenario) simulations for the different variables and time periods used in this study as indicated. Boxplots markers correspond to quartiles and median values are shown (●). The length of whiskers is limited to 1.5 times the width of the box and values located further away below the first quartile or above the third quartile are considered extreme ones (○)

Fig. 2
figure 2

Distribution of the delta changes between historic (20C3M scenario) and future (A2 scenario) simulations for the different variables and time periods used in this study as indicated. Boxplots markers correspond to quartiles and median values are shown (●). The length of whiskers is limited to 1.5 times the width of the box and values located further away below the first quartile or above the third quartile are considered extreme ones (○)

Changes in temperature are also in line with earlier studies on climate change in the region (Buytaert et al. 2009). Very little seasonality in the changes is detected (Fig. 1 and 2), which is in line with the tropical climate of the study region. The average increase in temperature is 1.15ºC for the A1B scenario and the 2010 – 2039 period, increasing to 3.35˚C for the A2 scenario and the 2070 – 2099 period. The changes in minimum temperature tend to be slightly lower than the changes in maximum temperature, also indicating an increase in the daily temperature variability. Similarly to precipitation, the inter-GCMs variability increases toward the end of the 21st century.

3.2 Hydrological projections

The panels of Fig. 3 represent the exceedance probabilities of changes in total streamflow for each scenario and period. In all our 6 cases, between 89% (i.e. A2 between 2070 and 2099) and 95% (i.e. A1B between 2010 and 2039) of projections agree on an increase of discharge from the San Francisco catchment. Furthermore, these exceedance probability curves are quite similar regardless of the emission scenario when considering the same future time period consistent with the typical observation that GCM ensemble variability is larger than emission scenario variability, although scenario A2 always projects slightly higher extremes. Meanwhile, the probability curves (Fig. 3) steepen by the end of the century, meaning that projections are less in agreement with each other with an increasing variability.

Fig. 3
figure 3

Exceedance probability of total modelled runoff relative to observed one for each scenario and time slice as indicated

As shown in the cumulative runoff curves (Fig. 4), this is due to a larger scatter between projections in the higher end of the uncertainty bounds. More precisely, while only 2% of the models predict a doubling in total runoff for the earliest period (2010–2039) regardless of the scenario, 36% and 32% of them do so at the end of the 21st century (2070–2099) for scenarios A1B and A2, respectively. Most of this uncertainty in projected runoff can be attributed to discrepancies in flow regimes simulated around simulation day 75, i.e. in mid-June, during the wet period. This corresponds to the large uncertainty of GCM projections for rainfall in June as shown in Fig. 1 and 2. Later in the simulation, projections still diverge but in a less dramatic way, especially between days 100 and 400 (Fig. 4).

Fig. 4
figure 4

Cumulative runoff predicted by the ensemble of 7 hydrological models driven by the downscaled GCMs projection for each scenario and time slice as indicated. Solid lines correspond to cumulative measured runoff (April 2007 to May 2007, in black), dashed lines indicate the average cumulative projections, light gray shadings represent the full range of runoff projections, and the inner 50% is represented in dark gray

3.3 Sources of variability

Although the range of projected changes does not vary much between scenarios, a finer investigation of the other sources of uncertainty is required. We sequentially look at the total spread of projections caused by differences between the GCMs and the hydrological models. Figure 5 depicts the average contribution of hydrological models to the total range in projected cumulative runoff for each scenario and time period as indicated. For each period and scenario, the grey area represents the ratio of the average range obtained by driving all hydrological models with the downscaled projection of one GCM to the total range (Fig. 5). During the first study period (2010–2039) the average proportion of the total range of runoff projections (Fig. 5) that is covered by the structural uncertainty in the hydrological models is as large as the uncertainty arising from downscaled GCM projections regardless of which scenario is considered. This ratio decreases throughout the century to 42.6 then 37.4% and to 35.1% then 31.3% for scenario A1B and A2 respectively. It indicates that the large uncertainty observed in late 21st runoff projections (Figs. 3 and 4) is more related to a growing disagreement between GCM projections downscaled using the delta-method than between hydrological models by the end of the 21st century.

Fig. 5
figure 5

Average fraction (in grey) of the total range of projections (grey + white) introduced by hydrological models when driven by the same GCM

4 Discussion

We focus on three sources of uncertainty in future runoff projections of our Ecuadorian catchment: emission scenario, choice of a GCM and choice of a hydrological model. First, runoff projections do not seem to be very sensitive to the choice of an emission scenario as emphasized by comparable patterns between projections, regardless of the scenarios. This observation confirms some previous studies that showed that GCM sensitivities were larger than emission uncertainties in various contexts (Wilby and Harris 2006; Christensen and Lettenmaier 2007; Teutschbein et al. 2011; Jung and Chang 2011; Harding et al. 2012). This is despite the fact that the A2 scenario involves a much larger global radiative forcing than the A1B scenario, especially towards the end of the 21st century (Arnell 2004). In a more regional context, Boulanger et al. (2006) reported that the A1B scenario only reaches 80 to 90% of the warming displayed in the A2 scenario. For precipitation the amplitude in change was also larger for A2 than A1B (Boulanger et al. 2007) as it is shown in Fig. 1 and 2, and indirectly in the range of predicted runoff in Fig. 5. In another context, Bennett et al. (2012)) found that, despite the larger uncertainty introduced by the GCM, emission scenarios represented a non-negligible source of predictive uncertainty in two catchments of British Columbia (Canada) where snow dominated hydrology is more sensitive to tiny differences in air temperature that can affect the timing of spring flood as opposed to our snow free catchment. The uncertainty linked to the choice of an emission scenario selection seems dependent on the region where projections are considered. With regard to our simulations, we conclude that the large variability in projected precipitation and runoff between models masks the general differences for the two climate change scenarios in the research area.

However, Fig. 5 illustrates the larger contribution of GCMs to the total uncertainty. The observed predominance of GCMs in the breakdown of the uncertainty in total runoff projections corroborate recent results obtained following a similar approach (several GCMs, several hydrological models) for drier environments in south-east Australia (Teng et al. 2012) or in a snow-dominated catchment in Québec, Canada (Chen et al. 2011). However, Bosshard et al. (2013) showed that seasonal variations existed in the relative contribution of GCMs and hydrological models to the total uncertainty in runoff projections in a Swiss alpine catchment. While differences between GCMs explained most of range over the full study period, hydrological models were the main source of uncertainty for low flow periods. Our results, although obtained in a very different climatic context, are consistent with these previous studies. In effect the range in runoff projections is dominated by GCMs but a non-negligible part of the total uncertainty is contributed to by hydrological models. It should therefore not be disregarded, as also shown by Buytaert et al. (2010), and represents up to a two-fold difference in total runoff predicted by different hydrological models forced by the same input data. Previous results with the same ensemble of models applied to this catchment (Plesca et al. 2012) indicated that differences in infiltration excess explain most of the variability in the hydrological modelling of this catchment. Therefore, the uncertainty could certainly be reduced with a more detailed knowledge of soil hydraulic properties and their spatial distribution.

We are aware that we do not address some other significant sources of uncertainty like the choice of a downscaling approach or the accuracy of weather and runoff data used for previous calibration. Meanwhile, our hydrological models have only been calibrated for a short period of time. We acknowledge this as a weakness in our study although studies by Brath et al. (2004) and Perrin et al. (2007) showed that short runoff calibration datasets may be sufficient to reach calibration results comparable to using a longer dataset. Discharge measurements used to calibrate the models (Plesca et al., 2012) simply correspond to the only available data in this region. Climate change effects are to be felt globally but water resources will involve local and regional decisions to address its impact. The non-negligible part of the projected range that is contributed by differences in hydrological models (Figure 5) shows that catchment scale studies, and by extension any impact assessment study, in places where calibration data is scarce cannot confidently rely on a single model that may be over-fitted. As previously argued in Plesca et al. (2012), merging the differences in the conceptualization of the same processes that arises from heterogeneous parameterizations in a multi-model ensemble provide a better confidence interval sampled from more independent predictions of the same system (Abramowitz and Gupta 2008; Exbrayat et al. 2013). For example, Seiller et al. (2012) recently demonstrated that the average of an ensemble of multiple hydrological models was more transposable under contrasted climate conditions than its members. This increases the confidence we can put in the validity of the ensemble approach for climate change impact studies, especially in regions where observations are insufficient to confidently constrain a single impact model, like in the Ecuadorian Andes.

5 Conclusions

We presented projections of the impact of climate change on runoff in a remote mountainous headwater catchment in the Ecuadorian Andes. Results indicate a strong agreement between models toward an increase in the runoff throughout the 21st century in response to a wetter rainfall season. However, the magnitude of this increase is highly uncertain with discrepancies between climate projections during the rainy season contributing the most to the variability in projected runoff. Nevertheless, the at least 31.3% contribution of hydrological models to the spread in final projections should not be disregarded either.

Although we did not address some sources of uncertainty like the choice of a downscaling method or the non-uniqueness of parameter sets, we highlighted some of the problems currently met in impact studies arising from the disagreement between GCM projections, but also impact models themselves. Meanwhile, we are concerned by the need to collect better in situ observational data to robustly constrain hydrological models and complex statistical downscaling procedures in data scarce environments like South American headwater catchments. We recommend that local impact studies should never rely on a single model, and integrate more models as the data availability decreases, especially if they are designed to advise decision-making that may affect large populations.