1 Introduction

The assessment of hydrologic responses to climate change and variability has been widely conducted for watershed management, planning, and water-related natural hazards such as floods and droughts. Dynamical downscaling (Takle et al. 1999; Mearns et al. 2009) and statistical downscaling (Wilby and Harris 2006; Maurer et al. 2007) are the two general methods to generate fine-resolution meteorological variables from global climate models (GCMs). The downscaled climate forcing output can be translated into hydrological state using watershed-modeling approach for management of water resources. However, uncertainties arising from climate and land surface models impose a significant challenge for related studies. Both downscaling methods introduce more uncertainty in addition to inheriting the GCMs’ uncertainty (Wood et al. 2004; Castro et al. 2005; Fowler et al. 2007; Lo et al. 2008).

Multi-model simulations (ensembles) can reduce bias and uncertainty by incorporating more global and regional climate models (RCMs). By doing so, inter-model differences due to internal variability, parameterization, and land-atmosphere interactions can be accounted, providing estimations that are superior to a single good simulation (Phillips and Gleckler 2006; Gleckler et al. 2008). For example, the program of Prediction of Regional Scenarios and Uncertainties for Defining European Climate Change Risks and Effects (PRUDENCE) and its improvement of Ensemble-Based Predictions of Climate Changes and Their Impacts (ENSEMBLES) combined uncertainty from global and regional models and systematically examined climate projections for Europe (Christensen et al. 2007, 2009). In North America, the North American Regional Climate Change Assessment Program (NARCCAP) is currently the most comprehensive regional climate-modeling project for climate change impact studies (Mearns et al. 2009, 2012). Although multi-model weighting approaches have been used to select climate models based on their historical simulation performance for many climatic and hydrologic studies (Giorgi and Mearns 2003; Gleckler et al. 2008; Christensen et al. 2010; Sobolowski and Pavelsky 2012; Zhang and Huang 2012), the equally weighting method also is advocated because the weighting approaches could drive further biases due to the limitation of evaluation metrics (Pierce et al. 2009; Knutti et al. 2010; Mearns et al. 2012).

Previous studies, as mentioned above, used either statistically or dynamically downscaled climate projections (e.g., Ghosh and Katkar 2012; Hanel et al. 2013; Fiseha et al. 2014). Wood et al. (2004), exceptionally, assessed statistical and dynamical downscaling methods, but was based on a single RCM and GCM. In this study, multi-model projections from the newly available NARCCAP are served as a primary data source, which are compared with statistically downscaled CMIP3 GCM ensembles from both aspects of climate forcing and hydrological response with the Variable Infiltration Capacity (VIC) land surface model. The impact of possible future climates on the hydrological system is further assessed after the climate projection evaluation and comparison. The study area is focused on the Oologah Lake (Verdigris River) watershed with area of 4,340 square mile (11,240 km2), extending northwestward from Oologah Lake and upstream along the Verdigris River to Kansas (Fig. 1). The watershed is an important water source for Tulsa, OK, and nearby regions. We chose the Oologah Lake watershed also because 1) there is an existing and relatively long hydrologic period of record, and 2) prior study efforts by the U.S. Army Corps of Engineers (USACE) suggested that high alteration of water quantity and quality could be associated with the climate change and variability within the basin. Statistical downscaled projections from CMIP3 have shown an increase of temperature (3 °C) and slight increase of precipitation for the nearby region in the next 50 years (Liu et al. 2012a, b), but with significant uncertainty arising from different GCMs. This climate change and impact study takes into account both statistically and dynamically downscaled climate projections and multiple river network upscaling methods for hydrological simulation improvement, which would provide better information to assist watershed management and planning for the region.

Fig. 1
figure 1

Study area with features of river, lakes, and political boundaries (left), elevation (middle), and land use and cover from the UMD (University of Maryland) 13-land-cover-type scheme (right). The abbreviations of land use and cover type in the legend are: Eve (evergreen), Ndl (needle leaf), Brd (broad leaf), Dcd (deciduous), Wd (wood), Clo (close), and Opn (open)

2 Data and Method

2.1 NARCCAP

NARCCAP is currently the most comprehensive regional climate-modeling project for the North America (Mearns et al. 2009, 2012). Four (4) GCMs were chosen in the project to provide boundary conditions for 6 RCMs running fine-resolution (50 km) regional climate simulations over the conterminous United States and Canada for the time periods of 1971–2000 and 2041–2070. NARCCAP uses the A2 emission scenario; hence, the global average CO2 is projected to reach 850 ppm by 2100 (IPCC 2000). To assess the RCMs’ performance alone, the project also includes the reanalysis data produced by the National Centers for Environmental Prediction (NCEP) and the U.S. Department of Energy (DOE) to drive the RCMs for the time period 1979–2004. Here we chose 8 different RCM/GCM combinations (including all the currently available simulations that have complete historical and future projections) from the archive distributed by NCAR’s Earth System Grid data portal. The selected RCM/GCM combinations are described in Table 1. Meteorological variables from these pairs include 3-hourly precipitation and wind speed and daily maximum and minimum temperature. In order to be directly comparable to other data sources (such as statistically downscaled climate projections) and compatible with the VIC model’s spatial dimension, these NARCCAP variables are interpolated to a 1/8th-degree grid size using a linear method.

Table 1 The NARCCAP RCM/GCM models used in this study

2.2 BCSD-CMIP3

For comparison of hydrological responses driven by different downscaled climate projections, the Bias-corrected and Spatially downscaled–Coupled Model Intercomparison Phase 3 (BCSD-CMIP3) was also incorporated (using the LLNL-Reclamation-SCU downscaled climate projections data derived from the World Climate Research Program’s CMIP3 multi-model dataset that is stored and served at the LLNL Green Data Oasis). This multi-model dataset includes 112 World Climate Research Program (WCRP) CMIP3 members with the CO2 emission scenarios of A1b, A2 and B1, and each climate projection was bias-corrected and spatially downscaled (Wood et al. 2002; Maurer et al. 2007). CO2 emission increases from B1, A1b to A2 scenarios. The A2 scenario represents a very high degree of climate change and provides more information from the impact and adaptation point of views. Since only A2 emission scenario is used in NARCCAP, for direct comparison, the A1b and B1 scenarios in BCSD-CMIP3 are not considered for the climate change impacts. Similarly, the statistically downscaled atmospheric forcing includes precipitation, wind speed, and maximum and minimum temperatures, which are on daily scale with spatial resolution of 1/8th degree.

2.3 VIC Model

The VIC model of Liang et al. (1994, 1996, 1999) is implemented for the Oologah Lake watershed (Fig. 1). It is a semi-distributed, grid-based hydrological model that simulates land surface-atmosphere hydrometeorological processes with both the water and energy budgets. The VIC model has been widely applied to river basins with various scales over the continental U.S. and the globe (Abdulla et al. 1996; Bowling et al. 2000; Nijssen et al. 2001; Su et al. 2005; Andreadis and Lettenmaier 2006; Christensen and Lettenmaier 2006; Gao et al. 2007; te Linde et al. 2008). In our application, version 4.1.2.c is used with three soil layers defined according to the U.S. State Soil Graphic (STATSGO) dataset. Land use and cover is leveraged from LDAS (Land Data Assimilation System) project (http://ldas.gsfc.nasa.gov/nldas/NLDASnews.php), which is originally derived from UMD (University of Maryland) 13-land-cover-type scheme (Fig. 1). Since our model is composed of 1/8th-degree cells, the proper upscaling of routing phase parameters (e.g., flow direction) becomes very important for the VIC runoff routing process (Lohmann et al. 1996). Wu et al. (2012) developed a new global river network database at multiple spatial scales from 1/16th to 2°. In our case, the upscaling is conducted from the Hydro-1 k digital elevation model (DEM) using an algorithm provided by VIC developing group (http://www.hydro.washington.edu). Figure 2 shows the comparison of upscaled river networks with these two methods. The network based on Wu et al. (2012) has more flow line distortion than our own calculated one for this region, especially in the northwest upstream portion where it creates parallel streamflow rather than draining water thorough a converging system.

Fig. 2
figure 2

Upscaled river networks with algorithms from (a) Wu et al. (2012) and (b) VIC developing group. The background is 1 km DEM derived flow accumulations

The VIC model is first driven by atmospheric forcing from the University of Washington’s (UW’s) gridded dataset (Maurer et al 2002). This dataset is processed with an elevation effects correction by the Parameter-Elevation Regressions on Independent Slopes Model (PRISM) (Daly et al. 1994). Figure 3 shows the daily and monthly VIC simulated hydrographs, which show good agreement with the observations at the Verdigris River near Lenapah (Fig. 1) for both calibration (1990–1997) and validation (1968–1989) periods. This gauge is located close to the final outlet of the basin and measures the total water discharge into the Oologah Lake; therefore, the proper reproduction of its long-term hydrography validates the VIC model and demonstrates its applicability for potential climate change impacts to the water availability in the basin. The hydrologic model is calibrated with the SP-UCI (shuffled complexes with principal component analysis) algorithm (Chu et al. 2010, 2011). This algorithm can prevent population degradation and provide better parameter effectiveness than its original method SCE_UA (shuffled complexes evolution) (Duan et al. 1993, 1994). Table 2 presents the statistics of VIC model performance in terms of root mean square error (RMSE) and Nash-Sutcliffe efficiency (NSCE), computed by the Equations below.

Fig. 3
figure 3

Daily (a) and monthly (b) VIC simulation and observation of hydrography at Verdigris River near Lenapah, OK. The gray bars show the precipitation amount over the Oologah Lake watershed

Table 2 Statistics of VIC streamflow simulations in the calibration and validation periods
$$ RMSE=\sqrt{\frac{{\displaystyle \sum_{i=1}^N{\left({S}_i-{M}_i\right)}^2}}{N}} $$
(1)
$$ NSCE=1-\frac{{\displaystyle \sum_{i=1}^N{\left({S}_i-{M}_i\right)}^2}}{{\displaystyle \sum_{i=1}^N{\left({M}_i-{M}_{avr}\right)}^2}} $$
(2)

Where N, S, and M represent the number of observations, VIC simulation, and gauge observation, respectively. Mavr represents the average of gauge observation. After calibration the model performs much better than a priori simulation with NSCE increasing from 0.53 to 0.62 (daily) and 0.59 to 0.74 (monthly) during 1990–1997 and even higher values of 0.66 (daily) and 0.8 (monthly) for the validation period 1968–1997.

3 Results

3.1 Historical Climate Simulations and Hydrological Performance

The annual temperature and precipitation (1968–1997 mean) from NARCCAP multiple models are shown in Fig. 4a and b, respectively. For comparison, the UW’s observations are also included in this figure. Generally, NARCCAP simulated the temperature field favorably, including the distinct south-north decreasing trend that persists in every model as reflected by the UW dataset, but with temperature over-prediction by 2 models (CRCM-CCSM and HRM3-HADCM3) and under-prediction by the RCM3-GFDL. For the precipitation, more differences and spatial variations are apparent among the NARCCAP models relative to the observation. Both precipitation magnitude and pattern tend to be more influenced by the RCMs; that is the same RCM pairs have very similar precipitation simulations even though they are driven by different GCMs. In particular, the WRFG and CRCM models reproduce similar spatial patterns to the observed field, with the precipitation maximum in the southeast and minimum in the northwest. Both GCMs underestimate the precipitation magnitude, however. The UK regional climate model of HRM3 performs worst by producing an unrealistic high precipitation center in the northwest but with annually reasonable precipitation amount.

Fig. 4
figure 4

Annual precipitation (a) and mean temperature (b) comparison between each NARCCAP model and the observation for the time period 1968–1997

Figure 5 shows monthly precipitation and temperature (1968–1997 mean) from NARCCAP (a and b) and BCSD-CMIP3 (c and d) models and corresponding observations. The NARCCAP multi-model simulations generally agree with the observed precipitation cycle that has dual maximums occurring in May and September, respectively. However, all ensemble members tend to underestimate rainfall for autumn. The annual temperature cycle is well simulated by NARCCAP except somewhat overestimation of values during the summer. It is notable that the HRM3 performs abnormally in both temperature and precipitation simulations, although their annual averages are closest to the observation. The lower panel of the Fig. 5 shows that BCSD-CMIP3 simulations almost exactly replicate observations for both temperature and precipitation because the bias-correction procedure has been applied during statistical downscaling of coarse GCMs. The BCSD-CMIP3 ensemble, however, slightly underestimates the total spring precipitation.

Fig. 5
figure 5

Average monthly precipitation and temperature from NARCCAP (a and b) and BCSD–CMIP3 models (c and d) coupled with relevant observations for the time period 1968–1997

The average monthly streamflow simulated with VIC forced by multiple climate models from NARCCAP and BCSD-CMIP3 is shown in Fig. 6a and b, respectively. The gauge observation at the Verdigris River near Lenapah, OK, is also included as a reference. Although substantial variability emerges due to different RCM/GCM performance, the six NARCCAP driven streamflows cover the observation reasonably over seasonal variation and provide the most promising ensemble average. In contrast, the streamflow from BCSD-CMIP3 ensemble departs from the observations, with underestimation from February to June and overestimation in January and August. Combining the Figs. 5 and 6, we note that the bias-corrected statistically downscaled GCMs are less likely to lead to favorable hydrological simulations, even though their precipitation and temperature fields are more comparable to the observation at the multiple year mean scale.

Fig. 6
figure 6

Average monthly streamflow from multiple NARCCAP models (a) and BCSD-CMIP3 (b) compared to the observation at Verdigris River near Lenapah, OK, for the time period 1968–1997

3.2 Future Climatic and Hydrological Change and Uncertainty

Figure 7 shows the future changes in ensemble-mean for temperature and precipitation over the Oologah Lake watershed from the NARCCAP dynamically downscaled and BCSD-CMIP3 models. We only used the A2 emission for the CMIP3 models in order to make a direct comparison with the NARCCAP models that only adopt the A2 scenario. All the changes described here are absolute deviations relative to the 1968–1997 mean. Although the NARCCAP-projected temperature increase for 2040–2069 is a slightly greater than that for BCSD-CMIP3, they are very consistent with respect to the seasonal variations, with larger increase (3 °C) in summer and smaller increase (2.5 °C) in winter. The temperature increase (1–1.5 °C) in 2010–2039 is projected only by the CMIP3 models, which is lower than that in 2040–2069 but with a similar monthly variation pattern. As for the precipitation changes, there are more differences between the two types of projections, especially for the changing magnitudes that are much more amplified in the NARCCAP models. Generally, increases of precipitation are projected for most months except from June to August, and the annual averages (with increase of 3 mm) are comparable between the two types of projections. However, NARCCAP- projected changes (either increase or decrease) are usually 2–3 times greater than BSCD-CMIP3.

Fig. 7
figure 7

Ensemble-mean changes in temperature (a) and precipitation (b) for the Oologah Lake watershed by NARCCAP dynamical downscaling models and BCSD-CMIP3 A2 emission models. All changes are absolute deviations from 1968 to 1997 mean

The VIC simulated changes of water fluxes (ET, baseflow and runoff) for the Oologah Lake watershed are shown in Fig. 8. Respectively, Fig. 8a shows the changes driven by the NARCCAP dynamically downscaled models for 2040–2069, and Fig. 8b and c display the changes driven by the BCSD-CMIP3 A2-emission models for 2040–2069 and 2010–2039. All the changes here are percentages relative to 1968–1997 mean. Both NARCCAP and BCSD-CMIP3 suggest winter-increasing and summer-decreasing water availability (precipitation, runoff, and baseflow) in the Oologah Lake watershed with overall increase of 3–4 % annually for 2040–2069. However, similar to the precipitation change in Fig. 7b, the seasonal differences arise in the amplitude of hydrologic changes simulated with the CMIP3 and NARCCAP projections. In general greater increase (from Oct. to May) or decrease (June to Sept.) of streamflow (runoff and baseflow) is clearly ascribed to the NARCCAP forcing (Fig. 8a), while a less changing amplitude is generated with the BCSD-CMIP3 forcing (Fig. 8b). ET significantly increases with the CMIP3 projections but decreases with the NARCCAP, yielding more water in forms of runoff and baseflow in NARCCAP. Figure 8c displays the hydrology changes for period of 2010–2039 with the CMIP3 forcing, which is similar to those for period of 2040–2069 (Fig. 8b) but with less hydrology changes.

Fig. 8
figure 8

Water flux ensemble-mean changes in the Oologah Lake watershed for 2040–2069 by NARCCAP dynamical downscaling models (a) and for 2040–2069 (b) and 2010–2039 (c) driven by BCSD-CMIP3 A2 emission models. All changes are relative to 1968–1997 mean

Because the BCSD-CMIP3 used more GCMs than NARCCAP, its uncertainty range, represented by the difference between maximum change and minimum change, is wider for precipitation and water fluxes of ET, runoff, and baseflow. Annually, NARCCAP and BCSD-CMIP3 project precipitation change ranges from 1.9 to −3.1 % and from 1.7 to 19.8 %, respectively. However, the temperature uncertainties are similar between them, with increases ranging from 1.2 to 3.8 °C in annual average.

4 Discussion and Conclusions

Based on the results in section 3, both the statistical and dynamical downscaling can provide useful information for regional climate processes in terms of spatial and temporal (monthly) patterns. Due to the bias correction coupled in the statistical downscaling, the BCSD-CMIP3 ensemble is more comparable to the historical precipitation and temperature observations, as shown in Fig. 5. However, when translating into hydrologic states, those statistically downscaled forcings perform worse, i.e., the entire underestimation of ensemble streamflow during wet seasons (Fig. 6). NARCCAP models, in contrast, provide wide variability and a promising ensemble mean for monthly streamflow simulations. Mearns et al. (2012) evaluated precipitation and temperature generated by six NARCCAP RCMs over the conterminous North American. They pointed out that the difference of model performance is discernible, but it is difficult to state one model is absolutely better than another in reproducing climate dynamics. In our case, the HRM3 driven by HADCM3, consistent with Mearns et al. (2012), performs worst in seasonal variation, but its annual averages of precipitation and temperature are closest to observations and the hydrological response is also acceptable compared to other NARCCAP models. Therefore, we adopt all the models and consider an ensemble- mean changes for climate and hydrology over the study region. Some studies have tried to use weighted model average to improve climate realization (Gleckler et al. 2008; Christensen et al. 2010; Sobolowski and Pavelsky 2012), however, metrics usage could further bias climate change impacts because models performances vary from different evaluation angle.

Previous studies mostly are focused on precipitation and temperature and seldom consider other variables such as wind, humidity, and solar radiation. This study suggests that the hydrologic performance could be used as a potential metric, since the land surface and atmosphere interact more realistically with more factors considered. We can see that the bias-corrected statistically downscaled GCMs are less likely to capture the hydrological simulation, even though their precipitation and temperature are more favorably than the NARCCAP simulations. Conversely, the hydrologic simulation with NARRCAP forcing demonstrates a tolerance to precipitation bias, suggested by the fact that the overestimation of rainfall in fall (Fig. 5) does not propagate into streamflow simulation (Fig. 6). Furthermore, because mesoscale convective systems tend to drive much of the region’s rainfall during the late spring and the early summer months, the higher precipitation and induced larger streamflow amplitude in the NARCCAP simulations may result from better mesoscale dynamics represented through the regional climate models as compared to the statistically downscaled datasets. As indicated by Gutowski et al. (2010), the NARCCAP are capable of generating extreme precipitation over the northern United States.

In conclusion, this study uses VIC (Variable Infiltration Capacity) model to evaluate the hydrologic response of Oologah Lake watershed to climate change by taking into account both statistically and dynamically downscaled climate projections from the BCSD-CMIP3 and NARCCAP. Evaluation and comparison of the results shows that (1) future water availability (precipitation, runoff, and baseflow) in the Oologah Lake watershed would increase annually by 3–4 %, suggested by both NARCCAP and BCSD-CMIP3. Temperature increases (2.5–3 °C) are more consistent between the two types of climate projections both seasonally and annually. (2) However, significant difference arises in the projected changes of precipitation and hydrology between CMIP3 and NARCCAP for different months and seasons. NARCCAP-projected changes (either increase or decrease) are usually 2–3 times greater than BSCD-CMIP3. (3) From the hydrological point of view, the dynamically downscaled NARCCAP projection performed better, most likely in capturing a larger portion of mesoscale-driven convective rainfall than the statistically downscaled CMIP3 projections, so that the VIC model generated higher seasonal streamflow amplitudes that are closer to observations. (5) This study suggests that the hydrologic performance could be used as a potential metric to comparatively differentiate climate models, since the land surface and atmosphere are processed integrally with more atmospheric factors included. (6) Current climate models still need to improve physics and parameterization to represent convective precipitation adequately. The advantages of the fifth phase of the Coupled Model Intercomparison Project (CMIP5) could provide new insights about climate change and variability (Taylor et al. 2011). Continued work will analyze the results with the better climate model inputs.