1 Introduction

The Earth’s temperature has increased by 0.74 °C between 1906 and 2005 due to increase in anthropogenic emissions of greenhouse gases according to the Fourth Assessment Report (AR4) of the Intergovernmental Panel on Climate Change (IPCC 2007). The use of fossil fuels has caused an increase in the concentration of greenhouse gases contributing to an incremental warming of the temperature of the Earth’s atmosphere and oceans. Global warming will have significant impact on local and regional precipitation and hydrological regimes, which in turn will affect ecological, social, and economical systems of human, such as health of ecosystems and fish resource management, industrial and agricultural water supply, resident living water supply, water energy exploitation, and other sectors.

The implications of climatic trends are driving modeling efforts to understand the future of the Earth’s climate. Recent studies have shown that general circulation models (GCMs) can adequately predict global temperature trends and changes in the spatial and temporal distribution of precipitation (Koukidis and Berg 2009). Global warming resulting from increased concentrations of atmospheric greenhouse gases is usually estimated with GCMs, and there have been remarkable advances in the development of these models over the past 20 years (Huntingford et al. 2006). However, raw output from GCMs is inadequate for assessing impacts of climate change on hydrological responses at regional scales. To resolve the coarse spatial scale of GCMs better, many techniques have been designed to downscale GCM output data to predict future regional climatic variables such as temperature and precipitation. Generally, two groups of techniques, regional climate models (RCMs) and statistical downscaling, have emerged within the literature as suitable approaches to relate global-scale predictor values to regional- to local-scale weather station data (Koukidis and Berg 2009).

Downscaling methods can be broadly divided into two classes: dynamical downscaling (DD) and statistical (empirical) downscaling (SD). In DD, the GCM outputs are used as boundary conditions to drive a regional climate model (RCM) or limited area model (LAM) and produce regional-scale information of up to 5–50 km; this method responds in physical consistent ways to different external forcing. However, DD requires higher computational cost and depends strongly on the boundary conditions provided by GCMs. RCMs may be as fine as tens of kilometers; however, for impact assessment applications, it often requires point-specific climate projections in order to capture fine-scale climate variations, particularly in regions with complex topography, in coastal or island locations, and in areas of highly heterogeneous land cover. Therefore, a gap exists between what climate models can predict about future climate change and the information relevant for environmental studies. Statistical downscaling models are commonly used to fill this gap (Timbal et al. 2009).

SD produces local- or station-scale meteorological time series by appropriate statistical or empirical relationships with predictor variables; this method is cheap, readily transferable, and computationally undemanding, and it has been widely used in climate change risk or uncertainty assessments. However, a disadvantage of SD is that building the appropriate statistical relationship needs historical observed data having sufficient length (Wilby et al. 2002). Statistical downscaling model (SDSM) is a hybrid of a regression method and weather generator. Many comparative studies (Fowler et al. 2007; Wilby et al. 1998; Khan et al. 2006; Harpham and Wilby 2005; Dibike and Coulibaly 2005) have shown that this method is simple to handle and has, by and large, superior capability and is therefore widely applied (Wilby and Harris 2006).

Statistical downscaling methods have not been documented for the climate research in Nepal. So the research based on SDSM is on early stage. The main objectives of this paper are to evaluate the application of SDSM over Nepal and to generate local-scale precipitation scenarios in three climatic regions of Nepal under future emission scenarios. This study may be an important reference for researchers who will apply SDSM inside Nepal and vicinity.

2 Study area and data set

Dominated by summer monsoon precipitation, Nepal has a variety of climate from lowland Terai plains to High Himalaya region. Based on the landscape, Nepal can be divided in to five physiographic regions: Terai, Silwalik, Middle Mountain, High Mountain, and High Himalaya from south to north (Kansakar et al. 2004). In this study, we have used three stations representing a cross section over central Nepal. These stations are located on three climatic zones—arid, humid, and subhumid regions—based on rainfall distribution and agro-ecological classification; the classification was also applied by Williams et al. (2004). Jomsom station falls under arid region (region-1) having an annual rainfall of below 300 mm. Pokhara station comes under humid region (region-2) having an annual rainfall of above 3000 mm, and Bhairahawa station lies in the subhumid region (region-3) with an annual rainfall of around 1500 mm. Details about these stations can be seen in Table 1 and Fig. 1.

Table 1 Description on study area
Fig. 1
figure 1

Study area within Nepal showing the list of stations with square box and their respective region represented by numbers. Black straight line shows the HadCM3 grid. Color bar shows the elevation in meter

3 Data

The observed daily precipitation data of Department of Hydrology and Meteorology, Government of Nepal, starting from January 1961 to December 2000 is utilized for this study. However, for two stations Bhairahawa and Pokhara, data is available from January 1969. The missing data of 1 day or 2 days were replaced by the average precipitation values of the neighboring stations using single best estimator method (Eischeid et al. 2000). The reanalysis data set of National Centers for Environmental Prediction (NCEP)/National Center for Atmospheric Research (NCAR) was used in this study, and this data set is the daily series for 1961–2000 at a spatial scale of 2.5° × 2.5°, which includes 26 atmospheric variables. GCM output data set of scenarios A2 (high greenhouse gas emission scenarios) and B2 (low greenhouse gas emission scenarios) derived from the Hadley Center’s coupled ocean/atmosphere climate model (HadCM3) has a resolution of 3.75° (longitude) × 2.5° (latitude), which includes the same atmospheric variables as NCEP data, and it should be interpolated in order to adjust its resolution to that under scenarios A2 and B2 of HadCM3 model. The transformed GCM data for 1961–2099 was directly downloaded from the internet (http://www.cics.uvic.ca/scenarios/sdsm/select.cgi). The HadCM3 grid boxes selected can be referred to in Fig. 1.

4 Methodology

The SDSM is an accepted SD technique in practice for the construction of climate scenarios for various related impact studies. The technique is mainly based on multivariate regression method. It is designed to simulate sequences of daily climate data for present and future periods through combinations of regression and weather generators by extracting statistical parameter from observed data series. It combines a stochastic weather generator approach and transfer function model that needs two types of daily data (Wilby et al. 2002). The first type corresponds to local predictands of interest (e.g., temperature and precipitation) and the second type corresponds to the data of large-scale predictors (NCEP and GCM) of a grid box closest to the study area (Hashmi et al. 2010).

During downscaling with the SDSM, a multiple linear regression model is derived from a few selected large-scale predictor variables and local-scale predictands such as temperature and precipitation. Large-scale relevant predictors are selected by the results of correlation analysis, partial correlation analysis, and scatter plots, and the physical sensitivity between selected predictors and predictand should also be considered in study. SDSM provides two means of optimizing the model—dual simplex and ordinary least squares (Wilby and Dawson 2007)—and both approaches give comparable results; ordinary least squares is much faster. The model is structured as monthly model for both daily precipitation and temperature downscaling, in which case, 12 regression equations are derived for 12 months using different regression parameters for each month equation.

The output of SDSM is daily series, when the model is established; the daily data of NCEP and GCM is used to construction of current and future daily weather series. Generally, the application of SDSM contains five steps (Wilby et al. 2002; Wilby and Harris 2006): (1) selection of predictors, (2) model parameter calibration, (3) simulation, (4) model validation, and (5) generation of future series of the predictand.

5 Results and discussion

5.1 Screening of variables

The most relevant predictors are screened with a multiple correlation analysis between the gridded predictors and predictand variables such as station precipitation. Daily data of 26 large-scale predictor variables derived from the NCEP reanalysis data sets are used to investigate the percentage variance produced by each predictand-predictor pair. In general, the correlation between the predictor variables and each predictand is low in the case of daily precipitation (Huang et al. 2011). Final predictor choice is made by considering whether the identified variables and relationship are physically sensible for particular experiment or study site. In this study, in selecting the most relevant predictor variables, the correlation matrix and partial correlation coefficient between the daily observed precipitation and individual NCEP predictors were identified for individual stations and based on p-value and partial correlation appropriate predictor were selected. The predictors are presented in Table 2. At region-1, the selected predictors were surface-specific humidity, near-surface relative humidity, 500-hPa divergence, 500-hPa wind direction, and 500-hPa meridional velocity. Similarly, predictors were surface zonal velocity, 500-hPa airflow strength, 500-hPa zonal velocity, 500-hPa geopotential height, 500-hPa wind direction, and 850-hPa geopotential height for the region-3. Mean sea level pressure, 500-hPa airflow strength, 500-hPa geopotential height, 500-hPa wind direction, 850-hPa geopotential height, 850-hPa divergence, and surface-specific humidity were the chosen predictors over region-2.

Table 2 List of 26 NCEP predictor variables and selected predictors for the calibration (in bold)

5.2 Calibration

The observed data series for 1961–2000 were split into two periods, 1961–1990 and 1991–2000, used for model calibration and validation, respectively. At region-2 and 3, the calibration period is from 1969 to 1990. Figure 2 shows the monthly precipitation of the three climatic regions during the calibration period. Following the user manual of SDSM 4.2, when using NCEP reanalysis data as predictors, threshold of wet day was set as 0 mm, a fourth root transformation was applied to the original precipitation series to convert it to a normal distribution (Wilby et al. 2002), and the ordinary least squares was used for optimization. SDSM provided several statistical indicators such as the percentage of explained variance and the standard error (SE) to reflect calibration results (Wilby et al. 2002). In this paper, the percentage of explained variance of downscaling experiment in each region ranged from 7 to 16.4 %, and the SE ranged from 0.36 to 0.57. For heterogeneous and random variables such as daily precipitation occurrence/amounts, percentage of explained variance is more likely less than 40 % (Wilby et al. 2002). The calibration is probably seriously biased by the large number of zero values entered in the multiple regressions, and the underlying surface factors are not considered in SDSM.

Fig. 2
figure 2

Observed mean monthly precipitation during calibration period in the three climatic regions

5.3 Bias correction

According to Salzmann et al. (2007), the bias correction approach is used to eliminate the biases from the daily time series of downscaled data. Bias correction is applied to the downscaled data obtained from the two SDSMs using HadCM3 predictors, in order to obtain a more realistic and unbiased data of future climate. Bias correction is performed using following equation:

$$ {P}_{\mathrm{deb}}={P}_{\mathrm{scen}}\kern0.5em \ast \kern0.5em \left({P}_{\mathrm{obs}}/{P}_{\mathrm{cont}}\right) $$

Where, P deb is de-biased (corrected) daily precipitation for future period, P scen is SDSM generated daily time series precipitation for future period, P cont is the long-term average of monthly precipitation for the control period simulated by SDSM, and P obs is the long-term average monthly observed value of precipitation.

Before applying it on the future downscaled data, bias correction was first validated for the period of 1991–2000. For this purpose, the mean monthly biases are obtained from the period of 1981–1990 because these biases have to be adjusted for the validation period that is also of 10-year duration by utilizing downscaled data of SDSM (Mahmood and Babel 2013) and observed data at each region. These biases are then adjusted to the downscaled daily data by SDSM in the period of 1991–2000. The corrected downscaled data is compared with the observed data by calculating the above mentioned statistical indicators. After successful validation, bias correction is applied to the future downscaled data. Mean monthly value of observed, simulated, and bias-corrected results are presented in Table 3.

Table 3 Comparison of observed and downscaled values of statistical indices during the validation period of 1991–2000

5.4 Validation

To validate the SDSM model, three sets of atmospheric data were used, i.e., from NCEP, as well as scenarios A2 and B2 from HadCM3 model (noted as H3A2 and H3B2, respectively). Monthly mean (μ), determination coefficient (R 2), relative error (RE), and root mean standard error (RMSE) were used to qualify the simulation results of monthly precipitation series in each region. In this study, the validation periods for precipitation were 10 years from 1991 to 2000; the results for the validation period showed obvious difference in different regions. Statistical parameters and validation results are presented in Table 3. It could be seen that the monthly precipitation series after adjusting the bias simulated from NCEP, H3A2, and H3B2 with the mean R 2 values being higher than 0.6, and the mean RE between observed and downscaled data did not exceed 22 % over region-1. On the other regions, also the values are comparable or better; on average, over three region R 2 values reached 0.84. This shows the good applicability of future precipitation downscaling. Figure 3a–c shows the observed and downscaled variables after bias correction for the period of 1991–2000. In some months like July, the model prediction seems to be highly overestimated at region-1 and region-3; for the remaining months, difference between observed and downscaled is lower. On the whole, the precipitation series simulated by SDSM with three sets of atmospheric data had more or less acceptable linear relationship with that of observed, but there was a noticeable deviation of amount between them. While observing the values from Table 3, it can be noticed that at region-1, simulated precipitation results have been slightly underestimated. In region-2, simulated result of NCEP is underestimated while those of H3A2 and H3B2 were slightly overestimated. In region-3, simulated results of H3A2 and H3B2 were overestimated. In general, after bias correction, the bias has been remarkably reduced with respect to observed ones. The simulation results derived from NCEP were better than from H3A2 and H3B2 as the SDSM was calibrated with NCEP data; therefore, the built parameters had biases when the model was driven by the H3A2 and H3B2 data which was also mentioned by Huang et al. (2011).

Fig. 3
figure 3

a Monthly values of observed and downscaled bias-corrected (NCEP, H3A2 and H3B2) precipitation during validation period of 1991–2000 at region-1. b Monthly values of observed and downscaled bias-corrected (NCEP, H3A2, and H3B2) precipitation during validation period of 1991–2000 at region-2. c Monthly values of observed and downscaled bias-corrected (NCEP, H3A2, and H3B2) precipitation during validation period of 1991–2000 at region-3

5.5 Future precipitation downscaling scenarios

In this study, the period of 1961–1990 was taken as the base period as was used in most impact studies worldwide, and the future period was divided into 2020s (2010–2039), 2050s (2040–2069), and 2080s (2070–2099). The patterns of change about future precipitation scenarios compared to base period were then analyzed, using only H3A2 and H3B2 data. Taking the simulation results of SDSM in the modeling precipitation of current period (1991–2000) into account, the change of seasonal and annual mean precipitation of three regions under scenarios H3A2 and H3B2 were discussed in this paper for illustrative purposes.

The changes of seasonal and annual mean precipitation (compared to base period 1961–1990) at three regions of Nepal under scenarios H3A2 and H3B2 are shown in Figs. 4, 5, and 6. It is seen that under scenario H3A2, the changes of annual mean precipitation of future periods (2020s, 2050s, and 2080s) in region-1 (arid region) would be 19.27, 16.6, and 14.76 %, respectively (Fig. 4a); as to region-2 (humid region), the changes would be 4.3, 4.7, and 4.8 %, respectively (Fig. 5a), while for region-3 (subhumid region), the changes would be 16.6, 19.8, and 13.4 %, respectively (Fig. 6a).

Fig. 4
figure 4

a Future precipitation scenario for H3A2 during 2020s, 2050s, and 2080s with respect to base period at region-1. b Future precipitation scenario for H3B2 during 2020s, 2050s, and 2080s with respect to base period at region-1

Fig. 5
figure 5

a Future precipitation scenario for H3A2 during 2020s, 2050s, and 2080s with respect to base period at region-2. b Future precipitation scenario for H3B2 during 2020s, 2050s, and 2080s with respect to base period at region-2

Fig. 6
figure 6

a Future precipitation scenario for H3A2 during 2020s, 2050s, and 2080s with respect to base period at region-3. b Future precipitation scenario for H3B2 during 2020s, 2050s, and 2080s with respect to base period at region-3

Under scenario H3B2, the changes of annual mean precipitation of future periods (2020s, 2050s, and 2080s) in region-1 would be 9.2, 14.2, and 13 %, respectively (Fig. 4b); as to region-2, the changes would be 5.4, 7, and 5.1 %, respectively (Fig. 5b), while for region-3, the changes would be 18.7, 13.6, and 21.7 %, respectively (Fig. 6b).

The changes of seasonal mean precipitation in the three regions under scenarios H3A2 and H3B2 would present obvious differences in different seasons. Under scenario H3A2, the seasons in which changes of seasonal mean precipitation would be most remarkable in the future periods (2020s, 2050s, and 2080s) in region-1 were winter (December–February) 42.34 % and autumn (September–November) 13.41 %, respectively, which in region-2 were summer (June–August) 9.04 % and spring (March–May) 5.28 %, respectively; over region-3, the ones were autumn 30.63 % and winter 22.1 %, respectively. Similar trends are found more or less for the results under scenario H3B2 with the difference in changing magnitude and percentage only.

6 Conclusion

SDSM was applied to downscale the precipitation in the three climatic regions of Nepal as a case study under H3A2 and H3B2 scenarios. SDSM is well known for its simplicity and also widely used as a decision support tool. The downscaling of precipitation scenarios is important to understand the impact of climate change and hydrological processes in the local scale. The validation results are found to be improved with the application of bias correction on the downscaled data. The simulation from NCEP is better than from H3A2 and H3B2 in the validation period, partly because SDSM is fitted using the NCEP data. The monthly precipitation between observed and simulated by SDSM with R 2 values ranges from 0.61 to 0.97 which showed the applicability of downscaling.

The model results of future precipitation showed that when compared to the base period, the annual mean precipitation of future periods would show different change patterns under scenarios H3A2 and H3B2. An increase in mean annual precipitation under H3A2 and H3B2 is expected for all three future periods (2020s, 2050s, and 2080s). Annually, on average, for the H3A2 scenario, the region-1 arid region has the highest change (18.53 %) in precipitation percentage compared to base period. For the H3B2 scenario, the region-3 subhumid region has the most remarkable (18.07 %) change in precipitation percentage.

Seasonally, on average, for the H3A2, winter has the most remarkable, 42.34 %, increases in precipitation at region-1. Autumn has the lowest, 1.64 %, change of precipitation at the region-2. As for the case of H3B2, winter has the highest change of 39.17 % in future precipitation at region-3 and autumn has the lowest value of precipitation change of 1.55 % at region-2. Within three climatic regions, there would be increase of 13.4 and 11.14 % of mean annual precipitation for the H3A2 and H3B2 in 2020s. Similarly, increase reaches to 13.75 and 11.68 % for the H3A2 and H3B2 in 2050s. During the 2080s, there would be increases of 8.28 and 13.30 % under H3A2 and H3B2, respectively, compared to the base period.

From this study, we noticed that this model can be more applicable to predict mean monthly precipitation rather than monthly and seasonal calculations. The performance of model for the heavy precipitation month was found to be overestimated on arid and subhumid region. While concerning on climatic conditions of arid, humid, and subhumid regions, the model showed better performance on the humid region. This study was carried out as a case study to evaluate the performance of SDSM; however, further study can be useful to verify these results along Nepal. In addition, a more extensive study has been planned over the river basins of Nepal.