1 Introduction

Human activities for the last century increased greenhouse emissions to the atmosphere, which caused global warming. Several types of research during recent decades indicated that the increase in the atmospheric greenhouse gas concentration due to anthropogenic emissions has begun altering global climate (IPCC 2007, 2012). Such impacts are expected more intense in sub-Saharan Africa where the social–ecological system is not resilient to absorb shocks (IPCC 2007; Davies et al. 2010). As the ongoing climate change has been confirmed, assessing its impact on regionally important sectors has become a major concern, especially for policymakers who develop action plans to mitigate and adapt to the impacts of future climate change (IPCC 2001).

Climate change studies are conducted using data from general circulation models (GCMs); however, since the GCMs have a coarse resolution, they are not suitable for regional climate change impact studies. Rather, regional climate models (RCMs) have been used to dynamically downscale GCM output to scales more suitable to end regional applications. Therefore, GCM-driven RCM output may provide valuable information to climate adaptation practices, risk assessment studies, and policy planning. Such efforts enabled the application of RCM outputs to understand the impacts of climate change in local climates that are influenced by complex topographies and landscapes (Alley and Coauthors 2007; Giorgi et al. 2009). The fact that RCM model outputs have biases made it difficult to use them directly for climate and hydrological impact studies. Several researchers suggested applying bias correction and the use of ensemble mean are important to reduce errors of RCM outputs (Christensen et al. 2008; Teutschbein and Seibert 2010; Kim et al. 2014; Nikulin et al. 2012).

Bias correction methods aim to adjust the mean, variance, and/or quintile of the model time series variable using a certain correction factor so that the corrected model time series matches closely with the observed variable. The bias correction methods have been developed in recent decades to modify RCM outputs for fewer statistical biases, mainly aiming at temperature and rainfall (Teutschbein and Seibert 2012). In this study, the skills of linear scaling and distribution mapping methods were assessed in reducing statistical biases of simulated rainfall and temperature.

The outputs of CORDEX are assessed and used for climate change impact modeling in different parts of Africa and indicated that reasonable standard (Nikulin et al. 2012; Haile and Rientjes 2015). Research evaluation for the Ethiopian highlands shows that simulation results of ten models of CORDEX Africa reproduced the shape of the monthly rainfall distribution and the annual rainfall anomaly but overestimated the mean monthly rainfall amount (Endris et al. 2013). A study in the upper Blue Nile basin (Fentaw et al. 2018) consistently predicts an increase in temperature.

Baro–Akobo basin is a transboundary river that flows into the Nile River. The Nile River is the main water resource for riparian countries which is already under immense pressure due to various competitive uses as well as social, geopolitical, and legislative conditions. Moreover, the Nile basin region is vulnerable to climatic variability (Kim et al. 2008), which affects the water resources of the basin (Haile and Rientjes 2015). To cope with the climatic variability and improved food security and energy provision in the Baro–Akobo basin and Nile basin region large irrigation and hydropower schemes are planned and under implementation (McCartney et al. 2010). Since climate change may have a considerable impact on water resources (IPCC 2014), planning water resources in the basin requires a robust study that explores the long-term climate trend and variability to optimize outcomes from such investments.

There are limited studies that evaluated the performance of the CORDEX Africa RCMs to reproduce the observed climate in Ethiopia. This study, therefore, evaluated the performance of raw and bias-corrected RCM simulation in reproducing for observed rainfall and temperatures at several locations in the Baro–Akobo basin from 1975 to 2005. Such studies help to select reliable RCM outputs that can be used for future climate change impact studies in areas that are highly vulnerable like the Baro–Akobo basin.

2 Materials and methods

2.1 Study area

The Baro–Akobo basin is one of the 12 major river basins of Ethiopia. It drains from the western highlands of Ethiopia to the Sudanese border to join the White Nile. The basin covers a large part of the southwest of Ethiopia between 5° 31′ and 10° 54′ north latitudes and 33° and 36° 17′ east longitudes which covers an area of about 76,000 km2. It is bordered by South Sudan in the west and southwest, Sudan in the northwest, and the Abbay and Omo–Gibe basins in the east. The basin has a complex topography with an elevation range of 293 to 3266 m above sea level (masl) which resulted in different rainfall regimes. The complex terrain and land surface heterogeneity and their interactions with large-scale climate forcings contribute to the diverse spatial rainfall patterns over the basin. The mean annual rainfall in the Baro–Akobo basin, covering an elevation between 1230 and 2200 m is estimated at 1749.803 mm/year and the mean daily maximum and minimum temperature are 26.80 °C and 13.41 °C, respectively (Fig. 1 and Table 1).

Fig. 1
figure 1

Study watershed of the Baro–Akobo basin in Ethiopia with the location of meteorological stations used in the study. The background image represents a digital elevation model

Table 1 Meteorological stations considered in the current study with their mean annual rainfall and temperatures from 1975 to 2005

2.2 Observed Climate data

Observed daily rainfall and maximum (Tmax) and minimum (Tmin) temperature from 1975 to 2005 were obtained from the National Meteorology Agency (NMA). The rainfall pattern over the Baro–Akobo basin is predominantly unimodal with two distinct seasons; the main (wet season), which is locally called “Kiremt” which generally spans from February/March to October/November, and the dry season called “Bega”, which generally spans from November/December to February. There is spatial rainfall variability in the region. Stations located in the southern parts of the basin including Masha, Mezan Teferi, Chena, and Tepi have Bega season from December to February and Kiremt season from March to November, whereas areas located in the Northern parts of the basin including Alem Teferi, Begi, Bure, Gore, Uka, Mettu, Dembi Dollo, Hurumu, Chora, and Yubdu have Bega season during the months of November to February and the Kiremt seasons spans from March to October (NMA 1996).

Despite some missing records, most of the data obtained from NMA capture climatologies of the regions (Diro et al. 2011). Fourteen rainfall and six temperature stations were obtained from NMA and then applied to bias correct and evaluate the performance of RCM output (Fig. 1 and Table 1). Quality control for the observed climate data (e.g., checking the presence of daily rainfall values less than 0 and a daily maximum temperature less than daily minimum temperature values) was conducted using RClimDex 1.1 software (Zhang and Yang 2004). Getting complete datasets with no missing records was difficult in the remotely located area like the Baro–Akobo basin of Ethiopia. However, to balance data quality and availability, this study used stations that have missing data of less than 20%. The missing data were completed using a multivariate imputation by chained equations (MICE) algorithm which is available in the R statistical software (Buuren et al. 2015). Findings in (Turrado et al. 2014) showed that the MICE algorithm is better than other methods such as multiple linear regression (MLR) and inverse distance weighting (IDW) in estimating missing climate data using solar radiation data under different atmospheric conditions in Galicia, Spain.

2.3 Regional climate models data set

RCM simulated daily rainfall and maximum and minimum temperature data were obtained from the CORDEX project (http://cordexesg.dmi.dk/esgf-web-fe/). The CORDEX RCM data have a spatial resolution of 0.44°, which corresponds to a 50-km-by-50-km bounding box. The CORDEX RCMs and their driving GCMs used in this study written in short forms as CNRM for the CNRM-CERFACS-CNRM-CM5, ICHEC for the ICHEC-EC-EARTH, and MPI for the MPI-M-MPI-ESM-LR (Table 2). It is noted that RCA4 and CCLM4 were each driven by three GCMs (CNRM, ICHEC, and MPI) and REMO2009 driven by (MPI). These RCMs are selected for evaluation due to the outputs of CCLM4, RCA4, and REMO were evaluated and showed reasonable performance over East Africa (Nikulin et al. 2012; Endris et al. 2013; Worku et al. 2019).

Table 2 List of RCMs with their institute and driving GCMs used in the study

In areas with sparse station networks and complex terrain, the accuracy and precision of interpolation are questioned to evaluate grid measurements of the climate model (Bhowmik and Costa 2015; Osborn and Hulme 1997). Therefore, due to the sparse distribution of rain gauge networks in the Baro–Akobo basin, a pixel-to-point approach was used to compare gridded RCM data against point rainfall data of the rain gauge observations. Findings in (Bhattacharya and Khare 2020) show that point to pixel approach comparison of observed data with gridded climate data has resulted in good agreement in the Beas River basin of Northwestern Himalaya.

2.4 Bias correction methods

From the gridded RCM model data, a representative value for each station was extracted and bias-corrected using the CMhyd tool. The tool has been tested using the CORDEX archive for different regions and provided satisfactory performance (Rathjens et al. 2016). It is used to bias correct rainfall data from seven RCM outputs and temperature data from five RCM outputs. This study used distribution mapping (DM) and linear scaling (LS) methods to correct the rainfall and DM method to correct temperature data due to they are effective in removing RCM bias in several studies (Christensen et al. 2008; Teutschbein and Seibert 2012; Fang et al. 2015).

2.5 Statistical evaluation of raw and bias-corrected RCMs

The RCM simulated raw and bias-corrected daily rainfall and temperature data were evaluated against observed data using goodness-of-fit criteria. Initially, the performances of the RCMs were evaluated in terms of their skill to reproduce the mean monthly characteristics of observed rainfall and temperature data. Moreover, an agreement between the observed against individual raw and bias-corrected RCMs as well as the ensemble mean value was evaluated using the most widely used statistical methods such as correlation coefficient (R), percent of bias (PBIAS), and root mean square error (RMSE). Monthly values of stations rainfall, minimum and maximum temperature data from 1975 to 2005 were used to evaluate the performance of RCM simulated counterparts.

A correlation coefficient is used to evaluate the linear relationship between the observed and RCM output in Eq. (1). A correlation coefficient value close to 1 shows a very good fit between observed and modeled data.

$$ \mathrm{Correlation}\ \mathrm{coefficient}\ (R)=\frac{\sum \limits_{t=1}^N\left(\ {S}_{i-}\overline{S_i}\right)\ \left(\ {O}_{i-}\overline{O_i}\right)\ }{\sqrt{\sum \limits_{t=1}^N\left(\ {S}_{i-}\overline{S_I}\right)\sum \limits_{t=1}^N\left(\ {O}_{i-}\overline{O_i}\right)}}\kern0.5em $$
(1)

Percent of bias-measured systematic bias between observed and RCM outputs variables in terms of percent is shown in Eq. (2). A PBIAS value of 0 indicates no systematic difference between simulated and observed amounts, whereas a large PBIAS indicates that the RCM rainfall amount largely diverges from the observed one. A positive PBIAS indicates overestimation whereas a negative PBIAS indicates an underestimation of the observed variables.

$$ \mathrm{Percent}\ \mathrm{of}\ \mathrm{bias}\ \left(\mathrm{PBIAS}\right)=\frac{\sum \limits_{i=1}^n\kern0.5em \left[{S}_i-{O}_i\right]\kern1.25em }{\sum \limits_{i=1}^n{O}_i\kern0.5em }\times 100 $$
(2)

Root mean square error (RMSE) in Eq. (3) is used to shows the differences between observed and model outputs. RMSE value close to 0 indicates a very good agreement between studied variables

$$ \mathrm{RMSE}=\sqrt{\frac{\sum \limits_{i=1\kern0.5em }^n\Big[\ {S}_{i-}\ {O}_{i\ \Big]}2}{n}} $$
(3)

Coefficient of variation (CV) in Eq. (4) and third quantile for both the observed and RCM rainfall data were estimated to evaluate the agreement between the monthly observed and RCM rainfall data variability in each station.

$$ \mathrm{CV}=100\times \frac{\sigma_R}{R} $$
(4)

where Si and Oi are the ith simulated and observed variables; the bar over the symbols denote the variables represent mean values for the analysis period of (1975 to 2005); and n represents the analysis period, which was 30 years. R represents that the statistics are estimated separately for either RCM or observed and σ represents to standard deviation of either the RCM or observed.

For Mann–Kendall’s trend test and Sen’s slope estimate, the performance of RCMs required further examination for their skill to simulate trends in rainfall and temperatures. Mann–Kendall’s trend test and Sen’s slope estimate were used to compare RCM outputs against observed data. Trends and magnitude of change in the observed annual rainfall and temperature time series as well as bias-corrected RCM outputs were evaluated using the Mann–Kendall (MK) test and Sen’s slope methods. These methods have been widely used to assess the significance of trends in climatologic and hydrologic time series studies (Tabari et al. 2015; Tekleab et al. 2013; Woldesenbetm et al. 2016). The MK trend test is based on two hypotheses, in which the null hypothesis H0 assumes that there is no trend and the alternative hypothesis,H1, assumes that there is a significant trend in the time series, for a given significance level.

To eliminate the effect of serial correlations on the Mann–Kendall trend test, several methods have been developed in the literature. These methods include pre-whitening (von Storch 1999), variance correction (Hamed and Rao 1998), and trend free pre-whitening (TFPW) (Yue et al. 2002). This study was used TFPW methods which take into account all the serial correlations significant at the 95% confidence level (Kumar et al. 2009). The methods provide a better assessment of the significance of the trends for serially correlated data (Kumar et al. 2009; Zhang and Lu 2009). The procedure of trend free pre-whitening methods to eliminate the data for serial correlation was described in (Tekleab et al. 2013; Yue et al. 2002).

3 Results and discussion

3.1 Performance of RCM outputs in reproducing monthly mean rainfall

Figure 2 shows the magnitude and distribution of the mean monthly observed and raw RCM outputs of rainfall. Results showed that most of the raw RCM rainfall data captured the distributions of the mean monthly observed rainfall in the majority of the studied stations, although the time series was shifted by 1 to 2 months, particularly during the peak period when rainfall amount is highest. Moreover, the raw RCMs indicated either overestimation or underestimation of the observed value in most stations. During most months of the year, CCLM4 (CNRM) and CCLM4 (ICHEC) showed overestimation of the observed in most stations, whereas CCLM4 (MPI), RCA4 group, and REMO2009 characterized by underestimation. However, RCA4 (CNRM) and RCA4 (ICHEC) showed an exceptional overestimation at station Mizan Teferi and Masha during most of the rainy season (May to September). In most stations, the ensemble showed underestimation of the observed value during most months of the year; however, it provides a better representation of the mean monthly rainfall distribution. In general, raw RCM comparison showed that the CCLM4 model performed better to reproduce the observed mean monthly rainfall than the RCA4 and the REMO2009 in most of the studied stations. Findings in (Haile and Rientjes 2015) for the upper Blue Nile showed underestimation of the observed mean monthly rainfall by most of the RCMs and also the ensemble resulted in better representation of mean monthly rainfall distribution.

Fig. 2
figure 2

Mean monthly rainfall from raw dynamically downscaled regional climate model (RCMs) outputs and observed data in 12 stations in the Baro–Akobo basin from 1975 to 2005

The DM (Fig. 3) and LS (Fig. 4) bias correction techniques of individual RCMs and their ensemble portrayed well the magnitude and distribution of the mean monthly rainfall in all stations. In the case of rainfall, a 1- to 2-month shift at the onset of the peak rainfall season was corrected after the implementation of bias correction. The shortcomings in most RCMs, i.e., overestimation by CCLM4 and underestimation by RCA4 and REMO, and ensemble mean were also substantially improved after bias correction. The LS method performed slightly better than the DM method in capturing the observed rainfall value in most stations. However, at Chora station, the LS method showed overestimation of the observed during the rainy season when correcting the data from CCLM4-MPI, whereas the DM method performed slightly less in capturing the mean monthly rainfall in most stations. For example, RCA4 (CNRM) and REMO (MPI) showed underestimation of the observed rainfall most of the time at stations Masha, Mizan Teferi, Gore, and Debi–Dollo.

Fig. 3
figure 3

The same as that of Fig. 2 but for distribution mapping (DM) bias correction

Fig. 4
figure 4

The same as that of Fig. 2 but for linear scaling (LS) bias correction

3.2 Performance of RCM outputs in reproducing mean monthly maximum and minimum temperatures

Figure 5 shows the magnitude and distribution of the mean monthly observed and raw RCM outputs of maximum temperatures. The majority of the individual raw RCMs and their ensemble reproduced well the distributions of the mean monthly maximum temperatures in most of the stations. All raw RCMs except RCA4 (CNRM) and RCA4 (ICHEC) showed underestimation of the observed mean monthly maximum temperature amount in most stations. The overestimation of the observed, particularly by RCA4 (CNRM) and RCA4 (ICHEC), has prevailed at the stations Gore and Dembi Dollo. Comparing the raw RCM skills, RCA4 (CNRM) performed better to reproduce the observed magnitude and distribution of mean monthly maximum temperature in most of the stations.

Fig. 5
figure 5

Mean monthly maximum temperature from raw dynamically downscaled regional climate model (RCMs) outputs and observed data in 6 stations in the Baro–Akobo basin from 1975 to 2005

All individual RCMs as well as their ensemble bias were substantially improved the observed maximum temperature after applying bias correction (Fig. 6). The underestimation in most of the raw RCMs and the overestimation in some models of the observed mean monthly maximum temperature were adequately removed in all stations after applying DM bias correction.

Fig. 6
figure 6

The same as that of Fig. 5 but for distribution mapping (DM) bias correction

Like the maximum temperatures, all the individual raw RCMs and their ensemble reproduce well the distribution of mean monthly minimum temperature in most of the stations (Fig. 7). However, the raw RCMs and their ensemble mean showed overestimation of the observed over most of the stations, except RCA4 (ICHEC) which showed underestimation in some stations. Comparing the raw RCM skills, RCA4 (ICHEC) performed better to reproduce the observed magnitude and distribution of minimum temperature in most of the stations.

Fig. 7
figure 7

Mean monthly minimum temperature from raw dynamically downscaled regional climate model (RCMs) outputs and observed data in 6 stations in the Baro–Akobo basin from 1975 to 2005

The overestimation or underestimation of the observed mean monthly minimum temperature was adequately removed in most of the stations after applying DM bias correction (Fig. 8). However, CCLM4 (MPI) showed minor but consistent underestimation of the observed minimum temperature at Mizan Teferi and overestimation at Mettu stations during most months of the year.

Fig. 8
figure 8

The same as that of Fig. 7 but for distribution mapping (DM) bias correction

In general, the raw RCM simulations were characterized by overestimation and underestimation of rainfall and temperature in most of the stations. This is likely due to rainfall and temperature varying spatially to elevation difference among considered meteorological stations. Other studies (Teutschbein and Seibert 2010; Worku et al. 2019) have also reported that the simulation of rainfall and temperature by climate models characterized by overestimation and underestimation in different elevation areas.

Findings in this study indicated that distribution mapping and linear scaling adequately corrected the bias in RCM simulated mean monthly rainfall, while distribution mapping also adequately corrected the bias in mean monthly maximum and minimum temperatures in the study region. The bias-corrected rainfall and temperatures could be utilized for future climate change impact on water resources studies. Other studies (Teutschbein and Seibert 2012; Worku et al. 2020; Fang et al. 2015) showed that these bias correction methods provided satisfactory performance in correcting mean monthly, annual rainfall, and temperature values.

3.3 Statistical performance evaluation of raw and bias-corrected RCMs against observed climate data

3.3.1 Rainfall

The statistical metrics such as correlation coefficient, PBIAS, and RMSE showed a substantial difference between raw and bias-corrected RCM output as compared to the observed long-term monthly rainfall data (Fig. 9). Individual raw RCM and the ensemble simulations are heavily biased from observed data. Raw RCA4 (CNRM), RCA4 (ICHEC), and RCA4 (MPI) showed underestimation of the observed in most stations. For example, the highest PBIAS (underestimation) −66.8% and RMSE (5.9 mm/day) were obtained at Hurumu and Mizan Teferi stations when the observed rainfall amount compared with the RCM output of RCA4 (CNRM) and RCA4 (ICHEC), respectively. On the other hand, CCLM4 (CNRM) and CCLM4 (ICHEC) showed overestimation of the observed rainfall in most stations. Among the raw RCMs, CCLM4 (ICHEC) showed the highest PBIAS (65.4%) at station Bure. CCLM4 (CNRM) at Mizan Teferi performed least in terms of the correlation coefficient between observed and simulated monthly rainfall amounts. The ensemble produced the lowest biases compared to the individual raw RCMs; it resulted in the highest correlation and lowest PBIAS and RMSE in most stations.

Fig. 9
figure 9

Statistical performance evaluation of the raw and bias-corrected RCM outputs against the monthly observed rainfall for 12 stations in the Baro–Akobo basin from 1975 to 2005. The R, PBIAS, and RMSE represent the correlation coefficient, percent bias, and root mean square error, respectively, of the comparison with raw data. RDM, PBIASDM, and RMSEDM represent the correlation coefficient, percent bias, and root mean square error, respectively, of the comparison with bias-corrected data using the distribution mapping (DM). RLS, PBIASLS, and RMSELS represent the correlation coefficient, percent bias, and root mean square error, respectively, of the comparison with bias-corrected data using the linear scaling (LS)

The biases in the raw rainfall data were substantially reduced after the DM and LS bias correction was applied (Fig. 9). The LS was slightly better than the DM in correcting the bias from the individual raw RCMs and ensemble for most of the stations. In both bias correction methods, the ensemble showed better performance in representing the observed rainfall in most of the stations with R value above 0.8, PBIAS of 0, and low RMSE. Other studies (Kim et al. 2014; Nikulin et al. 2012; Endris et al. 2013) also reported that the ensemble outperformed the individual CORDEX Africa model outputs to represent observed rainfall and temperature data. It is worth mention that future studies may use the ensemble RCM mean to understand the trends and impacts of climate change in the Baro–Akobo basin.

3.3.2 Maximum temperature

Figure 10 compares monthly maximum temperature raw and bias-corrected RCMs and the corresponding observed stations using the correlation coefficient (R), percent bias (PBIAS) and, root mean square error (RMSE). The result showed that raw outputs of RCMs had substantial biases compared to the observed in terms of R, PBIAS, and RMSE. For example, a large source of bias such as PBIAS (−27.7%) at Yubdu and RMSE (8.17 mm/day) at Alem Teferi station by raw CCLM4 (ICHEC) was obtained during raw RCM comparison against the observed. In addition to this, CCLM4 (ICHEC) at Alem Teferi performed least in terms of the correlation coefficient between observed and simulated monthly maximum temperature.

Fig. 10
figure 10

Statistical performance evaluation of the raw and bias-corrected RCM outputs using distribution mapping (DM) of monthly maximum temperature against the observed monthly maximum temperature in six stations in the Baro–Akobo basin from 1975 to 2005. The R, PBIAS, and RMSE represent the correlation coefficient, percent bias, and root mean square error, respectively, of the comparison with raw data. RDM, PBIASDM, and RMSEDM represent the correlation coefficient, percent bias, and root mean square error, respectively, of the comparison with bias-corrected data using the distribution mapping (DM)

The bias correction using the DM reduced biases substantially in most of the stations. For example, a PBIAS of −27.7% at Yubdu and RMSE of 8.17 mm/day at Aleme–Teferi station by CCLM44 (ICHEC) were removed after applying a robust bias correction. The DM is effective in providing an optimal value of goodness-of-fit such as PBIAS of close to 0, RMSE less than < 2.85 mm/day and R value > 0.8 in most RCMs and studied stations. Although bias correction for individual RCMs significantly removes the bias on the observed maximum temperature, the ensemble performed better by providing a PBIAS = 0, RMSE < 1.7 mm/day and R > 0.86 in all stations except Alem Teferi (Fig. 10).

3.3.3 Minimum temperature

Figure 11 compares monthly minimum temperature raw and bias-corrected RCMs and the corresponding observed stations using the correlation coefficient (R), percent bias (PBIAS), and root mean square error (RMSE). The result showed that like maximum temperature, raw outputs of minimum temperature RCMs had substantial biases compared to the observed in terms of R, PBIAS, and RMSE. However, the performance of the minimum temperature is less than that of the maximum temperature. For example, except RCA4 (ICHEC) at Gore station, all other RCMs and the ensemble resulted in a weak correlation which is less than 0.5.

Fig. 11
figure 11

Statistical performance evaluation of the raw and bias-corrected RCM outputs using distribution mapping (DM) of monthly minimum temperature against the observed monthly minimum temperature in six stations in the Baro–Akobo basin from 1975 to 2005. The R, PBIAS and RMSE represent the correlation coefficient, percent bias and root mean square error, respectively, of the comparison with raw data. RDM, PBIASDM and RMSEDM represent the correlation coefficient, percent bias and root mean square error, respectively, of the comparison with bias-corrected data using the distribution mapping (DM)

The biases in the raw minimum temperature value were substantially reduced after the DM bias correction was applied (Fig. 11). However, the improvement after bias correction particularly for correlation was not good as maximum temperature. Moreover, minimum temperature RCM bias correction for all the RCMs at Alem Teferi station and CCLM4 (MPI) at Mizan Teferi station showed higher PBIAS than the corresponding evaluation for the maximum temperature. Our result is in agreement with the findings of the other studies. According to (Kim et al. 2014), the skill of RCM simulation for minimum temperature is less than that of the maximum temperature. The low performance of RCMs to reproduce the observed minimum temperature may be due to the effect of undulating topography and cloudiness most of the time (Themeßl et al. 2012).

Besides the goodness-of-fit evaluations, the CV and third quantile were used to compare the raw and bias-corrected RCM outputs with the observed historical monthly rainfall. The raw RCMs show substantial bias with an underestimation of the third quantile and overestimation of the CV in most stations (Fig. 12 in the Supplementary information). In terms of the third quantile, CCLM4 (CNRM) and CCLM4 (ICHEC) overestimate the observed third quantile, whereas CCLM4 (MPI), RCA4 group, REMO 2009, and the ensemble were underestimated in most stations.

The CV and third quantile agreement was improved after applying DM (Fig. 13 in the Supplementary information) and LS (Fig. 14 in the Supplementary information) bias correction. As it is presented in figures, after applying bias correction, the difference between the CV and third quantile of the observed rainfall and all RCMs as well as the ensemble was minor. The LS was slightly better than DM in reproducing the mean monthly rainfall, while the DM was better in representing the CV and third quantile in most stations. According to (Teutschbein and Seibert 2012), DM was better than LS in estimating frequency-based statistics such as CV and third quantile. However, in time series–based metrics such as PBIAS and RMSE, the performance of mean-based methods (e.g., linear scaling) was slightly better than DM methods (Fang et al. 2015). However, in both bias correction methods, there is no substantial difference in representing the CV and third quantile, which indicates the corrected monthly rainfall time series are in good agreement with the observation.

3.4 Mann–Kendall’s trend test and Sen’s slope estimate for annual rainfall, maximum and minimum temperature

The trend analysis of the observed annual rainfall series indicated decreasing trends at eight stations and increasing at four stations (Table 3). The decreasing trends are found to be statistically significant at stations Gore, Masha, Uka, and Mettu, while the increasing trend was not significant in all stations. The magnitudes of the significant decreasing trends at Gore, Masha, Uka, and Mettu stations were −6.89 and −8.9 mm/year and −12 and −17 mm/year, respectively (Table 4). Statistical assessment of rainfall trends in the upper Blue Nile River basin by (Tabari et al. 2015) reported a decreasing annual rainfall trend at Gore station. Findings in (Cheung et al. 2008) also reported a significant decreasing trend in the June to September rainfall in the southwestern and central parts of Ethiopia. Both DM and LS bias-corrected individual RCMs as well as the ensemble annual rainfall showed a similar trend with the observed rainfall data in most stations.

Table 3 MK trend tests result for observed and bias-corrected RCM annual rainfall time series. Values in bold indicate statistically significant trends at the 5% level of significant level
Table 4 Sen’s slope result for observed and bias-corrected RCM annual rainfall time series

There was an increasing but not significant trend of observed annual maximum temperature at three stations and a significant increasing trend at three stations (Table 5). The significant increasing trends were observed at Alem Teferi, Mizan Teferi, and Mettu stations, and the slopes of the trends were at 0.005 and 0.003 °C/year and 0.008 °C/year, respectively (Table 6). All of the bias-corrected individual RCMs as well as ensemble showed similar increasing trends with the observed in all station. The maximum temperature show increasing trend in the study region and these result have an agreement with (Woldesenbetm et al. 2016; Christy et al. 2009) who showed an increasing trend in the maximum temperature in upper Blue Nile River Basin, and eastern part of Africa, respectively. Likewise, the intergovernmental panel on climate change reported an increasing trend in observed temperature in the African continent (IPCC 2014).

Table 5 MK test result for observed and bias-corrected RCM for annual maximum and minimum temperature time series. Values in bold indicate statistically significant trends at the 5% level of significant level.
Table 6 Sen’s slope result for observed and bias-corrected RCMS for annual maximum and minimum temperature time series

When it comes to the minimum temperature, insignificant decreasing trends at two stations and an increasing trend at four stations were observed (Table 5). A significantly increasing trend was observed only at Yubdu station with the slope of the trend at 0.004 °C/year. Like the maximum temperature, all of the bias-corrected RCMs and their ensemble mean showed similar increasing trends with the observed in all station (Table 5). The majority of the increasing minimum temperature trends have an agreement with (Tekleab et al. 2013) which showed that increasing trends in minimum temperatures for the majority of the stations in the upper Blue Nile basin, Ethiopia.

In general, the findings show that after applying the bias correction, the individual RCMs as well as ensemble outputs adequately simulate the trends in annual rainfall and maximum and minimum temperature. Such robust output of RCMs could be utilized for future climate change impact studies on the Baro–Akobo basin. Statistically downscaled outputs of the HadCM3 model output can provide useful information to devise appropriate climate change adaptation and mitigation strategies, such as sustainable watershed management practices that help to cope with the challenges of climate change while building resilience (Dile et al. 2013).

Changes in rainfall and temperature due to climate change may exacerbate the socio-ecological risks in the study area by impacting basic natural resources that rural households rely on. Most of the inhabitants in the Baro–Akobo basin are smallholder rural farmers whose livelihoods depend on agriculture and pastoralism. A decreasing trend of rainfall and an increasing trend of temperature may affect the available surface and groundwater resources, and thereby affect agricultural and ecological productivity which are a source of livelihood for the smallholder farmers. Therefore, to cope with the impacts of climate change in the study region, appropriate adaptation and mitigation strategies should be devised to reduce severe impacts on the social–ecological systems in the Baro–Akobo basin.

4 Conclusion

This study evaluated the dynamically downscaled rainfall and temperature simulations driven from GCMs which were part of the Coupled Model Intercomparison Project Phase 5 (CMIP5). The study area is found in the Baro–Akobo basin, in which many ongoing and proposed water resources projects such as irrigation and hydropower are identified. Several statistical metrics such as coefficient of correlation, PBIAS and root mean square error was used to compare the RCMs with the observed.

The findings of this study show that the raw RCM simulations were characterized by overestimation and underestimation of mean monthly rainfall and temperature in all stations. Furthermore, in all other performance metrics, there are no single RCMs that performed best in all stations at monthly or annual timescale. This is likely due to rainfall and temperatures vary spatially to elevation difference among considered meteorological stations. This implies that the best-performing RCMs vary within the study basin and thus site-specific evaluation of RCMs is indispensable. However, in most stations, no substantial difference is found after bias correction between the observed and the RCM simulations, and a relatively small variability range is identified in mean monthly distribution simulations and statistical performance evaluations for rainfall and temperatures during the period 1975 to 2005.

It is noted that in most stations, bias-corrected climate data resulted in a better representation of temperature than the rainfall indicating greater uncertainty still exists in model simulated rainfall than temperature. In most stations, the ensemble has shown improved performance as compared to most individuals RCM models when evaluated by historical values of the monthly correlation, PBIAS, RMSE, CV, and third quantile. This indicated that the DM methods of temperature and the LS and DM for rainfall provide a higher level of certainty in the reproduction of monthly rainfall and temperature.

The Mann–Kendall trend test for observed and most of the individual RCMs as well as their ensemble result showed a decreasing rainfall trend, while both the maximum and minimum temperature show an increasing trend in most of the stations. Improving the understanding of rainfall and temperatures is of high importance for the Baro–Akobo basin as the economy is primarily based on rainfed agriculture and therefore vulnerable to climate variability and climate change. Therefore, we concluded that the ensemble mean of the RCMs after bias correction can be used for the assessment of future climate projection and to develop action plans to mitigate and adapt the impacts of future climate change over the region.