1 Introduction

The earth’s climate is changing due to increased concentration of greenhouse gases in the atmosphere and local disturbance of ecosystems (IPCC 2021). The observed climate changes has indispensable impacts in several sectors. For instance, hydrology, sediment dynamics, crop production, and numerous ecosystem services have all been affected by the impacts of climate change (Gebrechorkos et al. 2023; Getachew et al. 2021; Hirpa et al. 2019; Nazeer et al. 2022; Tan et al. 2021). The latest report (six assessments) of the Intergovernmental Panel for Climate Change (IPCC 2021) indicated that the future climate is inevitable, and it would have an immense impact on several sectors, particularly in developing countries, which have limited capacity for adapting to the changing climate. Gebrechorkos et al. (2023) also stated that temperature and evaporation are projected to increase in the greater parts of East Africa throughout the 21st century, increasing hydrological extremes (floods and drought). Thus, projecting the patterns and magnitudes of future climate under different climate scenarios for specific locations is imperative for planning effective context-specific adaptation strategies.

Understanding the plausible future climate changes are made through the use of Regional climate models (RCMs) (Worku et al. 2019) or Global climate models (GCMs) outputs (Almazroui et al. 2020; Carvalho et al. 2021; Lovino et al. 2021). Since the performance of climate models varies from region to region (Almazroui et al. 2021), evaluating the performances of climate models in simulating climate variables for a specific area/region are indispensable for better understanding of future climate, its impact as well as for planning effective adaptation strategies (Adib et al. 2022). Nevertheless, performance evaluations of previous studies in many areas of the world are undertaken mostly for rainfall (Agel and Barlow 2020; Akinsanola et al. 2021; Cui et al. 2021; Rivera and Arnould 2020; Tewari et al. 2022; Zamani et al. 2020). In addition, majority of the previous climate change impact studies were also undertaken based on the Coupled Model Inter-comparison Project (CMIP) third or fifth phases (CMIP3 or CMIP5) of GCM products (Hirpa et al. 2019). The six-phase of the CMIP (CMIP6) GCMs products are available for wider applications, and study findings indicated that CMIP6 GCMs have shown improved performance compared to CMIP5 GCMs (Bağçaci et al. 2021; Fan et al. 2020; Wang et al. 2021; Zamani et al. 2020). Furthermore, previous studies undertaken in elsewhere are mostly evaluated with a reference that are other than the observed data (Abbasian et al. 2018; Ajibola et al. 2020; Belazreg et al. 2022). However, the relative performance of CMIP6 models varies based on the reference dataset (Akinsanola et al. 2021).

In spite of the advantages of using the latest versions of climate models for understanding future climate as well as climate change impacts, studies on CMIP6 GCMs are very limited in Ethiopia. Some of the studies undertaken on performance evaluation CMIP6 GCMs are Alaminie et al. (2021) in the Upper Blue Nile Basin and Sime and Dibaba (2023) in Awash Basin. Comparatively, a higher number of studies on the latest CMIP6 GCMs products are available in elsewhere. Most of these studies are concentrated on performance evaluation of CMIP6 GCMs for simulating rainfall (Guo et al. 2021; Iqbal et al. 2020), comparison of CMIP5 and CMIP6 GCMs (Fan et al. 2020; Gusain et al. 2020; Wang et al. 2021), and evaluations of CMIP6 GCMs for simulating climate extremes (Akinsanola et al. 2021; John et al. 2022). In Ethiopia, preceding climate model evaluation studies were emphasized to understand the capability of the models for simulating rainfall (Worku et al. 2019; Sime and Dibaba 2023). However, recent studies that evaluated the performances of RCMs in Guder sub-basin (Demessie et al. 2023) and GCMs in Upper Blue Nile Basin (Alaminie et al. 2021) indicated that the best-performing model for rainfall did not perform well for temperature and hence it urges the need for evaluating climate models for both rainfall and temperature. Moreover, most of the earlier studies in Ethiopia and elsewhere in the world did not also evaluate from daily to annual temporal scales (Bağçaci et al. 2021; Belazreg et al. 2022). On the other hand, previous climate change studies in Ethiopia are focused on the rainfall and temperature patterns at the watershed/sub-basin/study area scale only (Alaminie et al. 2021; Worku et al. 2019). However, the patterns of future climate change may be different within AEZs of a study region. Therefore, systematic evaluation of CMIP6 GCMs for simulating observed rainfall and maximum and minimum temperatures using a full range of temporal scales (daily to annual) and projections of future climate change using the best performing CMIP6 GCMs at different AZEs are necessary for assessing impacts of climate change as well as planning suitable adaptation strategies for each AEZs.

The Bale Eco-Region (the study area) is one of the biodiversity hotspot areas of Ethiopia in particular and the Horn of Africa in general. The BER is labeled as a water tower because of the existence of several springs in the area, which are flowing into the international Genale Dawa and Wabi Shebele Rivers. However, the ongoing climate change together with deforestation and agricultural land expansion is threatening the BER. Although studies are required on identifying best performing climate models from CMIP6 GCM products, and projections of rainfall and temperature using those models, efforts of this kind are very limited in the study region. Therefore, this study is aimed to evaluate the performance of CMIP6 GCMs in simulating rainfall and temperature from the daily to annual temporal scales, and projection of these climate variables in different AEZs as well as the BER scale using the best performing GCM products at three climate change scenarios.

2 Materials and methods

2.1 Descriptions of the study area

The BER cover approximately 38,036.18 Km2, and lies in the Genale Dawa (83.8%) and Wabi Shebele (16.2%) River Basins (Fig. 1). The study area contains numerous springs, which are flowing into the transnational Genale Dawa and Wabi Shebele Rivers. Geographically, the BER is located between 5°28′–7°44′N and 38°32′–41°26′E (Fig. 1) and its elevation varies from 241 to 4373 m a.s.l (Fig. 1). According to the traditional AEZ classification of Ethiopia, which is based on elevation and climate, the BER contains five AEZs-alpine, temperate, sub-tropical, tropical and desert AEZs (Fig. 1). Agriculture, which contains crop cultivation and livestock production, are the main livelihood of the study region over the sub-tropical, temperate and some parts of the alpine AEZs. In the tropical AEZ, however, animal raring is most common while desert is the region where we cannot find any kinds of human settlement.

Fig. 1
figure 1

Location of the Bale Eco-Region from Ethiopia and the meteorological stations used in this study

The climate of the study area is highly influenced by the low-level easterly winds from the Indian Ocean as well as the movements of the Inter tropical convergence zone (ITCZ). The BER is characterized by two distinct seasons, which is from March-October (Wet season) and November-February (dry season) (Hailemariam Sisay et al. 2015; Muhammed and Elias 2021). The wet season in BER spans from March-June and July-October, representing small and heavy rains, respectively (Muhammed and Elias 2021). Both the small and heavy rains are imperative for the livelihood of the population in the study area. Based on the studied meteorological stations (Table 1), the long-term mean (1995–2014) annual rainfall of the BER varies from 447 to 1409 mm. The long-term mean (1995–2014) annual maximum and minimum temperature is 17.1–31.7 °C and 4.3–19 °C, respectively.

Table 1 The list of CMIP6 GCMs employed in this study and their characteristics

2.2 CMIP6 GCMs dataset and observations

This study evaluated historical (1995–2014) simulations of ten CMIP6 GCMs (Table 1) for rainfall, maximum temperature (Tmax) and minimum temperature (Tmin) variables. The daily data of these GCMs were obtained from the Earth System Grid Federation (ESGF) database at https://esgf-node.llnl.gov/projects/cmip6/. The detailed descriptions of the chosen GCMs are found in Table 1. The selection of the GCMs are based on the availability of daily data, their best performance in different parts of Ethiopia (Alaminie et al. 2021), East Africa (Gebrechorkos et al. 2023) and elsewhere in the world (Iqbal et al. 2021; Ngoma et al. 2021). The models were also selected based on their wider applications in their earlier versions (CMIP5) in different parts of Ethiopia (Worku et al. 2019). For example, the inclusion of BCC-CSM2-MR and MRI-ESM2-0 was based on their best performance for simulating rainfall and temperature, respectively in the Upper Blue Nile Basin of Ethiopia (Alaminie et al. 2021). MRI-ESM2-0, EC-Earth3 and EC-Earth3-Veg were selected since they demonstrated best performance in mainland southeast Asia (Iqbal et al. 2021). This study used r1i1p1f2 and r1i1p1f3 vibrant label for CNRM-CM6-1 and HadGEM3-GC31-LL GCMs, respectively and, r1i1p1f1 vibrant label was employed for the remaining models (Almazroui et al. 2020, 2021) (Table 1).

The study area contains eleven rainfall and ten temperature meteorological stations, but only eight rainfall and six temperature stations contain less than 20% missing data for the study period. These observed climate data were obtained from the Ethiopian Meteorology Institute (EMI). The missing values in the climate data, which contain less than 20%, were filled using the Multivariate Imputation by Chained Equations (MICE) package in R software (Buuren et al. 2022). The quality of the data were checked using the RClimDex package in R software (Zhang and Yang 2004), and errors such as negative rainfall, and Tmin exceeding Tmax were replaced by the nearby station. Outlier values were treated following the procedures outlined in the World Meteorological Organization (WMO 2009). The study filled the missing values of stations that contain missing values higher than 20% using the best fitting reanalysis rainfall and temperature products for each AEZs, which is based on the suggestion of previous finding in the BER (Gashaw et al. 2023). Accordingly, the missing values of rainfall stations are filled with the Multi-Source Weighted-Ensemble Precipitation, version 2.8 (MSWEP v2.8). On the other hand, the study used the Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA v2) for filling missing values of Tmax and Tmin stations that are found in temperate and tropical AEZs, and ERA5 for stations found in sub-tropical AEZs. Besides, MSWEP v2.8 for additional rainfall stations such as Station 11 and 12, which are representing tropical and alpine AEZs, respectively, were used (Table 2). Similarly, MERRA v2 and ERA5 for temperature were also used for station 11 and station 12. Therefore, the study used 12 observed and reanalysis rainfall and temperature products for evaluating CMIP6 GCMs in BER. The detailed descriptions of MSWEP V2.8, MERRA v2 and ERA5 are found in Gashaw et al. (2023).

Table 2 The meteorological stations used in this study and their corresponding AEZs

2.3 Evaluations methods

Performance evaluations of CMIP6 models in this study were undertaken after extracting the CMIP6 GCMs data for the 12 meteorological stations of the study area (Table 2). Then after, the stations (point) data were changed into areal average rainfall, Tmax and Tmin values using Thiessen Polygon method in ArcGIS 10.7. The performance of the GCMs were evaluated from daily to annual temporal scales based on the areal average rainfall, Tmax and Tmin data of the studied CMIP6 GCMs against the observed data during 1995–2014 periods. The approach followed in this study was also implemented in other climate studies (Demessie et al. 2023; Tadese et al. 2019; Worku et al. 2018). For evaluating the GCMs, correlation coefficient (R), mean absolute error (MAE), root mean square error (RMSE) and percent bias (PBIAS) (Yazdandoost et al. 2021) were used and they are computed in R software. Comparable to this study, previous studies undertaken in Ethiopia (Demessie et al. 2023; Dibaba et al. 2019; Worku et al. 2018) and elsewhere in the world (Guo et al. 2021) have also used R, RMSE and PBIAS for evaluating performances of climate models. Rivera and Arnould (2020) in Southwestern South America also employed mean bias error, RMSE and PBAIS. The full descriptions of the applied performance measure statistics are found in the following references (Belazreg et al. 2022; Jose and Dwarakish 2021; Yazdandoost et al. 2021). In this study, the long-term mean (1995–2014) annual and monthly rainfall, Tmax and Tmin as well as annual values of these climate variables over the study period were also presented for visual comparison of the CMIP6 GCMs against observed data.

The overall ranking of the ten GCMs considering all statistical metrics (i.e., R, MAE, RMSE and PBIAS) were undertaken following the Comprehensive Rating Index (CRI) method, which has been also implemented in the previous studies (Dong and Dong 2021; Guo et al. 2021; Jiang et al. 2020; Rivera and Arnould 2020; Zhang et al. 2018a, b). The computations of CRI is following Eq. 1.

$$CRI=1-\frac{1}{nm}\sum _{i=1}^{n}{Rank}_{i}$$
(1)

Where n is the number of statistical performance measures used in this study for evaluating the models (5), m is the number CMIP6 GCMs that are evaluated in the study (10), and Ranki is the rank of the CMIP6 GCMs for each performance measures, which ranges from 1 to 10 for best and low performing models, respectively. The closer the values of CRI to 1 indicates the better performance of the model (Jiang et al. 2020; Zhang et al. 2018a, b).

2.4 Projections of future climate

Distribution Mapping (DM), which is available in the Climate Model data for hydrologic modeling (CMhyd) tool (Rathjens et al. 2016), was used for bias correction of best performing GCMs projections at different climate change scenarios. The selection of DM for bias correction of GCMs for rainfall and temperature was based on the suggestion of Worku et al. (2019), who compare several bias-correction techniques available in CMhyd in Jemma sub-basin, Upper Blue Nile Basin of Ethiopia and suggested the applications of DM for bias correction of rainfall and temperature products. Other study in the Awash Basin of Ethiopia has also found that DM is suitable for bias correction of climate models for temperature (Tadese et al. 2019). Due to its better performance, DM has also been employed for other studies undertaken elsewhere in the world (de Carvalho et al. 2021; Yeboah et al. 2022; Zhang et al. 2018a, b). This study bias corrected best performing CMIP6 GCMs after extracting the data for each stations, and hence bias correction was performed by providing the daily observed historical (1995–2014), raw GCMs historical (1995–2014) and future raw GCMs (2015–2100) data of each stations in text file.

In this study, analysis of future climate for near future (2021–2040), mid-century (2041–2060) and late-century (2081–2100) periods (Belazreg et al. 2022; Jiang et al. 2020; Lovino et al. 2021) were made at SSP2-4.5, SSP3-7.0 and SSP5-8.5 climate change scenarios across the AEZs and at the BER level. The SSP2-4.5, SSP3-7.0 and SSP5-8.5 scenarios are representing the medium-forcing, medium to high forcing and high-end-forcing pathways, respectively. The main reason why this study did the future climate change analysis in these scenarios is that three of them are possible climate change trajectories. On the other hand, the one that is not considered in this study (i.e., SSP1-2.6, representing the low climate forcing pathway) is unlikely happen considering the current global actions undertaking to mitigate climate change based on the Paris agreement. Due to this fact, future climate change analysis as well as bias corrections of global dataset were performed mainly for SSP2-4.5 and SSP5-8.5 climate change scenarios (Xu et al. 2021; Gebrechorkos et al. 2023). For instance, Xu et al. (2021) bias corrections of global dataset based on 18 CMIP6 GCMs were produced the data for SSP2-4.5 and SSP5-8.5 scenarios. In this study, we included SSP3-7.0 to see the projected climate changes within the middle of SP2-4.5 and SSP5-8.5 scenarios. Analysis of future climate change for each AEZs were performed by taking the average values of the stations found in each AEZs. On the other hand, the Thiessen Polygon weights of the 12 stations, which is representing the areal average values of the studied climate variables, were used for analyzing future climate change at the BER level.

3 Results and discussion

3.1 Performance of CMIP6 GCMs for simulating rainfall

The spatial distributions of the long-term mean (1995–2014) annual rainfall of the studied CMIP6 GCMs over the BER are presented in Fig. 2, and the results showed a clear difference between the models. Of the studied GCMs, ACCESS-ESM1-5, BCC-CSM2-MR, CNRM-CM6-1, MRI-ESM2-0 and GFDL-ESM4 are more or less represented the long-term mean (1995–2014) annual rainfall over the BER. On the other hand, the spatial rainfall presentations of the remaining CMIP6 GCMs are relatively poor. The finding indicated that the models contain varying extent of estimation bias in different regions of the BER. Among the best performing models, BCC-CSM2-MR overestimated in low elevation areas and underestimated in high elevation areas. In contrast, CNRM-CM6-1 underestimated and overestimated in low and high elevation areas, respectively. There is also an overestimation of MRI-ESM2-0 and ACCESS-ESM1-5 in high elevation areas and overestimation of GFDL-ESM4 in low elevation areas. Among the poorly performing models, MPI-ESM1-2-LR and HadGEM3-GC31-LL exhibit a higher underestimation bias in the majority of the study area, but INM-CM4-8 overestimated in the low elevation areas (Fig. 2). Ec-Earth3-Veg and EC-Earth3 also revealed higher overestimation bias in many high elevation areas.

Fig. 2
figure 2

The long-term mean (1995–2014) annual rainfall representations of the studied CMIP6 GCMs and observed data over the BER

Analogously, the climate models that are relatively good for representing the long-term mean (1995–2014) monthly rainfall are BCC-CSM2-MR, CNRM-CM6-1, MRI-ESM2-0, MPI-ESM1-2-LR and EC-Earth3 (Fig. 3A). In contrast, the representations of the remaining five models are relative weak. Among the poorly performing models, HadGEM3-GC31-L and INM-CM4-8 did not maintain the patterns of the observed rainfall for most months. In addition, although GFDL-ESM4 is one of the best performing models for simulating the long-term mean annual rainfall in the BER, its performance is relatively weak for representing the mean monthly rainfall. The result also revealed that the studied GCMs contain varying directions of biases. Of the best performing models, BCC-CSM2-MR, CNRM-CM6-1, MRI-ESM2-0 and EC-Earth3 underestimated in some of the months and overestimated in the remaining months. MPI-ESM1-2-LR, which is one of the poor performing model, shown underestimation bias in all of the months (Fig. 3A).

Fig. 3
figure 3

The long-term mean (1995–2014) monthly rainfall (A), Tmax (B) and Tmin of CMIP6 GCMs with observed data in the BER

The findings of this study indicated that even the best performing models contain a large percentage of biases as clearly seen in the long-term mean (1995–2014) annual and monthly rainfall figures (Figs. 2 and 3A). This is mainly attributed to the coarser resolutions of the climate models as well as the biases of these climate models are not removed through some statistical or dynamical bias correction tools. Aligned with the findings of this study, many other previous studies also reported similar results (Worku et al. 2019; Xu et al. 2021). For example, Xu et al. (2021) bias corrections of global dataset based on 18 CMIP6 GCMs using ERA5 indicated that the raw GCMs contain large percentage of biases. Additionally, Worku et al. (2019) graphical comparisons of bias corrected RCMs and un-corrected RCMs in the Jemma sub-basin (Ethiopia) releveled that the uncorrected RCMs contain large percentage of biases. The large bias exhibited even in the best performing climate models indicated the necessity for bias correction of GCMs or RCMs before using them for further applications such as for future climate change analysis or applying them for climate change impacts and adaptation studies.

The performances of the ten CMIP6 GCMs in simulating the daily and monthly rainfall climatology of the BER are shown in Table 3. According to the CRI, CNRM-CM6-1 is the best performing model for simulating daily rainfall that is followed by MPI-ESM1-2-LR. MRI-ESM2-0 and EC-Earth3 are equally the third ranked best performing models. BCC-CSM2-MR depicted the fifth rank for simulating daily rainfall in the BER. In contrast, GFDL-ESM4 and INM-CM4-8 are the poorest performing climate models. At the monthly temporal scale, BCC-CSM2-MR, CNRM-CM6-1 and MRI-ESM2-0 are the first, second and third ranked best performing climate models, respectively (Table 3). MPI-ESM1-2-LR and EC-Earth3 are the fourth and fifth ranked best performing climate models. The poorest performing climate models for simulating monthly rainfall is INM-CM4-8. Following INM-CM4-8, the other low performing models for representing monthly rainfall are GFDL-ESM4 and HadGEM3-GC31-LL (Table 3).

Table 3 Performance of CMIP6 GCMs for simulating rainfall in the BER at the daily and monthly temporal scales

Table 4 displayed the performances of climate models for simulating wet season and annual rainfall over the BER. The finding indicated that BCC-CSM2-MR is the best performing model during the wet season followed by MRI-ESM2-0 and CNRM-CM6-1 (Table 4). ACCESS-ESM1-5 and EC-Earth3 are the fourth and fifth ranked best performing models during the wet season. With reference to the capability of the studied climate models for simulating the annual rainfall, ACCESS-ESM1-5 is the first best performing model although it has placed in the fourth rank at the wet season temporal scale. BCC-CSM2-MR, CNRM-CM6-1 and MRI-ESM2-0 are among the second to fourth ranked best performing models. In contrast, MPI-ESM1-2-LR is the poorest performing model both at wet season and at annual temporal scales. HadGEM3-GC31-LL and INM-CM4-8 are the other poorly performing models at the wet season and annual temporal scales (Table 4).

Table 4 Performance of CMIP6 GCMs for simulating wet season and annual rainfall in the BER

The result of this study revealed that MPI-ESM1-2-LR is the second and fourth ranked best performing model for simulating the daily and monthly rainfall over the BER, respectively (Table 3). However, MPI-ESM1-2-LR is the poorest performing climate model for simulating wet season and annual rainfall (Table 4). In addition, although EC-Earth3 is the third ranked best model for simulating daily rainfall, its performance for presenting monthly, wet season and annual rainfall are placed in the fifth, fifth and sixth ranks, respectively (Tables 3 and 4). In the same manner, ACCESS-ESM1-5 has shown best performance for simulating annual rainfall, but its performance at the daily, monthly and wet season are placed in the sixth, sixth and fourth ranks, respectively. On the other hand, some models such as Ec-Earth3-Veg did not showed best or poorest performance in all of the studied temporal scales as it displayed from seventh to eighth rank in the considered temporal scales. The performance difference of the studied climate models at different temporal scales suggested the necessity for evaluating climate models at the full ranges of temporal scales (daily to annual) to use them for better decision-making processes.

In general, the three models that demonstrated relatively better performance for simulating rainfall over the BER in the studied temporal scales are BCC-CSM2-MR, CNRM-CM6-1 and MRI-ESM2-0. Aligned with the findings of this study, BCC-CSM2-MR was also the best performing model for simulating monthly rainfall in the Upper Blue Nile Basin of Ethiopia (Alaminie et al. 2021). Guo et al. (2021) reported that EC-Earth3 and Ec-Earth3-Veg are better than MRI-ESM2-0 for simulating annual rainfall in arid Central Asia. EC-Earth is also amongst the best performing models for rainfall in sub Himalaya region of Pakistan (Iqbal et al. 2020). GFDL-ESM4 has also superior performance than BCC-CSM2-MR for simulating monthly rainfall over Algeria (Belazreg et al. 2022).

The findings of this study also indicated that although EC-Earth3 and Ec-Earth3-Veg contain a better spatial resolution compared to the remaining eight models, they are not among the best three models that have shown consistently better performance for simulating rainfall in the BER from the daily to annual temporal scales. In contrary, BCC-CSM2-MR, CNRM-CM6-1 and MRI-ESM2-0, which are lower spatial resolution than EC-Earth3 and Ec-Earth3-Veg performed better for simulating rainfall over the BER at the different temporal scales. The findings clearly showed that spatial resolution is not the only factor that would make the climate models to perform well. Therefore, the better performance of BCC-CSM2-MR, CNRM-CM6-1 and MRI-ESM2-0 for simulating rainfall in the BER compared to those climate models that contain better resolution could be attributed to the parameterization of the models.

With reference to the estimation bias of the climate models, HadGEM3-GC31-LL and MPI-ESM1-2-LR have shown underestimation bias at the daily, monthly, wet season and annual temporal scales. However, the remaining eight models displayed overestimation bias from the daily to annual temporal scales (Tables 3 and 4). MPI-ESM1-2-LR has shown the highest bias in the four considered temporal scales, which ranges from 44.9 to 47.9% (Tables 3 and 4). On the other hand, ACCESS-ESM1-5, MRI-ESM2-0 and BCC-CSM2-MR have relatively demonstrated the lowest PBAIS from the daily to annual temporal scales (Tables 3 and 4).

3.2 Performance of CMIP6 GCMs for simulating maximum temperature

Figure 4 displays the long-term mean (1995–2014) annual Tmax distributions of the studied GCMs and observed data over the BER. The result revealed that EC-Earth3-Veg, EC-Earth3 and MPI-ESM1-2-LR are relatively better for simulating the long-term mean (1995–2014) annual Tmax distributions over the BER. The other relatively better performing models for representing the spatial distributions of the long-term mean annual Tmax are MRI-ESM2-0 and BCC-CSM2-MR although BCC-CSM2-MR has shown overestimation problem in many parts of the study area. On the other hand, HadGEM3-GC31-LL, ACCESS-ESM1-5 and GFDL-ESM4 are among the poorly performing model for simulating the mean annual Tmax. The finding also indicated that Ec-Earth3-Veg and EC-Earth3 overestimated Tmax in some of the high-elevated areas, but MPI-ESM1-2-LR overestimated and underestimated Tmax in some of the high and low elevation areas, respectively. The poorly performing model (HadGEM3-GC31-LL) exhibited overestimation bias in most of the BER. CNRM-CM6-1 has also shown underestimation bias in many parts of the low elevation areas. Moreover, some models such as ACCESS-ESM1-5 did not maintained the spatial patterns of the observed Tmax in the study area, but it general exhibited overestimation bias in a considerable areas of the BER.

Fig. 4
figure 4

The spatial distributions of the long-term mean (1995–2014) annual Tmax of the considered CMIP6 GCMs and observed data over the BER

The graphical comparison of the studied GCMs for simulating the long-term mean (1995–2014) monthly Tmax over the BER are illustrated in Fig. 4B. The result indicated that the models that are relatively better for simulating the mean monthly Tmax are Ec-Earth3-Veg, EC-Earth3 and MPI-ESM1-2-LR (Fig. 4B). MRI-ESM2-0 and INM-CM4-8 are the other best performing models for simulating mean monthly Tmax. In contrast, HadGEM3-GC31-LL, ACCESS-ESM1, BCC-CSM2-MR, GFDL-ESM4 and CNRM-CM6-1 are the poorly performing models for simulating the long-term mean monthly Tmax over the BER. CNRM-CM6-1 and GFDL-ESM4 underestimated the observed Tmax in the entire 12 months, but HadGEM3-GC31-LL and ACCESS-ESM1 (except for September) have overestimation bias (Fig. 4B).

The performance of the studied CMIP6 GCMs for estimating the daily and monthly Tmax over the BER is shown in Table 5. The result displayed that at the daily temporal scale, Ec-Earth3-Veg and Ec-Earth3 are the first and second ranked best performing models, respectively. MPI-ESM1-2-LR is the third best performing model followed by INM-CM4-8. MRI-ESM2-0 ranked fifth for estimating daily Tmax in the study area. On the monthly temporal scale, Ec-Earth3-Veg, Ec-Earth3 and MPI-ESM1-2-LR are equally the best performing models (Table 5). MRI-ESM2-0 and INM-CM4-8 are the fourth and fifth ranked best performing models. Of the studied GCMs, HadGEM3-GC31-LL is the least performing model both for the daily and monthly temporal scales. Next to HadGEM3-GC31-LL, BCC-CSM2-MR at the daily temporal scale, and BCC-CSM2-MR and ACCESS-ESM1-5 at the monthly temporal scale were the other poorly performing models. Although BCC-CSM2-MR and CNRM-CM6-1 are among the best performing models for simulating daily and monthly rainfall, they have shown poor performance for estimating Tmax.

Table 5 Performance of CMIP6 GCMs for simulating daily and monthly Tmax in the study area

Similarly to this finding, Abbasian et al. (2018) in Iran and Demessie et al. (2023) in Guder sub-basin, Ethiopia have also indicated that the best performing climate model for rainfall did not perform best for simulating temperature. Alaminie et al. (2021) study in the Upper Blue Nile Basin also reported that BCC-CSM2-MR and MRI-ESM2-0 are best performing GCMs for rainfall and temperature respectively, indicating the same model did not do good job for both variables. Belazreg et al. (2022) study in Algeria also demonstrated that IPSL-EM6A-LR is the best performing CMIP6 model for simulating monthly rainfall, but GFDL-ESM4 has shown the superior performance for simulating Tmax. The findings of our study and the above studies clearly indicated the need for evaluating climate models for each climate variables for supporting the decision making process through better research outputs.

Table 6 displayed the performance of the ten CMIP6 GCMs for estimating the wet season, dry season and annual Tmax in the BER. The finding indicated that MPI-ESM1-2-LR and INM-CM4-8 are equally best performing models for estimating Tmax during the wet season. Ec-Earth3-Veg and Ec-Earth3 ranked the third and fourth best performing models, respectively during the wet season. In contrast, HadGEM3-GC31-LL followed by ACCESS-ESM1-5 are the poorest climate models for estimating the wet season Tmax. With reference to the ability of the studied GCMs to estimate Tmax during the dry season, GFDL-ESM4 has shown the best performance. However, it has ranked as a seventh climate model during the wet season. Comparable to the wet season, Ec-Earth3-Veg and Ec-Earth3 are the best performing climate models for estimating Tmax during the dry season. On the contrary, HadGEM3-GC31-LL followed by BCC-CSM2-MR are the least performing climate models for estimating dry season Tmax. At the annual temporal scale, Ec-Earth3-Veg was among the best performing model. In addition, EC-Earth3 and MPI-ESM1-2-LR were equally the second ranked best performing models for simulating annual Tmax. HadGEM3-GC31-LL displayed the poorest performance from the daily to annual temporal scales. Some models, such as INM-CM4-8 has shown the best performance for simulating Tmax merely during the wet season. GFDL-ESM4 has also displayed the best performance for estimating Tmax only during the dry season. In general, Ec-Earth3-Veg, Ec-Earth3 and MPI-ESM1-2-LR are the three best models, which have consistently shown better performance for simulating Tmax from the daily to annual temporal scales.

Table 6 Performance of CMIP6 GCMs for simulating seasonal (wet and dry seasons) and annual Tmax in the BER

Regarding the estimation bias of the considered CMIP6 GCMs, CNRM-CM6-1, EC-Earth3 and GFDL-ESM4 have shown underestimation bias at the daily, monthly and annual temporal scales. Ec-Earth3-Veg displayed a PBIAS of zero in the daily, monthly and annual times. In contrast, the remaining six models have overestimated the observed Tmax in these temporal scales (Tables 5 and 6). During the wet season, CNRM-CM6-1 and GFDL-ESM4 underestimated Tmax, while the other eight models overestimate the observed Tmax. On the other hand, during the dry season, five models such as CNRM-CM6-1, EC-Earth3, Ec-Earth3-Veg, GFDL-ESM4, and MRI-ESM2-0 illustrated underestimation bias, and the remaining five models exhibit overestimation bias (Table 6). The result indicated that the direction of biases in some of the climate models are changed within different temporal scales. The result also showed that HadGEM3-GC31-LL and ACCESS-ESM1-5 exhibited the highest estimation bias from the daily to annual temporal scales. In contrast, Ec-Earth3-Veg and MRI-ESM2-0 at the daily, monthly and annual temporal scales exhibited the lowest estimation bias (Tables 5 and 6). Additionally, INM-CM4-8 and MPI-ESM1-2-LR during the wet season and MPI-ESM1-2-LR and GFDL-ESM4 during the dry season displayed the lowest bias (Table 6).

3.3 Performance of CMIP6 GCMs for simulating minimum temperature

The performances of the studied ten GCMs for simulating the long-term mean (1995–2014) annual Tmin in the BER are given in Fig. 5. The result illustrated that Ec-Earth3-Veg, CNRM-CM6-1 and EC-Earth3 are relative better for simulating the mean annual Tmin. Of these models, CNRM-CM6- 1 displayed overestimation problem in the some high elevation areas and underestimation bias in many parts of the low elevation areas (Fig. 5). In contrast, Ec-Earth3-Veg and EC-Earth3 did not capture minimum Tmin values, which are situated in the high elevation areas, exhibiting overestimation problem in these areas. However, both Ec-Earth3-Veg and EC-Earth3 were capable of simulating the highest Tmin values that are positioned in the low elevation areas. Similarly, climate models such as MPI-ESM1-2-LR, ACCESS-ESM1-5, and INM-CM4-8, which are classified as underperforming models, exhibited a tendency to overestimate the long-term mean annual Tmin over the BER. The other models namely BCC-CSM2-MR, GFDL-ESM4, HadGEM3-GC31-LL, and MRI-ESM2-0, have demonstrated a similar issue of overestimation bias in both high and low-elevated areas (Fig. 5).

Fig. 5
figure 5

The long-term mean (1995–2014) annual Tmin representations of the studied CMIP6 GCMs and observed data over the study area

The long-term mean (1995–2014) monthly Tmin values of the studied GCMs over the BER is indicated in Fig. 4C. The result indicated that CNRM-CM6-1 has exhibited relatively better performance for simulating the mean monthly Tmin over the BER. Among the models considered, Ec-Earth3-Veg and EC-Earth3 stand out as comparatively better models. In addition, HadGEM3-GC31-LL and MRI-ESM2-0 showed relatively better performance compared to the remaining five GCMs. In contrast, MPI-ESM1-2-LR, INM-CM4-8 and ACCESS-ESM1-5 were the least performing GCMs. With reference to the estimation bias of the climate models, except CNRM-CM6-1, which overestimated mainly in June, MPI-ESM1-2-LR, INM-CM4-8, ACCESS-ESM1-5 and MRI-ESM2-0 overestimated in the entire 12 months. On the other hand, some GCMs such as Ec-Earth3-Veg, EC-Earth3, GFDL-ESM4 and HadGEM3-GC31-LL have shown varying directions of biases within different months, but these models indicated overestimation bias in the majority of the months (Fig. 4C).

The performance of CMIP6 GCMs for simulating Tmin in the BER at daily and monthly time scales are presented in Table 7. The finding indicated that CNRM-CM6-1, Ec-Earth3-Veg and EC-Earth3 are the first, second and third ranked best performing models both in the daily and monthly temporal scales. MRI-ESM2-0 and GFDL-ESM4 are placed in the fourth and fifth rank of best performing models in these time slots. On the other hand, although MPI-ESM1-2-LR is one of the best performing models for simulating Tmax over the BER, it displayed the poorest performance for simulating Tmin for the daily and monthly temporal scales. BCC-CSM2-MR and INM-CM4-8 are also among the poorly performing climate models for estimating Tmin over the study area.

Table 7 Performance of CMIP6 GCMs for simulating Tmin in the BER at daily and monthly time scales

Although the performance of MRI-ESM2-0 for simulating both maximum and minimum temperature is not among the best performing climate models in the BER, a study undertaken in the Upper Blue Nile Basin of Ethiopia (Alaminie et al. 2021) indicated that this model has shown superior performance for simulating average temperature. Demessie et al. (2023) evaluations of five RCMs, which are downscaled using EC-Earth also indicated that HIRHAM5 and RACMO22T models are relatively better models that simulate maximum and minimum temperature, respectively over Guder Sub-basin, Upper Blue Nile Basin. The findings of this study and previous studies, therefore, urges the needs for evaluating climate models for Tmax and Tmin separately since a single model may not represent both of them in a better way.

Table 8 displayed performances of the studied CMIP6 GCMs for simulating Tmin at the wet and dry seasons as well as annual temporal scales. The finding revealed that CNRM-CM6-1, EC-Earth3 and HadGEM3-GC31-LL in the wet season are the first, second and third ranked best performing models, respectively for estimating Tmin. Ec-Earth3-Veg and MRI-ESM2-0 are the fourth and fifth ranked best performing models for estimating wet season Tmin. During dry season, Ec-Earth3-Veg and CNRM-CM6-1 are the first and second best performing models, respectively while EC-Earth3 and GFDL-ESM4 are equally the third ranked best performing models. On the annual temporal scale, Ec-Earth3-Veg followed by CNRM-CM6-1 and EC-Earth3 are the best performing models. HadGEM3-GC31-LL and MRI-ESM2-0 are the fourth and fifth ranked best performing models for estimating annual Tmin. However, although MPI-ESM1-2-LR is one of the best performing models for estimating Tmax, it has shown the poorest performance for estimating Tmin at wet season, dry season and annual temporal scales. Following MPI-ESM1-2-LR, INM-CM4-8 in both wet and dry seasons and ACCESS-ESM1-5 in annual temporal scales are the other poorly performing models (Table 8). In general, CNRM-CM6-1, EC-Earth3 and Ec-Earth3-Veg have regularly shown better performance for simulating Tmin in the studied temporal scales.

Table 8 Performance of CMIP6 GCMs for simulating Tmin at seasonal (wet and dry seasons) and annual temporal scales in the BER

Regarding the estimation bias of the climate models in the BER, CNRM-CM6-1 underestimated Tmin from the daily to annual temporal scales, and the remaining nine models overestimated the observed Tmin (Tables 7 and 8). In terms of magnitude of the estimation bias, MPI-ESM1-2-LR followed by INM-CM4-8 exhibit the highest estimation bias from the daily to annual temporal scales, which ranges from 36.2 to 51.6% and 30.1–42.5%, respectively (Tables 7 and 8). In contrast, CNRM-CM6-1 followed by Ec-Earth3-Veg exhibited the lowest estimation bias at the daily, monthly, dry season and annual temporal scales. In the wet season, however, CNRM-CM6-1 followed by HadGEM3-GC31-LL have shown the lowest estimation bias.

The findings of this study indicated that CNRM-CM6-1, EC-Earth3 and Ec-Earth3-Veg are the three best performing models for simulating Tmin over the BER. Although EC-Earth3 and Ec-Earth3-Veg contain better spatial resolution compared to the remaining eight studied models, the spatial resolution of CNRM-CM6-1 is lower than BCC-CSM2-MR, GFDL-ESM4 and MRI-ESM2-0. In addition, the three best performing models for simulating Tmax over the study area are EC-Earth3, Ec-Earth3-Veg and MPI-ESM1-2-LR. Of these models, EC-Earth3 and Ec-Earth3-Veg comprise better spatial resolution, but MPI-ESM1-2-LR have a coarser spatial resolution. On the other hand, the poorly performing model for Tmax (HadGEM3-GC31-LL) have a coarser resolution, but HadGEM3-GC31-LL has shown a relative better performance for Tmin in some of the studied temporal scales. Thus, further studies are required weather spatial resolution is affecting the performance of climate models for simulating Tmax and Tmin.

3.4 Selection of best performing models for future climate change analysis

As mentioned earlier, BCC-CSM2-MR, CNRM-CM6-1 and MRI-ESM2-0 have shown consistently better performance for simulating rainfall from the daily to annual temporal scales. EC-Earth3, Ec-Earth3-Veg and MPI-ESM1-2-LR for simulating Tmax, and CNRM-CM6-1, EC-Earth3 and Ec-Earth3-Veg for estimating Tmin at the daily, monthly, wet season, dry season and annual temporal scale have also exhibited consistently better performance over the BER. Thus, this study evaluated the performance of the individual best performing models as well as the ensemble mean of the three best performing models (Tables S1S3), which were chosen for rainfall, Tmax and Tmin. In addition to the ensemble mean of the three models distinguished for Tmin, this study has also evaluated the ensemble mean of the two models (CNRM-CM6-1 and Ec-Earth3-Veg). The main reason for selecting the ensemble mean of the two (CNRM-CM6-1 and Ec-Earth3-Veg) models besides to the ensemble mean of the three models for Tmin is CNRM-CM6-1 followed by Ec-Earth3-Veg have superior performance in the considered temporal scales, and hence evaluating the ensemble of the three models as well as the two best models for Tmin is indispensable.

The result revealed that the ensemble mean of the three best performing rainfall models (i.e., BCC-CSM2-MR, CNRM-CM6-1 and MRI-ESM2-0) have shown superior performance than any of the individual models through the daily to annual temporal scales (Table S1). The ensemble mean of the three Tmax models (EC-Earth3, Ec-Earth3-Veg and MPI-ESM1-2-LR) have also exhibited improved performance for the daily, monthly, wet season and annual temporal scales compared with the performance of the individual models. In the dry season, however, MPI-ESM1-2-LR have shown better performance for simulating Tmax than the ensemble mean of the three best performing models (Table S2). In this case, the ensemble mean of the three Tmax models can be used for future climate change analysis. However, the ensemble mean of these models during dry season contain a negative correlation. Ec-Earth3-Veg and MPI-ESM1-2-LR also exhibited a negative correlation during dry season. As a result, this study have used EC-Earth3 for simulating future Tmax in the BER. With regard to Tmin, the ensemble mean of the two models (CNRM-CM6-1 and Ec-Earth3-Veg) has shown better performance than the performance of the individual models as well as the ensemble mean of the three models throughout the entire studied temporal scales (Table S3). Therefore, this study have used the ensemble mean of the best performing three models for rainfall and two models for Tmin, but for Tmax, EC-Earth3 has been used for future climate change analysis.

In this study, the ensemble mean of the three best performing models are used for future rainfall analysis over the BER. Likewise, Abbasian et al. (2018) has also used the ensemble mean of the best performing climate models for simulating rainfall over Iran. Worku et al. (2019) has also employed the ensemble mean of best performing models for simulating future climate change analysis as well as climate extremes in the Jemma sub-basin of the Upper Blue Nile Basin. Ajibola et al. (2020) has also reported that the ensemble mean has outperformed than the individual models for simulating annual rainfall over West Africa. In this study, however, the performance of the ensemble mean of the two models are better than the ensemble mean of the three models for simulating Tmin in the studied temporal scales. Therefore, this study highlights the necessity for evaluating the ensemble mean of the different best performing model’s combinations to obtain better simulation.

3.5 Projections of future rainfall and temperature

The projected areal average rainfall of the BER amounts for the three future climate horizons at three scenarios compared to the reference period are indicated in Table 9. The result revealed that a reduction of rainfall in the near future at the three SSPs, where the highest reduction (-5%) is expected in SSP2-4.5 and the lowest is at SSP 5-8.5 (-2%). During the mid-century, rainfall over the BER is projected to diminish only in SSP2-4.5 scenario, but it increased in SSP3-7.0 and SSP 5-8.5 climate change scenarios by 1% and 8%, respectively. On the other hand, rainfall over the BER is likely to increase in the late-future period by 14%, 19% and 34% at SSP2-4.5, SSP3-7.0 and SSP 5-8.5 scenarios. In general, according to the modeling result, the BER will experience a reduction of rainfall in the near future at the studied three scenarios, but the study region will gain more rainfall in the late-century period compared to the reference period (Table 9).

Table 9 The mean annual rainfall (mm) in the near future (2021–2040), mid-century (2041–2060) and late-century (2081–2100) periods under different climate change scenarios at alpine, temperate, sub-tropical and tropical AEZs and BER scale compared to observed rainfall (1995–2014)

With reference to AEZs, annual rainfall in the near future is expected to reduce at alpine, temperate and sub-tropical AEZs at SSP2-4.5, SSP3-7.0 and SSP 5-8.5 scenarios, but with different magnitudes (Table 9). The highest reduction of rainfall in the future at the three scenarios is likely observed at the temperate AEZ, which ranges from − 11 to -14% reduction (Table 9). Conversely, the tropical AEZ is projected to get more rainfall (3–11%) at the three SSPs in the near future. During the mid-century period, reductions of rainfall in alpine, temperate and sub-tropical AEZs and an increase in tropical AEZ at the SSP2-4.5 climate change scenario is expected. At SSP 3–7.0, rainfall has likely continue to reduce in temperate AEZ, but it is expected to increase in the remaining studied AEZs. In the late-century, however, rainfall is projected to increase at SSP2-4.5, SSP3-7.0 and SSP 5-8.5 scenarios at the four AEZs. The uppermost increase of rainfall during the late-century period is projected in the tropical AEZ, which ranged from 25 to 45%. In the late-future period, the projected increase of annual rainfall in the four AEZs as well as the areal average of the BER is higher in SSP 5-8.5 compared to SSP2-4.5 and SSP3-7.0 scenarios (Table 9). The projected annual rainfall from 1995 to 2100 periods at alpine, temperate, sub-tropical and tropical AEZs as well as the areal average BER level at SSP2-4.5, SSP3-7.0 and SSP5-8.5 scenarios also confirm the above-mentioned explanations (Fig. 6).

Fig. 6
figure 6

The projected annual rainfall (2015–2100) under SSP2-4.5, SSP3-7.0 and SSP5-8.5 climate change scenarios along with the observed rainfall (1995–2014) at the studied AEZs as well as the BER level

The findings of this study revealed a likely reduction of annual rainfall (2–5%) at the areal average of the BER in the near future (2021–2040) at SSP2-4.5, SSP-3.7.0 and SSP 5-8.5 climate change scenarios. In the late-century (2081–2100) period, however, the areal average BER rainfall is projected to increase (14–34%) at the three climate scenarios, exhibiting diverse directions of changes in different climate periods. In contrast to our finding, a study in the Jemma sub-basin of the Upper Blue Nile Basin (Worku et al. 2019) using an ensemble mean of six CMIP5 RCMs indicated that both the near term (2021–2050) and long-term (2071–2100) annual rainfall was projected to reduce at RCP2.6, RCP4.5 and RCP8.5 emission scenarios compared to the baseline period (1981–2005). On the other hand, a recent study conducted in the Upper Blue Nine Basin using CMIP6 GCM (BCC-CSM2-MR) reported that annual rainfall has projected to increase in both 2031–2060 and 2071–2100 periods at SSP1-2.6, SSP2-4.5, SSP3-7.0 and SSP 5-8.5 scenarios compared to the 1981–2010 climate periods (Alaminie et al. 2021). Conversely, Tadese et al. (2019) study in the Awash Basin indicated a likely increase in rainfall during half of the months in 2050 and 2070 s in RCP4.5 emission scenario, but RCP8.5 simulation indicated an expected decrease in rainfall. The difference in projected rainfall conditions between this study and previous studies suggests the need for site-specific climate change projection for planning better climate change adaptation strategies. The expected higher reduction of rainfall at the BER level in the near future is attained at the SSP2-4.5 climate change scenario followed by the SSP3-7.0 and SSP5-8.5 scenarios.

The findings also indicated varying directions of future rainfall in different AEZs of the BER. The expected increase of future rainfall in tropical AEZ at the near future, mid-century and late-century periods at the three scenarios is helpful to reduce the recurrent droughts observed in many areas of the tropical AEZ. Conversely, there will be a reduction of annual rainfall in alpine, temperate and sub-tropical AEZs. The reduction of annual rainfall in the near future in temperate and sub-tropical AEZs may put some stress on crop cultivation while lessoning of rainfall in alpine AEZ may put strain on tourist attracting endemic wild animals that are found in the BER. The study indicated a projected increase of annual rainfall in the late-century period at the alpine, temperate, sub-tropical and tropical AEZs in the three studied scenarios. This will be an opportunity for crop cultivars as well as those endemic animas.

Table 10 displays the projected mean annual Tmax and Tmin over the BER and the different AEZs of the study area during the near future, mid-century and late-future periods. At the areal average of the BER, the projected Tmax are from 0.75 to 1.05 °C, 1.41 to 2.33 °C and 2.66 to 5.01 °C in the near future, mid-century and late-century periods, respectively. On the other hand, the projected areal average BER Tmin in the three scenarios in the near future, mid-century and late century periods are 1.67–1.79 °C, 2.69–3.71 °C and 4.64–9.04 °C, respectively (Table 10). The findings also revealed that the increase in Tmin is higher than the increase in Tmax in the studied SSPs over the BER. For example, in the areal average of the BER, Tmax is projected to increase by 2.66, 3.81 and 5.01 °C at SSP2-4.5, SSP3-7.0 and SSP5-8.5, respectively. However, Tmin in the BER is projected to increase by 4.64 °C at SSP2-4.5, 7.46 °C in SSP3-7.0, and 9.04 °C in SSP5-8.5 (Table 10). In addition, the result also indicated that the higher increase in Tmin in the near future, mid-century and late-century periods are found at SSP 5-8.5, which is followed by SSP3-7.0. Similarly, the projected higher increase in Tmax in the mid-century and late-century periods is at SSP 5-8.5 scenario, which is followed by the SSP3-7.0 climate change scenario. However, in the near future, the highest increase in Tmax is projected in SSP3-7.0 scenario, which is followed by SSP 5-8.5. In general, the smallest temperature (Tmax and Tmin) changes in all the studied three future climate periods is projected at SSP2-4.5 while the highest temperature change is expected at SSP 5-8.5 (except the change in Tmax in the near future).

Table 10 The mean annual Tmax and Tmin (°C) in the near future (2021–2040), mid-century (2041–2060) and late-century (2081–2100) periods at SSP2-4.5, SSP3-7.0 and SSP5-8.5 climate scenarios under the studied spatial scales compared to the observed Tmax and Tmin (1995–2014)

In terms of AEZs, the result revealed that future Tmax and Tmin is projected to increase in the near future, mid-century and late-century periods with different magnitudes in the four AEZs (Table 10). The result indicated that the projected highest Tmax and Tmin increase in the three-studied future climate periods at the three climate change scenarios are found in temperate AEZ. Conversely, the minimum Tmin changes in the near future, mid-century and late-future periods at the three climate scenarios is from the alpine AEZ. Likewise, the projected lowest Tmax changes in the late-future is also observed in the alpine AEZ, but in near future and mid-century periods of the studied three climate scenarios, the projected lowest Tmax changes is either in the alpine or tropical AEZs (Table 10). Similar to the results obtained for the areal average of the BER, the change in Tmin is likely higher than Tmax in all the four AEZs in all the three scenarios. In addition, the highest increase in Tmin for the three future climate periods is projected to attained in SSP5-8.5 scenario followed by SSP3.7.0. Likewise, the highest Tmax changes in the mid-century and late-century periods are at SSP5-8.5 scenario that is followed by the SSP3.7.0 climate change scenario, but in the near future, the highest change in Tmax is projected in SSP3.7.0. The time series annual (1995–2100) Tmax and Tmin (Figs. 7 and 8) graphs also confirm the increase of Tmax and Tmin at the four AEZs as well as at the areal average of the BER level at the SSP2-4.5 SSP3-7.0 and SSP5-8.5 scenarios.

Fig. 7
figure 7

The projected annual Tmax (2015–2100) under different climate scenarios together with its observed value (1995–2014) at alpine, temperate, sub-tropical and tropical AEZs as well as the BER scale

Fig. 8
figure 8

The projected annual Tmin (2015–2100) at SSP2-4.5, SSP3-7.0 and SSP5-8.5 climate change scenarios together with the observed Tmin (1995–2014) at the different AEZs as well as the BER scale

The projected increase in Tmax and Tmin in this study is associated with many other studies in Ethiopia (Alaminie et al. 2021; Worku et al. 2019) and elsewhere in the world (Fan et al. 2022; IPCC 2021). For example, Worku et al. (2019) in the Jemma sub-basin of the Upper Blue Nile Basin illustrated an expected increase in both Tmax and Tmin in the near term (2021–2050) and long-term (2071–2100) periods at RCP2.6, RCP4.5 and RCP8.5 climate change scenarios. Similarly, Alaminie et al. (2021) has also reported the projected increase of Tmax and Tmin in the Upper Blue Nile Basin of Ethiopia during the 2031–2060 and 2071–2100 periods at the four SSPs. A study in the Awash Basin also indicated that the expected increase of Tmax in 2050s and 2070’s periods is from 0.48 to 2.6 °C and 3.4–4.1 °C under RCP4.5 and RCP8.5, respectively (Tadese et al. 2019). Almazroui et al. (2021) has also indicated that the mean annual temperature over Africa is projected to increase in the near term period (2030–2059), and the highest increase is expected in Sahara region. The latest IPCC report also clearly indicated the projected increase in temperature over the global scale, but with different magnitudes of changes (IPCC 2021). Fan et al. (2022) has also reported the expected increase of temperature in the Pan-Third Pole region.

4 Conclusions

This study evaluated the performance of ten CMIP6 GCMs for simulating rainfall, Tmax and Tmin over the BER from the daily to annual temporal scales, and projection of these climate variables at the different AEZs and BER spatial scales using the best performing climate model (s) for the near future (2021–2040), mid-century (2041–2060) and late-century (2081–2100) periods. The result generally revealed that BCC-CSM2-MR, CNRM-CM6-1 and MRI-ESM2-0 have shown consistently better performance for simulating rainfall from the daily to annual temporal scales over the BER. EC-Earth3, Ec-Earth3-Veg and MPI-ESM1-2-LR for Tmax, and CNRM-CM6-1, EC-Earth3 and Ec-Earth3-Veg for Tmin have also shown superior performance in the studied temporal scales. The finding revealed that the best performing model for rainfall (e.g., BCC-CSM2-MR and MRI-ESM2-0) did not perform correspondingly for simulating Tmax and Tmin. In addition, although MPI-ESM1-2-LR for Tmax and CNRM-CM6-1 for Tmin are one of the best performing models, MPI-ESM1-2-LR and CNRM-CM6-1 did not perform well for Tmin and Tmax, respectively. Therefore, the findings from the BER suggests the necessity for evaluating climate models for each climate variables (even for Tmax and Tmin separately) for the better use of the research outputs in the decision-making processes.

Accordingly, the ensemble mean of BCC-CSM2-MR, CNRM-CM6-1 and MRI-ESM2-0 were used for rainfall change analysis at different scenarios. The study used EC-Earth3 for simulating future Tmax since the ensemble mean of best performing three climate models and the remaining two individual models attained negative correlation coefficient. For Tmin, the ensemble mean of CNRM-CM6-1 and Ec-Earth3-Veg were used, which is due to the superior performance of this ensemble mean compared to the ensemble mean of the best performing three models as well as the individual climate models. Climate change analysis for the near future, mid-century and late- century periods at SSP2-4.5, SSP3-7.0 and SSP5-8.5 scenarios were undertaken after bias correction of the selected ensemble mean climate models for rainfall, Tmax and Tmin.

The findings revealed an expected reduction of annual rainfall in the near future in the alpine (2–5%), temperate (11–14%) and sub-tropical (7–9%) AEZs as well as the BER (2–5%) spatial scales at SSP2-4.5, SSP3-7.0 and SSP5-8.5 scenarios, but rainfall is likely increase in the late-century period. On the other hand, rainfall has projected to increase in the tropical AEZ in both the near future (3–11%) and late-century (25–45%) periods. In the mid-century period, rainfall is expected to increase in the tropical AEZ in all the three scenarios, but it exhibits different directions of changes in the remaining AEZs and BER scale at different scenarios. The likely increase of rainfall in the near future in the tropical AEZ may be signs of diminishing of the frequently occurring droughts in the lowland parts of the BER. On the other hand, the expected reductions of rainfall in the near future at alpine, temperate and sub-tropical AEZs may introduce challenges for agriculture, water resources, agro/pastoralists and endemic wild animals. The result denotes that both Tmax and Tmin has likely increase in the near future, mid-century and late-century periods in all the three studied scenarios, where the larger increase is expected in the temperate AEZ. Therefore, this study suggests implementations of best climate adaptation strategies to reduce the impacts of climate change on livelihood. Besides, implementations of climate mitigations measures in the BER in particular and in the wider global community in general are indispensable to limit the speed of temperature increase. This study suggest further research on identifying best climate adaptations strategies for varies sectors/communities in the BER, particularly for agriculture, water resources, agro/pastoralist community as well as endemic wild animals.