1 Introduction

Both observed trends and future climate projections suggest that the broader region of the Middle East and North Africa (MENA) is expected to be greatly affected by global warming (Lelieveld et al. 2016). In this already environmentally stressed part of the world, anthropogenic climate change is likely to induce a further warming, combined with prominent changes in the hydrological cycle. Further challenges such as the rapid population increase and strong urbanization trends, political and religious conflicts, economic instability, will likely enhance the climate change-related impacts and reduce the resilience and adaptive capacity for both human communities and natural ecosystems.

Particularly for precipitation in the region of interest, most of the studies on observed climatic changes focus over the Mediterranean countries. Xoplaki et al. (2000) and Maheras et al. (2004) identified significant negative trends in winter and autumn precipitation amounts over parts of Greece. Zhang et al. (2005) analyzed a large number of Middle East meteorological stations for a long period covering mainly the second half of the twentieth century. They identified strong interannual variability of precipitation but an absence of significant trends. Partal and Kahya (2006) used data from 96 weather stations in Turkey for the period 1929–1993. They suggest a downward precipitation trend, predominantly in the western and southern part of the country. Similarly, for the period 1967–2006, Shaban (2009) identified a declining trend of precipitation and snow cover in Lebanon. Philandras et al. (2011), based on station observations and gridded datasets, suggest that annual rainfall was subject to statistically significant decreasing trends for most of the Mediterranean Basin during the period 1901–2009. These trends were more pronounced during the period 1951–2009 and within the rainy season (October–March). Tanarhte et al. (2012) analyzed six observational and reanalysis datasets. They identified a decreasing trend in rainfall over the Mediterranean during the period 1961–2000 whereas an opposite trend is observed in the Gulf region, where however precipitation amounts are much lower. Hoerling et al. (2012), after analyzing several gridded observational datasets, suggest that droughts in the land area surrounding the Mediterranean Sea have shown an increasing trend, particularly over the last 20 years of the period 1902–2010. Almazroui et al. (2012) identified significant declining precipitation trends for Saudi Arabia and the period 1978–2009. Tramblay et al. (2013) analyzed data from 22 stations of the Maghreb region and for the period 1950–2009. They identified strong tendencies toward a decrease of precipitation amounts and number of wet days together with an increase in the duration of dry periods, mainly for Morocco and western Algeria. Similarly, Donat et al. (2014) analyzed 61 stations of the broader North Africa and Arab regions. They found less significant trends in precipitation comparing to the ones of temperature. However, in the western part of the MENA region, they identified a consistent tendency toward wetter conditions during the past 30 years, whereas in the eastern part, there are some consistent drying trends. In the same context, Ziv et al. (2014) identified a negative precipitation trend of − 15%/decade for Israel and the period 1975–2010, connected with a shortening of the rainy season by 4 days/decade. Noteworthy, most of the available literature does not include more recent data that they can contribute significantly to the statistical analysis of the observed trends in rainfall. In addition, they either include data from a single source or they cover relatively small regions (e.g., of a country level or less) which is often not useful, particularly since many of the surface and groundwater resources of the MENA region are transboundary. Moreover, most of the studies for the region focus exclusively on mean precipitation and do not discuss changes and trends of extreme rainfall.

In order to better monitor these recent past and present climate conditions but also evaluate future changes, reliable and time-consistent meteorological observations are essential. This information is more or less available for some parts of the world (e.g., central Europe, USA, parts of India, southeast Australia). However, in extended areas, including the region of interest, either the coverage of meteorological stations (Fig. 1) or the quality of observations or both are not sufficient. Therefore, gridded daily or monthly meteorological observations are often used in assessing climate trends and validating global or regional climate models. Such products are usually available throughout the globe; they cover relatively large periods and are most of the time open access. In addition, model grid box output in general represents area averages; therefore, differences compared to station observations result not only from model errors but also from the scale gap between grid box and point scale (Maraun et al. 2015). The latter discrepancy, which is not a model error, is known as the representativeness problem (Klein Tank et al. 2009). Considering this, gridded observations are usually more appropriate for assessing the weather/climate models’ skill in horizontal resolutions coarser than 10 km. Apart from studying the recent past climate and use for model validation, such datasets are often employed to correct the biases of global or regional climate models (Teutschbein and Seibert 2012; Paxian et al. 2016).

Fig. 1
figure 1

Maximum number of stations per grid point considered for the compilation of the 1° version of the GPCC gridded precipitation dataset

Besides their usefulness, gridded observational datasets have several limitations, while a number of uncertainties are involved during their compilation. These uncertainties can be related to the interpolation methods used, the density of the meteorological network, or the quality of the observations themselves (Wagner et al. 2012; Contractor et al. 2015; Beguería et al. 2016; Prein and Gobiet 2017). For example, the sparser the observational network, the higher the uncertainty related to the sample error and gridding procedure is (Rudolf et al. 1994). The latter is particularly important for the MENA region where extended areas (e.g., parts of Sahara Desert and Arabian Peninsula) are not monitored at all. An indicative estimation of the global spatial coverage by meteorological stations is presented in Fig. 1. This map depicts the station density per grid point that was considered during the compilation of the Global Precipitation Climatology Center 1° dataset. Apparently, most of MENA is covered by one or none meteorological stations per grid point (≈ 10.000 km2) in this particular dataset. Furthermore, measuring precipitation is a challenging task due to the very local nature of rainfall and short time scales that are often involved. Thus, observational uncertainty related to precipitation is expected to be more evident comparing to other variables such as temperature (Tanarhte et al. 2012). A number of recent studies highlight substantial differences in gridded precipitation products in various environments and locations of the world. Contactor et al. (2015) compared gridded precipitation datasets over Australia and found that in general they agree well for low to moderate daily precipitation amounts but diverge at upper quantiles, indicating that substantial uncertainty exists in gridded precipitation extremes over their study region. Isotta et al. (2015) compared a number of gridded observations over the Alpine region and identified quite different representations of annual mean rainfall. Gómez-Navarro et al. (2012), identified that uncertainties in the observations are comparable to the uncertainties within state-of-the-art regional climate models (RCMs), even in areas covered by dense monitoring networks such as Spain. For a broader European domain, Prein and Gobiet (2017) demonstrated that differences between various data sets can locally have the same magnitude as precipitation errors found in RCMs. Similarly, Kotlarski et al. (2017) highlight that for individual precipitation metrics and sub-regions of Europe, observational uncertainty can dominate over RCM uncertainty. Zittis et al. (2017) studying some exceptional extreme rainfall events for the eastern Mediterranean, identified significant discrepancies between several observational and reanalysis products.

This study serves several objectives, taking into account the importance of accurate rainfall data in assessing the impacts of global warming and in validating climate models over the MENA environment. Considering datasets that provide long-range records, first objective is to assess the centennial rainfall trends and at the same time better understand precipitation interannual variability throughout the twentieth century for the greater MENA region. Furthermore, since additional information is available during the last years, these trends are also calculated and discussed for the very recent decades by also utilizing supplementary gauge and satellite-based datasets. For large parts of the region, this study is an update from the existing literature in regard to observed precipitation climate trends. Besides mean precipitation, several natural processes and socio-economic sectors are impacted from severe weather events. Therefore, a second objective is to investigate if there is a climate change signal in several aspects of extreme precipitation including duration and intensity, by employing six indices of drought and extreme rainfall. A final objective is to explore up to what level the tested datasets agree or disagree over the region of interest. This is achieved by applying straightforward comparison (both spatial and temporal) and by utilizing more sophisticated statistical methods appropriate for climate analysis such as the Kruskal–Wallis and Dunn’s nonparametric tests. Observational uncertainty is quantified, while the statistical significance of the differences between the observational sources is assessed. Discrepancies between datasets are also discussed for the indices of extreme rainfall. While the impact of climate change is already evident in the region, such an uncertainty analysis, particularly over the African and Middle East parts of the study region does not exist. The highlight of potential outlier datasets or the identification of regions of high observational uncertainty will benefit future climatological studies. Despite the many challenges across the region, a further goal of this study is to motivate stakeholders in unrestrictedly sharing meteorological data when it comes to scientific applications such as climate research.

The datasets and methods used are in detail presented in Section 2, while Section 3 includes all results of this analysis. Finally, the general conclusions and some emerging points for discussion are presented in the last section.

2 Data and methods

2.1 Description of data

Five gridded observational datasets are employed in this study. These are the University of East Anglia—Climate Research Unit (CRU), University of Delaware (UDEL), Global Precipitation Climatology Center (GPCC), Climate Prediction Center (CPC), and Climate Hazards Group InfraRed Precipitation with Station (CHIRPS) datasets. The first four are rain gauge-based, while the latter is a blended product that also incorporates satellite information. Their temporal and spatial resolution varies. For the longer range centennial trends, only monthly CRU, UDEL, and GPCC data were available. For the rest of the analysis regarding mean monthly rainfall, the common period of all five datasets (1988–2014) is used. Daily time series, that are essential for analysis of extremes, were available only for CHIRPS, GPCC, and CPC, while their common period is 1988–2016. The selected datasets cover a range of horizontal resolutions (0.25–1°) in order to investigate if any systematic discrepancies are related to their grid spacing. All five data sources are open access. More information and references are provided in Table 1. Other available high-quality datasets such the E-OBS and APHRODITE were not used, since their spatial coverage is more restricted, as it does not include the whole extent of the study region. The same applies for high-resolution national-level gridded datasets that might exist for some MENA countries (e.g., Camera et al. 2014).

Table 1 Description of used gridded observational datasets of precipitation

2.2 Selection of sub-regions

The study region coincides with the MENA-CORDEX domain.Footnote 1 It covers a very wide range of geographical features and climatic regimes, including deserts, high-elevation mountain ranges, forests, and typical Mediterranean climate conditions. In order to present the analysis in a concise way, most results are averaged over nine sub-regions of special interest that more or less encompass areas of similar climatic regimes. These sub-domains are presented in Fig. 2. This subjective selection is briefly described and justified in this paragraph. Located in the northwest part of the domain of interest, Iberian Peninsula is not a part of the MENA region. Nevertheless, it is important for this study to include an adjacent, well-monitored region in order to facilitate the observational uncertainty discussion. The Greater Maghreb region covers most of northwest Africa. Extended regions are characterized by desert climate, while the Mediterranean-type characterizes the northernmost part of Maghreb. The third sub-domain includes the outspread semi-arid Sahel region. It is a transitional zone between the wet tropics and hyper-arid Sahara Desert. Sahel experienced a high-impact strong drying during the second half of the twentieth century (see Dong and Sutton 2015 and references therein). Ethiopian Highlands are particularly important for the water resources of the Blue Nile and all riverside ecosystems and agricultural activities developed in the Nile Basin. The Balkan Peninsula is not traditionally included in the MENA region countries, but it was added in this study for two important reasons. First, as for the case of Iberia, it is interesting to explore the observational uncertainty in a part of southeast Europe covered by a denser network of meteorological stations. Moreover, this region is expected to be a hot-spot of climate change, mainly related to land-atmosphere interactions due to the rainfall-driven changes of the hydrological cycle (Zittis et al. 2014). The adjacent region of Anatolia mainly includes Turkey. Besides the impact in local populations and ecosystems, precipitation changes over this region are particularly important, since it includes the sources of Tigris and Euphrates over the Taurus Mountains. These two rivers play a major role in agriculture and biodiversity conservation of the downstream Mesopotamia region. The Levant sub-domain includes the semi-arid easternmost part of the Mediterranean. Parts of this region have suffered from severe and prolonged drought events over the last decade (Bergaoui et al. 2015; Kelley et al. 2015), and Cook et al. (2016) highlighted that for some Levantine countries, there is an 89% likelihood that this drought was drier than any comparable period of the last 900 years and a 98% likelihood that it was drier than the last 500 years. A large part of the Arabian Peninsula is also included in the sub-region analysis in order to discuss precipitation trends and observational uncertainty in another challenging hyper-arid environment. Finally, the greater Iran region was also included as a highly populated part of the world that is also facing water scarcity issues (Madani et al. 2016). The sub-region analysis was performed in the native grid of each dataset, and results were then averaged over the geographical boundaries presented in Fig. 2.

Fig. 2
figure 2

Definition of sub-regions for analysis

2.3 Linear trends and statistical tests

As mentioned in Section 1, the first objective of this study is to identify the long-term annual precipitation trends by utilizing up-to-date observational data for the MENA region. A simple linear regression model is used in order to facilitate the calculation of these trends. It is based on Wilkinson and Rogers (1973), while the significance of the trends is assessed by testing the p value of the F-statistic in the 95% significance level. More details about the model’s formula can be found in Chambers (1992).

Linear rainfall trends, based on the longer records of CRU, GPCC, and UDEL datasets, were computed for their common time span (1901–2014). Additionally, this trend analysis is performed for four sub-periods (P1: 1901–1930; P2: 1931–1960; P3: 1961–1990; P4: 1991–2014). Noteworthy, these sub-periods were selected in order to be consistent with the World Meteorological Organization (WMO) climatological standard normals.Footnote 2 The trend analysis is also repeated for the most recent years (1988–2014) where data were available from all the five tested datasets. Besides annual rainfall, linear trends were similarly calculated for the annual values of six drought and extreme precipitation indices that are presented in the following section in order to fulfill the second objective of this study.

The third and main objective is to investigate up to what extend the five selected datasets of precipitation are in agreement or not in terms of precipitation amounts, interannual trends, intra-annual variability, and spatial distribution. The first approach is to apply a straightforward comparison (distance from the ensemble mean, time-series analysis over sub-regions, precipitation range between datasets). The intra-annual distribution of rainfall is also discussed considering the mean conditions for each month of the year. The second approach is the application of more sophisticated statistical methods in order to assess the significance of potential discrepancies. These tests should consider the non-normal nature of precipitation and also be appropriate for multivariate analysis since more than two samples are compared. Such a statistical test is the Kruskal–Wallis H test (Kruskal and Wallis 1952). It is a nonparametric statistical procedure for comparing more than two samples that are independent, or not related (Corder and Foreman 2009). It extends the Mann–Whitney U test when there are more than two samples. Since it is a nonparametric method, the Kruskal–Wallis test does not assume a normal distribution of the tested samples, which is the case for precipitation. The parametric equivalent to this test is the one-way analysis of variance (ANOVA). A detailed description on the calculation of the H statistic of the Kruskal–Wallis test is given in Yu et al. (2006) and Corder and Foreman (2009). For this study, the null hypothesis is that the distributions of the monthly values of all observational datasets are equal. When the Kruskal–Wallis H test leads to statistically significant results, then at least one of the datasets is different from the rest. The level of statistical significance is set to 95%. Noteworthy, the test does not identify where or how many differences occur. To identify the particular differences between sample pairs, the use of post hoc tests, to analyze the specific sample pairs for significant difference(s).

One such post hoc test that reports the results among multiple pairwise comparisons after a Kruskal–Wallis test is the Dunn’s test (Dunn 1964). Its interpretation requires the assumption that the cumulative distribution function (CDF) of one group does not cross the CDF of the other, which is more or less true for the gridded datasets of monthly rainfall. The null hypothesis for each pairwise comparison is that the probability of observing a randomly selected value from the first group that is larger than a randomly selected value from the second group equals one half (Dino 2017).

The Kruskal–Wallis and Dunn’s tests were applied on the monthly time series for the common period of the five datasets (1988–2014) and on a grid point level. There are some applications of the Kruskal–Wallis test in the field of climatology and atmospheric sciences but are limited in number (Chew and McMahon 1996; Yu et al. 2006; Nogués-Bravo et al. 2008; Georgoulias et al. 2015; Wakelin et al. 2015; Javari 2016).

For the applications where direct point-to-point comparison between the different datasets was necessary (maps of ensemble mean and ranges, Kruskal–Wallis and Dunn’s tests), all datasets were re-gridded in the native grid of the CRU dataset (≈ 50 × 50 km) by applying a bilinear interpolation method.

2.4 Extreme precipitation indices

Besides mean monthly and annual rainfall, extreme precipitation is also considered in the analysis. Six indices, among the ones suggested by WMO’s Expert Team on Climate Change Detection and Indices (ETCCDIFootnote 3), were calculated. However, this is the case only for the three datasets that provided daily precipitation values (GPCC, CHIRPS, CPC). Linear trends of the indices, interannual variability, and differences between the three daily datasets for their common period of coverage are discussed. In consistence with mean rainfall, calculations of the indices and trends were performed on a grid point level and were then spatially averaged over the grid points of each sub-region. A short definition for each of the selected indices follows.

  • Consecutive dry days (CDD): The greatest number of consecutive days with precipitation lower than 1 mm, within a year.

  • Consecutive wet days (CWD): The greatest number of consecutive days with precipitation higher or equal to 1 mm, within a year.

  • Annual count of rainy days (RR1): The annual count of days with observed rainfall greater than 1 mm.

  • Annual count of days with precipitation larger than 20 mm (R20).

  • Highest 5-day precipitation amount for each year (R5D).

  • Simple precipitation intensity index (SDII): Annual sum of precipitation during wet days (precipitation > 1 mm) divided by the annual count of wet days.

In this study, the threshold for a rainy day is set to 1 mm. This was mainly decided in order to follow the WMO guidelines for the calculation of the indices. Furthermore, the selection of a smaller threshold of 0.1 mm that is sometimes used would likely affect the results, particularly for the calculation of wet and dry spells (i.e., CDD and CWD indices). Such an occurrence of a 0.1-mm rainy day could either break some of the dry spells or will result in a larger number of consecutive wet days while the actual precipitation of that particular day is negligible and without any practical significance. Moreover, such an overestimation of the number of wet days per year would also affect the RR1 and SDII indices.

3 Results

3.1 Centennial precipitation trends

The century-long precipitation time series averaged over nine sub-domains of the MENA region are presented in Fig. 3. These are based on observations from three datasets (CRU, GPCC, and UDEL). The rainfall means, the interannual standard deviation (% of mean annual rainfall), and the linear trends are summarized in Table 2. Similar results for four sub-periods are presented in Table 3.

Fig. 3
figure 3

Mean annual precipitation for the period 1901–“Today,” averaged over nine sub-regions and based on CRU (black), GPCC (light blue), and UDEL (pink) gridded observations. Lines indicate the linear trends while curves are the 5-point moving averages.

Table 2 Mean annual rainfall (mm/year), standard deviation (% of mean annual rainfall), and precipitation trends (mm/decade) for the period 1901–2014 averaged over nine sub-regions and based on CRU, GPCC, and UDEL gridded observations. Statistically significant trends (95% confidence) are highlighted in italics
Table 3 Same as Table 2 for four sub-periods defined as P1: 1901–1930; P2: 1931–1960; P3: 1961–1990; P4: 1991–2014. Statistically significant trends (95% confidence) are highlighted in italics

Over the last century, for the Iberian Peninsula (Fig. 3a), there are no significant trends observed in the long term. In general, the three datasets are in good agreement with the whole period’s annual precipitation ranging between 655 and 695 mm/year, with CRU dataset being the driest. The interannual variation is about 13–14% of the latter rainfall amounts. Interestingly, the first and third sub-periods are characterized by negative precipitation trends, while the second and fourth sub-periods by precipitation increase. However, the negative trends of the third period are the most profound ones (from − 26 to − 38 mm/decade) while they are statically significant for the UDEL dataset. Another interesting finding is the low interannual variability observed over Iberia in the beginning of the twentieth century (Table 3). Precipitation over the Iberian Peninsula is known to be controlled from large-scale teleconnection patterns and mainly from the North Atlantic Oscillation (NAO). During the driest sub-periods, positive NAO phases prevail (not shown). This strong link is due to the control exerted by NAO on the branch of the storm track affecting winter precipitation in the Mediterranean, mainly in its western part (Lionello et al. 2006).

Precipitation over the drier Maghreb region shows a general decreasing trend over the past 115 years (Fig. 3b). This trend is more pronounced (≈ − 3.5 mm/decade) and statistically significant for the UDEL and GPCC datasets. Interestingly, the latter dataset is found to be much wetter than CRU and UDEL (about 40 mm/year more). Noteworthy, CRU interannual variability is found to be much smaller. As in the case of Iberia, the sub-period analysis (Table 3) indicates declining rainfall trends over Maghreb for the first and third sub-periods, while in general positive trends are observed in periods 2 and 4.

An overall drying is also observed over the region of Sahel (Fig. 3c). As seen in Table 2, this trend is statistically significant for CRU and GPCC datasets. On average, this zonally extended region receives about 250–280 mm of rainfall per year. In agreement with the previous sub-regions, CRU is again found to be drier. Particularly for the first sub-period, CRU annual rainfall is about 50–60 mm less, while the linear trend has an opposite sign from the GPCC and UDEL. Sahel experienced the strongest drying (≈ − 35 mm/decade) over the period 1961–1990, with this negative trend being significant for all three datasets. During that period, interannual rainfall variability is found to be higher. For the most recent sub-period, the three datasets suggest positive trends and a precipitation recovery. This shift from the dry conditions during the 1970s and 1980s to wetter conditions after the 1990s is in agreement with other studies and is likely related to the observed differential heating between northern and southern hemisphere Atlantic sea surface temperatures that apparently control the regional cyclonic circulation (see Munemoto and Tachibana 2012 and references therein).

Ethiopian Highlands (Fig. 3d) is the sub-region that receives most rainfall (≈ 1000 mm/year). A general decline of precipitation is observed during the last century with statistically significant trends reported only from the GPCC dataset (Table 2). On the contrary, UDEL suggest a less significant trend. As Table 3 indicates, these trends are stronger during the third sub-period (1961–1990). Interestingly, the interannual variability of rainfall is also reduced from 10 to 12% of the mean in the beginning of the century to 6–8% during the most recent sub-period.

For the well-monitored Balkan region (Fig. 3e), all three datasets are found in a good agreement. According to Table 2, the annual mean precipitation is about 710 to 740 mm per year, while the interannual variability is ± 12% of this amount. An overall decreasing trend is observed over the last century, and this trend is mainly biased from a strong rainfall reduction of more than 40 mm/decade during the period 1961–1990. This is found to be statistically significant according to the CRU and UDEL datasets. Interestingly, precipitation amounts in the Balkan sub-domain appear to be recovering over the last 30 years. Rainfall interannual variability is also found to be increased during the more recent decades. These recent trends are going to be discussed in more detail in the following section.

During the last 115 years, precipitation over Anatolia (Fig. 3f) has shown a decline that is pronounced only over the second and third sub-periods. GPCC is found to be much wetter from the other two datasets, while it also shows the most profound negative trend during the overall dry 30-year period of 1961–1990. Regarding the signal of the trends, the three datasets are more or less in agreement with the exception of the fourth sub-period, where UDEL suggests a drying while the other two datasets suggest and increasing precipitation trend. As for the case of Iberian Peninsula, the century-long interannual variability over the Balkans and Anatolia is mainly controlled by fluctuations of NAO but also from other teleconnection patterns such as the North Sea - Caspian Pattern (Kutiel et al. 2002; Lionello et al. 2006; Türkeş et al. 2009).

The Levant region is under a much drier rainfall regime (≈ 250 mm/year). Precipitation shows a negative trend which is statistically significant for all three datasets but is much stronger for GPCC and UDEL. Interestingly enough, this is the sub-region with the higher interannual variability in terms of percentage of mean precipitation, while it is one of the few regions where clear negative trends prevail in all sub-periods. For large parts of Levant, these observed long-term declining trends in precipitation are likely a consequence of human interference with the climate system (Kelley et al. 2015).

Similar negative trends, more pronounced over the recent sub-period, are found for the Arabian Peninsula. On average for the twentieth century, annual rainfall is slightly higher than 100 mm. CRU dataset is found somehow drier, while it also shows the smallest interannual variability and linear trend.

For most of the last century, rainfall over Iran did not show significant trends. This is consistent with Kousari and Asadi Zarch (2011) who however used 55 years of station records. CRU data are found much drier comparing to UDEL and GPCC. As seen in Table 3, discrepancies between the three datasets are most pronounced when the analysis is performed for the sub-periods. Indicatively, over the most recent one (1991–2014), CRU and GPCC indicate a strong declining trend for rainfall (more than − 25 mm/decade) which is not so evident for the UDEL data.

3.2 Recent precipitation trends

The five tested datasets are found to have substantial differences in the mean totals per year for their common period of coverage (1988–2014). These differences range from about 40 mm/year for the dry Maghreb region to 340 mm/year for the wettest Ethiopian Highlands. According to Fig. 4 and Table 4, CPC is generally found to be the driest dataset for most sub-regions. Particularly for the Ethiopian Highlands and the period 2000–2012, CPC is unrealistically drier comparing to all the other observations. CPC is also a prominent outlier for Iran and the period 1997–2001 in particular. Thus, for these two sub-regions, the use of this dataset is not recommended. This feature might be indicative for deficiencies in the construction of this dataset, likely related to long periods of missing raw meteorological data in combination with the complex orography over Iran and Ethiopian Highlands. On the other hand, GPCC has a strong wet tendency in several sub-regions including Maghreb, Sahel, Levant.

Fig. 4
figure 4

Mean annual precipitation for the period 1988–“Today,” averaged over nine sub-regions and based on CRU (black), CHIRPS (purple), CPC (gray), GPCC (light blue), and UDEL (pink) gridded observations. Lines indicate the linear trends while curves are the 5-point moving averages

Table 4 Rainfall annual means, interannual variability (standard deviation: STD), and linear trends for the period 1988–2014. Statistically significant trends (95% confidence) are highlighted in italics

Interestingly, the interannual variability of rainfall (expressed as a percentage of mean rainfall) is higher for the dry regions of Middle East and Iran, while in general discrepancies between the tested datasets are also evident in terms of their standard deviation (Table 4).

The recent linear precipitation trends for each sub-region are summarized in the bottom section of Table 4. Safe results can be extracted only for a few sub-regions, considering the profound discrepancies between the means in the observational datasets. Most prominent and evident in all datasets is a positive trend for the region of Balkans (70–90 mm/decade). This trend is statistically significant according to all gridded datasets and could be an indication of a recovery of rainfall levels to normal values after a very strong drying in the period 1975–1991. Two of the three daily datasets agree on similar positive trends for the adjacent region of Anatolia. An agreement between observations is also found regarding the drying over the Arabian Peninsula. These negative trends of precipitation (of about minus 3–15 mm/decade) are statistically significant for the UDEL and CHIRPS datasets. An absence of significant trends during the period 1988–2014 is also consistent for the sub-regions of Iberian Peninsula and Maghreb. Unfortunately, for regions such as Sahel, Ethiopian Highlands, Levant, and Iran, the direction of the trends is not clear since some of the datasets not only disagree in the magnitude but also in the sign. Particularly for Sahel, a number of studies suggest positive precipitation trends after the 1990s (see Munemoto and Tachibana 2012 and references therein); therefore, the CPC and GPCC datasets are likely outliers for this region, since they indicate slightly negative trends.

3.3 Extreme precipitation indices

Six drought and extreme precipitation indices that were calculated for the three datasets that provide daily values (CHIRPS, CPC, GPCC) are presented in Table 5, in addition to their standard deviation and linear trend values. The common period for these daily datasets is 1988–2016. Overall, this analysis suggests some discrepancies between these three observational datasets, while the identification of robust signals in the trends of the selected indices was not possible.

Table 5 Extreme precipitation indices, interannual variability (standard deviation: STD), and linear trends (index units/per decade) for the period 1988–2014. Significant trends (95% confidence) are highlighted in italics

As expected, the maximum number of CDD per year is much larger (more than 200 days per year) over the arid sub-regions of Maghreb, Levant, and Arabian Peninsula. For six out of the nine sub-regions, CHIRPS dataset is found to be more conservative than GPCC and CPC, suggesting a shorter length of extreme dry spells. Moreover, CHIRPS in general exhibits weaker interannual variability in the value of this index. Negative trends of CDD are found over the Balkans, Anatolia, and Iran. Nevertheless, the trend is significant only for the latter location and CHIRPS dataset. On the other hand, positive, but non-significant, CDD trends are found over Levant, Sahel, and Ethiopian Highlands. For Ethiopia in particular, results are in agreement with other studies such as Viste et al. (2013) who suggest a more frequent occurrence of droughts over the last decades. An increasing length of dry spells over the Levant is also suggested in Donat et al. (2014).

The annual maximum number of CWD is found to be larger over the high-elevation region of Ethiopian Highlands (up to 10–20 days/year). For the rest of the study region, this number ranges between 2 and 9 days per year. GPCC consistently indicates larger values for this particular index. A prominent example is the case of Ethiopian Highlands where GPCC suggests a double number of the maximum wet spell length per year comparing to the other two daily datasets. For most cases, the interannual variability of CWD is generally small (less than 1 day/per year). Consistent significant positive trends are found only over Iran and the Balkans and Anatolia region for CHIRPS and CPC datasets, while the significant negative trend identified over Ethiopian Highlands for GPCC is likely an outlier.

The RR1 per year is ranging from less than ten for the hyper-arid Arabian Peninsula to more than a hundred for the wetter Ethiopian Highlands. Inter-annual variability of RR1 is higher over the wetter sub-regions. As was the case for CWD, GPCC precipitation dataset consistently suggests a higher number of individual rainy days per year comparing to CHIRPS and CPC. This pattern is more evident for the wetter sub-regions such as Ethiopian Highlands, Iberian Peninsula, Balkans, and Anatolia. In general, the linear trends for this index are not consistent between the three datasets, with an exception of the significant increasing trends over the Balkans and Anatolia. An example is the region of Iran where CHIRPS and CPC exhibit significant positive trends while statistically significant negative trends in the annual count of rainy days are manifested in GPCC rainfall data.

The R20 is found larger for the wetter sub-regions of Iberia, Balkans, and Ethiopian Highlands. This index is generally higher for CHIRPS observations, while this particular dataset shows higher interannual variability, at least for the aforementioned wet regions. Trends of R20 are either very small or not consistent with the exception of the Balkans, where up to one additional heavy precipitation day per decade is suggested from CPC and GPCC observations. A number of observation-based studies for the Balkan region also suggest an increasing trend in the frequency of extreme precipitation days after the 1990s (Bocheva et al. 2009; Nastos and Zerefos 2008).

Some differences between the gridded observations are also evident regarding the annual maximum 5-day rainfall amounts (R5D). These discrepancies are ranging between 10 and 15 mm for the wetter sub-regions. For most sub-regions, CHIRPS is the dataset that shows the highest values. Trends for R5D during the period 1988–2016 are found very noisy. For example, over Iran CHIRPS data suggest a statistically significant decrease of this index, while CPC suggests a significant increase. Similar inconsistencies are found over other sub-regions with the exception of the Balkans where two datasets (CPC, GPCC) suggest a significant increase of this extreme precipitation index.

The SDII ranges from 4 to 10 mm per rainy day. For the wetter regions, CHIRPS data tend to overestimate this index comparing to the other two daily datasets. For the drier parts of the study region, the differences between the observational sources are less profound. The interannual variability of SDII is generally small, and the three datasets are found in general agreement, which is not the case for the linear trends. The most indicative examples are the opposite sign but at the same time statistically significant trends over Balkans and Iran and between CHIRPS and CPC SDII values.

3.4 Observational uncertainty

3.4.1 Range of multi-year climatology

In this section, the range of the selected precipitation datasets and the distance from their ensemble mean are being discussed. Their ensemble mean is not necessarily representing the reality throughout the MENA domain; however, it is used as a representative point of reference. Results are presented in terms of multi-year annual climatologies calculated for a common period. This type of multi-year average annual climatology is often used as the reference for climate model validation. Figure 5a depicts the ensemble mean annual rainfall based on all five datasets and for the period 1988–2014. The rest of the panels in the same figure present the differences of each precipitation dataset from the ensemble mean in order to identify potential outliers or “problematic” regions. The same discrepancies presented in percentages are depicted in Fig. 6, since MENA encompasses large regions that receive very limited precipitation, a fact that makes comparison with absolute values hard. CRU is in general agreement with the ensemble mean.

Fig. 5
figure 5

Ensemble observed mean annual precipitation for the period 1988–2014 (a) and differences of each particular dataset from the ensemble mean (bf)

Fig. 6
figure 6

Same as Fig. 5 with the differences expressed in percentages

The most pronounced differences are found in west and central tropical Africa and in the Ethiopian Highlands (Figs. 5b and 6b). Parts of these regions are found to be either drier or wetter than the ensemble mean. Furthermore, this particular dataset is found to be relatively drier than the ensemble mean in the west parts of Sahara Desert and relatively wetter in the eastern parts of the latter and in south-west Arabian Peninsula. Nevertheless, the annual precipitation in these regions ranges from near zero to less than a hundred millimeter per year. CRU is also found wetter than the ensemble mean in the north-east part of the MENA domain and the westernmost part of the Himalayan range. Similarly, CHIRPS precipitation (Figs. 5c and 6c) over north Africa and the Arabian Peninsula deviates from the ensemble mean with discrepancies high in percentages but low in precipitation amounts. CHIPRS is also found wetter in most of tropical Africa but relatively drier over the Ethiopian Highlands. Large mean annual rainfall differences are found between the ensemble mean and the CPC gridded precipitation (Figs. 5d and 6d). For most of the MENA domain, this dataset is much drier than the ensemble mean. Most significant are the discrepancies over the tropical part of the domain while this is also evident in south Europe and western Asia. This distance from the ensemble mean is an indication of problematic representation of precipitation over the aforementioned regions that was also evident from the time series analysis presented earlier. The coarser resolution GPCC dataset is identified to be wetter than the ensemble mean particularly over the tropics (Figs. 5e and 6e). Exception is the eastern Sahara and southern Arabian Peninsula. Finally, the University of Delaware precipitation climatology (Figs. 5f and 6f) is found to be in general agreement with the ensemble mean with some discrepancies observed mainly in the tropical part of the MENA region, while this particular dataset is found to be wetter than the ensemble mean for most of the dry parts of the study domain including North Africa and the Arabian Peninsula. The fact that all datasets, besides CPC, are found to be wetter than their ensemble mean over the tropical part of the study regions is explained by the fact that this ensemble also considers the very dry CPC dataset.

The range of the 1988–2014 climatologies derived from the five gridded observations is shown in the two panels of Fig. 7. On the left it is presented in terms of precipitation amounts, while on the right panel as a percentage of the annual total. Although the averaging over long periods should have smoothed any differences, discrepancies between the observational datasets are evident throughout the extent of the Mediterranean/MENA region. Hot spots of differences are mainly located near the tropics, where the higher rainfall amounts are observed. Over this particular part of the study domain, the range of the annual totals between the five datasets is found to be greater than 700–800 mm/year. Locally, this range reaches 2000 mm per year over Guinea, Sierra Leone, and Liberia in Western Africa. Interestingly, as the range in terms of percentages suggests, in the drier Sahara and Arabian Peninsula, the range of observational uncertainty is almost equal to the annual precipitation totals (Fig. 7b). Despite of the better monitoring network over southern Europe, the maximum distance between the tested datasets is found to be of at least 200 mm per year on average which is about 20% of the annual rainfall.

Fig. 7
figure 7

Mean annual precipitation range between five gridded precipitation datasets (CRU, CHIRPS, CPC, GPCC, and UDEL) expressed in absolute rainfall amounts (left panel) and percentages of mean annual precipitation (right panel). Results are based on the period 1988–2014

3.4.2 Annual cycle of precipitation

The range of the monthly climatologies between the five observational datasets, averaged over the sub-regions of special interest and the common period 1988–2014 is presented in Fig. 8. For Iberia, precipitation on average shows a bimodal distribution with one maximum in boreal autumn and a second maximum in the late spring season (Fig. 8a). The driest month is July with an average precipitation of about 20 mm per month. Regarding outliers, CPC is found to be one, since it is consistently drier than the rest. This is more evident during the first half of the year. The rest of the datasets seem to be in closer agreement with the exception of spring and autumn months, where the two maxima are observed. Maghreb region shows a typical Mediterranean climate rainfall distribution with a very dry summer season and a relatively wet winter; however, precipitation does not exceed 10–15 mm/month on average (Fig. 8b). GPCC dataset is more wet throughout the year, while the range for the rainy winter months is about 5 mm/month or about 30% of the monthly total. The Sahel region (Fig. 8c) receives most of its rainfall between June and September, a fact connected with the north-south displacement of the Intertropical Convergence Zone (ITCZ) throughout the year. During this wet period, the five datasets are found to have substantial differences. For example, for this sub-region and the months of June, July, August, and September, the range between the wettest (GPCC) and the driest dataset (CHIRPS) is more than 70 mm which is roughly the one fourth of Sahel’s annual precipitation. On the other hand, all datasets seem to agree for the extended dry period of the year. Precipitation in Ethiopian Highlands (Fig. 8d) shows a bimodal evolution with an absolute maximum in July–August (≈ 160 mm/month) and a secondary peak in May (≈ 100 mm/month). Most datasets are in agreement throughout the year with some deviations of about 10–20 mm/month observed during the secondary peak of spring months. Exception is the CPC gridded dataset which is consistently found to be much drier. On average, the Balkan Peninsula (Fig. 8e) nicely follows a Mediterranean-type rainfall distribution with dry summers and maximum in the late boreal autumn and early winter seasons (November–January). In particular, during the December rainfall peak, average monthly precipitation sum is around 100 mm. CPC is again found to be a dry outlier, while the overall spread ranges from more than 20 mm in December to about 10 mm in September. Very similar is the intra-annual rainfall distribution for the adjacent sub-regions of Anatolia (Fig. 8f) and Levant (Fig. 8g). The latter is however much drier, and the 10 mm/month divergence, observed in the winter months, is a more significant portion of the annual total. For both sub-regions, December and January are typically the wettest months, while during summer months, precipitation is practically near-zero over the Levant. For these two regions, GPCC dataset is found to be wetter during most of the year, including the rainy season while CPC is again found to be on the dry end. The Arabian Peninsula (Fig. 8h) is by far drier than the rest of the investigated sub-regions, and rainfall is controlled by the displacement of ITCZ as indicated by the two peaks. The primary maximum usually occurs during the northern-hemisphere spring months, while the secondary peak is observed in late summer. A third peak during November–December months is probably due to the Active Red Sea Trough Events (de Vries et al. 2016). Precipitation amount during these periods is about 10 mm/month. The spread between the observational datasets is found to be high throughout the year, nevertheless, the very low absolute amounts should also be considered. For all months, CPC dataset is found to be the dry outlier. Finally, Iran shows a precipitation maximum in boreal winter and early spring months (Fig. 8i). An extended very dry period typically occurs between June and September. Besides the dry CPC, all datasets are in good agreement for the spring and summer months, while deviations of about 10 mm/month are observed for the rest of the year. A wet outlier for the region of Iran and particularly for the second half of the year is the University of Delaware dataset.

Fig. 8
figure 8

Monthly precipitation distribution averaged for the period 1988–2014 over nine sub-domains

3.4.3 Probability density functions

The probability density functions (PDFs) of monthly rainfall are presented in Fig. 9. In calculating these PDFs, all observed data points within each sub-region’s extent were used for each dataset. Rather than assessing particular extreme events, and since only monthly data were available for all the datasets, this type of plots is useful in order to identify similarities/discrepancies during the very extreme wet months or over the rainiest grid points of each sub-region. Furthermore, they can also be used for the distribution inter-comparison of the “near-normal” months of high probability of occurrence. For most cases, the monthly precipitation is described by a positively skewed lognormal type of distribution. Since for most of the MENA region months of low rainfall are commonplace, kurtosis of the distribution is higher over near-zero values (Fig. 9). In order to facilitate the inter-comparison between the datasets over these near-zero rainfall values, the PDFs of Fig. 9 are zoomed in for the dry end of the monthly rainfall distributions and are presented in Fig. 10.

Fig. 9
figure 9

Probability density functions of monthly precipitation for the period 1988–2014 over nine sub-domains

Fig. 10
figure 10

Probability density functions of monthly precipitation for the period 1988–2014 over nine sub-domains, zoomed in for the dry end of the distributions

Some general conclusions regarding monthly rainfall and the differences/similarities between the tested datasets can be extracted. Regardless their significant horizontal resolution difference (0.25 vs. 1°), the CHIRPS and GPCC datasets (purple and light blue curves in respect) are found to have a tendency for more extreme wetter values comparing to the other datasets. This is the case for the vast majority of sub-regions. On the other hand, CPC has a tendency to overestimate the probabilities of the very dry months. Another useful observation is the good agreement in the distribution of monthly rainfall between the CRU (dashed black curve) and UDEL (pink curve) datasets for most sub-regions.

As expected, for the most arid sub-regions such as Maghreb and Arabian Peninsula, near-zero precipitation months have a very high probability of occurrence. Particularly for Maghreb (Fig. 10b), while all datasets suggest higher probabilities for near-zero monthly rainfall amounts, CHIRPS distribution is shifted toward wetter conditions. This is also the case for the wet end of the distribution where rainfall during the wettest months is near 1400 mm for CHIRPS, almost double the amount indicated by GPCC and about three times more than the other datasets (Fig. 9b). These few grid points of extreme monthly precipitation reported from CHIRPS are located over the Canary Islands in a very limited number of cases and could be potential outliers. The rest of the datasets are not fine enough to resolve this group of islands. For the Arabian Peninsula, there are some differences in the dry part of the distributions (Fig. 10h); however, the rainfall amounts can be considered negligible. For the rare extreme wet months and grid points, the gridded observations suggest an absolute maximum ranging from 300 to 450 mm/month with CRU dataset being the wettest. The PDF of Sahel is very similar, nevertheless the near-zero precipitation months have a lower probability of occurrence, while the datasets are found to be in agreement (Fig. 9c and Fig. 10c). This is also the case for the semi-arid regions of Levant and Iran (Fig. 9g, i and Fig. 10g, i). For the sub-regions of Iberian Peninsula, Balkans and Anatolia wetter conditions (50–200 mm/months) have a higher probability of occurrence compared to the previously discussed cases (Fig 10a–f). Some discrepancies between the datasets can be observed; however, CPC probabilities are higher toward drier conditions, while CHIRPS precipitation is consistently showing higher absolute precipitation maxima. Ethiopian Highlands is the sub-region that receives higher amounts of rainfall. As the PDFs of Fig. 9d and Fig. 10d indicate, CPC is again found to overestimate the occurrence probability for dry months. On the other hand, CRU is found to overestimate extreme wet conditions with absolute maximum of 2000 mm/month, much higher than the rest of the datasets. Nevertheless, this outlier is the case for only one particular month and a very limited number of grid points.

3.4.4 Application of statistical tests

Results of the application of Kruskal–Wallis and Dunn’s tests are presented in this section. They were both applied on the monthly precipitation time series on a grid point basis. In more detail, the p values of the Kruskal–Wallis test applied on the monthly precipitation time series of five gridded datasets are presented in Fig. 11a. The very low p values (< 0.05), highlighted in dark colors, indicate regions where monthly precipitation distribution of at least one, out of the five tested datasets, differs from the rest. This is the case for about 85% of the land grid points covering the broader MENA region, while this difference exceeds the 95% significance level. The remaining 15% of grid points in general agreement is mostly located over the Sahel region and in a latitudinal zone between 10 and 20° north.

Fig. 11
figure 11

p values of the Kruskal–Wallis test for all datasets (a) and for all datasets excluding CPC (b). Dark, hatched areas where p value is less than 0.05 indicate statistically significant differences in the 95% confidence level

Since the Kruskal–Wallis test does not provide information on which dataset or datasets are the ones that differ, post hoc Dunn’s test is applied for a pairwise comparison of the distribution of monthly rainfall between all datasets. Results of this post hoc test, for all possible combinations of pairwise comparison, are summarized in Table 6 and are also depicted in the maps of Fig. 12.

Table 6 Percentage of grid points that have statistically significant differences (95% confidence level) in the distribution of monthly precipitation according to the pairwise Dunn’s test
Fig. 12
figure 12

p values of the pairwise comparison Dunn’s test. Dark, hatched areas where p value is less than 0.05 indicate statistically significant differences in the 95% confidence level

In general, differences among the five datasets are strongest over the dry parts of the region. As suggested from most of the analysis of the previous sections, CPC dataset is an outlier for the region. This is also confirmed from the results of the Dunn’s test. The pairwise comparison of the monthly precipitation time series indicates significant differences CPC and each of the CRU, CHIRPS, and UDEL datasets for about 70% of the land grid points that had available rainfall data. Extended regions of consistent and statistically significant differences from CPC are found in the wet tropics, Central Africa, the arid Arabian Peninsula and parts of North Africa (Fig. 12b–i). On the other hand, despite their horizontal resolution difference, CHIRPS and GPCC show the strongest agreement (Fig. 12e). This is evident over most of the MENA region with the exception of the drier North Africa and Arabian Peninsula regions. The widely used CRU data are in agreement with CHIRPS and GPCC for about half of the total grid points (Fig. 12a, d). Nevertheless, significant discrepancies in the precipitation distribution between the aforementioned datasets are also found. The latter are mainly located over parts of North Africa and Arabian Peninsula.

Interestingly, when the outlier CPC dataset is excluded from the application of the Kruskal–Wallis, the spatial extent of the disagreement is limited from 85% to less than 70% of all grid points of the MENA region (Fig. 11b).

4 Conclusions and discussion

Century-long and more recent precipitation trends were calculated taking into account a number of available state-of-the-art gridded datasets. The long-range trends indicate an overall drying of the MENA region, since the beginning of the twentieth century, with significant trends mainly over the Maghreb, Sahel, Levant, and Arabian Peninsula sub-domains. Particularly important is the strong and significant drying over the period 1961–1990 that most of the region of interest has experienced. The impact of global warming in precipitation over the investigated parts of MENA is apparently not as straightforward as it is for temperature. Modes of internal climate variability, such as NAO, are likely still the dominant drivers. Further investigation in the impact of anthropogenic climate change in rainfall trends over the broader Mediterranean and MENA regions is suggested.

The recent precipitation trends are less pronounced and robust with the exception of significant precipitation increase in the regions of Balkans and Anatolia that were found to be recovering from the strong drying of the previous decades. The trends of six indices related to extreme rainfall/drought are not profound. Exceptions are again the positive trends of heavy precipitation indices for the Balkans and Anatolia sub-regions. The latter were found statistically significant for most datasets and for most indices related to precipitation amounts and number of rainy days.

These results regarding long-range and more recent observed rainfall trends are in general agreement with the available literature. However, in this study, both the length of the time series and the region of interest are extended.

At each stage of the analysis and for all time scales, discrepancies between the tested datasets were observed throughout the greater Mediterranean/MENA region. These are evident not only for mean climate conditions but also for the indices of extremes. Profound differences are also found regarding rainfall interannual variability and climate trends. This is the case also for the sub-regions with a sufficient coverage of meteorological stations (e.g., Iberia, Anatolia, Balkans). This could be related to the different grid spacing, the various interpolation methods used or different treatment of periods of missing information in the raw meteorological records.

Interestingly, for about 85% of the tested land areas, the monthly distribution of rainfall has significant differences at least for one out of the five tested datasets. The analysis revealed the CPC dataset as an outlier for the region, as it is found to be much drier than the rest. In terms of rainfall quantities, this was more evident for the wetter regions toward the southern parts of the domain. For a domain covering mostly the Mediterranean Basin, Tanarhte et al. (2012) also point out that CPC annual precipitation over particular regions deviates from the ensemble mean of several datasets. On the other hand, CHIRPS were found to give higher values for the extreme precipitation indices and higher probabilities of very wet months. The averaging over relatively large sub-domains is probably smoothing the inconsistencies between the different datasets; therefore, discrepancies over smaller areas are expected to be much higher. Interestingly, the spatial resolution of the gridded datasets was found not to be responsible for any systematic behavior, since for different periods or locations datasets with finer or coarser resolution were found to be either wetter or drier. The current results can be considered valid only over the discussed region of MENA. In other parts of the world, the inconsistencies between the tested datasets might be smaller or even larger, while the outliers identified in this study might be performing differently.

In this study, the observational uncertainty is mainly discussed on the monthly and multi-annual means of precipitation. However, during particular extreme events of the recent past, this observational uncertainty might be even larger (e.g., Zittis et al. 2017).

Finally, when it comes to model validation and specifically for precipitation or other water cycle-related variables, the issue of observational uncertainty should be seriously considered. Particularly for the MENA region, it is highly recommended to use multiple data sources, while obvious outlier datasets should be avoided.