1 Introduction

Climate change is a major concern facing the world today. The issue may be one of the biggest threats confronting human society and the natural environment. Climate change will be manifested through climatic and weather events such as temperature extremes, droughts, sea level rise, tornadoes, and hurricanes (Joyce et al. 2016). According to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC 2007), “Extreme weather events are likely to become more frequent, more widespread and more intense during the 21st century.”

Extreme temperature is one of the most serious events. For instance, the rising temperatures expected with climate change and the possible for more extreme temperature events will likely affect many aspects of human life, such as health, energy supply, hydrology, and agriculture (e.g., Kunkel et al. 1999; Karl and Easterling 1999; Patz et al. 2005; Lee and Chiu 2011; Emmanuel 2014; Chang et al. 2014; Kaushal et al. 2016).

Given these serious impacts, it is obviously important to investigate extreme temperature events in response to current and future climate change and to accurately assess the changes in extreme temperature based on rigorous statistical methods. In the literature, both non-parametric and parametric approaches have been proposed for the modeling and the analysis of extreme events in climate data.

The non-parametric approach is based on the definition of indices for extreme events. This approach was made possible through the efforts of climate scientific communities, including the efforts of the Expert Team on Climate Change Detection Monitoring (ETCCDM) who has developed a set of indices for temperatures and precipitation extremes. A total of 27 indices were calculated from daily observations of temperature or precipitation data. The ETCCDM climate indices are derived from daily temperature and precipitation data (e.g., Peterson et al. 2001). These indices recognized by the World Meteorological Organization (WMO) attempt to monitor changes in “moderate” extremes and to make the research on climate further possible. The defined indices exhibit attractive proprieties; they are statistically robust, cover a wide range of climates, and have a high signal-to-noise ratio (Zhang et al. 2011).

During the last years, there has been wide interest in analyzing climate with climate extreme indices. Numerous studies were conducted in various geographical regions or countries, calculated the indices closely the same way, and examined trends and variability of some meteorological stations. Overall, the existing literature provides strong evidences that global warming is closely related to significant changes in temperature extremes (e.g., Easterling et al. 1997; Manton et al. 2001; Peterson et al. 2002; Klein Tank and Konnen 2003; Aguilar et al. 2005, 2009; Griffiths et al. 2005; Moberg and Jones 2005; Haylock et al. 2006; Skansi et al. 2013; Vincent et al. 2005; New et al. 2006; Athar 2014; Krishna 2014; Alghamdi and Moore 2014; Guan et al. 2015; Keggenhoff et al. 2015). This approach has its own advantages. Particularly, it made the research on climate further possible as to quantitatively analyze moderate extreme events. However, it was criticized for not being able to extrapolate the results for even more extreme, out-of-sample events (Frei 2003).

The second approach, initially developed by Fisher and Tippett (1928), is based on extreme value theory (EVT). The theory provides a solid framework for modeling rare events that occur with low frequency. By focusing directly on the tails of the sample distribution, the EVT could potentially provide better results than other standard approaches that model the entire distribution, in terms of predicting highly unusual extreme changes (Marimoutou et al. 2009). The major advantage of the EVT is that it allows for estimating and analyzing the probability of occurrence of events that are outside of the observed data range. EVT has been successfully applied in many fields including hydrology, environment, and climatology. When applied in climatology or in hydrology, for instance, extreme climate events may be supposed stationary (e.g., Katz et al. 2002). However, empirical evidences have rejected this assumption. It has been confirmed that climate time series exhibit clear signs of non-stationarity, where long-term trends and seasonal variations owing to climate change are generally observed. Consequently, it is important to take into account the non-stationarity issue when modeling climate data.

In the literature, there are broadly two common alternatives in accounting for the impact of the dependence (Chavez-Demoulin and Davison 2012): one strategy is to use the full dataset to detect and estimate non-stationarities and then to apply methods for stationary extreme-value modeling to the resulting residuals. Alternatively, we can fit a non-stationary extremal model to the original data.

The first strategy (McNeil and Frey 2000; Marimoutou et al. 2009) consists of estimating non-stationarities induced by volatility clustering using a GARCH model, based on the full dataset, and then to apply the standard stationary extreme-value approach to the resulting residuals. The main advantage of this approach, initially applied in finance, is that it reflects two stylized empirical facts frequently observed in financial time series; stochastic volatility and the heavy-tailedness of conditional return distributions over short-time horizons (McNeil and Frey 2000).

The second strategy (Davison and Smith 1990) for modeling the extremes of a non-stationary process explicitly models the dependence structure using covariate in the parameters of the model. Following this pioneering work, numerous studies describing methodologies have been proposed for the estimation of time-covariate extreme value distributions. For example, non-parametric approach to estimating temporal trends, once parametric models were applied to extreme values from a weakly dependent time series, has been proposed by Hall and Tajvidi (2000). Chavez-Demoulin and Davison (2005) have developed a generalized additive modeling of sample extremes, in which spline smoothers are introduced into models for exceedances over high thresholds. El Adlouni et al. (2007) used the generalized maximum likelihood estimation method (GML), in which covariates were modeled by linear or log-linear parametric models. A comparison of trend in maximum temperature obtained from EVT with that obtained indirectly by fitting a trend to mean annual temperature was conducted by Cooley (2009). Westra and Sisson (2011) apply a non-stationary max stable process to detect non-stationarity and simulate both spatial and temporal variability in precipitation extremes data in Australia. More recently, some papers focused on the suitability of non-stationary models and their possible use for risk management and engineering design (e.g., Katz 2013; Cheng et al. 2014; Salas and Obeysekera 2014; Hounkpè et al. 2015; Gao et al. 2016). Overall, the findings of most studies on EVT have showed the importance of taking account of the non-stationarity issue frequently detected in climate data and reported the superiority of non-stationary EVT models compared to the stationary ones. Additionally, they justified its usefulness as a complementary approach for the climate extreme indices.

The climate in Saudi Arabia is characterized by extreme aridity and heat (Krishna 2014). In particular, Saudi Arabia is one of the hottest countries in the world where temperature during summer can easily exceed 50 °C. Desert covers more than half of its total area. In particular, the country includes, the Rub Al-Khali, one of the world’s largest sand deserts (Library of Congress 2006). Its climate is usually influenced by its geography, which is may be considered as one of the factors for the temperature variations. Additionally, the area may be more susceptible to extreme climate events owing to some factors such as the industrial development, the increasing number of motor vehicles, and the related increasing of air pollution (Krishna 2014).

The previous studies show that there exist some works on the analysis and the comparison of trends in extreme temperature indices over Saudi Arabia (e.g., Rahman and Al Hadhrami 2012; Almazroui et al. 2012, 2013; Krishna 2014; Alghamdi and Moore 2014; Athar 2014). However, to the best of my knowledge, there are no studies that included recent statistical technics based on EVT models for modeling such climate data. In a very recent paper, Pal and Eltahir (2016), using an ensemble of high-resolution regional climate model simulations, addressed the Arab Gulf region and parts of southwest Asia and showed that this region specifically could be uninhabitable before the turn of the century as temperature records are projected to rise to insupportable levels. Consequently, modeling and analyzing statistical characteristics of the extreme climatic events such as extreme temperatures will be a topic of great importance; an understanding of the changes in the climate is crucial to evaluating vulnerability and development of future plans that could look for solutions or just how to reduce the impact of heat on human or on environment.

This study aims to provide a more comprehensive analysis of observed maximum temperature data, from several stations, over Saudi Arabia, by applying two approaches: (1) the behavior of extreme temperatures events is analyzed and their changes are quantified with respect to the ETCCDM indices and (2) the series of extremes data are modeled in terms of stationary and non-stationary GEV distributions. Particularly, the GEV approach could improve our knowledge about climate changes over the 30-year time period under consideration and potentially in-depth the perspective of climate changes over the area of study.

The paper is organized into five sections: Section 1 reviews the literature. Section 2 presents the study area and its data. Section 3 provides the methodology. Results and discussion are presented in Section 4. Section 5 concludes the paper.

2 Study area and data

2.1 Study area

Saudi Arabia, the largest country in the Peninsula, occupies the southwest part of the Asian continent and covers an area of approximately 2,150,000 km2, more than half of which is desert. In particular, the country includes, the Rub Al-Khali, one of the world’s largest sand deserts (Library of Congress 2006). Saudi Arabia is located in the tropics between 16° N and 32° N latitudes and 37–52° E longitudes. It is bordered by the Red Sea on the West, by Yemen and Oman on the South, Qatar, the United Arab Emirates, and the Arabian Gulf on the East, and Kuwait, Iraq, and Jordan on the North. The country has a coastline of almost 2320 km, including about 1760 km for the Red Sea coastline and roughly 560 km for the Arabian Gulf coastline.

The climate of Saudi Arabia is mainly characterized by extreme aridity and heat (Krishna 2014) but differs from one region to another. From the coast and the interior, the country has considerable variation in climate as well as topography. Moderate temperatures are always accompanied by high humidity along the coast, whereas extreme temperatures and aridity are prevalent in the interior (Library of Congress 2006).

Its climate is usually influenced by its geography, which is may be considered as one of the factors for the temperature variations. Additionally, the area may be more susceptible to extreme climate events due to some factors such as the industrial development, the increasing number of motor vehicles, and the related increasing of air pollution (Krishna 2014).

2.2 Data

The data used in this study are the time series of maximum air temperatures (TX) provided by the Presidency of Meteorology and Environment (PME) in Saudi Arabia. The data were collected by 15 weather stations that showed a continuity of their records across Saudi Arabia. They spread all over Saudi Arabia, although it is not equally distributed in some parts of the country. We considered the years 1986 to 2014 because this is the longest available period that is provided by the PME Services.

In addition to manual checks for data completeness, data quality control and homogeneity assessment were attained using the R ClimPACT2, a downloadable R-software package. The main purpose of this QC procedure was to identify errors in data processing, such as errors in manual entering and identifying data values as outliers in daily maximum temperature (these values were above four standard deviations (σ) and identified as potential errors).

The spatial distribution of the selected stations for air temperature is shown in Fig. 1. We also present the geographical locations and characteristics of meteorological stations considered in this study in Table 1.

Fig. 1
figure 1

The spatial distribution of the selected stations for maximum air temperatures in Saudi Arabia

Table 1 Geographical locations and characteristics of meteorological stations used in the study

As in Table 1, the surface elevation varies from 4 m at Gizan station, which lies in the extreme southwest corner of Saudi Arabia, to 2100 m at Abha in the southwestern region. High elevation areas are confined to the southwest region with a gradual decrease in slope towards the east. From the same table, it seems that differences in the mean (maximum) weather data for these locations are relatively important. In particular, the mean (maximum) annual of TX varied from a minimum of 25.8 °C (35.1 °C) at Abha (Abha) to a maximum of 38.3 °C (52 °C) at Makkah (Jeddah). Additionally, the dispersion of maximum temperature, as measured by the coefficient of variation, seems to be different among stations.

3 Methodology

3.1 Statistical tests

3.1.1 Trend analysis of Mann–Kendall trend test

The Mann–Kendall (MK) test is a non-parametric test that is widely used to detect temporal trends in the indices time series. In contrast to the common linear regression, the advantage of the MK approach is that the underlying data do not need to follow any particular statistical distribution. Also, the test is not influenced by outliers and skewed distribution (e.g., Sen 1968; Caesar et al. 2011; Chen et al. 2015).

The test statistic (Z) is given as follows:

$$ Z=\left\{\begin{array}{c}\frac{S-1}{\sqrt{\mathrm{Var}(S)}}\ \mathrm{if}\ S>0\kern13em \\ {}0\kern0.5em \mathrm{if}\ S=0\kern15.75em \\ {}\frac{S+1}{\sqrt{\mathrm{Var}(S)}}\ \mathrm{if}\ S<0\kern13.25em \end{array}\right. $$
(1)

where

$$ S={\sum}_{i=1}^{n-1}\sum_{k= i+1}^n\operatorname{sgn}\left({X}_k-{X}_i\right) $$
$$ \mathrm{Var}(S)=\frac{n\left( n-1\right)\left(2 n+5\right)-\sum_{i=1}^m{t}_i\left({t}_i-1\right)\left(2{t}_i+5\right)}{18}\kern4em $$
(2)

and where the X k and X i are the sequential data values, m is the number of tiedFootnote 1 groups, n is the length of the data set, t i is the number of data points in the ith group, and sgn(q) is equal to 1, 0, or −1 if q is greater than, equal to, or less than zero, respectively.

Positive (negative) values of Z indicate increasing (decreasing) trends. Testing trends is performed at the fixed significance level α. The null hypothesis H0 is rejected and a significant trend exists in the time series when \( \left| Z\right|>{Z}_{1-\frac{\alpha}{2}} \). \( {Z}_{1-\frac{\alpha}{2}} \) is the quantile of the standard normal. A significance level of α = 0.05 was chosen. When significant, the magnitude of the existing trend was estimated with the Sen’s slope estimator (Garza et al. 2012; Rulfová and Kyselý 2014).

3.1.2 Sen’s slope estimator

Sen’s slope estimator method is a non-parametric test developed by Sen (1968) to estimate the true slope of Mann–Kendall’s trend analysis in the sample of N pairs of data. This estimator has been widely used in meteorological time series (Gocic and Trajkovic 2013; Chen et al. 2015). Its application follows the Mann–Kendall test in time series data where the trend is assumed to be linear,

$$ {Q}_i=\frac{X_j-{X}_k}{j- k}\ \mathrm{for}\ i=1,\dots, N\kern3.75em $$
(3)

where X j and X k are the data values at times j and k (j > k), respectively. If there is only one datum in each time period, then we will have \( N=\frac{n\left( n-1\right)}{2} \), where n is the number of time periods. If there are multiple observations in one or more time periods, then we will get \( N<\frac{n\left( n-1\right)}{2} \), where n is the total number of observations.

The Sen’s slope estimator (median of slope) is computed as follows:

$$ {Q}_{\mathrm{med}}=\left\{\begin{array}{c}{Q}_{\left[\left( N+1\right)/2\right],}\ \mathrm{if}\ N\ \mathrm{is}\ \mathrm{odd}\kern5.5em \\ {}\frac{Q_{\left[ N/2\right]}+{Q}_{\left[\left( N+2\right)/2\right]}}{2},\kern0.5em \mathrm{if}\ N\ \mathrm{is}\ \mathrm{even}\end{array}\kern3.5em \right. $$
(4)

The Q med sign reflects data trend reflection, while its value indicates the steepness of the trend. To determine whether or not the median slope is significantly different than zero, one could estimate the confidence interval of Q med at a specific probability.

The confidence interval about the slope (Gilbert 1987) can be computed as follows:

$$ {C}_{\alpha}={Z}_{1-\frac{\alpha}{2}}\mathrm{Var}(S)\kern4.5em $$
(5)

where Var(S) is defined in Eq. (3) and \( {Z}_{1-\frac{\alpha}{2}} \) is the \( \left(1-\frac{\alpha}{2}\right) \) quantile of the standard normal distribution.

Then, \( L=\frac{N-{C}_{\alpha}}{2} \) and \( U=\frac{N+{C}_{\alpha}}{2} \) are calculated. The lower and upper limits of the confidence interval, Q min and Q max, are the Lth largest and the (U + 1)th largest of the N ordered slope estimates (Gilbert 1987). The slope Q med is statistically different than zero if the two limits (L and U) have similar sign.

3.2 Analysis with indices of extremes

On the basis of recommendations given by the ETCCDMI, a calculation of 14 climate indices derived from daily maximum temperature data was performed (Table 2). Concerning the temperature-related indices, some indices are based on station-related thresholds, while others are derived from fixed thresholds, absolute peak values, or the duration of specific climate events (Kioutsioukis et al. 2010).

Table 2 Definition of extreme maximum temperature indices

The indices selected were calculated on the annual time scales for individual stations. However, as stated by Homewood (2016), “since people tend to adapt to their local climate, a threshold considered extreme in one part could be considered quite normal in another”, we have decided to modify some of the extreme temperature indices in order to take account of the main characteristics of climate conditions of the studied area.

The retained indices can be classified into five groups (Alexander et al. 2006): The percentile-based indices for temperature include TX10P, TX90P, and TX50P. Absolute indices represent maximum or minimum values within a period which are based on the values of the absolute maximum daily air temperature—TXX and TXN. Threshold indices are defined as the number of days on which a temperature value falls above or below a specified threshold. Indices whose values are based on the previously defined threshold—TXLT15, TXLT20, SU30, SU35, and SU40. Duration indices including WSDI4 and WSDI6 and other indices that do not fall into any of the above categories but may still be of interest—HWN.

All selected indices were computed with ClimPACT2, a downloadable R-software package that calculates a wide range of the Expert Team on Sector-specific Climate Indices (ET-SCI) as well as additional climate extremes indices.

3.3 Analysis with the generalized extreme value (GEV) distribution

3.3.1 GEV models

In this section, we present a brief review of the generalized extreme value (GEV) distribution to provide a basis to our modeling of extreme temperature events in climate model data. For further details on the GEV distribution, we refer to Embrechts et al. (1997).

Extreme value theory (EVT) is a branch of statistics dealing with the extreme deviations from the mean of probability distributions. EVT is concerned with the study of the asymptotical distribution of extreme events that are rare in frequency and huge with respect to the majority of observations. The theory is of interest for assessing the risk of climate changes associated to highly unusual events such as very high or low temperatures, high precipitation events.

Consider the sample X 1,..., X n of n independent and identically distributed (i.i.d.) random variables with unknown cumulative distribution function (cdf), F(x) = P(X ≤ x i ). We define the ordered sample by X 1 , n  ≤ X 2 , n  ≤  …  ≤ X n , n  = M  n , and we are interested in the asymptotic distribution of the maxima M  n as n → ∞. By the Fisher–Tippett theorem, the normalized maximum converges to the GEV distribution whose cumulative distribution is as follows:

$$ G\left( x,\mu, \sigma, \xi \right)=\left\{\begin{array}{c} \exp \left\{-{\left(1+\xi \left(\frac{x-\mu}{\sigma}\right)\right)}^{-\frac{1}{\xi}}\right\},\xi \ne 0,\kern1.5em 1+\xi \left(\frac{x-\mu}{\sigma}\right)>0\\ {} \exp \left\{-\left( \exp -\left(\frac{x-\mu}{\sigma}\right)\right)\right\},\kern0.5em \xi =0\end{array}\right. $$
(6)

where μ ∈ , σ > 0, and ξ ∈  are the location, scale, and shape parameter, respectively. The asymptotic distribution of the maxima converges to a Fréchet, Weibull, or Gumbel distributions, independently of the original distribution of the observed data. The shape parameter ξ describes the tail modeled of the maximum distribution. For ξ = 0, the GEV belongs to the Gumbel distribution. For ξ > 0, the tail of the GEV is “heavier” than the tail of the Gumbel distribution, while for ξ < 0, it is “lighter” than that of the Gumbel distribution.

Having modeled the upper tail of a distribution by fitting a GEV distribution, it remains to use such a model for inference. One of the main applications of extreme value analysis is the estimation of the once per N-year return levels. An event exceeding such a level is expected to occur once every N years. The 1/N year return value based on GEV distribution, z N , is given by

$$ {z}_N=\left\{\begin{array}{c}\widehat{\mu}-\frac{\widehat{\sigma}}{\widehat{\xi}}\left\{1-{\left(- \log \left(1-\frac{1}{N}\right)\right)}^{-\widehat{\xi}}\right\},\mathrm{for}\ \widehat{\xi}\ne 0\ \\ {}\widehat{\mu}-\widehat{\sigma} \log \left\{- \log \left(1-\frac{1}{N}\right)\right\},\mathrm{for}\ \widehat{\xi}=0\end{array}\right. $$
(7)

The classical G(x, μ, σ, ξ) distribution supposes that the three parameters (location, scale, and shape) of the model are time independent (Coles 2001). This model is often called “the stationary approach.” However, if trends are detected in the data sample, the non-stationary case would be where parameters will no longer be constants but expressed as covariates (e.g., time).

The non-stationary GEV distribution can be denoted as G (μ(t), σ(t), ξ(t)) with the distribution function

$$ G\left( x,\mu (t),\sigma (t),\xi (t)\right)=\left\{\begin{array}{c} \exp \left\{-{\left(1+\xi (t)\left(\frac{x-\mu (t)}{\sigma (t)}\right)\right)}^{-\frac{1}{\xi (t)}}\right\},\xi \ne 0,\kern1em 1+\xi (t)\left(\frac{x-\mu (t)}{\sigma (t)}\right)>0\\ {} \exp \left\{-\left( \exp -\left(\frac{x-\mu (t)}{\sigma (t)}\right)\right)\right\},\kern0.5em \xi (t)=0\end{array}\right. $$
(8)

Following El Adlouni et al. (2007), Cannon (2010), and Chen and Chu (2014), we allow for possible non-stationary behavior in the location μ and scale σ parameters but keep the shape parameter ξ constant. This choice seems defensible insofar as the location and scale parameters play fundamental roles in shaping the trend in extremes (e.g., Garcia et al. 2007; Chen and Chu 2014) and the variability of the shape parameter is small (Hosking et al., 1985). Additionally, letting the shape parameter to vary may lead to numerical problems (e.g., Chen and Chu 2014).

More specifically, nine models of varying complexity may be defined in this way (three choices for each of j and k) allowing up linear and nonlinear dependence on time of both the location μ and scale σ parameters, were developed with parameters described as follows:

$$ \begin{array}{l}\mu (t)={\mu}_0+{\mu}_1 t+{\mu}_2{t}^2\hfill \\ {}\hfill \sigma (t)= \exp \left({\sigma}_0+{\sigma}_1 t+{\sigma}_2{t}^2\right)\hfill \end{array} $$
(9)

We denote by GEVkl the model with time dependence of order k in the location parameter and order l in the scale parameter. For instance, GEV00 denotes that we have a stationary GEV distribution with both the location and scale parameters independent of time (μ 1 = μ 2 = 0 and σ 1 = σ 2 = 0), while the GEV21 non-stationary model assumes a quadratic trend in location and a log-linear trend in scale (σ 2 = 0).

3.3.2 Estimation and model selection

The model parameters for GEV can be estimated in a variety of ways. Possible methods include maximum likelihood techniques (ML), the L-moment approach of Hosking (1990), and Bayesian methods (e.g., Coles and Powell 1996). Most common methods for parameter estimation in climate research are the ML estimation (e.g., Katz et al. 2002, 2005) and the method of-moments. The ML method, although problematic when applied to very small samples, is the preferable method due to its universal applicability and its nice asymptotic properties. Moreover, the method allows for the introduction of covariates such as time into the model (Katz et al. 2005). Since the ML approach seems to be most common within the literature, we will adopt this method in here.

The analysis of extremes with EVT has been performed using the free software R and the extRemes package, which is designed for statistically analyzing of extreme weather events and climate. For a brief introduction to the capabilities of extremes, we refer to the paper of Gilleland and Katz (2011).

In the literature, there exist various methods to identify the best model out of a set of cautiously selected candidate models. One approach involves the use of information criteria, such as the Akaike information criterion (AIC, e.g., Burnham and Anderson 2002), the Bayesian information criterion (BIC, e.g., Katz et al. 2005), and the Hannan–Quinn information criterion (HIC, e.g., Grasa 1989).

In this study, the AIC and the BIC statistics were used to determine which of the candidate models is most applicable. These two statistics identify the best model, which is supposed to fulfill the individual Student’s t test on the parameters, when minimized.

4 Results and discussion

In this section, the results of the non-parametric approach involving the analysis with climate extreme indices and of the parametric approach based on GEV distribution are presented and analyzed. For both approaches, the individual study of each station, the field global significance assessment, and the statistical assessment of changes in extreme maximum temperature over Saudi Arabia are discussed.

4.1 Analysis with indices of extremes

Trends in indices for air temperature extremes were investigated for the stations indicated in Fig. 1. Results of applying the Mann–Kendall trend test for climate indices over the period 1985–2014 are presented in Fig. 2. Each index was assessed based on consistency in direction of a trend as well as proportion of significant trends across all stations (Insaf et al. 2013).

Fig. 2
figure 2

Percentage of stations in Saudi Arabia showing positive and negative trends in extreme indices from 1985 to 2014; image concept borrowed from Insaf et al. (2013). The percentage of stations with positive and negative trends is shown by the length of the band to the left or right of the zero line, respectively. Significant level is fixed to 5%

As shown, the majority of stations displayed upward trend in warm weather indicators, including mean TX (TXM), warmest day (TXX), coldest day (TXN), summer days (above 30, 35, and 40°), warm days (TX90P), above average days (TX50P), warm spell duration 4 (WSDI4), warm spell duration 6 (WSDI6), and heat wave number (HWN). However, all of the stations have showed a downward trend in the annual count of days with maximum temperature < 15 °C (TXLT15), the annual count of days with maximum temperature < 20 °C (TXLT20), and the percentage of days when TX < 10th percentile (TX10P) indices, with most of those being significant.

  • Annual mean values of TX (TXM) increased with years in almost all the stations during 1985–2014; about 94% of stations show positive trends in index TXM, with 80% (all) being significant at 5% (10%) level. Similarly, 94% stations showed increases in TXX and significant trends are achieved for 60% of stations. Roughly, 87% of stations showed increases in TXN trends but only 27% of stations are statistically significant during the 1985 to 2014. The number of days with maximum temperature < 15 °C (<20 °C) demonstrated a decreasing trend for all stations, with 53.33% (60%) being significant at 5% level.

  • For summer days indicators (SU30, SU35, and SU40), increasing trends are dominant across the studied area. The percentage of stations with increasing trend for SU30, SU35, and SU40 indices is approximately 90, 65, and 65% of stations, respectively. However, we note the existence of significant decreasing trends with a percentage of 5% for the very hot days index (SU40).

  • The TX10P index records clear decreasing trends over the studied area, and most of the stations are significant. For the TX90P, stations with significantly increasing trends account for 60% of the total. The TX50P is positive over most of the stations; however, it is significant only at 60% of the stations.

  • The warm spell duration WSDI4 and WSDI6 demonstrate an increase at the majority of the stations, being significant at approximately 54 and 20% of stations, respectively. The heat wave indicator HWN has increased significantly at almost 54% of the stations as shown in Fig. 2.

In order to complete the analysis, we plotted the spatial distributions of the values estimated by the Sen’s method and evaluated their statistical significance by the two-sided Mann–Kendall test (Figs. 3 to 8 and Table 3).

Fig. 3
figure 3

Spatial patterns of (Sen’s slopes) trends per decade during 1985–2014 over Saudi Arabia for TXM (left), TXX (middle), and TXN (right) indices. Up (red) and down (blue) triangles indicate positive and negative trends, respectively. The triangles are scaled according to the magnitude of the trend. The filled triangles indicate significance at the 95% confidence level

Table 3 Trends in °C, days, %, or number of events per decade for maximum temperature indices using Sen’s slope estimator in 15 stations over Saudi Arabia during 1985–2014

In general, results reveal that the warming tendencies are dominant, positive trends are recorded in TXM, TXX,TXN, SU30, SU35, SU40, TX90P, TX50P, WSDI4, WSDI6, and HWN indices, reflecting an increase in maximum air temperatures across the whole area in the period 1984–2014. Negative trends are essentially seen in TXlt15, TXlt20, and TX10P indices. However, the distribution of warming trends was not uniform across the region of study. It differs from one station to another.

  • The results obtained for indices TXM, TXX, and TXN (mean TX, coldest night and warmest day indices, respectively) are shown in Fig. 3. The annual maximum mean daily temperature (TXM) trends are dominated by significant increases that are observed over most stations of the studied area, with the general trend >0.34 °C per decade. The stations with trend >0.34 °C per decade were mainly in the southwestern part of Saudi and the middle and north-east part of the country. Among all the 15 stations, Jeddah (western part) has recorded the minimum increase in TXM with a value of 0.26 °C/decade and Najran (southwest) has shown the maximum increase of 1.03 °C/decade. However, insignificant negative trend was detected at Gizan station.

  • Warming trends are also found in the warmest day (TXX) index. An increase in the warmest day temperature of the year (TXX) is seen at most stations across Saudi Arabia. Stations in northeastern and southwestern Saudi have larger trend magnitudes. The strongest warming (>1 °C) has occurred in Arar, where the lowest warming (<0.5 °C) has been recorded in Al-Baha. Other regions of Saudi have typically experienced a decadal warming of temperatures by 0.2 to 1.08 °C. Figure 4 displays how the annual maximum daily temperature has varied from 1985 to 2014 for two of the fifteen locations (Abha and Gassim cities).

Fig. 4
figure 4

Annual maximum daily temperatures with the TXX trend line in two of the fifteen locations (Abha and Gassim cities)

  • For the index of coldest day temperature of the year (TXN), increasing trend is most common, although most of them are insignificant. In addition, non-significant decrease was found on one station in the extreme North West (Tabuk). About 87% of stations showed an increase in TXN with 27% of those trends being statistically significant during the 1985 to 2014 period. Warming trends are particularly strong (up to 1 °C per decade) over southwest part (Abha and Taif) of Saudi.

  • From the spatial distribution of TXLT15 (TXLT20), in Fig. 5, it could be seen that the TXLTt15 (TXLT20) exhibited a significant decreasing trend over most stations. Downward trends are most prominently in the eastern region. The highest trend was found in Arar while the lowest was found in Al-Ahsa station. Additionally, most of the stations with no trends are seen in the western part. From this result, it seems that the latter is less affected by the warming trend, relative to the annual count of days with maximum temperature < 15 °C, than the eastern one. This may be explained by local factors such as topography or geographic features. In fact, the eastern region, which lies in the interior, is characterized by its low elevation. In contrast, the western region ranging along the Red Sea coast is a relatively high elevation land. This finding shows that the TXLT15 trends are amplified with decreased elevation. In other words, the decreasing trend in the annual count of days with maximum temperature < 15 °C is more apparent in the low elevation region.

Fig. 5
figure 5

Spatial patterns of (Sen’s slopes) trends per decade during 1985–2014 over Saudi Arabia for TXLT15 (left) and TXLT20 (right) indices. Down (blue) triangles indicate negative trends. The triangles are scaled according to the magnitude of the trend. The filled triangles indicate significance at the 95% confidence level

  • Figure 6 shows the spatial patterns of trends for summer days indices during 1985–2014 over Saudi Arabia. The annual count of days when daily maximum temperature exceeds 30 °C (SU30) has showed upward trends, with all of those being significant, except for Gizan station, evidencing an increase in the annual number of days when the maximum air temperature was higher than 30 °C. The significant trends are ranging from 1.67 d/decade in Medina and Gizan to 20 d/decade in Najran. SU30 has significantly increased at the highest rate in the western regions of Saudi. Similarly, the same figure shows region-wide increases in the number of summer days above 35 and 40 °C, with many trends significant. In addition, for the SU40 index, 5% of the stations exhibited decreasing trends. The comparison of different thresholds reveals that some stations located in the western part no longer have significant increasing trends and even more they became decreasing after increasing the threshold. This finding becomes more pronounced for the SU40 index than those in SU30 and SU35 indices.

  • The spatial distribution of the trends of the percentile indices (TX10P, TX90P, and TX50P) are shown in Fig. 7. The TX10P trend during the period 1985–2014 is generally negative and significant over the whole Saudi. The strongest significant decreasing trend was found in Najran located in the southwestern part. The significant decreases occurred predominantly in southwestern and northern Saudi. The highest decrease (∼6%/decade) appears in Najran.

  • Conversely, the extreme index of warm days (TX90P) demonstrated significant increases over most stations of the studied area. In the same figure, we show also a significant decreasing trend recorded in the southwest at Gizan station. Broadly speaking, southwestern and northeastern parts of Saudi have stronger increases in TX90P relative to the other parts. The strongest increase, up to 5% per decade, is found in some regions in the southwest part (Abha and Najran). The significant increasing trends were varied between 1.5%/decade at Jeddah station and 5.48%/decade at the Abha station. For the TX50P index, upward trends were mainly detected on all the major area during 1985–2014.

Fig. 6
figure 6

Spatial patterns of trends per decade during 1985–2014 over Saudi Arabia for SU30 (left), SU35 (middle), and SU40 (right) indices. Up (red) and down (blue) triangles indicate positive and negative trends, respectively. The triangles are scaled according to the magnitude of the trend. The filled triangle indicates significance at the 95% confidence level

Fig. 7
figure 7

Spatial patterns of trends per decade during 1985–2014 over Saudi Arabia for TX10P (left), TX90P (middle), and TX50P (right) indices. Up (red) and down (blue) triangles indicate positive and negative trends, respectively. The triangles are scaled according to the magnitude of the trend. The filled triangles indicate significance at the 95% confidence level

  • The spatial distribution of the trends of the WSDI4, WSDI6, and HWN indices is shown in Fig. 8. As can be seen on the maps, the trends for WSDI4 were increasing although only half of these are significant. The significant trends are ranging from 5.7 d/decade in Abha to 1.9 d/decade in Makkah. This result becomes more pronounced for the WSDI6. HWN showed an upward significant trend for more than 50% of the stations.

Fig. 8
figure 8

Spatial patterns of trends per decade during 1985–2014 over Saudi Arabia for WSDI4 (left), WSDI6 (middle), and TX50P (right) indices. Up (red) and down (blue) triangles indicate positive and negative trends, respectively. The triangles are scaled according to the magnitude of the trend. The filled triangles indicate significance at the 95% confidence level

4.2 Analysis with GEV distribution

  • We investigated the use of the GEV distribution to model the annual maximum value of daily max temperature (TXX) for each of the 15 weather stations across Saudi Arabia. We modeled this data series through the three-parameter GEV distribution using both stationary and non-stationary models for the time period 1985–2014. The inclusion of non-stationarity is plausible for our modeling approach since clearly we see some kind of non-stationarity for many stations. As an example, we have plotted the annual maximum daily temperature (Fig. 4) from 1985 to 2014 for two of the fifteen locations (Abha and Gassim cities) with the trend line using the Sen’s slope estimate. The graph shows clear trend in the annual maximum data. This visual impression and the MK trend test results justify the use of non-stationary GEV models with time as covariate.

  • The time series of annual maxima temperature will be used for estimating parameters in GEV distribution with and without trend as well as return values later on. Models were fitted to annual maxima temperature from each of the fourteen locations by the method of maximum likelihood (ML). The analysis of extremes with EVT has been performed using the free software R and the extRemes package.

  • The basic model fitted was (6) with μ, σ, and ξ constant. The models were tested with likelihood ratio test with a significance level of 95%. If GEV was significantly better, the Gumbel model was not kept. Otherwise, the Gumbel model was retained for the advantage of being a simpler model, with two parameters instead of three, can be used in further analyses, instead of the GEV distribution.

  • We try to improve our modeling approach by allowing for time dependence on the location μ and/or on the scale σ parameters. Two model selection criteria, the AIC and the BIC, were used to select the best model among a collection of nested models. Both selection criteria were applied here to select the best model among a collection of candidate models. The best fitting models for annual maxima temperature through GEV distributions, selected using likelihood ratio tests and the AIC and BIC criteria for each station separately, are summarized in Table 4 and can be seen in Fig. 9.

  • Footnote 2 distribution is generally the most likely model (Table 4 and Fig. 9). The distribution can profile the patterns of extreme temperature better than the GEV distribution for 10 stations, the percentage of which accounts for almost 67% of the stations. Of them, half follow a stationary Gumbel model GUM00, 40% follow a GUM10, and the rest follow GUM11. The Gumbel distribution could be employed to fit the annual maxima of temperature for these stations. Controversy, the GEV distribution with (ξ < 0) was retained for the 33.33% of the stations, indicating that the distribution of the extremes values is of Weibull form. Hence, the variable exhibits tail behavior such that the upper tail is bounded at a finite upper point. In other words, extreme temperatures have an upper limit, so there are finite values that cannot be exceeded.

Table 4 Best fitting of GEV models applied to annual maximum temperature for different stations
Fig. 9
figure 9

Spatial distribution of the best fitting models

From these results, we obtain the following:

  • Non-stationary models are predominantly the best models. These models, including GUM10, GUM11, and GEV10, were found successful in modeling extreme temperature data for about 67% of the stations. Particularly, models with only trend in the location, GUM10 and GEV10, are found the best. These models are generally chosen as best in stations in western south and northern east of Saudi. Controversy, the GUM11 appears to be best for only one station. Additionally, all stations that were fitted by a non-stationary model exhibited an upward linear trend for extreme temperature. In particular, the estimated slope μ 1, corresponding to the annual rate of change in annual maximum temperature, is found positive and significant (Table 4) for all selected non-stationary models. The existence of significant positive trend in the location of the annual maxima indicates that the extreme values of maximum temperature will be more severe. This result is consistent with the significant increasing trend found in TXX index shown in Fig. 3.

-Figure 10 describes the spatial pattern of trends for the location parameter μ and for the scale parameter σ for maximum temperature according to the selected non-stationary GEV models (Fig. 9 and Table 4). The strongest trend has occurred over north and south of Saudi (Arar and Najran), with the lowest trend over central and southwest regions. This resemblance supports, in some way, the results obtained from both approaches (the EVT and the climate indices) used in this work. The strongest trend was found in Arar and Najran. It is important to note that most of the stations with significant trends in location show also significant trends with the non-parametric approach (see trend in the TXX index, Fig. 3). In addition, the spatial patterns are very similar to those obtained with the Sen’s method when applied to trend detection of climate indices. Figure 11 shows the time series plot of the two stations data with fitted estimates for μ superimposed. Also, shown, for comparison, is the fitted estimate for μ under the stationary model.

Fig. 10
figure 10

Spatial pattern of trends for (a) location parameter μ (left) and (b) scale parameter σ (right) for maximum temperature according to the selected non-stationary GEV models. Triangles denote the locations of the individual stations. Up triangles indicate positive direction of change, and their sizes are scaled according to the magnitude of the trend. Filled triangles indicate trends significant at the 5% level

Fig. 11
figure 11

Time series plot of annual maximum temperature observed at (a) Arar (left) and (b) Najran (right) stations, with fitted estimates for μ based on the stationary GEV model (mu0) and the model which allows for a linear trend in time (mut). The trend line of the location parameter is schemed using the non-parametric Sen’s slope estimator

The goodness-of-fit of these models is examined by the quantile (Q–Q plot). Taking Taif and Tabuk stations as an example, the first station is on the south (with a non-stationary GEV11 being the best), while the second one is located in the north of Saudi (with a stationary GUM00 being the best). The Q–Q plots for the best fitting models are shown in Fig. 12. Both plots show the reasonability of the GEV11 and GUM00 fit: each set of points follows a quasi-linear behavior; thus, there is no doubt on the validity of the fitted models.

Fig. 12
figure 12

Quantile–quantile plots resulting from adjusting the series of maximum temperature to the best fitted models. (a) The non-stationary GUM11 for Taif station (left) and (b) the stationary GUM00 for Tabuk station (right)

Once the best models for the data have been selected, the interest is in deriving the return levels of extreme maximum temperatures. Estimates and confidence intervals for return levels for 10, 50, and 100 years of the selected models are presented in Table 5; panel A for stationary GEV models and the non-stationary ones are given in panel B.

  • From panel A, it is revealed that the return levels for maximum temperature for all stations are increasing over higher and higher return periods (10, 50, and 100 years). Confidence intervals also become increasingly wider as the return periods increase.

  • From the above results, for example, one would expect that the maximum temperature (C°) at Gizan will exceed about 43.58 °C on average every 10 years, will exceed about 45.19 °C on average every 50 years, and will exceed about 45.87 °C every 100 years. The 95% confidence intervals, in (C°), were 42.75–44.41, 43.92–46.46, and 44.41–47.33, respectively. Among the locations considered, Jeddah in the southwest appears to be associated with the highest return levels. Tabuk in the North West has the lowest return levels.

  • Based on the 95% confidence intervals, we can expect for example that a maximum temperature event could reappear for Medina and Riyadh stations within the next 10 years and it is almost certain the annual maximum will exceed the currentFootnote 3 maximum within 50 years for all the five stations.

  • The return levels estimated above assume stationarity, meaning the return level of a particular return period is the same for all successive years. This implies that the statistical properties, θ = (μ, σ, ξ), are constant. However, in the non-stationary case, the parameters are time-variant and the return level of extreme temperature will follow too. In particular, it is now possible to calculate return levels for any year within the time period of interest.

Table 5 Stationary and non-stationary return levels for all stations
  • Following Feng et al. (2007), for the purpose of comparison, the return levels, for stations with significant trend, were taken to be the averages during the analyzed period (Table 5, panel B). We report in the same table the minimum and the maximum return levels during the considered period. From this table, we note, for example, for Abha station that the averages return levels for 10, 50, and 100 years are 34.18, 34.47, and 34.54 °C, respectively. The quantiles corresponding to 10-, 50-, and 100-year return periods have varied from 33.42 to 34.94 °C, 33.70 to 35.23 °C, and 33.78 to 35.30 °C, respectively.

  • Similarly, it can be seen that the return levels for maximum temperature for all stations are increasing over higher and higher return periods. Comparing the return level estimates across all the stations (see Table 5), it seems that their values differ markedly from one station to another. Among the stations considered, Al-Ahsa in the southwest appears to be associated with the highest return levels. The second highest return levels appear at Makkah, whereas, Abha has the lowest return levels.

  • To further complete the analysis, we have computed the trends for the three return levels. As example, we observe spatial patterns of the trends in the 10-year return levels of maximum temperature (Fig. 13). Note that the trends of the return levels are also determined using the MKS method. The most relevant feature of the return levels is the predominant significant positive trends in the location parameter over the considered stations. As a result, the return levels increase more considerably for the next T years and the return-level threshold values are also found to change with time. Further, the distribution of trends was not uniform across the study area. As an example of return level estimates, we plot the time series of return levels for maximum temperature at Abha according to the best fitting non-stationary GEV model (Fig. 14).

  • Comparing the spatial patterns of return levels (Fig. 13) with those of the location parameter (Fig. 10), it is obvious that the trend of the location parameter μ 1 reflects the major features of that of return levels. A positive trend for the location parameter μ (i.e., μ 1 > 0) will inherently induce a positive trend for the return level (Chen and Chu 2014).

  • Using these estimates, it is now possible to calculate return levels for any year within the time period of interest. In Fig. 15, we show the difference between the 10-year return levels for the year 1985 and the 10-year return levels for the year 2014. As expected, we see a similar behavior in these differences as we did for μ 1. All stations show an increase in the return levels. The average difference, taken over all stations, is 2.09 °C. This means that we estimate an average increase in TXX of 2.09 °C in the region of interest from 1985 to 2014, based on the models fitted. This analysis shows the strong need to account for the change over time in the estimation of return levels and therefore justifies the use of the non-stationary GEV model to describe the data.

Fig. 13
figure 13

Spatial pattern of trends for non-stationary GEV return level for maximum temperature at 10-yr return period. Triangles denote the locations of individual stations. Up (red) triangles indicate positive trends, and their sizes are scaled according to the magnitude of the trend. All the trends are significant at 5% level

Fig. 14
figure 14

Time series of return levels for maximum temperature at Abha according to non-stationary GEV. The solid, dashed, and dash-dot lines represent the 10-year, 50-year, and 100-year return levels, respectively. The circles stand for observation data

Fig. 15
figure 15

Spatial pattern of difference between the 10-year return levels for the year 1984, and the 10-year return levels for the year 2014. Up (red) triangles indicate positive trends, and their sizes are scaled according to the magnitude of the trend

5 Conclusion

In this study, we have modeled the daily maximum temperatures recorded at 15 meteorological stations over Saudi Arabia, for the period 1984–2014, using two statistical approaches: the first one is based on the trend analysis of the calculated climate indices of extremes and the second one consists in fitting stationary and non-stationary GEV distribution to accurately assess potential changes in extremes maximum temperature. The main conclusions obtained from this study are the following:

  • Using the first approach, we find that positive trends were mostly recorded in TXM, TXX, TXN, SU30, SU35, SU40, TX10P, TX90P, TX50P, WSDI4, WSDI6, and HWN indices, reflecting an increase in maximum air temperatures across the whole area in the period 1984–2014, while negative trends are essentially seen in TX10p, TXLT15, and TXLT20 indices. All the calculated temperature indices show a general warming trend across Saudi Arabia, during the period of analysis over most stations. However, the distribution of warming trends was not uniform across the country.

  • The results of the second approach reveal that the non-stationary models are recommended to describe extreme temperature series for most stations. This is in line with the estimation result of the significantly increasing trend existing in climate indices analyses, especially in TXX index. The return level is increasing from time to time for the next T years (T = 10, 50, 100), and the maximum temperature could start to reappear in the different T year for most stations. The most relevant feature of the three return levels is the predominant significant positive trends on the different stations considered. This analysis shows the strong need to account for the change over time in the estimation of return levels and therefore justifies the use of the non-stationary GEV model to describe the data.

The results of both approaches confirm the existence of warming trend in maximum temperatures in most stations across Saudi Arabia. However, the EVT approach provides further analysis of the possible changes in maximum temperature records through the estimation of the return levels. Indeed, in a climate change perspective, forecastings of such return values to a future climate are of major importance for risk management and adaptation purposes (Vanem 2015). In fact, decision-makers and researchers will eventually benefit from knowledge about the behavior of extreme temperatures, as appropriate policies and plans can be drawn to prepare the general public for changes due to extreme temperatures. Further, they could also look for solutions or just how to reduce the impact of heat on human or on environment.

In this paper, we have shown how climates indices and extreme value theory serve as complementary tools for describing and analyzing extreme temperature records over Saudi Arabia. The study could therefore improve the understanding of recent changes in the variability, intensity, frequency, and duration of extreme maximum temperatures and extreme events over Saudi Arabia. However, the results we have presented show the need for more investigation and therefore can be extended in several ways.

In this study, we have only considered simple forms of non-stationarity, where we only relied on linear models of the time covariates. However, the framework of climate change can be better understood by further integration of other model structures, including quadratic models, cycle covariates, and higher-order polynomial. Furthermore, there is still significant work to be done to improve our understanding of climate changes in Saudi Arabia. This should include, for example, a denser grid of stations, metadata of climate data (temperature, precipitation), and a larger number of extreme climate indices. Finally, researchers could extend our framework and focus on the impacts of the temperature increasing trend either on water resources or on human health in Saudi Arabia.