Introduction

Spatial modelling of rainfall is important for evaluating and predicting spatial patterns and amounts of rainfall for various hydrological studies and water resource management (Abdullah and Al-Mazroui 1998; Gouvas et al. 2009; Mallick 2016; Kumari et al. 2017). The seasonal and geographical patterns of rainfall are crucial factors in the water budget equation (Alyamani and Sen 1993). The rainfall is complex geographical phenomena and the topography of a region plays an important role in controlling the amount and the spatial distribution of the rainfall (Smith 1979; Richardson 1981). Many researchers have evaluated the relationship between the spatiotemporal rainfall distributions and topographic variables such as elevation (Basist et al. 1994; Alijani 2008; Mair and Fares 2011), Geo-location (Oettli and Camberlin 2005; Buytaert et al. 2006), slope and aspect (Buytaert et al. 2006; Konrad 1996), wind information (Johansson and Chen 2003) and proximity to the sea or large water bodies (Johnson and Hanson 1995; Buytaert et al. 2006). These topographical variables has been used to explore the relationships between topography and the spatial distribution of annual (Razmi et al. 2017), seasonal (Kumari et al. 2017) and monthly (Hession and Moore 2011) rainfall using bivariat and multiple regression models and geostatistical methods (Al-Ahmadi and Al-Ahmadi 2013).

Monitoring the rainfall data is important for flood and drought-prone regions (Hong et al. 2007). Therefore, accurate rainfall data for different applications are required for management of hydrological, natural and ecological resources management in rugged terrain (Mallick et al. 2014; Tote et al. 2015). In Aseer region, Saudi Arabia, conventional rain gauges are the primary source of rainfall data (Vincent 2008). However, the distribution of rain-gauge network are inadequate to provide reliable rainfall assessment due to their non-uniform spatial coverage (i.e., rain gauge distribution) and missing data within a very large area (Vila et al. 2009; Rozante et al. 2010; Al-Ahmadi and Al-Ahmadi 2013).

Remote sensing-based (onboard satellite) rainfall measurements may assist ample data with high spatiotemporal resolution over the large extents where conventional rain gauge data are unavailable or scarce (Li et al. 2010; Funk et al. 2014). However, it has several limitation basically related to significant uncertainty, because there is no satellite sensors to detect rainfall; rather, it relies on one or many indirect variables (Tote et al. 2015; Trejo et al. 2016).

The Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) based on multiple data sources is a relatively new rainfall data product with high spatiotemporal resolution. The product was developed by the USGS (United States Geological Survey), EROS (Earth Resources Observation and Science) centre with the collaboration of Santa Barbara Climate Hazards Group, University of California (Funk et al. 2014). Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) is a 30+-year quasi-global rainfall dataset. Covering 50°S–50°N with all longitudes, starting in 1981 to present, CHIRPS incorporates 0.05° resolution satellite imagery with in situ station data to create gridded rainfall time series for trend analysis and seasonal drought monitoring. The details about the operation and production of CHIRPS could be found in Tote et al. (2015). The remote sensing-based (on-satellite) rainfall measurements calibrated vs. rain gauge data could enhance the accuracy of measured rainfall data (Ebert et al. 2007). Various validation studies were performed in the African Sahel (Laurent et al. 1998; Nicholson et al. 2003; Funk et al. 2014), South-eastern Africa (Tote et al. 2015), Southwestern North America (Funk et al. 2014), Brazil (Negri et al. 2002; Franchito et al. 2009) and Colombia (Dinku et al. 2010; Funk et al. 2014).

Rainfall is a non-stationary phenomenon, occurs in the region and local landscape conditions (such as land use, vegetation and topography) and causes a non-homogenous rainfall distribution that varies over space and time (Sierra et al. 2015). Many studies have investigated the non-homogenous rainfall distribution using non-stationary modelling techniques, i.e. ordinary least square (OLS) and geographically weighted regression (GWR), which are extensively implemented regression techniques for modelling the rainfall–elevation relationship (Lloyd 2005; Al-Ahmadi and Al-Ahmadi 2013). Singh et al. (2016) carried out the non-stationary frequency analysis of Indian Summer Monsoon Rainfall extreme (ISMR) and found evidence of significant non-stationarity in ISMR extremes in growing urban areas (sprawling from rural to urban), as compared to completely urbanised or rural areas. Still, their analysis was carried out at 10 spatial resolution of using gridded daily precipitation data acquired from the Indian Meteorological Department (IMD). Yilmaz et al. (2017) found superiority of stationary models over non-stationary models.

The local spatial non-stationary relationships between the variables facilitates an exploratory analysis of the stationary assumption of a global multiple linear regression model. In OLS regression method (global multiple linear regression model), the variable estimates are applied in a single model to all data and uniformly distributed over the whole geographic area of interest. The OLS regression model hypothesised the relationship is geographically constant, and its coefficients are also constant. On the other hand, in GWR, the variable estimates are performed using a method in which the contribution of a sample to analysis is weighted based on its spatial proximity to the geographic space of interest under consideration. Hence, the weighing of observation is not constant in the calibration but differ with different locations.

The present study aims to examine the rainfall–topography relationship using non-stationary modelling technique in Aseer region, Saudi Arabia. The cumulative rainfall data for rainfall months (March to June) of the 4 years is estimated from CHIRPS data for Aseer region. The extremities of this year were selected due to the large variation in cumulative rainfall (March–June) received in the region; the years 1988–2000 were a low rainfall period and 1998–2016 were the high rainfall years. The CHIRPS product has been used due to its free availability and high spatiotemporal resolution. Thereafter, the abovementioned rainfall data has been evaluated using ordinary least squares (OLS) global and geographically weighted regression (GWR) local models. The results obtained from the GWR model and OLS model were compared, mainly focusing on the spatial non-stationary and scale-dependent relationships between rainfall and topographical factors. The outputs of the proposed method are valuable for descriptive purposes and a useful exploratory technique in complex geographical phenomena (viz. rainfall–topography) and other related disciplines where spatial data are used in applications where spatial non-stationary is biased.

Study area

The study has been conducted in Aseer region located in the southwestern part of Saudi Arabia. The region has an area of 84,240 km2 that shares a short border with Yemen. The boundary of the Aseer region lies between the latitude of 17°21′56.506″N and 21°2′11.176″N and longitude of 41°17′54.353″E and 44°31′22.668″E. It consists of mountains, plains and valleys of the Arabian highlands. The elevation varies from 0 to 2990 m with a mean and std. dev. of 1313 m and 544.3, respectively. Geologically, it comprises of sedimentary rocks (limestone, sandstone, shale) of Jurassic-Cretaceous and Precambrian granite igneous rock basement (Vincent 2008; Mallick et al. 2015). The climate of the region differs considerably depending on topography, slope, aspect and season. The highlands of this region collect variable rainfall caused by the southwestern monsoon, which carries wet oceanic winds (Vincent 2008). This region has the highest average rainfall in Saudi Arabia distributed over 2–4 months during the spring and summer growing seasons (March–June) while rainfall that occurs during the rest of the year is negligible (Wheater et al. 1989). The region’s rugged terrain has assisted in preserving the region’s biodiversity. This region also has one of the last natural habitats of the Arabian leopard which is listed on the IUCN Red List as critically endangered (Baillie and Groombridge 1996). The mountainous region of Aseer also corresponds to the Afromontane phytogeographic region (Davis et al. 1995). In the Aseer Mountains region, juniper woodlands and Acacia woodlands are the representative woody vegetation (some areas are shrublands and open forest). Juniper woodlands are distinguishable from Acacia woodlands and shrublands by differences in associated species (Konig 1987). The altitudinal range of the juniper woodlands differed in each area. The juniper woodlands at Tanumah and Billasmar, which is located in the northern part of the Asir, with elevation between 2400 and 2800 m, are lower in comparison with those at Jabal Sudah (central west). On the east slope of Jabal Sudah, the highest peak in the Kingdom of Saudi Arabia (approx. 2990 m), juniper woodlands occur between 2600 and 2990 m.

Data and material used

On the basis of the topographical homogeneity, the region has been divided into sub-regions namely lowland (elevation less than 1500 m) and upland (elevation more than 1500 m). The topography obtained from SRTM data (The Shuttle Radar Topography Mission) by USGS with a spatial resolution of 90 m. This division is based on the study carried out by Bisht (2008) for stratification of Himalayan region. A total of 66% of the total Aseer region was stratified as lowland whereas upland accounts for 34% of total Aseer regions. Figure 1 shows the graphical representation of rainfall for last 50 years (from 61 rain-gauge stations) in Aseer region.

Fig. 1
figure 1

Rainfall for the last 50 years (from 61 rain gauge stations) in Aseer region

The CHIRPS v.2.0 provides rainfall datasets at a 3-km spatial resolution. These data are created coupled with geostationary infrared satellite rainfall estimates in conjugation with rain gauge observations that are interpolated to create gridded precipitation datasets (Funk et al. 2014). Monthly grids (March, April, May and June) over 1988, 1998, 2000 and 2016 study period were downloaded (ftp://ftp.chg.ucsb.edu/pub/org/chg/products/CHIRPS-2.0). The margins of this time period was selected because of the significant difference in cumulative rainfall (March, April, May and June) occurred in the region during the years 1988 and 2000 which were low rainfall years and 1998 and 2016 which were high rainfall years. Table 1 illustrates that the descriptive statistics of the cumulative rainfall data of rainy season (March–June) for years abovementioned.

Table 1 Statistical description of cumulative rainfall data (March–June) in mm for the selected 4 years (as per the rainfall intensity) from CHIRPS dataset product

Methodology

Rainfall data assessment based on spatial-stratified-clustering

In the atmospheric study, climatologist and research scientists frequently require meteorological data (viz. rainfall and other related variables) to classify the parameters into homogeneous classes and identify the spatial patterns (Burrough 2001; DeGaetano 2001). In the mountainous region, the topography is complex, so it might not be possible to evaluate the relationship between rainfall and topographical parameters as homogenously distributed data across the region (Osborne and Suárez-Seoane 2002). The rationale of the rainfall data assessment based on spatial-stratified-clustering is to investigate the complex topographical variables and its relationship with the rainfall. In the spatial-stratified-clustering-based technique, the complex mountainous area is segregated into considerably homogenous sub-areas (Kumari et al. 2017). In the present study, the elevation data (DEM) has been classified into two sub-regions as per the natural break classification technique (Jenks 1967) (Fig. 2). The natural breaks were manipulated to the boundary values 1500 and 2990 m (Bisht 2008). A total of 66% of the total Aseer region was stratified as lowland whereas upland accounts 34% of total Aseer regions.

Fig. 2
figure 2

Spatial distribution of elevation (DEM) variability and cumulative rainfall pattern obtained from CHIRPS of 1988, 1998, 2000 and 2016 of Aseer region

Generation topographical variable from DEM

The highlands of Aseer mountainous region collect variable rainfall caused by the southwestern orographic rainfall, which carries wet oceanic winds (Vincent 2008). The geographic phenomenon of orographic rainfall is produced when moist air lifted as it moves over a mountain range (Dingman 2002). The elevation has considerably stronger correlation with rainfall data as compared to other variables derived from DEM/topography (Kumari et al. 2017)

GTOPO30 with a spatial resolution of 90 m has been used to derive elevation as topographical variable. Wu et al. (2007) investigated the role of the DEM spatial resolution on the parameters derived from it. The ideal grid size (pixel) depends on the purpose and scale of the study area and the topographical characteristics to be derived from DEM. In the large-scale geographic phenomenon, namely orographic rainfall, the mean elevation derived at coarse cell size should be considered (Schermerhorn 1967). The topographical elements extracted at the scale of 1–10 km are strongly correlated with the rainfall as compared to other spatial resolutions (Daly et al. 1994; Kumari et al. 2016). Hence, in this study, the ideal spatial resolution 1 km has been considered for derivation of elevation.

Modelling methods

The spatial evaluation has been performed to compare the rainfall data and elevation using ordinary least squares (OLS) and geographically weighted regression (GWR) models. These two models have been applied to the complete dataset and also each sub-region (lowland and upland) using ArcGIS software.

OLS model

The ordinary least squares is a method for estimating the unknown variables in a linear regression model. The linear model of OLS method is shown in Eq. 1:

$$ \gamma ={c}_0+\sum \limits_{i=1}^p{c}_i\left({x}_i\right)+\varepsilon $$
(1)

where γ = dependent variable, c0 = intercept, c1 = estimated coefficient, x i  = independent variable (elevation, slope), ε = error and p = number of independent variables. The stationary characteristics of the OLS regression method show that a single model is applied to the entire region and is uniformly distributed over the entire geographic area. The OLS regression model hypothesised the relationship is geographically constant, and its coefficients are also constant. Therefore, in a complex orographic environment, it may not apply to predefined, same linear relationship applicable over the entire geographic region under study.

GWR model

GWR model is an extension of the normal regression method and deals with the spatial non-stationarity of empirical relationships. It is a local regression technique appropriate for spatial data with some degree of spatial dependence (Kalogirou 2003). The technique disseminates information that is locally linked and allows regression model variables to vary in spatial domain. GWR infers the problem of spatial non-stationarity by examining the relationship between elevation and rainfall data as explanatory variables at every observed point (Brunsdon et al. 1996). Fotheringham et al. (2003) described the complete presentation of the GWR method.

The principle of GWR provides local rather than global variables to be derived as Pratt and Chang (2012)):

$$ \gamma ={c}_0\left({u}_j,{v}_j\right)+\sum \limits_{i=1}^p{c}_i\left({u}_j,{v}_j\right){x}_{ij}+{\varepsilon}_j $$
(2)

where j is the geographical coordinate, (u j , v j ) shows the longitude and latitude coordination of each location in space, and c0(u j , v j ) c i (u j , v j ) are the local parameters to estimate each location in space. To achieve this, a sub-model around each observation location is identified and fitted under the consideration of a subset of the original observations. The study area of the sub-model is a neighbourhood explained by a weighing scheme in which the nearest observations have a non-zero weight. Usually, the number of sub-models equals the number of observations. By computing a local parameter estimate for each observation of the study area, it is possible to examine the potential variability of the relationship between the dependent and independent variable. More details of the algorithm could be found in Georganos et al. (2017).

In GWR, the variable estimates are performed using a method in which the contribution of a sample to analysis is weighted which is based on its spatial proximity to the specific geographic space. Hence, the weighing of observation is not constant in the calibration but varies with different locations. Data from observations near the location weighted more than the data from the observations farther away (Wang et al. 2005). Both OLS and GWR models were executed by using the spatial-statistical extensions in ArcGIS version 10.3.

The evaluation of model performance

The coefficient of determination, R2, is calculated to analyse the goodness of fit (Caruso and Quarta 1998) of GWR and OLS models. A higher coefficient is a sign of a better goodness of fit for the observations. The value of R2 is in the range from 0 (0%) to 1 (100%) and represents the strength of the linear relationship between x and y. The higher coefficient of determination, R2 value, represents a stronger understanding of the variables responsible for the variation in the dependent variable. However, the statistical result interpretation based on R2 may be biased. In case of choosing small bandwidth, the R2 value will be high. Therefore, the value of Akaike information criterion (AIC) was incorporated with R2 to check the model performance. Two performance indices such as root mean square error (RMSE) and mean absolute error (MAE) were used to verify the prediction accuracy of OLS and GWR:

$$ RMSE=\sqrt{\frac{\sum \limits_{i=1}^N{\left[{z}_i^{\ast }-{z}_i\right]}^2}{N}} $$
(3)
$$ MAE=\frac{1}{N}\sum \limits_{i=1}^N\left[{z}_i^{\ast }-{z}_i\right] $$
(4)

where \( {z}_i^{\ast } \) = predicted value and z i  = observed value.

Measuring spatial autocorrelations

Spatial autocorrelation analysis is performed to assess the characteristics of the geostatistical models based on the observed and predicted data. In this regard, there are significant literature available related to the procedures for autocorrelation analysis. In the present study, global spatial autocorrelation and local spatial autocorrelation of the Moran’s I statistics were performed to evaluate the spatial autocorrelation of the rainfall data. Global spatial autocorrelation analysis can be carried out to evaluate the geographical phenomenon characteristic of a given variable which applies to the entire dataset and depict the mean of spatial variation between all the spatial cells and their surrounding cells (Dai et al. 2010). In Moran statistics, the normalised z-score value ranges from − 1 to 1. In a specified significance level, a Moran value significantly greater than zero depicts positive correlation having cluster pattern, whereas a Moran value significantly less than zero depicts negative correlation having dispersed pattern.

The global spatial autocorrelation Moran’s I statistics was calculated as given in Eq. 5 (Xu et al. 2015):

$$ I=\frac{N\sum \limits_{\mathrm{i}=1}^{\mathrm{n}}\sum \limits_{\mathrm{j}=1}^{\mathrm{n}}{\mathrm{w}}_{\mathrm{i}\mathrm{j}}\left({\mathrm{x}}_{\mathrm{i}}-\overline{\mathrm{x}}\right)\left({\mathrm{x}}_{\mathrm{j}}-\overline{\mathrm{x}}\right)}{\left(\sum \limits_{\mathrm{i}=1}^{\mathrm{n}}\sum \limits_{\mathrm{j}=1}^{\mathrm{n}}{\mathrm{w}}_{\mathrm{i}\mathrm{j}}\right)\sum \limits_{\mathrm{i}=1}^{\mathrm{n}}{\left({\mathrm{x}}_{\mathrm{i}}-\overline{\mathrm{x}}\right)}^2} $$
(5)

where N = number of observations, xi = observed value of cell i, xj = observed value of cell j, \( \overline{\mathrm{x}} \) = mean value of xi, and wij is the weighting value between the cells i and j. The global spatial autocorrelation Moran’s I statistics only depicts the overall clustering pattern, but it cannot be evaluated to detect spatial association pattern in several locations. The global spatial autocorrelation does not show the location of the clusters whereas local spatial autocorrelation analysis assesses the significance of local statistic at each location and identification of the location of spatial clusters. Hence, local spatial autocorrelation Moran’s I is used to evaluate the local spatial association and difference between each cell and its surrounding cells (Dai et al. 2010). The local spatial autocorrelation Moran’s I is calculated using Eq. 6 (Anselin 1995):

$$ {I}_i={x}_i\sum \limits_{i=1,j\ne i}^N{w}_{ij}{x}_j $$
(6)

where N = number of observations, xi = observed value of cell i, xj = observed the value of cell j, and wij = weighting value between the cells i and j and ∑ j w ij  = 1. The result of local spatial autocorrelation Moran’s I may be estimated using z-score. In the present study, Moran’s I statistics was calculated based on ArcGIS tools. Similar to the global OLS Moran’s I, the outcome of local Moran’s I may be calculated by means of z-score.

Results and discussion

Analysis of rainfall data

The Aseer region has the highest average rainfall in Saudi Arabia distributed over 4 months (March, April, May and June). The cumulative rainfall distribution of rainy months (March–June) of the year 1988, 1998, 2000 and 2016 is represented in Fig. 3. The distribution of complete dataset of low rainfall years accounted means of 75.5 and 73.4 mm in the years 1988 and 2000, respectively, whereas the distribution of complete dataset of high rainfall years accounted with a mean value of 111.4 and 147.7 mm in the years 1998 and 2016, respectively.

Fig. 3
figure 3

Cumulative rainfall (March–June) distribution for 1988, 1998, 2000 and 2016 in the geographical Aseer region

The coefficient of skewness (CSK) of all 4 years (CSK = 0.20 to 1.76) is near to zero. All 4 years (low-high rainfall) of the complete dataset (topographical characteristics) are right skewed with leptokurtic distribution. During all 4 years in the lowland rainfall datasets, CSK is in between 0.77 to 1.76 whereas for upland rainfall datasets, CSK ranges between 0.20 and 1.24. The kurtosis of the years 1998, 2000 and 2016 is < 3, which suggests platykurtic distribution whereas 1988 dataset shows > 3, suggesting leptokurtic distribution.

Bandwidth scale dependency

The relationship between topography and rainfall in the Aseer region, Saudi Arabia during the 4 years is scale dependent. The uniform pattern observed as the bandwidth broadened and incorporated information from locations far away, thereby smoothing the regression coefficients and nearer to those of a global regression model. In contrast, with narrow bandwidths, very comprehensive patterns were developed with increased standard errors.

Figure 4 shows the Stationarity Index (SI) for the years 1988, 1998, 2000 and 2016 and justifies the scale-dependency of non-stationarity which was evident by changing the scale of investigation. In all the years, the SI was higher for small bandwidths, while for broadened bandwidth, the SI index got flattened (stabilised). Also, the index values were not stationary in any of the investigated spatial scales, which illustrates having a high non-stationary data. The SI decreased abruptly with an increase in bandwidth that stabilised around 12 km; this suggests that this is the appropriate scale of the rainfall–elevation relationship, viz. the lowering geographical area with which a reliable relationship can establish for the whole study area. This scale represents the geographical landscape unit that infers the natural arrangement of the geographic phenomena where differences of non-stationarity could be integrated while removing unnecessary error in the model.

Fig. 4
figure 4

Stationarity index (SI) for 1988, 1998, 2000 and 2016 at various bandwidths. The SI calculated by the ratio of the IQR (interquartile range) of standard errors (SEs) for the geographically weighted regression coefficients with double the SE of a constant

The GWR and OLS model assessment

Model assessments were carried out between GWR and OLS models for 1988, 1998, 2000 and 2016 outputs. The coefficient of determination (R2) for GWR was much higher than the OLS model for all 4 years (Fig. 5). The GWR models deviate less while OLS models are uneven in their estimations. In the GWR model, the coefficient of determination (R2) ranges between 0.86 and 0.94 and the coefficient of determination (R2) ranges from 0.22 to 0.35 for the OLS models.

Fig. 5
figure 5

R2 for GWR and OLS models during 1988, 1998, 2000 and 2016 for complete, upland and lowland geographical region

The F-test based on ANOVA suggested that GWR model gives a substantial improvement over the OLS models in all 4 years (p < 0.01). GWR statistics described more of variance and lowered AIC values (year 2016 = 26,913; year 2000 = 23,971; year 1998 = 28,682; year 1988 = 26,914) than did OLS models (year 2016 = 93,740; year 2000 = 84,458; year 1998 = 87,363; year 1988 = 82,022), which justifies the degrees of freedom and changes in model complexity.

Geographical pattern of the rainfall–elevation relationship

In the present study, elevation was computed based on the optimal pixel size (1 km) of DEM taken from the correlation study (Daly et al. 1994; Kumari et al. 2016). There is spatiotemporal variability found in the rainfall–topography correlation throughout the Aseer region. GWR models produced the maps of slope parameters (β coefficients), local R2 and standardised residuals (StdResid) to identify the geographical variability relationships between effective bandwidth and related factors. The coefficient of determination (local R2) ranges from 0 to 1, which demonstrates the local regression model fit with the observed values and the local models with high values being preferable whereas the low values perform poorly. The objective of mapping the local R2 values is to verify if GWR predicts better and if prediction becomes poor; it may give indications about significant variables that may be missing from the regression model. Residuals are the variation between the observed and predicted y values respectively. Standardised residuals have a mean of 0 and a standard deviation of 1. In the low rainfall years (1988 and 2000), it shows (Fig. 6) that the local fits are high (R2 > 0.4) in the northwestern (Al Namas, Sabat Alalayah and Bisha), the western and northeastern (Tathleeth and Morighan) and the southwestern (Abha, Alsooda, Muhayil) part of Aseer region. It signifies that rainfall is a very useful determinant in these areas, whereas the local fits (R2) are lower in the north and north-central parts of Aseer region that signifies that the land use and other ecological factors have strong influences in these areas. Figure 7 shows that in high rainfall years (1998 and 2016), the local fits are high in the northwestern and northeastern parts of Aseer region whereas the north and north-central parts of Aseer region depicted lower local fits.

Fig. 6
figure 6

Spatial variation of regression outputs of low rainy months (1988–2000) from the GWR model for effective bandwidth. a Slope parameter (β coefficient). b Local R-squared (R2). c Standardised residuals

Fig. 7
figure 7

Spatial variation of regression outputs of high rainy months (1998–2016) from the GWR model for effective bandwidth. a Slope parameter (β coefficient). b Local R-squared (R2). c Standardised residuals

The local fits perform better in the rainfall year of 2016 than in the relatively low rainfall of 1988, although the general or somehow clustering variability is similar in all the 4 years (Figs. 6 and 7). The vast majority of the rainfall coefficients are positive, inferring that an increase in elevation relates to an increase in rainfall. However, the rate of variations increases significantly throughout some parts of the north and north-central parts of Aseer region (Figs. 6 and 7).

The strength of the associations is higher in the years 1988 and 2000 than in 1998 and 2016. Although rainfall coefficients were lower in most of the high rainfall years 1998 and 2016 than of low rainfall years 1988 and 2000, an apparently high cluster extending over the central and western parts had remarkably high values that exceeded all the coefficient values of 1988 and 2000. The findings suggest that elevation was sensitive to rainfall in Aseer region in all the rainfall years. Rather than developing a sensitive continuous geographical phenomena transitional zone, the Aseer semi-arid region forms clusters as per the β coefficient (Fig. 8) that regulated the transitional division between semi-arid and arid environments, and their clustering size is reliant on the quantity of rain.

Fig. 8
figure 8

Spatial clustering maps for the slope coefficient in the Aseer region in 1988, 1998, 2000 and 2016, respectively. Dark brown dots suggest significant clustering of similarly high values while light blue dots are significant clustering of low values

Figure 9 shows the linear spatial-temporal trends of rainfall coefficients during 1988 to 2000 and 1998 to 2016. During 1988 to 2000, the influence of rainfall as a local predictor of elevation reveals a negative trend in most of the eastern and central-northwestern parts of Aseer region, whereas the positive trend is located in the southwestern, northeastern and northwestern parts of the Aseer region. In the years 1998 to 2016, the negative trend in most of the northern region extended towards east-west, whereas the positive trend was found in the western, central part of the Aseer region.

Fig. 9
figure 9

Temporal trends of the rainfall coefficient between 1988 and 2000 and between 1998 and 2016 over the rainy months (cumulative rainfall March–June). The trends describe the temporal changes in the significance of rainfall as a local predictor of elevation in the Aseer region

Spatial variability in predicted patterns

The performance of elevation based on global and local model viz. OLS and GWR for cumulative rainfall (March–June) datasets of 4 years was estimated using RMSE and MAE, which is shown in Fig. 10. It suggests that the GWR local model performs better than the OLS global model for all the years considered in the study. Figure 10 shows the scatterplots of the observed and predicted values obtained from OLS and GWR model of the years 1988, 1998, 2000 and 2016. The OLS global models were not consistent, miscalculating high values of elevation with the low values because the constant set of variables was used to draw a relationship over an extensive area with high spatial heterogeneity, and also, it creates a single regression equation to represent that process. The GWR, local modelling approach managed to produce more accurate estimates by taking into account local characteristics. Figure 11 shows the observed and predicted spatial patterns of rainfall for 1988–2000 and 1998–2016, respectively. The OLS model created more generalised patterns that disregarded local geographical variability in the rainfall across the Aseer region (RMSE 19.35and MAE 15.01 in 1988; RMSE 25.65 and MAE 19.12 in 1998; RMSE 21.97 and MAE 18.17 in 2000; RMSE 36.05 and MAE 27.93 in 2016). However, the GWR local model (Table 2) had higher accuracy because it considered the local geographical variability in the rainfall–elevation relationship and other ecological factors (RMSE 8.24 and MAE 4.96 in 1988; RMSE 10.42 and MAE 6.03 in 1998; RMSE 5.91 and MAE 3.62 in 2000; RMSE 8.24 and MAE 4.96 in 2016).

Fig. 10
figure 10

Scatterplots of observed and estimated rainfall for the rainy months (March–June). OLS model in 1988, 1998, 2000 and 2016; GWR model in 1988, 1998, 2000 and 2016. RMSE and MAE refer to the root mean square error and mean absolute error, respectively

Fig. 11
figure 11

Observed and predicted spatial patterns of rainfall for 1988–2000 and 1998–2016 for observed pattern, global OLS model and local GWR model

Table 2 The performance of elevation based on global and local model viz. OLS and GWR for cumulative rainfall (March–June) datasets of 4 years

The usefulness of the GWR is its local method in analysing the relationship between spatial variables. This empowers the utilisation of the non-stationarity in the relationship for better prediction and also classifies spatial patterns in the model residuals and reduces the spatial autocorrelation of the residuals. Regression analysis based on applying the OLS conventional global regression model shows that there is a significant relationship between elevation and rainfall. The GWR model allows the regression variables to vary in space, and time has smaller residual sum of square than OLS. The GWR method is applied for dealing with spatial relationship which significantly decreases both the degree of autocorrelation and absolute values of the regression residuals. The results recommend that GWR provides a better solution to the problem of spatially autocorrelated error terms in spatial modelling compared with the OLS global regression modelling.

The results demonstrated that the elevation-rainfall relationship is not stationary over the Aseer semi-arid region during the rainy months (March–June) in the years 1988–2000 and 1998–2016. The study findings recommend that elevation is a significantly powerful predictor of rainfall if spatial non-stationarity can be integrated into the regression model. In the study region, the coefficient relationship was positive across most of the region. However, some patches show the very weak or negative relationships. Moreover, the importance of association also varied across the study area. GWR is a function to detect the continuous variation in respect to spatial relationships by incorporating the local information with considerably efficient performance as compared with the global OLS models.

Kumari et al. (2017) suggested that the global OLS model techniques show poor performance compared to the local geographically weighted regression (GWR) technique in explaining the variability in the rainfall–topography relationship over complex terrains like the Himalayas. This study estimates the geographical variability of a relationship is very high between rainfall (acquired from 80 rain-gauge stations) and the topography using GWR model. It performs effectively with 48% improvement in the result when it is compared to OLS. The relationship between elevation and climatic variables such as rainfall has been shown to be scale dependent (Zhao et al. 2015; Kumari et al. 2017), and the identification of an appropriate bandwidth in GWR is essential input in the model (Gao and Li 2011). The bandwidth choice is a substitute between variation in local estimates and model biases. In general, this choice is based on AICc, but with large sample sizes, AICc can propose optimum conditions in models with extremely small bandwidths, overestimated R2 values and large standard errors (Propastin et al. 2008). It also inhibits the substantial interpretations and introduces enormous noise in the results. In the present study, AICc was appropriate for comparing GWR and OLS once it is assigned the appropriate bandwidth based on stationarity index (SI). However, it is essential to consider that in bivariate models with large sample (n > 10,000), it is presumed to encounter very high collinearity problems (the coefficient estimates are unstable and difficult to interpret) in the local exploratory relationships. The estimation of SI at increasing spatial scales deciphers appropriate information about the bandwidth dependency of elevation and rainfall in the Aseer rainfall season. The 4-year dataset selected, based on low and high rainfall occurrence, for detailed examination showed that SI stabilised at 12 km (Fig. 4), and the regressions plotted at that bandwidth were more stable and consistent, because the condition variables and local-correlation coefficients did not propose any bias local-correlation.

Conclusions

The main objective of the study is to examine the spatial variation in rainfall–topography relationship using non-stationary modelling technique in the semi-arid Aseer region, Saudi Arabia. The GWR using topo-climatic data (elevation and rainfall) to analyse the cumulative rainfall data for rainy months (March to June) of the 4 years was estimated from CHIRPS product. In the GWR, the variable uses a method in which the contribution of a sample to analyse the weight is based on its spatial proximity to the specific geographic space of interest under consideration. Hence, the weighting of observation is not constant in the calibration but differs with different locations. GWR models produced the maps of slope parameters (β coefficients), local R2 and standardised residuals (StdResid) to identify the geographical variability relationships between effective bandwidth and related factors. The coefficient of determination (R2) for GWR was much higher than the OLS model for all 4 years. The results validated the hypothesis that GWR local modelling is a viable alternative to OLS global modelling in heterogeneous areas that are sensitive to ecological changes.

GWR statistics described more of variance and lowered AICc values than OLS models, which justify the degrees of freedom and changes in model complexity. The outcome findings suggest that rainfall is sensitive to elevation over in Aseer region in all the rainfall years. Rather than developing a sensitive continuous geographical phenomena transitional zone, the Aseer semi-arid region forms clusters as per the β coefficient that regulated as the transitional division between the semi-arid and arid environments, and their clustering size is reliant on the quantity of rain. The study findings recommend that elevation is a significantly powerful predictor of rainfall if spatial non-stationarity can integrate into the regression model. Additionally, in these areas, rainfall appears to be the dominant determinant in understanding the topographical variation. In the study region, the coefficient relationship was positive across most of the region. However, very weak or negative relationships were found in some patches. Moreover, the importance of association also varied across the study area. The results show that parts of the study area were particularly sensitive to variability in rainfall that formed large clusters that connected semi-arid and arid climatic zones. In these areas, elevation appears to be the dominant factor in understanding the variability of rainfall. Moreover, regions mainly located around the lower-altitude areas and valley demonstrated weak relationships with rainfall, indicating the need for the incorporation of additional variables to explain variations in rainfall. The GWR approach produced better predictions and lower autocorrelation in the residuals and highlighted interesting local variations. The finding shows that the orographic effect increases rainfall with elevation in the Aseer mountainous region specifically in upland areas (> 1500 m). Terrain plays a key role in extreme rainfall variation in the study area. The rainfall–elevation relationship at long rainfall duration events can assist to improve rainfall pattern estimation for a landslide warning system.

As such, GWR is strongly suggested as both an explanatory and exploratory method in spatiotemporal analysis and water resource modelling where spatial constancy in relations between variables is part of further research. Additionally, the spatially variable relationship between NDVI, rainfall and elevation parameters has not been yet explored in depth in semi-arid mountainous region. As such, further improvement for this study is to model the complex relations between NDVI and rainfall with a focus on vegetative growing season by using a GWR local non-parametric regression method.