1 Introduction

In recent years, studies on predicting and analyzing precipitation have gained momentum due to their importance in managing water resources, agriculture, and environmental issues. As a vital element of climate and climate change, precipitation has become the focus of many researchers’ projects. Spatial and temporal variations in precipitation concentration refer to precipitation changes in the particular region over a certain period of time. These variations may profoundly affect different environmental elements such as drought (Parajka et al. 2010; Chang et al. 2017; Sarricolea et al. 2019; Darand and Khandou 2020; Darand and Pajzhoh 2022), flood (Shi et al. 2015; Huang et al. 2019; Singh et al. 2020; Hekmatzadeh et al. 2020), soil erosion, and change in vegetation cover (Zhao et al. 2012; Vyshkvarkova et al. 2018). These variations can also increase the spatiotemporal distribution of precipitation in an unusual way, hence boosting the frequency, intensity, and length of heavy precipitation events (Zhai et al. 2005).

An important aspect of climate change requiring close inspection is variations associated with the temporal distribution of precipitation (Adegun 2012). Unbalanced distribution of precipitation leads to drought, declining soil humidity and eliminating vegetation cover. In such a situation, even precipitations less than the average amount of the region may cause dangerous floods (Fu et al. 2023). Variations in precipitation also influence water resources, including groundwater, surface water, and snow reserves. It is therefore essential to use some indices to study changes in precipitation. Indices of precipitation concentration are used as warning instruments in hydrological, water resource, and environment management programs with the aim of increasing capability of dealing with flood, erosion, etc. (Adegun 2012).

The indices which have recently become popular are concentration index (CI), precipitation concentration index (PCI), precipitation concentration period (PCP), and precipitation concentration degree (PCD) (Li et al. 2011; Zhang et al. 2019; Zhao et al. 2019; Sarricolea et al. 2019; Kaboli et al. 2021). Owing to its scientific and practical advantages, PCI has been extensively used to analyze spatiotemporal patterns in different regions including Spain (Martin-Vide 2004; Máyer et al. 2017), New Zealand (Caloiero 2014), Peru (Zubieta et al. 2017), India (Chatterjee et al. 2016; Yin et al. 2016), Algeria (Benhamrouche et al. 2015), Chile (Sarricolea et al. 2019; Sarricolea et al. 2019), the United Arab Emirates (Royé & Martin-Vide 2017). In these studies, precipitation intensity and concentration were explored. Precipitation concentration indices have been used in some studies. Alijani et al. (2008), for example, explored precipitation intensity in 90 synoptic stations in Iran. The results demonstrated an irregular precipitation distribution in Iran, with the coastlines of the Caspian Sea, the areas around the Zagros mountain range, and the northeast of Iran recording the heaviest precipitations. Cortesi et al. (2012) studied daily precipitation concentration in the entire Europe, discovering that the highest density of daily precipitation was located on the coasts of Spain and France. They reported that latitude and distance from the sea were the main predictors of the distribution of daily precipitation concentration in the studied region. Valli et al. (2013) examined the annual and seasonal precipitation pattern in the state of Anhra Pradesh, India, using PCI. The findings showed an irregular precipitation distribution in the region with values varying from 16 to 35. Mondol et al. (2018) carried out a similar study in Bangeladesh and observed a significant rise in the temporal concentration of precipitation after 2000, which was in alignment with the increasing spatiotemporal irregularity of precipitation intensity and duration as well as temporal concentration of precipitation. Benhamrouche et al. (2022), who examined daily precipitation concentration in Central Coast Vietnam, reported high values of CI, ranging from 0.62 to 0.72. Wang et al. (2022) studied the spatiotemporal variability of CI in Shaanxi Province, China. The results indicated that the values of the precipitation correlation index and CI were greater in the south of Shaanxi and smaller in its north. The results also showed a falling trend in the annual rainfall of Shaanxi over the recent decades.

Spatiotemporal variability of precipitation is a key factor in managing water resources and agricultural products, reducing drought-related hazards, controlling floods, and gaining a better understanding of the impact of climate change (Zamani et al. 2018). Consequently, the ability to forecast spatiotemporal variations in precipitation is critical in managing water resources and soil in a region. Precipitation is of particular importance in a water-deficient land like Iran, where water resources depend on rainfall and the population is on the rise. Given the high spatiotemporal variability of precipitation in Iran, there are numerous uncertainties associated with rainfall variations. Given the critical condition of water resources in the country, it is essential to carry out a comprehensive analysis of water resource management. To this end, in the current study, attempts are made to forecast spatiotemporal variations in precipitation concentration using the data of the sixth report.

2 The study area

The study area comprised Iran, which is located in southeast Asia with a latitudinal coordinate of 25°–40° N and a longitudinal coordinate of 44°–64 °E, and covers an area of 1,648,195 km2 (Fig. 1). Iran borders Armenia and Azerbaijan in the northwest, the Caspian Sea in the north, Turkmenistan in the northeast, Afghanistan and Pakistan in the east, the Persian Gulf and the Oman Sea in the south, and Iraq and Turkey in the west. In general, Iran is located in a mountainous and semi-arid region and its average height is over 1200 m above sea level. Since Iran encompasses a vast area, hosts numerous geographical factors, and is located at the transition point of different atmospheric circulation systems, it is home to a wide range of climates and ecosystems.

Fig. 1
figure 1

Demonstrates the distribution of the studied ground stations

Considering the temperature, Iran is divided into a cold mountainous region and a hot low altitude part. The average temperature of the country is around 18 °C. The presence of various synoptic systems like the Ganges low pressure and Azores high pressure as well as the moisture content of the atmosphere play a key role in the formation of different temperature zones in Iran (Masoodian and kavyani 2008). The country experiences a more homogeneous temperature situation during the summer than winter. The annual precipitation rate in Iran is 250 mm, meaning that the country is categorized as an arid one. During cold seasons, the dominance of western winds and proximity to the moisture source of the Mediterranean Sea cause heavy rainfalls. During hot seasons, however, the presence of Azores high pressure significantly reduces the amount of precipitation. Moreover, the spatiotemporal distribution of precipitation in Iran has an irregular pattern. The highest and lowest precipitation rates respectively belong to the southern coastlines of the Caspian Sea and the central deserts, including the Lut Desert and the Great Salt Desert.

3 Data

3.1 Synoptic data

In the current study, the data obtained from 95 synoptic stations, which had been collected during a period ranging from 1985 through 2014, were utilized and analyzed as the basis. Attempts were made to select stations that were located in various climatic zones, had little missing data, and met minimum standards measured through quantitative tests.

3.2 Coupled model intercomparison project (CMIP) and shared socioeconomic pathways scenarios (SSPs)

CMIP is a key activity in research about climate forecasting. This project is a powerful source for advancing model development and gaining scientific understanding of the earth’s system through systematic comparison of climate models’ output generated in various climate modeling centers. Compared to the previous versions, CMIP6 has not only improved the algorithms and physical processes, but also included new variables in the ocean, ocean biogeochemistry, and sea ice sections (Zhu and Yang 2020). Following the Intergovernmental Panel on Climate Change (IPCC) Sixth Assessment Report in 2021, new climate change scenarios, known as SSPs, were projected. SSPs are climate change scenarios of projected socioeconomic global changes up to 2100. They are used to derive greenhouse gas emission scenarios with different climate policies. In the present study, the two scenarios of SSP1-2.6 (very low greenhouse gas emission and adaptability) and SSP5-8.5 (very high greenhouse gas emission and high adaptability) for the near future (2021–2040).

Table 1 presents the details of the five general circulation models (GMCs) used in the current study. Attempts were made to select models that were available, enjoyed enough popularity among researchers, and took climate sensitivity into account. As such, the Earth System Models (ESMs), which clearly model carbon movement in earth’s systems, were exploited in this research. These models try to simulate all aspects related to the earth’s systems including physical, chemical, and biological processes. They are thus more sophisticated in estimating climate in comparison with previous models (i.e., the global climate model/GCM), which only forecast atmospheric and oceanic processes. In addition to the model’s availability and popular–for example, GFDL-ESM4 was utilized by Sentman et al. (2018), IPSL-CM6A-LR was exploited by Boucher et al. (2020), MPI-ESM1-2-HR was used by Müller et al. (2018), and UKESM1-0-LL was adopted by Sellar et al. (2020)—their climate sensitivity was an inclusion criterion. Climate sensitivity is typically defined as the global temperature rise following a doubling of CO2 concentration in the atmosphere compared to pre-industrial levels. Before industrialization, CO2 was about 260 parts per million (ppm), so a doubling would be at around 520 ppm.

Table 1 Models used for ensemble

3.3 Data extraction, regridding, and skew correction

First, the data associated with the used models were extracted from the areas which were close to the synoptic stations. Then, in order to make pixel dimensions of the models and ground data comparable, the gleaned data were gridded. Kriging was used for data regridding. Examining the data indicated that the spatial resolution of 19 km was found to be appropriate for pixel dimensions. Consequently, gridding yielded 19 × 19 km pixels. In the next stage, microscaling was performed on the developed pixels through the delta change factor (DCF) approach using the temperature data of the utilized models. More precisely, DCF approach was adopted to conduct skew correction on the decadal prediction models. DCF was calculated using Eq. (1):

$$ T_{{{\text{frc}}}}^{{{\text{BC}}}} (t) = P_{{{\text{obs}}}} (t),\left[ {\frac{{\mu_{m} P_{{{\text{frc}}}} (t)}}{{\mu_{m} P_{{{\text{conter}}}} (t)}}} \right] $$
(1)

where T is the target variable, conter refers to the number of simulated CMIP6-DCCP models during the control period, obs is the period of observation, f is the projected future time series whose skewness must be corrected, BC is the projected future time series whose skewness has been corrected, t is the time step, and \({\mu }_{m}\) is the long-term monthly average (Mendez 2020). Upon microscaling, the models’ errors were assessed using numerous methods such as RMSE, MAE, MBE, and R2. All the abovementioned steps were coded in MATLAB and the data are retrievable from https://github.com/poyan2021/ensemble2.git.

3.4 Ensemble method

To minimize the uncertainty of the used models, a multi-model ensemble model, which was based on the correlation-weighted average, was used for forecasting. Equation 2 was used in this ensemble method.

$$ w^{T} x^{j} = \sum\limits_{k = 1}^{K} {w_{k} x_{k}^{i} } $$
(2)

where \(w_{k}\) is the weight of the data in each model and \(x_{k}\) is the data estimated based on the model (Bai et al. 2020). Pearson correlation was used to estimate the weight of the data obtained from each model. Pearson correlation coefficient was calculated using Eq. 3:

$$ r_{xy} = \frac{{\sum\limits_{i = 1}^{N} {(X_{i} - } \overline{X} )(Y_{i} - \overline{Y} )}}{{\sqrt {\sum\limits_{i = 1}^{N} {(X_{i} - \overline{X} )^{2} \sum\limits_{i = 1}^{N} {(Y_{i} - \overline{Y} )^{2} } } } }} $$
(3)

Finally, the models whose data had stronger correlations with the real data received higher weights.

3.5 Validation of the ensemble system output and system member models

The Taylor diagram was used to validate the direct output of the models. It is a suitable instrument to validate the output of the set of climate models and is growing in popularity in climatology studies (Wehner 2013). The Taylor diagram is based on the geometric relationship between correlation coefficient, standard deviation, and root mean square deviation (RMSD). This diagram is displayed in the form of a semicircle showing negative and positive correlations, or a quarter circle indicating only positive correlations. In both forms, the correlation coefficients are displayed as the radius of the circle on its arc, standard deviations are presented as concentric circles relative to the reference point, and RMSDs are shown as concentric circles relative to the center of the circle. The hallow circle on the horizontal axis of the reference point indicates the position of the ground station in light of the standard deviation of the time series. Accordingly, the models that are located closer to the reference point enjoy higher accuracy (Azizi et al. 2016) (Fig. 2).

Fig. 2
figure 2

Standard deviation error of the used ensemble model

Given the large number of synoptic stations used in this study, 10 stations which represented various climatic zones were selected for the validation process. Validation of the model data and the ensemble model were carried out using these stations. Each of these 10 stations represented a climatic zone of the country. Figure 3 displays the precipitation Taylor diagram for the selected stations. Examining the diagram shows that, in all the stations, the ensemble data had the highest similarity with the real data. Low error and standard deviation values and high correlation coefficients support this claim. It is therefore argued that the ensemble system was able to significantly reduce the error of the estimated data.

Fig. 3
figure 3

Validation of model and ensemble data based on Taylor diagram

3.6 PCI

PCI indicates the concentration and distribution of rainfall. The seasonal scale of this index is calculated using Eq. 4:

$${PCI}_{{\text{seasonal}}}=\frac{\sum_{i=1}^{3}{Pi}^{2}}{{(\sum_{0=1}^{3}Pi) }^{2}}*25$$
(4)

where PI is the amount of rainfall in the ith month. According to the suggested formula, the minimum value for PCI is 8.3, which indicates complete regularity in precipitation distribution. In other words, this value implies that the same amount of precipitation has been recorded in each region during each month. If PCI is equal to 16.7, it indicates that the whole precipitation has concentrated on 1.2 of the time interval. According to this categorization, Oliver (1980) suggests that PCI values smaller than 10 indicate regular precipitation distribution, values between 11 and 15 show moderate precipitation distribution, values ranging from 16 to 20 display irregular distribution, and values greater than 20 demonstrate high irregularity in precipitation distribution (Table 2) (Luis et al. 2011).

Table 2 Classification of the precipitation concentration index (PCI) (Oliver 1980)

3.7 PCD and PCP

PCP shows the cumulative amount of annual precipitation in a particular region. PCP is calculated through adding up the monthly rainfall vector. In other words, PCP represents the cumulative amount of annual precipitation in a region. PCD, on the other hand, indicates the annual degree of precipitation concentration in a particular region. It is estimated through calculating the direction (angle) of the monthly rainfall vector during a year. To assess PCD, monthly rainfall vector can be regarded as a 360-degree vector and its direction during a year can be examined. The direction can be regarded as the PCD index. Using this procedure, annual precipitation concentration and distribution can be estimated for a particular region to discover the degree of precipitation distribution during a year. The procedure for calculating PCD and PCP was proposed by Zhang and Qian (2003). PCD has recently become an important index to gauge the regularity of regional precipitation distribution. Also, PCP is able to shed light on the irregularity of precipitation over the course of time using quantitative measures. It is calculated through Eqs. 5 and 6:

Where n is the overall number of days in the ith year, j is the ordinal number of a particular day in the ith year, rij is the degree of precipitation in the station on the jth day of the ith year, Ri is the total precipitation degree of the station in the ith year, which is equally divided into [− π,,  π,] intervals based on the number of days in the ith year, and θj is the angle of the jth day. PCD ranges from 0 to 1. PCD values that are closer to 1 indicate more cumulative annual precipitation. Conversely, PCD values that are closer to 0 represent more regular annual precipitation distribution (Table 3).

Table 3 Classification of PCD values

4 Results and discussion

4.1 Analyzing PCI

Figure 4 demonstrates PCI for the near future (2021–2040). The results of analyzing PCI indicate irregular precipitation distribution along the coastlines of the Gulf of Oman and the Persian Gulf. The maximum value of PCI (over 50%) was recorded for areas around the Gulf of Oman. The lowest PCI (less than 20%) was observed on the coastlines of the Caspian Sea. These coastlines also registered the highest number of rainy days across the country, which were regularly distributed in different seasons. Thus, there is an inverse relationship between PCI, on the one hand, and number of rainy days and latitude, on the other hand. The largest PCI values were recorded in the southeastern parts of Iran. In other words, the southern part of the country experienced more precipitation irregularity. In contrast, northern, northwestern, and northeastern parts of Iran had relatively regular precipitation distribution. It appears that air masses and the presence of heights play a significant role in the regularity of PCI. Examining the association between longitude and PCI showed that the largest PCI values were observed in arid regions, which have little precipitation and low altitude. Our findings are similar to the ones reported by Darand and Pajouh (2022), who studied PCI using field data. Estimating PCI indicated that precipitation irregularity will go up during the period ranging from 2021 through 2040. Maity and Maity (2022) also demonstrated that, based on the CMIP6 data, precipitation intensity will increase during the coming century. Zarrin and Dadashi-Roudbari (2022) further showed that, according to the CMIP6 data, future years will witness more intense precipitations across the country. Heightened precipitation intensity will lead to more flooding rains and consequently more floods. Previous studies confirm the results obtained in the current research.

Fig. 4
figure 4

PCI index in near future (2021–2040)

4.2 Analyzing PCD and PCP

Figure 5 displays the CMIP6 data for PCI. Accordingly, the highest PCI coefficients (over 0.72) were recorded on the coastlines of the Persian Gulf, indicating the occurrence of precipitation days in a short period during the year (mainly winter) and lack of rainfall in the rest of the year. The smallest PCD coefficients (less than 0.22) were recorded in the northern parts of Iran including the coastlines of the Caspian Sea and northwestern Iran. This indicates the high distribution of precipitation days during each year in these areas. Overall, the PCD coefficients recorded for the southern parts of Iran were bigger than 0.59. Another large PCD coefficient was observed in the east of Iran around Hamun Lake. Compared to Darand and Pajouh’s (2022) findings, PCD estimation in the current study showed that, during 2021–2040, precipitation irregularity will cover larger areas in the southern parts of Iran, with PCD coefficients exceeding 0.59 for most regions in this area.

Fig. 5
figure 5

PCD index in near future (2021–2040)

Figure 6 provides information about PCP in near future (2021–2040). Accordingly, temporal irregularity in precipitation is observed during this period. The features of spatial distribution of PCP are totally different from those of PCD. Based on Fig. 5, the largest PCD values were registered in two areas of Iran. The maximum amount (over 2020 degrees) was observed on the coastlines of the Caspian Sea, and the value recorded for the coastlines of the Persian Gulf (200 degrees) came next. This indicates that the highest annual PCD in these regions are observed from June through September. The minimum PCP values were observed in northeastern and northwestern Iran. In other words, the largest number of precipitations in these regions will occur during winter. The lowest PCP coefficient (less than 60) was recorded for northeastern Iran. The PCP values recorded for the coastlines of the Caspian Sea and the Persian Gulf indicate the role of topography, distance from the sea, and differences in synoptic systems. Heights have a significant role in PCP for the period ranging from 2021 to 2040. Regular spatial distribution of precipitation means that the likelihood of mountainous floods declines due to the appropriate distribution of annual precipitation. This in turn mitigates the challenge of using resources, hence its positive impacts on human activities. On the other hand, if the degree of precipitation declines considerably in places where major water resources are located, it will result in more economic damages. Moreover, irregular precipitation distribution will cause anomaly in temporal precipitation distribution and increases precipitation distribution during a limited number of days, a phenomenon that leads to more severe floods. Compared to Darand and Pajouh’s (2022) findings, estimating PCP in the present study showed that this index will rise on the coastlines of both the Caspian Sea and the Persian Gulf.

Fig. 6
figure 6

PCP in near future (2021–2040)

5 Conclusion

PCI shows that, in future, the coastlines of the Gulf of Oman and the Persian Gulf will experience irregular precipitation distributions. PCI estimation indicates precipitation irregularity in the southern parts of Iran, which is likely to rise in future. The results of PCD also showed that precipitation irregularity will probably go up in the southern parts of Iran during the period varying from 2021 to 2040. In fact, most of the areas located in the southern part of the country recorded PCD coefficients greater than 0.59. Analyzing PCP also indicated higher values on the coastlines of the Caspian Sea from June to September in near future. The smallest PCP coefficients were recorded in the northeast and northwest, suggesting that these areas will have the largest precipitation degrees in winter. Moreover, the regularity of precipitation temporal distribution is likely to rise in the northern, northwestern, and northeastern parts of Iran. This will increase sustainable water resources in these regions. Nonetheless, the overall evaluation of the obtained results suggests that Iran will experience irregular precipitation distribution in near future (2021–2040), with this irregularity being more notable in the southern part of the country. This is attributed to higher precipitation concentration in the southern regions of Iran. Precipitation distribution is relatively more regular in the northern part of Iran, which is attributed to reduced precipitation concentration in this area. PCD will considerably increase in the northern part of Iran, which is due to more precipitation during summer in this area. These variations in precipitation distribution will have critical consequences for the country. For example, the risk of the occurrence of severe floods and the challenges of water management are likely to go up in the southern parts of Iran. As for further research, it is suggested that hidden patterns in precipitation data be identified using reinforcement learning (RL). Because of its ability to learn flexible strategies in complicated environments, RL can be a powerful instrument for further studies on variations in complex time series like precipitation. It can estimate the consequences of such variations more accurately.