Introduction

Precipitation is one of the most critical parameters in the hydrological cycle. It is considered crucial for monitoring and predicting hydrometeorological extreme events such as drought (Golian et al. 2019) and flood (Maggioni and Massari 2018). The spatio-temporal variability of precipitation and its influence on the natural and human environment have attracted tremendous attention in climatic, meteorological, hydrological, and environmental studies (Dezfooli et al. 2018; Rivera et al. 2018). Precipitation has increased its relevance due to the direct impact on freshwater availability and the associated consequences for its intensity evidenced in the following phenomenon: storms, blizzards, floods, droughts, and landslides (Futrell et al. 2005; Qin and Lu 2014). Therefore, it is a challenge to improve water cycle simulations without accurate precipitation data (Xue et al. 2013).

Accurate precipitation measurement or estimation is critical in various fields including water resources management, weather forecasting, hazards control, etc. (Hou et al. 2014; Wu et al. 2014). A great deal of missing data in meteorological stations, the lack of updated various climate variables such as temperature and precipitation, and poor spatial density of stations are among the limitations that researchers face in different parts of the world, especially undeveloped countries, mountainous and deserted areas (Miri et al. 2016).

Due to the rapidly advancing scientific developments, there is an increasing necessity for the utilization of new technologies in forecasting and determining accurate patterns, especially in the field of water resources control and management, specifically about global warming. Therefore, in some areas, due to the difficulty in measuring the amount of precipitation, alternative measurement methods to ground-based observations should be considered (Prakash et al. 2010). Currently, the only practical approach to achieving a comprehensive estimate of global precipitation is the use of satellites that provide access to precipitation source with a high spatial and temporal resolution for many parts of the world (Hong et al. 2012). Precipitation products based on remote sensing techniques enable access to data with different temporal spans and spatial coverage in a fast and cost-efficient manner (Baseri et al. 2023).

In recent decades, there has been a development of several satellite-based precipitation products (SPPs) aimed at providing continuous global precipitation data at detailed temporal and spatial levels (Ferraro 1997; Susskind et al. 1997). Generally, SPP algorithms retrieve precipitation information from microwave (MW) (Ferraro 1997) or infrared (IR) sensors (Palomino-Ángel et al. 2019; Susskind et al. 1997). Some of the more recent algorithms for SPPs, such as those used in Integrated Multi-Satellite Retrievals for Global Precipitation Measurement (IMERG) (Hou et al. 2014), Multi-Source Weighted-Ensemble Precipitation (MSWEP) (Beck et al. 2017), Tropical Rainfall Measuring Mission (TRMM) (Huffman et al. 2007), and Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks—Climate Data Record (PERSIANN-CDR) (Ashouri et al. 2015) can combin information from both MW and IR sensors to provide precipitation data (Anjum et al. 2018).

MSWEP is an innovative product that integrates gauge, satellite, and reanalysis data to generate a 3-hourly and 0.1 global-gridded precipitation dataset spanning from 1979 to the present (Beck et al. 2017). This dataset's long-term mean relies on the Climate Hazards Group Precipitation Climatology (CHPclim), a global precipitation climatology that uses both gauge observations and satellite data (Funk et al. 2015). Subsequently, the CHPclim data was substituted with more accurate and accessible regional datasets. The temporal variability of MSWEP is calculated through a weighted average of precipitation anomalies from seven datasets. Two of these are based on gauge observation interpolation (CPC Unified and GPCC), three on satellite remote sensing (CMORPH, GSMaP-MVK, and 3B42RT), and two on atmospheric model reanalysis (ERA-Interim and JRA-55).

TRMM is an algorithm for analyzing precipitation based on data from multiple satellites, which generates two main products known as 'TRMM-3B42' and 'TRMM-3B43' (Huffman et al. 2007). This satellite was launched on November 27, 1997 and its measurement system includes a precipitation radar, a multi-frequency microwave radiometer, and an infrared radiometer (Kummerow et al. 1998). The products of this algorithm cover the latitudes 50° S to 50° N, with a spatial resolution of 0.25° × 0.25°. One of the algorithms used in this study is the seventh edition of the 3B43 algorithm, which utilizes input data from two types of satellite sensors, microwave and InfraRed (IR) measurements. This algorithm aims to minimize estimation errors from the TRMM satellite and improve the monthly time series precipitation estimation (Huffman et al. 2010).

The NOAA's National Climatic Data Center Climate Data Record (CDR) program has developed a new satellite-based precipitation dataset known as PERSIANN-CDR, designed for long-term research (Ashouri et al. 2015). PERSIANN-CDR is a high-resolution precipitation product derived from multiple satellites, providing daily estimates at a spatial resolution of 0.25° × 0.25° from 1983 to the present, covering the globe from 60°N to 60°S. The retrieval algorithm utilizes infrared satellite data from global geosynchronous satellites as the primary source of precipitation information. To calibrate PERSIANN, the model is initially trained using the National Centers for Environmental Prediction stage IV hourly precipitation data. The model's parameters are then held constant, and the model is applied to the complete historical record of GridSat-B1 IR data (Knapp 2008). To mitigate biases in the estimated precipitation, the estimates are adjusted using the GPCP monthly 0.25° precipitation products (Ashouri et al. 2015).

The Global precipitation measurement (GPM), a project led by the National Aeronautics and Space Administration (NASA) and the Japan Aerospace Exploration Agency, has begun to provide the next generation of rainfall products. These products have a temporal resolution of 30 min and a spatial resolution of 0.1° (Hou et al. 2014). The GPM consists of one Core Observatory satellite and approximately 10 constellations satellites. The GPM Core Observatory is equipped with a dual-frequency rainfall radar (the Ku band at 13.6 GHz and Ka band at 35.5 GHz) and a multichannel GPM microwave imager. These sensors provide more comprehensive data compared to TRMM instruments. Since March 2014, the GPM mission has been providing Level-3 IMERG (Integrated Merged Multi-Satellite Retrievals) Final Run (research) products, including more sensors than TRMM, such as the Advanced Microwave Scanning Radiometer model2. The IMERG datasets are expected to be updated to a longer period with improved GPCC Reanalysis data by incorporating other sources (including China) (Huffman et al. 2015a, 2015b).

In the present era, the utilization of satellites has become essential, particularly in scientific investigations in fields such as meteorology, climatology, and hydrology. The advanced capabilities of SPPs have allowed numerous researchers to assess and incorporate satellite-based precipitation data into hydro-climate studies in diverse global regions (Chen et al. 2020; Eini et al. 2022; Guo et al. 2022; Kofidou et al. 2023; Kumar et al. 2021; Le et al. 2020; Li et al. 2023, 2020; Muñoz de la Torre et al. 2024; Rahman and Shang 2020; Yang et al. 2024; Zhang et al. 2023). Undoubtedly, it is crucial to acknowledge the potential errors in satellite rainfall data, as their performance can vary depending on the region or season (Hosseini-Moghari and Tang 2020). Therefore, it is imperative to thoroughly evaluate the accuracy of these products in comparison with observational data before their utilization.

Wu and Zhao (2022) found that PERSIANN-CDR-CSS, IMERG, MSWEP, GSMAP, CHIRPS, and ERA5-Land displayed superior performance during the summer and were more accurate in the eastern region of Mainland China compared to other regions and seasons. Meanwhile, Salih et al. (2023) observed the weakest performance over the Tensift basin for PERSIANN-CDR, IMERG, MSWEP, and ERA5 during the summer. Additionally, Aryal et al. (2023) noted in their research that IMERG outperformed MSWEP in Myagdi Khola in the Kali Gandaki Basin, Nepal. However, MSWEP showed the highest accuracy in the Qaidam Basin, Northwestern China (Qi and Lv 2021). Yang et al. (2020) also reported that TRMM 3B42V7 and TMPA-RT did not perform as well as IMERG-V3 in the Shuaishui River Basin in East-Central China. GSMAP-V7 had the higher accuracy than IMERG-V5 in Myanmar, as reported by Yuan et al. (2019). Similarly, Nepal et al. (2021) found that IMERG-V6 outperformed GSMaP-V7 in terms of higher correlation and coefficient and smaller bias in the mountainous country of Nepal. In a study conducted in Northwestern China, Anjum et al. (2019) found no discernible difference in estimating precipitation between IMERG-V6 and IMERG-V5.

Multiple studies have previously assessed SPPs in Iran (Alijanian et al. 2022; Araghi et al. 2021; Darand et al. 2017; Dehaghani et al. 2023; Dezfooli et al. 2018; Goodarzi et al. 2022; Hosseini-Moghari et al. 2018; Hosseini-Moghari and Tang 2020; Keikhosravi-Kiany et al. 2023; Khalili and Rahimi 2014; Moazami et al. 2016). Shirmohammadi-Aliakbarkhani and Akbari (2020) conducted a study to evaluate the accuracy of TRMM-3B42V7 and GPM daily, monthly, and seasonal precipitation products over Khorasan Razavi, Iran, from 2008 to 2012. The findings indicated that all SPPs exhibited strong and moderate correlations with measurements from rain-gauge data on the monthly and daily timescale, respectively. Moreover, it was observed that all SPPs correlated well with measurements from gauges in spring, summer, and winter, but TRMM outperformed GPM in the fall. GPM slightly overestimated the data in all seasons except for fall, whereas TRMM tended to underestimate the values in all seasons except spring. In a recent study, the performance of IMERG, TRMM-3B43, CHIRPS, and ERA5 was compared to identify meteorological droughts in Iran from 2001 to 2019 based on Root Mean Square Error (RMSE) and correlation coefficient (CC) (Keikhosravi-Kiany et al. 2022). The results showed that IMERG and TRMM-3B43 performed better in detecting drought events at both short and long timescales, with higher CC and lower RMSE, while CHIRPS demonstrated the least accuracy. Spatially, all products performed best in identifying drought events over western and southwestern regions. Additionally, Mahdavi (2023) noted that TRMM-3B42V7 was underestimated and IMERG-V6 was overestimated in the east of Lake Urmia, Iran. However, TRMM-3B42V7 outperformed IMERG-V6 in the study area in terms of statistical indices. More recently, Shabankareh et al. (2024) evaluated the IMERG in Fars province, Iran, using daily rain gauges as reference data. Their study aimed to identify relationships between the uncertainties in the product and various factors such as elevation, temperature, and rainfall intensity. The study found that IMERG tended to overestimate light rainfall and underestimate heavy rainfall. The accuracy of the product was less biased in areas with milder temperatures and showed a higher correlation in mid-elevated areas. Additionally, the study observed a positive and negative bias in low-elevated high-elevated areas, respectively.

Many researchers have investigated the accuracy of SPPs based on changes in ground station altitude, as precipitation value is affected by elevation changes. At the highest elevations, there was a better relationship between IMERG-FR and the rain gauge network due to orographic rain. However, the bias and error increased with elevation, indicating a slight overestimation at higher altitudes (Muñoz de la Torre et al. 2024). An analysis of the performance of PERSIANN, SM2Rain, TRMM, and IMERG over the Hindu Kush Mountains of Pakistan, South Asia, showed that the RMSE values of all SPPs decreased with increasing elevation. With the increase in elevation, the values of CC and rBIAS of SM2Rain and IMERG-V6 decreased, whereas the values of PERSIANN and TRMM increased with an increase in elevation (Hamza et al. 2020). Mondal et al. (2018) also assessed the effect of elevation on the performance of SPPs, with their results illustrating that TMPA and PERSIANN underestimated (> 20%) the precipitation in basins located in relatively high elevations in northern India.

Previous studies have rarely evaluated the MSWEP dataset in Iran. Alijanian et al. (2019) utilized the SPI drought index to assess the performance of PERSIANN-CDR and MSWEP datasets. Their findings showed that both datasets closely matched the temporal and spatial patterns observed in the SPI data. Despite the widespread use of TRMM, the 3B43 edition has not been extensively studied in all surveys. Additionally, most research has only focused on a small number of SPPs at a local scale and for a short duration.

In this study, we will investigate the validity of SPPs over Iran, the potential improvements in the accuracy of the IMERG compared to the TRMM-3B43V7, and the performance of SPPs in various conditions across Iran. To achieve this, we will evaluate MSWEP, IMERG-FRV6, TRMM-3B43V7, and PERSIANN-CDR over the period 2008–2019 in Iran, considering a range of precipitation products and conducting analysis on monthly, seasonal, and annual scales, as well as at different elevations. The results of this study may provide valuable insights for decision-makers involved in hydrological issues and for drought monitoring, and demonstrate the potential usefulness of satellite-based measurements for these purposes.

Methodology

Study area

With an area of 1,648,195 square kilometers, Iran is located in southwest Asia between 25—40 N° in latitude and 43.5—63.5° E in longitude. Figure 1 illustrates the topographic features of the Iranian plateau.

Fig. 1
figure 1

Study area and location of synoptic stations

The north and southern seas, which are far from the central areas, and the mountains surrounding the country, have made Iran as one of the countries in the world with different types of climates. The presence of the Alborz Mountains in the north and the Zagros Mountains in the west of the country has caused the inner regions of Iran to have an arid and semi-arid climate covering with widespread deserts. The annual average rainfall in Iran is approximately 250 mm; ranging from 50 mm in desert regions to 1600 mm for the coastal strip of the Caspian Sea. The uneven distribution of rainfall over the country seems quite evident, since 60% of the area receives less than 250 mm of rainfall, while only 4% of the country receives more than 600 mm of precipitation.

Datasets

The research utilized a dataset gathered from 81 synoptic stations spanning from 2008 to 2019. This data served as the basis for evaluating various precipitation estimation products for monthly precipitation across different regions in Iran. Each station's data included geographic coordinates (latitude and longitude), elevation, and precipitation metrics. The careful selection of synoptic stations allowed for a comprehensive assessment of satellite performance across all of Iran's latitudes (Fig. 1). Stations with complete data records from 2008 to 2019 were chosen from Iran's available synoptic stations. Furthermore, the spatial distribution of stations across the country's provinces was considered to ensure comprehensive coverage. This study selected four satellite-based precipitation products (SPPs)—IMERG-FRV6, MSWEP, TRMM-3B43V7, and PERSIANN-CDR to evaluate their performances in comparison to in situ gauge data. The specifications of the four SPPs selected for this research are presented in Table 1.

Table 1 Selected satellite-based precipitation product specifications

Data Preparation

To ensure the consistency of Satellite-based Precipitation Products (SPPs) with ground-based precipitation measurements, the gridded precipitation data from the SPPs was interpolated to the station scale for point-to-point comparison. In this procedure, the coordinates of the center of each pixel in the satellite product are first obtained. Subsequently, the four surrounding pixels for each observation station are identified, and the precipitation values reported at the centers of these four pixels are extracted. The next step involves determining the equations of the lines on the sides of the square formed by the four centers. The slope of all the lines is obtained from the difference in the amount of precipitation between the centers of the pixels located at the beginning and end of the lines. Precipitation is found to be a function of longitude in lines 1 and 3, while it is a function of latitude in lines 2 and 4. By inputting the coordinates of the observation station into these equations, the precipitation values for the station are calculated. Averaging these four calculated values yields the corresponding precipitation value of the satellite product for the designated observation station (Fig. 2).

Fig. 2
figure 2

Schematic of the precipitation interpolation method of satellite products in the position of synoptic stations

Temporal Evaluation

The temporal assessment involved analyzing the performance of IMERG-FRV6, MSWEP, TRMM-3B43V7, and PERSIANN-CDR at monthly, seasonal, and annual intervals. Monthly data were utilized to generate seasonal and annual precipitation data for evaluating the efficacy of precipitation products. The average annual rainfall was determined by summing the monthly rainfall for each year and then calculating the long-term annual rainfall for each station by averaging the annual rainfall over the statistical period. Subsequently, annual precipitation zoning maps were generated for both the ground station and satellite products using the Kriging interpolation method.

To investigate the seasonal sensitivity and different performance of satellite-based precipitation products (SPPs) in different seasons, rainfall data were also evaluated seasonally. In order to calculate the average seasonal precipitation, the total monthly precipitation for each season in every year was determined, and then the average seasonal precipitation was calculated for each season during the entire study period (consisting of four seasonal values at each station from June 2008 to December 2019). To assess the satellite-based precipitation at a seasonal level, Box-and-Whisker plots depicting seasonal precipitation for both gauge and SPPs were generated. Additionally, spider charts were utilized to demonstrate the accuracy of satellite products based on average statistical metrics.

A Taylor diagram was utilized to compare the agreement between SPPs and the data from selected synoptic stations in the study area (Iran) in terms of CC, standard deviation (SD), and RMSE. Furthermore, a violin plot was used to illustrate the distribution and probability density of multiple groups of data, incorporating the features of both a box plot and a density plot. Additionally, spatial distributions of the synoptic stations and a confusion matrix were employed to better comprehend the results of the statistical indices and the accuracy of the SPPs in various ranges of monthly precipitation observations. The confusion matrix represents a summarization of SPPs capability to reproduce a given observed rainfall range. The x-axis (columns) represents the observed rainfall, and the y-axis (rows) indicates the estimated values. The first column gives information about the distribution of SPPs when observed precipitation values were recorded between 0 and 1 mm/month. An ideal perfect fit in this column (the highest detection capacity of SPPs) would be achieved if the lower cell reveals 1. In addition, when all counter-diagonal element of the matrixes showing value equals to one, the matching would be considered as perfect. Scatter plots were also utilized to demonstrate the impact of elevation on the performance of satellite products.

Calculation of Statistical Indices

The study assessed the performance of SPPs in capturing the spatial variability of precipitation across Iran through the calculation of various statistical metrics including Correlation Coefficient (CC), Kling-Gupta Efficiency (KGE), Root Mean Square Error (RMSE), and Bias (Table 2). CC was employed to determine the linear relationship between SPPs and observed precipitation. KGE, on the other hand, is an objective measure that evaluates error in terms of correlation, variability, and Bias (G. Nascimento et al. 2021). RMSE was used to quantify the average error magnitude between SPPs and gauge-based precipitation. Lastly, Bias, defined as the average difference between satellite estimates and rain gauge observations, was utilized to identify overestimation and underestimation of rainfall amount (Moazami et al. 2016).

Table 2 Statistical indices used to assess the performance of SPPs

A satellite product can be considered a reliable substitute for gauge-based data if the estimated values of CC, KGE, RMSE, and Bias are all equal to 1, 1, 0, and 0, respectively. According to the criteria established by Brown (2006), Condom et al. (2011), and Anjum et al. (2018) for satisfactory performance of satellite-based precipitation products, the correlation coefficient (CC) should exceed 0.7, and Bias should fall within the range of -5 to + 5 mm/month. Figure 3 presents a schematic diagram with the data and performance for the overall evaluation process of this study.

Fig. 3
figure 3

General framework for data preparation and performance evaluation used in this study

Results and Discussion

Spatiotemporal Distribution of SPPs

The spatial distribution of the mean annual precipitation based on observed (synoptic stations) and satellite precipitation products is shown in Fig. 4. In most cases, there is a noticeable large spatial variation in the amount of precipitation, which is likely due to the complex topography (Hamza et al. 2020). Additionally, there is a gradual decrease in precipitation from north to south with a similar spatial pattern. Generally, it is evident that the central, eastern, and southeastern regions of Iran receive the least amount of precipitation compared to other parts. Moreover, the highest precipitation occurs on the southern shores of the Caspian Sea, followed by the western regions of the country (Fig. 4a). The IMERG-FRV6 product significantly overestimates precipitation at the country level (Fig. 4b). Visual comparison indicates that the best agreement in precipitation pattern with synoptic station measurements is associated with the MSWEP product (Fig. 4c). TRMM-3B43V7 performs better in areas with low to moderate precipitation (Fig. 4e). Conversely, PERSIANN-CDR appears to be less sensitive to precipitation fluctuations in different regions (Fig. 4d).

Fig. 4
figure 4

Average annual rainfall observed during the study period for gauges and all SPPs. a Synoptic, b IMERG-FRV6, c MSWEP, d PERSIANN-CDR, and e TRMM-3B43V7

Performance of SPPs at Seasonal Scale

Several meteorological and climatic factors contribute to the observed seasonal sensitivity. For instance, different types of precipitation, such as convective storms in spring and more stratiform precipitation in winter, can impact the performance of SPPs. Additionally, seasonal variations in atmospheric moisture content, the transition from snowfall to rainfall, topographical effects, and changes in vegetation and land surface conditions can also influence the accuracy of SPPs.Fig. 5 shows the box-and-whisker plots of the SPPs and observed seasonal precipitation.

Fig. 5
figure 5

Box-and-Whisker plots of seasonal precipitation related to the gauge and SPPs. a Spring, b Summer, c Autumn, and d Winter

In all seasons, the minimum amount of precipitation in MSWEP has the most similarity to the observed data. Additionally, in spring (Fig. 5a) and autumn (Fig. 5c), the maximum rainfall in MSWEP is the closest value to the observational data, whereas, in summer (Fig. 5b) and winter (Fig. 5d), the maximum rainfall in IMERG has the most correspondence with the observed data. In summer, autumn, and winter, the maximum precipitation in PERSIANN-CDR is the most mismatched with the ground station and the confidence interval of MSWEP has the most compliance with the observed data, while in the spring, summer, and autumn, the length of the PERSIANN-CDR confidence interval has the least conformity with the gauge data. In IMERG, the highest conformity with ground data can be seen in spring and the lowest in winter. Also, in all seasons, TRMM has the second place in terms of confidence interval compliance.

The spider chart illustrates the average of performance indices based on different SPPs at a seasonal scale (Fig. 6). As can be seen, in all seasons, MSWEP has the highest and PERSIANN-CDR has the lowest correlation with the observed data (Fig. 6a). It is noteworthy that the lowest correlation value for MSWEP, TRMM-3B43V7, and IMERG-FRV6 are related to spring. On the other hand, the average correlation of IMERG-FRV6 is the most sensitive to seasonal variation. For PERSIANN-CDR, the average KGE decreased significantly in all seasons compared with other products, and it is also highly sensitive to seasonal changes. In contrast, MSWEP has a more desirable and steadier efficiency in all seasons. Moreover, MSWEP, TRMM-3B43V7, and IMERG-FRV6 have the highest KGE (Fig. 6b) in spring. In all seasons, MSWEP and PERSIANN-CDR have the lowest and the highest error (Fig. 6c) compared to the observational data, respectively. It should be noted that SPPs’ error in autumn, with the most rainfall occurring, increased substantially. Though, the products error is highly sensitive to seasonal variations. Furthermore, the least Bias (Fig. 6d) for MSWEP, TRMM-3B43V7, and PERSIANN-CDR is in summer. Overall, TRMM-3B43V7, MSWEP, and PERSIANN-CDR underestimate the precipitation in summer and autumn. In contrast, the maximum bias of IMERG-FRV6, TRMM-3B43V7, and PERSIANN-DCR occurred in winter. In general, MSWEP is the least biased product between all SPPs.

Fig. 6
figure 6

Spider charts of the performance metrics for SPPs in different seasons. a CC, b KGE, c RMSE, and d Bias

Performance of SPPs at Monthly Scale

Figure 7 (Taylor diagram) exhibits the performance of the four satellite products (IMERG-FRV6, MSWEP, PERSIANN-CDR, and TRMM-3B43V7) on monthly scale. Accordingly, it can be realized that the correlations between the monthly observed precipitation and the estimates of MSWEP and TRMM-3B42V7 are greater than 0.75, whereas in contrast, the amounts of CC for IMERG-FRV6 and PERSIANN-CDR are less than 0.6. As well, the values of RMSE for MSWEP and TRMM-3B43V7 are lower than or equal to 30 mm/month. On contrast, the calculated values of RMSE for PERSIANN-CDR and IMERG-FRV6 are higher than or equal to 40 mm/month. The standard deviation difference of IMERG-FRV6, MSWEP, and TRMM-3B43V7 with observations is less than 25%. This is while that IMERG-FRV6 has the closest (most similar) distribution to the observation data with a standard deviation difference of around 8%. Overall, MSWEP is the closest SPP to the observed point, i.e., it outperforms other SPPs at monthly time scale.

Fig. 7
figure 7

Taylor diagram displaying the performance of monthly precipitation estimates from MSWEP, TRMM-3B43V7, IMERG-FRV6, and PERSIANN-CDR. Root mean square difference (RMSD) values are denoted bye the semi-circular green line.The values of CC are represented by the straight (blue) lines

Figure 8 shows scatterplots and statistical metrics for the SPPs versus observations. Each point represents the value of monthly rainfall during the statistical period in Iran. In all SPPs, agreements are more accurate at low rainfall ranges. However, there are quite large variations in the form of underestimation at high rainfall ranges, especially more than 200 mm. This underestimation is more evident in PERSIANN-CDR (Fig. 8c). Additionally, IMERG-FRV6 has overestimated the amount of precipitation in a large number of records (Fig. 8a). By comparing the trendline with y = x line, the MSWEP performance is significantly better than other SPPs (Fig. 8b).

Fig. 8
figure 8

Scatter plots of monthly precipitation for all SPPs. The trendline and the y = x line indicate by the purple and dotted lines, respectively. a IMERG-FRV6, b MSWEP, c PERSIANN-CDR, and, d TRMM-3B43V7

Figure 9 shows the effect of elevation on the performance of all used satellite products. It is clear that the RMSE values of all SPPs decrease, while BIAS amounts rise with the increase in elevation. The values of CC for IMERG-FRV6, MSWEP, and TRMM-3B43V7 rise with the increase in elevation, whereas the PERSIANN-CDR has the inverse trend.

Fig. 9
figure 9

Scatter plots for the statistical indicators on monthly time scales versus elevation. The purple lines and the gray lines represent the linear regression fitting lines and Threshold line of indices, repectively

The violin plot in Fig. 10 represents the variations of the values of CC, KGE, RMSE, and Bias for selected SPPs. The percentage of stations in each category of statistical indices is shown in Online Resource 1 (Table S1). Accordingly, it can be seen that the correlation distribution of products with observational data (Fig. 10a) has a negative skewness, thus there is a correlation higher than the average (CC ≥ 0.8) in most stations. In addition, the lowest correlation is seen in PERSIANN-CDR. The smallest confidence interval with the most stations (87.66%) having CC over 0.8 calculated based on MSWEP. The average Bias values of all SPPs are positive (Fig. 10d). The distribution of Bias in IMERG-FRV6 has more negative skewness compared to other products. Consequently, it has an overestimation in a large number of stations. The highest underestimation and overestimation can be seen in PERSIANN-CDR and IMERG-FRV6, respectively.

Fig. 10
figure 10

Violin plot of statistical indicators for all SPPs at the monthly scale. a CC, b KGE, c RMSE, and d Bias

MSWEP has the smallest confidence interval with the most stations (85%) in the desired Bias range. From this point of view, it is clear that PERSIANN-CDR has the poorest performance. The lowest efficiency (KGE) can be seen in TRMM-3B43V7 (Fig. 10b). While the KGE value distribution in MSWEP has high compliance with the normal distribution, other products are skewed to the left. The number of stations with optimal efficiency is more in all SPPs. The only product which shows acceptable efficiency on over 90% of the stations is MSWEP. In all products, the RMSE distribution has a positive skewness, which indicates a greater portion of stations have small error (Fig. 10c). The highest amount of error is observed in PERSIANN-CDR while for MSWEP, the confidence interval of RMSE which is concentrated near zero is the lowest. In other words, MSWEP has the highest number of stations (more than 80%) in the acceptable RMSE range.

The spatial distribution of performance indices of satellite-based precipitation products on a monthly scale, which is shown in Fig. 11, has implications for water supply planning, agricultural management, flood risk assessment, urban water management, and ecosystem conservation. Inaccuracies in different regions can impact decision-making and infrastructure planning for these sectors (Dube et al. 2023; Kikon and Deka 2022; Omonge et al. 2022; Sheffield et al. 2018). The results indicate a high correlation between the MSWEP and observed data in most regions across Iran. Moreover, the best correlation for all SPPs can be seen in the northwest, northeast, and Zagros highlands. Also, the correlation between PERSIANN-CDR and observation data is clearly lower than other products in the whole area of Iran. However, the value of RMSE in the southern shores of the Caspian Sea and the Zagros Mountains was higher than in the rest of the regions.

Fig. 11
figure 11

Spatial distributions of statistical indicators in evaluating SPPs on a monthly scale

The RMSE of MSWEP is less than 20 mm/month in more than 80% of the stations while more than 69%, 65%, and 58% of the stations are in this error range (less than 20 mm/month) for TRMM-3B43V7, IMERG-FRV6, and PERSIANN-CDR, respectively. It can be seen that PERSIANN-CDR has the lowest KGE, which indicates the weakest efficiency, especially in the northern areas and the highlands of Zagros. Generally, about 88%, 72%, 58%, and 38% of stations have a KGE value of greater than 0.5 for MSWEP, IMERG-FRV6, TRMM-3B43V7, and PERSIANN-CDR, respectively. For all SPPs, the percentage of stations with overestimation of precipitation is higher; however, the amount of this overestimation is significantly lower compared to the stations with underestimation of precipitation. The Bias of the SPPs decreases from the west and northwest of the country towards the east and southeast. Therefore, it can be concluded that the performance of these products in high rainfall areas is always accompanied by higher levels of Bias. The high correlation and lower RMSE of MSWEP in most regions of Iran suggest its reliability for water resource management decisions. However, the weaker performance of PERSIANN-CDR and higher Bias in high rainfall areas may impact water allocation, reservoir management, agricultural planning, flood risk assessment, and urban water management in those regions.

The confusion matrix is shown in Fig. 12. Clearly, with the increase in precipitation (value more than 20 mm/month), the detection capacity of the products improves. Except for PERSIANN-CDR (Fig. 12c), which has a significant decrease in precipitation above 100 mm/month, this trend is especially evident in the performance of the fifth (20–50 mm/month) and sixth class (50–100 mm/month) of products. MSWEP shows significantly higher detection capacity than other products for the first class rainfall values (0–1 mm/month). Besides, the highest detection capacity of PERSIANN-CDR (48%) is for the sixth class. The highest and the lowest detection capacity of IMERG-FRV6 (Fig. 12a) are for the fifth (20–50 mm/month, 53%) and third classes (5–10 mm/month, 19%), respectively. It should be noted that the worst detection capacity of precipitation values for MSWEP (Fig. 12b), TRMM-3B43V7(Fig. 12d), and IMERG-FRV6 is for precipitation amounts between 5 and 10 mm/month. The total percentage of detection up to the maximum amount of precipitation in each class can indicate underestimation or overestimation of the precipitation by each product. Accordingly, MSWEP underestimates in all classes; while, IMERG-FRV6 and PERSIANN-CDR have overestimated rainfall below 20 mm/month. In addition, TRMM-3B43V7 underestimates in very low rainfall (0–1 mm/month) and also rainfall greater than 20 mm/month. Investigation of the performance of product detection capacity in all precipitation classes indicates that MSWEP is the best. After that, IMERG-FRV6, PERSIANN-CDR, and TRMM-3B43V7 had a good performance, respectively.

Fig. 12
figure 12

Confusion matrix of the performance of all SPPs at a monthly scale. a IMERG-FRV6, b MSWEP, c PERSIANN-CDR, and, d TRMM-3B43V7

Conclusion

Precipitation is the key factor in the hydrological cycle, but its value varies widely throughout the year and in different regions, particularly in arid and semi-arid areas with low levels of rainfall. The lack of rainfall data and uneven distribution of rain gauge stations, particularly in mountainous and remote areas, pose challenges in analyzing rainfall and making hydrological forecasts for water resource management. Therefore, alternative methods such as meteorological radars and satellite sensors are necessary. This study aims to evaluate four different precipitation estimation methods (MSWEP, IMERG-FRV6, TRMM-3B43V7, and PERSIANN-CDR) in Iran from 2008 to 2019. The evaluation includes a wide range of precipitation products and analyzes them at monthly, seasonal, and annual levels, as well as at different elevations.

All SPPs accurately display the precipitation pattern that results from Iran's topographical and climatic characteristics, which is a progressive decrease in precipitation from north to south. The MSWEP product has the most agreement of precipitation pattern with the observation data. Thus, it may be the best product for conducting climate and drought studies as well as calculations of water resources per capita to determine the amount of allocation to different sectors of agriculture, industry, and drinking in areas without stations. Also, a significant overestimation at the country scale can be seen in IMERG-FRV6. It was also shown that PERSIANN-CDR is less sensitive to precipitation fluctuations over different regions leading to larger biases for this SPP.

Analysis of satellite-based precipitation products on a seasonal scale shows that the highest efficiency of the products and at the same time the average favorable correlation with observed precipitations can be seen in the spring season. Moreover, the performance of MSWEP is promising and has the least error in all seasons. Simultaneously, PERSIANN-CDR has unfavorable performance and high error and is also sensitive to seasonal variations.

Based on the analysis of SPPs on a monthly scale, MSWEP and TRMM-3B43 V7 have the highest correlation and the lowest error, respectively, and they outperform IMERG-FRV6 and PERSIANN-CDR. Furthermore, the underestimation by all products, especially the PERSIANN-CDR, increases moving towards the areas with high precipitation. TRMM-3B43V7 has a better performance in areas with low to moderate precipitation values. In general, the error of the SPPs decreases, and their adaptation to the observed rainfall rises by moving to areas with a higher elevation. Additionally, the number of points with better performance than the average correlation, error, and efficiency of the stations is noticeable in all products. Meanwhile, the highest underestimation and overestimation can be seen in PERSIANN-CDR and IMERG-FRV6, respectively.

According to the spatial distribution of the performance indices on monthly scale, the highest and the lowest correlations with observed rainfall in most regions of Iran are seen in MSWEP and PERSIANN-CDR, respectively. Also, the best correlation for all SPPs can be seen in the northwest, northeast, and Zagros highlands. It is worthwhile to mention that the RMSE value in MSWEP is the least all over the country. Generally, in all products, at least about 40% of the stations have an efficiency higher than 0.5, over 50% of the stations have an overestimation and at least around 60% of the stations have an error of less than 20 mm/month. Results show that the correlation, Bias, and error of the products increase by moving towards areas with high average rainfall. Finally, all products have a good ability to identify high rainfall amounts. Meanwhile, MSWEP has the best performance and the highest detection capacity in minimum precipitation values. The superior performance of MSWEP can be attributed to its integration of multi-source data, high spatial and temporal resolution, consideration of topographic and land cover information, advanced bias correction and data assimilation techniques, and rigorous validation and calibration processes. These factors collectively contribute to the accurate capture of precipitation in various conditions. However, the accuracy of satellite-based precipitation products varies across different regions due to factors like precipitation intensity, sensors, and geography. Therefore, it is recommended to assess the accuracy of these products in each region, taking into account topographical and climatic conditions. In addition, increasing the density of ground rainfall stations is essential for improving statistical metrics in evaluating satellite precipitation assessments. Furthermore, utilizing multi-criteria methods to assign different weights to the statistical indices is suggested for improved outcomes.