1 Introduction

Precipitation is a major component of the global water and energy cycle, in which it helps regulate the climate (Kidd and Levizanni 2010), and provides key input data for ecological and hydrological model studies (Daly et al. 1994; Running et al. 1987). Therefore, strengthening the observations on distribution, volume, time, and duration of precipitation is pertinent for fully understanding Earth’s hydrological processes. A ground-based rain gauge is an effective means to measure precipitation at a meteorological or hydrological site, where observations of intense rain gauge networks can provide accurate precipitation data for a specific region. However, alpine areas, such as the Tibetan Plateau, generally lack enough rain gauges to represent the spatial features of complex topography and heavy precipitation (Ciach 2003) and have very limited meteorological stations, where only a few are above 4000 m a.s.l. Therefore, spatial interpolation only based on ground-based measurements of precipitation is problematic (Zawadzki 1975).

The development of satellite remote sensing has broadened the field of human observation and provides technical support for the temporal and spatial distribution of precipitation. In such technology, a variety of precipitation products with different temporal resolution and spatial resolution are available, including Tropical Rainfall Measuring Mission (TRMM) 3B42, Climate Prediction Center (CPC) MORPHing technique (CMORPH), and so on. There are also many reanalysis precipitation products that assimilate observation data with satellite observation data, such as the Interim European Centre for Medium-Range Weather Forecasts (ECWMF) Re-Analysis (ERA-Interim), China Meteorological Forcing Dataset (CMFD), 55 Japan Reanalysis Project of Japan Meteorological Agency (JRA-55), National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (CFSR), National Aeronautics and Space Administration (NASA) Modern Era Retrospective-analysis, Research and Application (MERRA), Global Precipitation Climatology Project (GPCP), and Climate Prediction Center Merged Analysis of Precipitation (CMAP).

Since the performance of different precipitation products varies at different regions (Ebert et al. 2010; Hirpa et al. 2010; Jiang et al. 2012), it is necessary to evaluate products at the local region before application in hydrological modeling, even it had been evaluated for large-scale (Wang et al. 2017b). The distribution of precipitation in alpine regions with cold climates is notoriously difficult to estimate. For example, Yin et al. (2008) found that TMPA 3B42 consistently overestimated the monthly precipitation on the Tibetan Plateau. Ma et al. (2009) determined that CMAP-1 and GPCP-2 had better correspondence with adjusted observational precipitation in general compared with ERA-40, NCEP-1, NCEP-2, and CMAP-2 in China. Zhao (2013) reported that although TRMM 3B42 efficiently describes the spatial patterns and seasonal changes in the rainy season in most parts of China, it underestimates the precipitation in most of western China, especially the western part of the Tibetan Plateau. Tayeb et al. (2011) revealed that the GPCC is a useful tool for drought monitoring on the Iranian Plateau. Chen et al. (2014) found that JRA-55, ERA-Interim, NCEP CFSR, and NASA MERRA can proficiently represent the interannual variation in the daily cycle of precipitation in East Asia, while JRA-55 captures the early and late rainfall over the Tibetan Plateau. Wang et al. (2017a) suggested that CMFD has better performance compared to CMORPH, GPCP-2, TRMM 3B43, and Global Precipitation Climatology Centre (GPCC) in Qinling-Daba mountain areas on the east edge of the Tibetan Plateau. Wu et al. (2019) reported that CMFD has better performance than GPCC, ERA-Interim, GPCP2.3, JRA55 both in upper reach and middle and lower reaches of Shule River Basin, and the performance of these datasets in upper reach is higher than in middle and lower reaches. These results indicate that the performance of different products varies greatly on different spatial–temporal scales; thus, it is critical to evaluate different precipitation products on regional scale before applying them to studies on climate change or in hydrological simulations.

Located on the northeastern edge of the Qinghai-Tibet Plateau, the Upper Reach of the Shule River Basin (URSRB) is the main water resource of the whole basin and has important impacts on the regional climate and hydrology. However, no national meteorological station exists in the URSRB, and only some automatic weather stations and total rain gauges were set up at different altitudes after 2010. Due to limited resources for data collection, it is very important to obtain the spatial–temporal characteristic of precipitation from different precipitation products. In 2013 and then 2015, the High Asia Refined Analysis (HAR) (Maussion et al. 2013) dataset, which focuses on alpine regions and with high-resolution reanalysis data of the Tibetan Plateau, and the China Alpine Precipitation Dataset (CAPD) (Chen et al. 2015), which integrates more short-term ground-based precipitation measurements in alpine regions, were developed and released, respectively. As more precipitation products in alpine regions are developed, enhanced and accurate spatial and temporal characteristics of precipitation in the mountainous URSRB can be obtained.

Altitude is the main variable that controls the spatial distribution of precipitation in mountainous areas (Wang et al. 2017b). As precipitation increases with altitude, a precipitation gradient is formed, which is one of the key parameters for simulating hydrological processes in mountainous areas. However, precipitation gradients are very difficult to obtain due to the alpine environments, and different estimated precipitation gradients exist in the Qilian Mountains, which include the URSRB. Tang (1985) reported that the precipitation has an “S” shape distribution with altitude, while Zhang et al. (2008) pointed out that the precipitation increases with altitude in the Qilian Mountain areas. In this area, maximum precipitation is observed in the windward slope at 2910 m a.s.l. The field observed rainfall at an altitude of 2600–3800 m in the Heihe Basin suggests that the annual precipitation increases by 4.55%, and the maximum precipitation altitude is about 3650 m a.s.l. (Chang et al. 2002). Kang et al. (1999) reported that the precipitation gradient from March to October is 12.4 mm/100 m at the Heihe Basin. Lan and Zhang (2017) classified the precipitation gradients in whole TP as five sub-regions, where the precipitation gradient of URSRB is estimated by 19 mm/100 m in wet season (May–September). Wang et al. (2009) found that the maximum precipitation altitude is on the northern slope of the middle section of Qilian Mountain at 4670 ma.s.l. and has seasonal fluctuation, where the summer precipitation gradient is 11.5 mm/100 m. This study also suggests that a statistical relationship between the maximum precipitation height and the amount of precipitation should be established in different climatic regions at this altitude. These above-mentioned studies indicate that the precipitation gradient varies greatly at different observation stations, altitudes, and seasonal periods.

The update precipitation products in alpine regions provide good opportunity to investigate the spatial pattern and gradient of precipitation in URSRB, among which the precipitation gradients were rarely reported in alpine areas. In this study, nine precipitation products, including CMFD, CMOPRH, ERA-Interim, GPCC, GPCP V2.3, JRA-55, TRMM 3B43, HAR, and CAPD, and their precipitation gradients were evaluated by ground-based measurements in the URSRB and surrounding areas during 2001–2013. This paper is organized as follows: the study area, nine precipitation products, and evaluation indicators are introduced in Section 2; evaluation results of the nine products and their precipitation gradients on annual, seasonal, and monthly scales are shown in Section 3; reasons for the different performance of the nine precipitation products and uncertainties in the evaluations are discussed in Section 4; and conclusions about the findings are presented in the final section.

2 Study area and methods

2.1 Study area

URSRB lies at the northeastern margin of the Tibetan Plateau, which ranges between 96.7° E–99.0° E and 38.3° N–39.9° N. URSRB refers to the upstream mountainous area above Changmabao at the outlet of the mountain areas and has an area of 1.14 × 104 km2 (Gao et al. 2013) (Fig. 1). There is no national meteorological station in the URSRB, so two national meteorological stations near the URSRB were involved in the study. URSRB is mainly composed of the Shulenan Mountain, Tuolenan Mountain, and the valley of the Shule River. The terrain is high and steep with obvious vertical landform zones (Sheng et al. 2010; Xie et al. 2010). The altitude of the URSRB is between 2081 and 5764 m a.s.l., and the alpine area above 4500 m a.s.l. is always covered by glacier. URSRB is in the westerly zone climate, where precipitation is mainly affected by the Atlantic vapor carried by the westerly circulation. The URSRB is characterized by a dry and cold climate with less rain, much wind, and an obvious vertical and horizontal distribution of precipitation (Gao et al. 2013). The average annual rainfall is about 378.4 mm, of which 80% primarily occurs during the growing season (May–September).

Fig. 1
figure 1

Location and the distribution of rain gauge stations in the Upper Reach of Shule River Basin and surrounding areas

2.2 Data

The temporal resolution, spatial resolution, periods, space span, and data sources of CMFD, CMOQPH, ERA-Interim, GPCC, CPCP V2.3, JRA-55, TRMM 3B43, HAR, and CAPD are summarized in Table 1.

Table 1 Information of nine precipitation products in the Upper Reach of Shule River Basin and surrounding areas

CMFD is a set of reanalyzed datasets of near-surface meteorological and environmental elements, which is developed by the Institute of Tibetan Plateau Research, Chinese Academy of Sciences. The dataset is based on the existing Princeton reanalysis data, Global Land Data Assimilation System (GLDAS) data, Global Energy and Water Exchanges Surface Radiation Budget (GEWEX-SRB) data, and TRMM precipitation data and is made by combining the weather observation data of the China Meteorological Administration (Chen et al. 2011; Jin et al. 2003).

CMORPH includes products inversed from low-orbit Defense Meteorological Satellite Program (DMSP) satellites (F13 to F15, NOAA-15 to 18, Aqua, and TRMM), and Infrared Radiation (IR) data provided by relative geostationary satellites, combined by the Lagrangian interpolation method (Joyce et al. 2004).

ERA-Interim is the latest global atmospheric reanalysis dataset of ECWMF that replaced ERA-40 (MRP and Arkin 2009). The ERA-Interim uses data assimilation technology to improve the radiative transfer model (Lindsay et al. 2014).

GPCC was established in 1989 to analyze the global monthly precipitation, which is measured by rain gauges to assess global energy and water cycles (Prakash et al. 2015). This database includes precipitation data through the Global Telecommunication System (GTS), Synoptic Weather Report (SYNOP), and Monthly Climate Report (CLIMAT) and is generated within 2 months after the end of the observation month. The dataset collects more meteorological stations with respect to other datasets during processing, which also provides system errors and other data information (Wan et al. 2013).

GPCP was established by the World Climate Research Program (WCRP) in 1986. The GPCP2.3 integrates the ground observation of rain gauges and the inversion results of satellite remote sensing data. One of the main GPCP data sources is infrared radiation data from LEO satellites, including GOES, GMS, Meteosat, and NOAA AVHRR, where the microwave satellite data comes from SSM/I on the DMSP satellite. The site observation data mainly come from the GPCC, and the monthly precipitation analysis data are a basic product.

JRA-55 used data from ERA-40 (Uppala et al. 2005) and JMA (Kobayashi et al. 2015). JRA-55 uses the JMA global operational prediction model with good spatial resolution (Yokoi 2015) to provide long-period data, which simultaneous implements four-dimensional variational methods for assimilation.

TRMM 3B43 incorporates precipitation estimates from multiple satellites as well as possible measurement analysis (Huffman et al. 2010). TRMM uses five instruments, including Precipitation Radar (PR), TRMM Microwave Imager (TMI), Visible Infrared Scanner (VIRS), Cloud and Earth Radiation Energy System (CERES), and Lightning Imaging Sensor (LSI). The final data product reflects monthly multi-satellite satellite estimates scaled to rain gauge data (Karaseva et al. 2012) (https://climatedataguide.ucar.edu/climate-data/trmm-tropical-rainfall-measuring-mission). TRMM 3B43 is derived from the average TRMM 3B42 V6 precipitation product (Wang et al. 2017a).

HAR generates dynamic downscaling of global analytical data through the atmospheric model Weather Research and Prediction Model (WRF) V3.3.1 to prepare a 10-km grid for the Qinghai-Tibet Plateau and surrounding areas (Maussion et al. 2013). It simulates a continuous run with 36 h of time integration, where the last 24 h of the model output provides information for 1 day of the final dataset.

CAPD includes monthly precipitation datasets in the Qilian Mountain (1960–2013). The product involves observed precipitation from the National Meteorological Station, hydrological stations, and observation stations above 4000 m a.s.l. from the Chinese Academy of Sciences. CAPD establishes a relationship between precipitation and altitude, longitude, and latitude (Chen et al. 2015) and generates monthly precipitation datasets by kriging interpolation with a 1-km scale.

2.2.1 The gauge data

The precipitation data of 2 national meteorological stations during 2001 to 2013, including Toule and Yeniugou in the surrounding area of the URSRB, were obtained from the National Meteorological Information Center of the China Meteorological Administration (CMA, http://data.cma.cn/site/index.html). Other data were acquired from four Automatic Weather System stations in the URSRB from 2008 to 2013. The related information of each stations is listed in Table 2.

Table 2 The information of rain gauge stations in the Upper Reach of Shule River Basin and surrounding areas

Total bucket gauges were set up in the URSRB for short-term observations, but observation of total rain gauges was limited to the rainy period due to observation difficulties during winter. We define the rainy period from May to October in the URSRB, which accounts for more than 80% of the annual precipitation. The precipitation gradient in the rainy period was obtained by integrating short-term observations from 40 total bucket gauges during 2013 to 2016. The observed precipitation gradient in the rainy period was determined to be 12 mm/100 m, and no maximum precipitation altitude was observed.

2.2.2 DEM

The Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM) with 30 m resolution (http://gdex.cr.usgs.gov/gdex/) was used to calculate the precipitation gradients in the URSRB.

2.3 Evaluation method

2.3.1 Date preprocessing

Since seven of the nine types of precipitation products have a monthly resolution, the evaluation time scale was determined on a monthly scale. The seasonal and annual precipitation were calculated from monthly precipitation. Spring, summer, autumn, and winter are defined as March–May, June–August, September–November as autumn, and December–February of next year, respectively.

Considering the area of URSRB, the spatial resolution of the evaluation was taken as 0.25° × 0.25°. Bilinear interpolation, which is commonly used for estimating precipitation, was employed as the interpolation method. The monthly data of each precipitation product was compared with the precipitation observed by rain gauges on the grid. Since HAR data is stored with point features, the meshless method based on moving kriging interpolation (Ge and Cheng 2014) was used to convert point features into raster when preprocessing HAR data. The GPCC was corrected by using the correction factor provided by the metadata before evaluation.

The following formula was applied to calculate the precipitation gradient G, which is expressed as:

$$ G=\frac{P_2-{P}_1}{H_2-{H}_1}\times 100 $$
(1)

where P1 and P2 are precipitation measurements (mm) of different stations or grids; and H1 and H2 are altitudes of different stations or grids.

2.3.2 Evaluation indicator

Four indicators were used to quantitatively evaluate nine precipitation products, which include the Pearson correlation coefficient (R), relative BIAS (BIAS), root mean square error (RMSE), and mean absolute error (MAE):

$$ R=\frac{\sum \left({X}_1-\overline{X}\right)\left({Y}_1-\overline{Y}\right)}{\sqrt{\sum {\left({X}_1-\overline{X}\right)}^2{\left({\mathrm{Y}}_1-\overline{Y}\right)}^2}} $$
(2)
$$ \mathrm{BIAS}=\frac{\sum \limits_{i=1}^n\left({Y}_1-{X}_1\right)}{\sum \limits_{i=1}^n{X}_1}\times 100\% $$
(3)
$$ \mathrm{RMSE}=\sqrt{\frac{1}{n}\sum \limits_{i=1}^n{\left({X}_1-{Y}_1\right)}^2} $$
(4)
$$ \mathrm{MAE}=\frac{1}{n}\sum \limits_{i=1}^n\mid {X}_1-{Y}_1\mid $$
(5)

where n is the number of measured and precipitation product samples; X is the measured precipitation on the ground; Y is the precipitation from precipitation product in the corresponding time interval; and \( \overline{X} \) and \( \overline{Y} \) are the average of n on X and Y, respectively.

According to previous literature (MRP and Arkin 2009), different values or R are defined as follows: 0.8 < R ≤ 1.0 which means extremely strong correlation and 0.6 < R ≤ 0.8 strong correlation, 0.4 < R ≤ 0.6 moderate correlation, 0.2<R≤0.4 weak correlation, and 0.0 ≤ R≤ 0.2 extremely weak or uncorrelated (Xia et al. 2017). The statistical significance test of the correlation coefficient was tested by the Student T test, and BIAS reflects the degree of precipitation bias. The closer BIAS is to zero, the more accurate the precipitation product is. However, the positive and negative bias are probably offset, leading to relatively small BIAS. Because RMSE reflects the precision of the precipitation product, it is very sensitive to any large or small error in a group of data. MAE can better reflect the actual situation of error in a precipitation product.

3 Results

The precipitation products and precipitation gradients were evaluated on annual, seasonal, and monthly scales, and then the performance of products at each station was compared.

3.1 Annual scale

The R values of GPCC, CMFD, GPCP V2.3, and TRMM 3B43 were all higher than 0.8, indicating that the four precipitation products have extremely strong correlation with the precipitation observations on an annual scale (Fig. 2). The R of GPCC (0.89) was the highest correlation coefficient. The R values of ERA-Interim, JRA-55, HAR, and CAPD were slightly lower but greater than 0.7, suggesting that these four precipitation products have strong correlation with precipitation observations on annual scale. The R value of CMORPH was less than 0, indicating that CMORPH has poor correlation with observations on an annual scale.

Fig. 2
figure 2

Scatter plots of annual precipitation between precipitation products including a CMFD, b CMORRH, c ERA-Interim, d GPCC, e GPCP V2.3, f JRA-55, g TRMM3B43, h HAR, i CAPD, and corresponding rain gauge data during 2001–2013 in the Upper Reach of Shule River Basin and surrounding areas. Double asterisk denotes that this product has passed the significance test at the 99% confidence level on an annual scale

The CMFD, GPCC, and TRMM-3B43 were closer to 1:1 line, suggesting that the annual precipitation of these products was closer to observations (Fig. 2). HAR, CAPD, JRA-5, and ERA-Interim overestimated precipitation compared to observations, while GPCP V2.3 and CMOPRH clearly underestimated precipitation.

The BIAS of CMFD and TRMM 3B43 were smallest, while BIAS of CMORPH and GPCP V2.3 were − 18.58% and − 43.38%, respectively, suggesting that they underestimated precipitation. The BIAS of HAR was the largest by 82%, indicating that HAR largely overestimated annual precipitation (Table 3).

Table 3 BIAS, RMSE, and MAE of annual precipitation between nine precipitation products and corresponding rain gauge observations during 2001–2013 in the Upper Reach of Shule River Basin and surrounding areas

The CMFD had the smallest RMSE with 60.19 mm, which suggests that it is most similar to observed annual precipitation. Following CMFD, GPCC, and TRMM 3B43 had slightly larger RMSE values of 64.29 and 70.41 mm, respectively. RMSE of the other six precipitation products were relatively larger than 100 mm, among which the RMSE of CMORPH, JRA55, and HAR were greater than 200 mm (Table 3). CMFD had the lowest MAE of 35.12 mm, while MAE of GPCC and TRMM 3B43 were slightly larger than CMFD. HAR had the largest MAE of 274.82 mm (Table 3).

It is interesting to note that although the R between CMFD and the observed annual precipitation was slightly lower than that of GPCC, CMFD precipitation performed better than GPCC with smaller BIAS and MAE. GPCC and TRMM 3B43 had good R values compared to observed annual precipitation and slightly larger RMSE and MAE values than CMFD. After integrating the four evaluation indicates, CMFD, GPCC, and TRMM 3B43 accurately estimated precipitation on annual scale, while CMORPH provided the worst estimation of annual precipitation in the URSRB.

3.2 Seasonal scale

In spring, the R of all precipitation products except CMORPH were greater than 0.4, indicating a moderate correlation between these products with ground observations. CMFD had the highest R of 0.86, suggesting its excellent performance in spring, which was followed GPCC with R of 0.70 (Fig. 3a). In spring, GPCC and CMFD had smallest BIAS values of 1.15% and − 2.61%, respectively, while GPCP V2.3 had the largest BIAS greater than 200% (Fig. 3b). CMFD had RMSE and MAE values of 12.63 and 7.34 mm, followed by GPCC with the smallest values of 15.4 and 12.7 mm, respectively. Unsurprising, the MAE and RMSE of CMORPH were largest (Fig. 3c, d).

Fig. 3
figure 3

Evaluation indicators including a R, b BIAS, c RMSE, d MAE between seasonal precipitation from precipitation products and corresponding rain gauge data in various seasons during 2001–2013 in the Upper Reach of Shule River Basin and surrounding areas

In summer, the R values of the nine precipitation products were all greater than 0.4, suggesting that all products perform well during June–August. GPCC had the largest R of 0.86, followed by TRMM 3B43 with R of 0.79. It is pertinent to note that R of all products, except CMORPH and CAPD, were larger than 0.6 in summer, which is better than other seasons, indicating that most of the products can efficiently capture precipitation in summer (Fig. 3a). The BIAS values of most precipitation products were relatively smaller than 100%. ERA-Interim, JRA-55, HAR, and CAPD overestimated summer precipitation, while other products underestimated precipitation in summer. The BIAS of CMFD was smallest by only 2.79%, and the BIAS of ERA-Interim, GPCC, JRA-55, and TRMM 3B43 were less than 16% (Fig. 3b). GPCC had the smallest RMSE of 53.22 mm, followed by TRMM 3B43 with RMSE of 53.99 mm. CMFD has the smallest MAE of 35.68 mm, followed by TRMM 3B43 with MAE of 41.69 mm in summer.

GPCC, ERA-Interim, CMFD, and TRMM 3B43 performed well with higher R values in autumn, while GPCP and CAPD had very low R values, which do not pass the 95% confidence test (Fig. 3a). The GPCC had the smallest BIAS of − 3.2%, followed by TRMM 3B43 and CMFD, which had BIAS of 15.99 and 22.51%, respectively (Fig. 3b). The RMSE and MAE of GPCC were smallest (17.31 and 13.64 mm, respectively) in autumn, followed by TRMM 3B43 and CMFD with RMSE of 21.27 and 30.58 mm and MAE of 17.18 and 14.08 mm, respectively. The RMSE and MAE of ERA-Interim, GPCP V2.3, and HAR were less than 40 mm (Fig. 3c, d).

All precipitation products performed poor during the winter, among which CMORPH failed to pass the 95% confidence test. HAR had the best performance with R of 0.27 (Fig. 3a). All precipitation products overestimated precipitation in winter, and their BIAS values all exceeded 100%. TRMM 3B43 had the smallest BIAS of 132.28% and lowest RMSE and MAE values of 5.68 and 4.9 mm, respectively, in winter (Fig. 3c, d).

In general, R values were best in summer and worst in winter for all precipitation products. Similar to R, the BIAS values were smallest in summer and largest in winter. The RMSE and the MAE of the precipitation products showed a similar pattern (Fig. 3c, d). The performance of the nine precipitation products in different seasons varied greatly. GPCC, TRMM 3B43, and CMFD performed well during all season. CMFD performed best in spring, CMFD and GPCC performed well in summer, GPCC performed best in autumn, and HAR performed best in winter. CMORPH performed relatively poor in all seasons, indicating that CMORPH cannot accurately estimate precipitation on a seasonal scale.

3.3 Monthly scale

It can be clearly seen that all precipitation products have a similar monthly distribution pattern with the ground-based measurements (Fig. 4), where the precipitation from May to September was greater with the maximum in July. The distribution pattern of CMFD, GPCC, and TRMM 3B43 were closer with the distribution of observation, suggesting that these products have superiority in estimating monthly precipitation distribution. HAR significantly overestimated the precipitation each month, while CMORPH had the largest difference from the measured precipitation, and JRA-55 clearly overestimated the precipitation in May (Fig. 4).

Fig. 4
figure 4

Monthly average precipitation in the upper reach of the Shule River Basin and surrounding areas

All precipitation products failed to pass the 95% confidence test in December, and most of the precipitation products failed to pass the 95% confidence test in February and November (Table 4). The monthly average values R of the nine precipitation products were lower than the annual averages. The R of CMFD in March, April, and May was highest, which is consistent with its best performance in spring. GPCC had the highest R in June, July, and August, suggesting its best performance in summer. CMFD and CMORPH had the maximum and minimum average R values of 0.65 and minimum − 0.04, respectively. CMFD, ERA-Interim, JRA-55, HAR, and CAPD overestimated the measured monthly precipitation average, while the other four precipitation products underestimated the average.

Table 4 Relationship coefficients (R) between nine precipitation products and corresponding rain gauge observation within each month in the Upper Reach of Shule River Basin and surrounding areas

The minimum BIAS value of TRMM 3B43 was only − 0.27% (Table 5). The BIAS of CMFD and GPCC were also relatively small, 4.3% and − 9.1%, respectively, suggesting the better performance of the latter three precipitation products. The lowest BIAS was GPCP V2.3 of − 43.28%, and the highest BIAS was JRA-55 of 63.7% (Table 5). The RMSE and MAE revealed a similar pattern to BIAS. GPCC had the smallest RMSE of 14.71 mm, and CMFD and TRMM 3B43 both had a slightly larger RMSE value of 16.2 mm. CMFD had the smallest MAE of 6.5 mm, while HAR had the largest RMSE and MAE.

Table 5 BIAS, RMSE, and MAE of monthly precipitation between nine precipitation products and corresponding rain gauge observation during 2001–2013 in the Upper Reach of Shule River Basin and surrounding areas

In a short summary of the different performances on annual, seasonal, and monthly scales, CMFD had the best R both annually and monthly and had the minimum MAE, low BIAS, and higher RMSE on a monthly scale. Thus, CMFD performed best on a monthly scale. GPCC estimated slightly better than TRMM 3B43 when comparing their R and BIAS values. The performance of CMFD, GPCC, and TRMM 3B43 indicated that the uncertainty in estimated alpine precipitation in URSRB has largely decreased in these products. CMORPH has the worst indicators, suggesting its poor performance for estimating precipitation in the URSRB, which is the same as that in Qingling-Daba mountains at east edge of the Tibetan Plateau (Wang et al. 2017a). HAR performed well for the monthly distribution of R but overall overestimated the monthly, annual, and seasonal precipitation.

3.4 Different performance at each station

The performance of nine precipitation products varied greatly at each station but showed similarities with the monthly distribution of measured precipitation. The patterns of CMFD, GPCC, and TRMM 3B43 are closer with the observations, while CMFD is most consistent with the observation. Both GPCP V2.3 and CMORPH had an obvious mismatch with the measured values at each station, where estimated precipitation was obviously underestimated in the months with heavy precipitation. At the S4 site, the precipitation measured by CMORPH in spring and autumn was abnormally large. Although the distribution of monthly precipitation of HAR was close to the observation, it is evident that HAR overestimates the precipitation approximately twice in each month at each station (Fig. 5). JRA-55 significantly overestimated the precipitation in May at Tole, Yeniugou, Suli, Gahe, and S3 stations.

Fig. 5
figure 5

Monthly average precipitation chart of six sites including a Tuole, b Yeniugou, c Suli, d Gahe, e S3, and f S4 of the Upper Reach of Shule River Basin and surrounding areas

All precipitation products, excluding CMORPH, passed the 95% confidence significance test at all stations. The correlation between HAR and measured data was the best among the nine types of precipitation products, especially at higher altitude stations S1, S2, and S3 with R of 0.93, 0.95, and 0.96, respectively (Fig. 6a). CAPD had the highest R values of 0.95 and 0.98 at Tuole and Yeniugou stations respectively, which may be related to CAPD’s integration of observation data from national stations during data processing, followed by GPCC with R of 0.93 and 0.95. At the lowest altitude station, S4, GPCP V2.3 had the highest R of 0.87. In general, all precipitation products had their largest R values at the Yeniugou station, while R was smaller with more variations at the S4 station (Fig. 6b).

Fig. 6
figure 6

Indicators including a R, b BIAS, c RMSE, and d MAE of monthly average precipitation at each site of the Upper Reach of Shule River Basin and surrounding areas

All precipitation products largely overestimated precipitation at the S4 station. Both CMORPH and GPCP V2.3 had negative BIAS at the other five stations except S4 (Fig. 6b). ERA-Interim, JRA-55, HAR, and CAPD overestimated precipitation at all sites except S4. CMFD had the smallest BIAS of − 3.09%, 0.355%, − 2.38%, and − 4.97% at Tuole, Yeniugou, Suli, and S3 stations, respectively, followed by GPCC and TRMM 3B43. The BIAS of − 0.11% of CMORPH at S2 was the smallest, followed by GPCC and TRMM 3B43. The minimum BIAS for GPCP V2.3 at S4 was 10%, followed by GPCC and TRMM 3B43.

The RMSE values of GPCC, CMFD, and TRMM 3B43 were less than 20 mm at each of the six stations (Fig. 6c), and MAE values were less than 12 mm (Fig. 6d), while RMSE and MAE of HAR and CMORPH at each station were larger. The RMSE of CAPD at the two national stations were the smallest, 12.6 and 15 mm, respectively, followed by GPCC, CMFD, and TRMM 3B43. CMFD had the smallest RMSE and MAE at Suli and S3 stations and the smallest RMSE at Tuole and Yeniugou. At the Gahe station, GPCC had the smallest RMSE and MAE (11.53 and 7.48 mm), and GPCP V2.3 had the smallest RMSE and MAE of 6.55 and 4.7 mm, respectively.

Considering the four evaluation indicators, GPCC, CMFD, and TRMM 3B43 can estimate precipitation accurately at each station. The monthly distribution pattern of HAR is consistent with the observed data, especially at high altitude stations, but shows an apparent overestimation of precipitation. CMORPH cannot accurately estimate precipitation at most of stations. Although CAPD performed exceptionally at the national meteorological stations, it did not perform good at the four other stations, indicating that the interpolation from neighboring stations has some limitations. GPCP V2.3 performed well at low altitude site S4. The accuracy of all precipitation products at the S4 site is lower than that of other sites, while the precipitation products performed best at the Yeniugou station.

3.5 Precipitation gradients in rainy period

All precipitation products exclude CMORPH are increased with altitude, suggesting that there are clearly precipitation gradients (Fig. 7). The precipitation gradient of GPCP V2.3 in all years was significantly less than other precipitation products, while the trend of HAR was significantly greater than that of other precipitation products.

Fig. 7
figure 7

Relationship between average precipitation in rainy period and altitude of nine precipitation products including a CMFD, b CMORRH, c ERA-Interim, d GPCC, e GPCP V2.3, f JRA-55, g TRMM3B43, h HAR, and i CAPD

The average gradient values of GPCC, JRA-55, and CAPD were closer to the measured absolute gradient 12.4 mm/100 m, which were 10.9 mm/100 m, 13.05 mm/100 m, and 10.13 mm/100 m, respectively (Table 6). The precipitation gradient of GPCP V2.3 is far less than observed. The precipitation gradient of HAR is approximately twice the field observed gradient. However, the relative precipitation gradient (%/100 m) of HAR is very close to observation. It is also need to note that the precipitation gradient of CMFD and CAPD varied from 2001 to 2013, and precipitation gradient of CMFD was − 0.88 mm/100 m in 2013 while CAPD was − 0.44 mm/100 m in 2012, respectively.

Table 6 The precipitation gradients in rainy periods in the Upper Reach of Shule River Basin and surrounding areas

In general, GPCC performed best on the precipitation gradient and JRA-55 and CAPD performed better, followed by CMFD and ERA-Interim. CMORPH cannot correctly estimate the precipitation gradient.

4 Discussion

4.1 The different performance of nine precipitation products

CMFD seems to have the best performance on yearly, seasonal, and monthly scales because it is primarily based on the merged precipitation product from surface measurements and satellite remote sensing, followed by GPCC and TRMM 3B43. ERA-Interim, JRA-55, and CAPD performed slightly worse than TRMM 3B43, and GPCP V2.3. It is also found that CMORPH shows the worst overall performance due to only derived from remote sensing, which is different with that TRMM 3B42V6 and CMOPRH performed better than Precipitation Estimation from Remotely Sensed Information using artificial Neural Network (PERSINN) and TRMM RT reported by Gao and Liu (2013) over TP. The different performance of same precipitation product over URSRB and whole TP proved that it should be evaluated before application even had been assessed in larger extent. HAR performed well in estimating the distribution of monthly precipitation but overestimated precipitation by nearly double each month. Wang and Zeng (2012) suggested that no reanalysis product is superior to others in both daily and monthly precipitation after evaluating six reanalysis products including MERRA, NCEP/NCAR 1, CFSR, ERA40, ERA-Interim, and GLDAS over the whole TP. Our study indicates that CMFD is superior to other precipitation products in URSRB.

The main impact factors of the different performances of the precipitation products include the number of ground-based observation stations involved, the inverse algorithm of satellite remote sensing data, the algorithm of assimilation in the reanalysis data, and the different spatial resolution of the product.

Since there are no national meteorological stations in URSRB and only two national meteorological stations in the neighboring areas, the performance of each precipitation product at different stations varied in the URSRB. For instance, CAPD performs exceptionally at two meteorological stations but has poor performance at four other stations. The number of ground stations involved in different precipitation products varies greatly in the whole Tibetan Plateau. The data collected by the China Meteorological Administration’s observation of rain gauges is the key to CMFD’s best performance, which also contributes to the relatively better performance of GPCC (Wan et al. 2013). TRMM 3B43 incorporates multiple satellite and radar precipitation observations and undergoes multiple measurements and scale corrections (Karaseva et al. 2012), contributing to its good performance in the assessment. CMORPH is only based on microwave data and does not merge with ground measurement data, which influences its rather poor performance compared to the other datasets. Although CAPD seems involved more local observation stations in the Heihe Basin of Qilian Mountains, it did not show superiority to the other products, indicating that the precipitation derived from satellite remote sensing data may help to obtain the spatial distribution of precipitation.

There are large differences in the precipitation products obtained from different physical process parameterization schemes (Zhu et al. 2015). The different satellite inverse algorithms and the assimilation algorithm varied greatly between the nine precipitation products, making it difficult to assess the impact of different algorithms, since they also varied on different spatial–temporal scales. For example, CMFD incorporates Princeton reanalysis data, GLDAS, GEWEX-SRB, TRMM 3B42, and routine observation data from the China Meteorological Administration (Huffman et al. 2010; Karaseva et al. 2012). TRMM 3B43 inverses the precipitation from microwave sensors of multiple satellites (including SSMI, SSMIS, MHS, AMSU-B, and AMSR-E) and calibrates measurements with TCI. The estimation methods of GPCC V2.3 include the SSM/I emission algorithm, SSM/I scattering algorithm, TOVS-based algorithm, and the GPI and OPI precipitation index method.

GPCP V2.3 performed worse, possibly due to the coarse spatial resolution in the original precipitation product, which distorted many details of the precipitation. In addition, the study area is in the edge area of the HAR and CAPD range, meaning the location may have some impacts on the accuracy of these two precipitation products.

4.2 Poor performance in winter

All precipitation products performed poorly in winter, which maybe attributable to two reasons. One reason is that the precipitation in winter is relatively lower than other seasons with an average precipitation of only 3.16 mm in the URSRB, and the main type of precipitation is snow. Less precipitation in winter lead to the relatively large BIAS and MAE, even though the absolute error was not so large.

The second reason is the large uncertainty in observed precipitation in winter, in which it was probably largely underestimated. It is well-known that large biases exist in gauge precipitation data due to wind and wetting and evaporation loss, especially for snow, and corrections for the biases are necessary to generate reliable regional and global precipitation datasets and climatology (Groisman and Legates 1994; Groisman et al. 1975; Yang et al. 2009). The average correction factor is about 1.3 for snow and 1.1 for rain in Heihe Basin in Qilian Mountain areas (Kang et al. 1999). Moreover, snow is also affected by the re-distribution of wind, which further impacts snow observations on a field scale.

4.3 Uncertainty of evaluation

Uncertainty in the evaluation can be attributable to several factors, such as the effects of latitude/longitude on the precipitation gradient of the precipitation products, different resolutions of the original precipitation product, and the interpolation method in preprocessing.

One previous study (Qi et al. 2013) suggests that the precipitation at different stations is likely affected by longitude and latitude due to the large spatial range of site locations. To examine the effect of longitude, a linear regression of the measured annual precipitation and longitude (Fig. 8) suggests that R between annual precipitation and longitude is 0.98, indicating that the annual precipitation decreases from east to west, which is consistent with the trend of precipitation in China. The relationships between annual precipitation measurements of the nine precipitation products and longitude (Fig. 9) imply that nearly all products agree well with observations, except CMORPH with a negative R. The largest range of latitude and longitude are only 2.3° and 1.6°, respectively, suggesting the effect of longitude on precipitation is limited. The results indicate that the relationship between precipitation and longitude has little effect on the performances of different products.

Fig. 8
figure 8

The relationship between observed annual precipitation and longitude of the Upper Reach of Shule River Basin and surrounding areas

Fig. 9
figure 9

The relationship between annual precipitation of nine precipitation products including a CMFD, b CMORRH, c ERA-Interim, d GPCC, e GPCP V2.3, f JRA-55, g TRMM3B43, h HAR, i CAPD, and longitude of the Upper Reach of Shule River Basin and surrounding areas

The altitude most likely caused some deviation in the estimated precipitation gradient of DEM. The elevation of the resampled 0.25° DEM grid has a relatively larger error with the real altitude of the stations, especially in the peaks and valleys in alpine regions. The 6 stations in this study were mainly located on a plateau plane; thus, the impact of the evaluation was relatively small.

The original spatial resolution of the nine precipitation products varied greatly from 1 km to 2.5°, and 2.5° is about 300 km in the URSRB. The spatial resolution of evaluation is 0.25°, which probably lost the detail of some precipitation product with high original resolution, such as CAPD. The CAPD has an original resolution of 1 km, only performs good on monthly, seasonal, and annual scales, but has an excellent precipitation gradient. It may also explain the relatively poor performance of GPCP V2.3, which has an original coarse resolution of 2.5°. In addition, the interpolation method may have led to large uncertainty in the evaluations when preprocessing to the same spatial resolution of 0.25°.

5 Conclusions

In this study, CMFD, CMORPH, ERA-Interim, GPCC, CPCP V2. 3, JRA-55, TRMM 3B43, HAR, and CAPD precipitation products were evaluated by ground-based measurement data from 2001 to 2013 in the URSRB. The following conclusions can be made:

  1. 1.

    CMFD has the best performance when compared with the ground-based precipitation measurements on annual, seasonal, and monthly scales, followed by GPCC and TRMM 3B43. CMORPH showed the worst performance, while ERA-Interim, CPCP V2.3, JRA-55, HAR, and CAPD have average performance in reproducing precipitation. The performance of precipitation products is better on annual scale than monthly scale. CMFD performed best in spring, while GPCC performed best in autumn, and TRMM 3B43 performed best in winter. All products simulated summer rainfall better than simulations in other seasons.

  2. 2.

    All precipitation products, except CMORPH, performed well for the monthly distribution at each site. HAR had the best monthly distribution pattern at each site, especially those at higher altitudes. However, HAR overestimated precipitation in each month.

  3. 3.

    All precipitation products, except CMORPH, have a clear precipitation gradient during the rainy period. The precipitation gradient of GPCC, JRA-55, and CAPD were similar to observations. Although the gradient of HAR is more than twice of observed absolute precipitation gradient(mm/100 m), its relative precipitation gradient (%/100 m) was very close to the observation, indicating that it has some advantages for applying on the Tibetan Plateau.

Some uncertainties are presented in the evaluations due to limited precipitation observations in alpine regions. This study provides some insight into the spatial–temporal characteristics of different precipitation products in the UBSRB, which help to better understand the climate change and hydrological processes in the Tibetan Plateau. Considering the simple and quick use of precipitation products for measuring the spatial distribution of precipitation, further assessment of precipitation products is a meaningful and continuous work for studying climate change and its impact on hydrological processes in the Tibetan Plateau, which is also needed for other areas with limited resources and stations.