1 Introduction

Gridded precipitation products, developed using spatial interpolation methods with data from multiple sources such as rain gauges, weather radars and Earth-observation satellites, are vital for distributed hydrologic modelling studies (Hossain and Huffman 2008; Wu et al. 2014; Shah and Mishra 2016; Sun et al. 2018). Evaluation and bias correction of gridded precipitation data are two critical tasks that precede the use of these datasets in distributed hydrologic modelling applications. Several studies were conducted in the last decade to validate satellite rainfall estimates (SREs) across the globe through various approaches, such as grid to grid comparison (Dinku et al. 2007; AghaKouchak et al. 2012; Gosset et al. 2013; Prakash et al. 2015a, b), grid to point comparison (Scheel et al. 2011) and comparison at watershed or regional scales (Bitew and Gebremichael 2011; Thiemig et al. 2012; Beria et al. 2017). Dinku et al. (2007) investigated the challenges posed in SREs over two climatic regions, the mountainous and arid regions of East Africa, and noticed that the satellite products underestimated rainfall in the mountainous regions and overestimated rainfall in the arid region. Maggioni et al. (2016) presented a synthesis of findings of validation studies of contemporary SREs from different multi-satellite precipitation products during the TRMM era (1998–2015) across various regions in the world, and reported that TMPA-3B42, due to gauge adjustment over the land, is generally superior to other concurrent SREs in terms of continuous and categorical error metrics over the tropics.

Prakash et al. (2015b) performed an error characterization of near-real-time (3B42RT) and research product (3B42V7) of the TMPA precipitation estimates over India using gauge-based gridded precipitation data from the India Meteorological Department (IMD) for a period of 13 years (2001–2013) and found that both datasets represent the mean seasonal rainfall characteristics reasonably well. However, both products overestimate rainfall over most parts of the country except over the orographic regions with 3B42V7 having rather less error. Although the primary source of gauge-based gridded precipitation data in India is IMD, a few studies have also assessed the secondary sources of precipitation estimates from satellites (e.g. TRMM and Global Precipitation Measurement (GPM)) in complex terrain due to their easy accessibility and uniform spatial coverage (Bharti and Singh 2015; Bhardwaj et al. 2017). There are primarily two approaches used in the assessment of satellite rainfall products, and they are (i) evaluation using a reference dataset through a wide range of statistical scores and (ii) evaluation through their use in any specific application. The second approach of evaluating the SREs has also been investigated in India for applications in crop modelling for simulation of biophysical parameters like leaf area index (LAI) and biomass, estimation of crop yield with comparison to using the gauge data (Sreelash et al. 2013), vertical soil moisture profile simulation using HYDRUS 1D (Gupta et al. 2014) and found to be promising. Beria et al. (2017) evaluated TMPA-3B42V7 and the global precipitation measurement (GPM)-based multi-satellite precipitation product (e.g. IMERG) at daily scale over 86 basins in India and found that IMERG was performing better than 3B42V7 over most of the basins with overestimation in semi-arid regions. However, the improvement of IMERG over 3B42V7 did not translate into runoff simulation. A detailed review of the various evaluation studies of multi-satellite precipitation products over India was recently provided by Prakash et al. (2018).

Recently, several studies (Nastos et al. 2013; Lockhoff et al. 2014; Mehran and AghaKouchak 2014) have focused on assessing the capability of SREs in identifying extreme precipitation. Likewise, the studies over India for extreme precipitation estimates using corresponding extreme precipitation indices are rarely reported in the literature (e.g. Prakash et al. 2016). Such studies are critical to understand the applicability of SREs as inputs to simulation models dealing with flash floods, streamflow and groundwater. Furthermore, most of the studies have recommended two approaches to alleviate the problems in SREs before their use in hydrological applications. One approach is the local calibration using the local meteorological and physical conditions (topography) of the region and second approach is a suitable region-specific and season-dependent bias-correction. However, the local calibration for satellite precipitation products is quite cumbersome, and the only alternative is the bias correction of the satellite rainfall (Dinku et al. 2007; AghaKouchak et al. 2012). Therefore, many studies (Bitew et al. 2012; Yong et al. 2012; Habib et al. 2014; Yang et al. 2016; Yuan et al. 2017) had adopted bias correction of SREs to make them suitable for hydrological applications. In this paper, the term “rainfall” is used interchangeably with the term “precipitation.”

The SREs were improved by developing an empirical relationship between the SREs and rain gauges, and further incorporating the geographic location and topographic variables (Yin et al. 2008; Cheema and Bastiaanssen 2012). Abera et al. (2016) evaluated and bias-corrected five daily satellite rainfall products using quantile matching in the Upper Blue Nile basin, Italy, and found the bias has decreased. Vernimmen et al. (2012) used a power-law based bias correction of TMPA-3B42RT for its application in drought monitoring in Indonesia. Woldemeskel et al. (2013) developed a combined rain gauge and SREs product over Australia using a linearised weighting procedure considering error variances of each dataset. Mitra et al. (2009) developed a TMPA–3B42V7 merged precipitation product with the daily rain gauge data, to generate a 1° × 1° gridded rainfall product over the Indian region for verification of large-scale monsoon precipitation features in numerical weather forecast models. However, the approach only corrects the mean bias in the SREs. Several studies reported that the distribution-based bias correction techniques which have gained prominence are consistently better than the mean-based methods for climate projections and hydrological simulations (Chen et al. 2013; Teutschbein and Seibert 2013). Bias adjustment of satellite rainfall has been carried out considering both frequency and intensity bias in the rainfall. Tobin and Bennett (2010) developed a methodology to adjust the false alarm and missed rainfall in the SREs and found encouraging results in the simulation of streamflow at daily time scale.

The Kabini River (KR) basin is characterized by highly heterogeneous topography and land use/land cover coupled with large-scale hydro-climatological variability that essentially depends on precipitation as a source of hydrological input. It is an ideal testbed for diverse research studies related to agro-hydrological, remote sensing and hydrological investigations (Kumar et al. 2009). Several studies in the above themes have been carried out in the KR basin and sub-basins of KR (Soumya et al. 2013). The basin is prone to high flood risk due to depleting forest cover and also medium to high groundwater risk due to excessive pumping in the downstream of the basin. The large-scale spatial and temporal variation of precipitation over the KR basin is often not well represented by ground-based measurements of hydro-climatological variables, particularly because of the lack of dense monitoring networks and thick forest cover. Remote sensing of rainfall has immense potential to improve the modelling of the spatio-temporal variability of hydrological variables in the KR basin.

The IMD gridded rainfall data (Rajeevan et al. 2006; Pai et al. 2014), developed using rain gauge observations from a countrywide network of gauges, is used as a standard benchmark for evaluation of satellite-based precipitation products in India. However, the distributions of gauges are not uniform across the country. In addition to IMD gauges, there is a rather dense network of independent gauges maintained by local agencies in and around the KR basin. It is to be noted that these gauge observations were not used in the IMD gauge-only gridded precipitation product as well as in TMPA-3B42. Hence, the TMPA-3B42 and IMD gauge-only gridded precipitation products were validated against the local rain gauges across the KR basin in this study. Although several studies have been carried out to evaluate precipitation products over India using the gridded IMD rainfall as a reference, this study is a more comprehensive evaluation of both IMD and TRMM gridded precipitation products at basin scale with a suitable bias correction procedure using an independent basin-specific gauge network.

The present study is formulated with an objective of a comprehensive evaluation of the IMD gauge-only and TMPA-3B42 precipitation products over an environmentally sensitive region of the KR basin with a focus on the computation of extreme precipitation indices using predominantly independent gauge network. To alleviate the overestimation of dry days in the TMPA-3B42 estimates, a dry-day correction procedure is applied using gauge observations. Furthermore, a quantile-based correction is also applied to both IMD and TMPA gridded precipitation products. The contents of the paper are organized as follows. The description of the study area and details of the precipitation datasets are provided in Section 2. The evaluation metrics and methods for bias correction are presented in Section 3. Results are presented and discussed in Section 4 followed by conclusions in Section 5.

2 Study area and precipitation datasets

2.1 Kabini River basin

The Kabini River (KR) basin is located on the leeward side of the Western Ghats of the southern Peninsular India and lies between 11° 30′ 9′′ N to 12° 21′ 22.68′′ N latitude and 75° 47′ 25.44″ E to 76° 54′ 37.44″ E longitude with a distinct climate and geomorphologic gradient (Fig. 1). The west-east geomorphologic gradient is due to the climatic gradient induced by the Western Ghats, which run parallel to the west coast and act as a barrier to the southwest monsoon winds. Due to its unique characteristics, the KR basin has been designated as a critical zone observatory (CZO), which facilitates multidisciplinary studies related to hydrology, geochemistry, soil science, agronomy, remote sensing and ecology are being conducted (Sekhar et al. 2016). The area of the basin is approximately 7000 km2, and the elevation ranges from 500 to 2000 m above the mean sea level. There is a high gradient in the annual precipitation varying from 800 mm in the east to 5000 mm to the west of the KR basin. The temporal variability in precipitation is also quite large over the KR basin. The basin falls into two different climatic zones: tropical monsoon and tropical savannah according to Köppen-Geiger classification (Kottek et al. 2006) interspersed with forest cover (refer to Fig. 1a). The major crops in the region are plantations such as tea, coffee, pepper, cardamom in the low hills while the valleys are predominantly paddy fields. High rainfall in humid zone reduces the dependency on groundwater, and hence there is almost no pumping. Conjunctive usage of surface and groundwater in sub-humid zone demands pumping of groundwater eventually. Hence, downstream areas of the two major dams—Kabini dam and Krishna Raja Sagara dam are prone to more pumping than rest of the sub-humid zone. Groundwater pumping is mainly used for irrigation of crops in the southern parts of both sub-humid and semi-arid zones. Traditionally, crops are grown during Kharif season (i.e. southwest monsoon season) in most of these zones. However, these crops are irrigated using groundwater during the last decade. Parts of these zones cultivate either a second crop in the non-monsoon season (i.e. Rabi season) or year-long crops such as sugarcane and turmeric. Paddy is grown in the command areas of tanks and canals.

Fig. 1
figure 1

a Kabini river basin with the land use land cover map overlaid with the TRMM grids and spatial assignment of grids numbers. b elevation map with Köppen-Geiger climate classses and superimposed rain gauge network

2.2 Rain gauge data

Precipitation data for the study is based on telemetric rain gauges (TRGs) data available at Hobli (i.e. a cluster of villages formed by an area of approximately 250 km2) level over Karnataka state by the Karnataka State Natural Disaster Monitoring Centre (KSNDMC), an autonomous body affiliated to the Department of Science and Technology (DST) of the Government of Karnataka. The TRGs are operational since 2009. The rainfall data are collected at every 15-min frequency. There are 57 gauges in and around the KR basin with 53 TRGs from KSNDMC in Karnataka state and four manual rain gauges operated by the IMD in Kerala state (refer to Fig. 1b). The gauge identification numbers of TRGs as shown in Fig. 1b are issued by the KSNDMC.

2.3 Gridded precipitation data

Two gridded precipitation datasets based on gauge-only observations and satellite-based gridded data are used and evaluated in this study. The gridded precipitation dataset based on gauge-only observations is from the IMD. This gauge-only daily gridded precipitation product available at 0.25° spatial resolution has been generated over the Indian land area from 1901 to 2016 using a varying rain gauge network of 6995 stations by applying Shepard’s interpolation method (Pai et al. 2014). This dataset has been widely used as a reference precipitation data for the evaluation of satellite-derived and numerical models’ precipitation, and various hydro-meteorological applications in India. Although the density of rain gauges is quite dense over southern peninsular India, there seems to be a drastic decrease in the number of rain gauges from 2009 onwards (Pai et al. 2014; Beria et al. 2017). The IMD reports daily rainfall as the total rainfall accumulated for the preceding 24 h ending at 08:30 a.m. Indian Standard Time (IST) (e.g. 03:00 UTC) on the recording date of the measurement.

Another gridded precipitation product used and evaluated in this study is the TMPA-3B42 version 7 research product, which is a gauge-adjusted multi-satellite precipitation estimate (Huffman et al. 2007, 2010). This three-hourly multi-satellite precipitation product provides quasi-global quantitative precipitation estimates at 0.25° spatial resolution from 50° N to 50° S. The TMPA-3B42 precipitation estimates are produced in four stages (Huffman et al. 2010): (i) the microwave precipitation estimates are calibrated and combined, (ii) the infrared precipitation estimates are created using the calibrated microwave precipitation, (iii) the microwave and infrared estimates are combined and (iv) rescaling to monthly data is applied. The rescaling or bias correction is carried out with the Global Precipitation Climatology Centre (GPCC) gauge analyses to enhance calibration. Over India, the number of gauges used in bias adjustment of the TMPA is about 260 (Prakash et al. 2015a). After the degradation of TRMM precipitation radar in October 2014, a climatological calibration procedure has been used. TMPA-3B42 estimates are proven to be superior to other contemporary multi-satellite precipitation estimates across India as well as over the globe (Maggioni et al. 2016; Prakash et al. 2018). We used three-hourly precipitation estimates from TMPA-3B42 for 2009 to 2016 in this study. Even though the GPM-based multi-satellite precipitation product (e.g. IMERG) is available from March 2014 at a better spatial resolution of 0.1°, however we have used TMPA product in this study due to its longer temporal record and its availability from the time the TRGs are operational in KR basin.

3 Methodology

3.1 Metrics for evaluation

As TRGs and IMD accumulate daily rainfall ending at 0300 UTC, the same convention for the computation of daily accumulation from three-hourly TMPA product is utilized for the assessment. The grid centres of TRMM and IMD do not match. There is a shift of 0.125° between the grid centres. To maintain the spatial homogeneity between the datasets, the IMD gridded rainfall is re-sampled to match TRMM grid centres using the linear interpolation technique. The grid to point comparison is done with the co-located rain gauges in the grid, and when more than one gauge is available a mean value of all available observed rainfall values is used. The evaluation is done from 2009 to 2016 due to the availability of TRGs since 2009. Gebremichael (2010) and Teegavarapu et al. (2017) have put forth a standard framework for the assessment of radar-based precipitation and SREs, respectively, using ground-truth data (e.g. rain gauge data) as a reference. The framework utilizes three types of analyses, and they are (i) categorical verification statistics, (ii) continuous verification statistics and (iii) extreme indices. Apart from these three major analyses, visual verification methods can also help in the validation exercise of precipitation products.

  1. (i)

    Categorical verification statistics

    The categorical metrics are used in this study to measure the correspondence between the observed and the gridded (i.e. satellite-based or gauge-based) precipitation. Also, the volumetric indices, which are an extension to categorical indices proposed by AghaKouchak and Mehran (2013), were also computed. IMD uses a threshold of 2.5 mm to define a rainy day, and this threshold was adopted in this study. The verification measures are described in Table 1.

  2. (ii)

    Continuous verification statistics

    The continuous verification statistics measure the accuracy of gridded precipitation products vis-à-vis precipitation amount or intensity. Standard measures such as Pearson’s correlation coefficient (r), root mean squared error (RMSE), bias (β), mean absolute error (MAE) defined in Table 2 are used to quantify the errors and performance measures calculated based on gridded rainfall estimates and rain gauge observations.

  3. (iii)

    Extreme precipitation indices

    The categorical verification statistics and continuous verification statistics do not evaluate how well the SREs resolve extreme event-related precipitation magnitudes. To appraise how well the TMPA and IMD gridded products estimate the extreme values as observed by gauge-based observed precipitation, 12 standard extreme indices introduced by the World Climate Research Program (WRCP) on Climate and Ocean: Variability, Predictability and Change (CLIVAR) are used (Karl et al. 1999) in this study. Table 3 provides details of the twelve extreme precipitation indices used in this study.

Table 1 Categorical verification statistics
Table 2 Continuous verification statistics with their respective equations and units
Table 3 Definition of different precipitation indices used in this study

3.2 Bias corrections

The bias in the TMPA-based product after the evaluation has been corrected for two components: (i) dry-day correction and (ii) frequency and magnitude correction using the quantile-based approach. The details of these approaches are as described in the following sub-sections.

3.2.1 Method for dry-day correction

Teegavarapu et al. (2009) proposed a correction technique for the lower end extremes of precipitation data by using the nearest gauge as the best estimator (referred to as single best estimator) to set it as a dry day in spatial interpolation estimate. In this study, a variant was used to correct the dry days wrongly constructed in the gridded data using the rainfall recorded in the grid or multiple grids around the grid by one or more rain gauges. The hypothesis is that if all the stations report zero rainfall, then observation in the grid in contention can be equated to zero rain in any given time interval (i.e. day). Dry-day corrections are implemented whenever a specific gridded product is known to be underestimating the number of dry days. The threshold used for dry is 0 mm of precipitation on any given day. Dry-day correction for any data value from a specific grid (referred to as base grid) is implemented using information from a nearby rain gauge or a set of rain gauges from the centre of the grid or information of rainfall magnitudes from one or more grids surrounding the base grid. Precipitation information for the surrounding grids is obtained using a separate gridded precipitation product available for these grids. The revised value for the base grid magnitude in any given time interval t using dry-day correction is given by Eq. 1. The variable θm, t is the revised value of the base grid (m) and θn, t is the magnitude of precipitation at the gauge nearest to the centre of the base grid.

$$ {\theta}_{m,t}={\theta}_{n,t}\left\{ if{\theta}_{n,t}=0\right\}\kern2.75em \forall m,n,t\kern1em \wedge m\epsilon n $$
(1)

Dry-day correction based on surrounding grids, rook neighbourhood (Lloyd 2010) is given by Eq. 2 and Eq. 3. The variable Sm, t is the accumulated value of rainfall based on the nearby ith grid values θi, t.

$$ {S}_{m,t}={\sum}_{i=1}^{N_G}{\theta}_{i,t}\forall i,\forall t $$
(2)

where NG be the number of surrounding grids. θm, t is the revised rainfall estimate for base grid.

$$ {\theta}_{m,t}={S}_{m,t}\left\{ if{S}_{mt}=0\right\}\kern4.25em \forall i,\forall t $$
(3)

A schematic illustrating different dry-day corrections is shown in Fig. 2. However, in the variant proposed, as the interpolated rainfall is based on neighbourhood, this might lead to overcorrection of the dry days in few cases. Therefore, another variant is also proposed in this study using the gridded precipitation data. Figure 2b shows the schematic for dry-day correction at the boundary (or corner) of the study region where there might not be surrounding grids existing in all directions. Figure 2c demonstrates the correction for the grids in the centre of the study region, which has eight surrounding grids that can be of aid in dry-day correction.

Fig. 2
figure 2

Schematic of dry-day correction for TRMM data using the neighbourhood. a Single grid-based correction. b Multiple grid-based corrections (case 1) with the base grid at the corner of the study area. c Multiple grid-based corrections (case 2) with the base grid at the centre of the study area

3.2.2 Bias corrections using quantile matching

The quantile-based mapping (QM) method used for correcting the biases in downscaled precipitation datasets obtained from the general circulation model simulations is used in this study to correct biases in the gridded precipitation data (Panofsky and Brier 1968; Maurer and Hidalgo 2008). The QM method adjusts all the moments of the estimated data. The availability of stationary precipitation time series from rain gauges is required for bias corrections. The method uses the observed cumulative distribution function (CDF) of data from rain gauges to correct gridded precipitation data with an assumption that the distribution characteristics of data do not change in the period of consideration. The bias correction is given by Eq. 4.

$$ {\beta}_i^c={F}_{ob}^{-1}\left({F}_{gd}\left({\beta}_i^{gd}\right)\right)\forall i $$
(4)

where Fob is the CDF of the observed precipitation data derived from rain gauge and Fgd is the CDF derived from the gridded precipitation data. The variable \( {\beta}_i^c \) is the bias-corrected gridded precipitation value for any time interval i obtained in two steps: (a) gridded precipitation are used to develop a CDF and the non-exceedance probability \( {F}_{gd}\left({\beta}_i^{gd}\right) \) is obtained for each value of \( {\beta}_i^{gd} \) and (b) corrected estimate (\( {\beta}_i^c \)) obtained using the inverse of the observed CDF for the value of non-exceedance probability obtained in the first step. To correct the gridded precipitation, data available from the collocated rain gauge or nearest rain gauge from the grid centre is used. The accuracy of gridded data precipitation corrections will depend on the existence of serially complete chronological data from rain gauges and evidence of stationarity of the precipitation time series over the period under consideration. QM method can be used for bias correction of both daily and monthly gridded precipitation datasets.

3.3 Evaluation of stationarity of precipitation time series

The stationarity of rain gauge data is confirmed by the evidence of lack of any statistically significant trends and changes in the first two statistical moments. Two nonparametric trend tests (viz. Spearman’s Rho and Mann-Kendall tests) are used to evaluate monthly time series for confirmation of any statistically significant trends in this study. As seasonality was evident in the monthly time series based on visual evaluation of time series, a seasonal Mann-Kendall test that accounts for seasonality and autocorrelation was used. In additional to trend tests, augmented Dickey-Fuller (ADF) test (Dickey and Fuller 1979) was also used to assess stationarity of time precipitation time series. All the hypothesis tests were conducted at 5% significance level.

4 Results and discussions

An exhaustive evaluation of two gridded precipitation products (i.e. IMD and TRMM) against independent gauge observations is carried out at two temporal scales (viz., daily and monthly) with first the results of daily followed by monthly scale analyses are discussed. Twenty grids in and around KR basin are chosen for evaluation in the current study and are numbered from 1 to 20 (Fig. 1a) with 18 grids having one or more rain gauges. Rain gauge 2, which is exactly on the edge of grid 11 and 12, is used as a common reference for both the grids. Grids 16 and 20 are with no gauges, and therefore not included in the evaluation. The IMD and TRMM are evaluated with the gauge rainfall as a reference. Also, the TMPA was evaluated against IMD gridded rainfall data as a reference.

4.1 Evaluation of daily datasets

The four statistical moments (viz., mean, variance, skewness and kurtosis) of daily rainfall were compared against IMD and 3B42V7 rainfall at 18 grids. Table 4 shows the first and second moments and their absolute differences for each grid. The mean and variance in the 3B42V7 rainfall is high, with IMD rainfall closely matching the median. However, the median value of skewness and kurtosis of 3B42V7 are quite closely matching the RG values. The third and fourth moments are not shown in the table.

4.1.1 Rainy days

IMD uses a threshold of 2.5 mm of precipitation to classify a day as a rainy day. This threshold was used to compute the number of rainy days in a month across each year for 8 years. The number of rainy days is overestimated by TRMM in the semi-arid region in the monsoon season (JJAS) and is underestimated in the humid region of the KR basin. The month of July has the maximum number of rainy days in a year. In this month, the number of rainy days is underestimated in grids close to the Western Ghats (see Fig. 1b), and as we move eastwards, the number of rainy days is overestimated by TRMM. Figure 3 shows the number of rainy days in different months for two grids 6 and 15. There is clear disagreement among the three datasets in the number of rainy days in the monsoon season for the grids in the semi-arid region. This overestimation of dry-day rainfall by TRMM in June, July, August and September warrants a dry-day correction of TRMM.

Fig. 3
figure 3

Number of rainy days at two locations

4.1.2 Categorical verification statistics

Figure 4 shows the categorical statistics with both RG and IMD as a reference. The average POD is around 0.63 with RG as reference and 0.69 with IMD data as a reference. The POD values are smaller in grids 12 and 18, around 0.46 to 0.54. The FAR and MR measures are in the range of 0.46 to 0.55. The CSI is found to be lower in the grids 12, 14, 18 and 19 using both gauges and IMD as a reference. The grids 12 and 18 are in the forested region. However, the grids 14 and 19 which have crops and in the semi-arid region have a low CSI. The performance is lower for the grids 11 to 20. Figure 5 shows the volumetric metrics; the VHI values are high, with values ranging from 0.48 to 0.89. The VMI and VFAR are high in grid 18 with 0.51 and 0.42 respectively. Similar behaviour was also observed when the RG is used as a reference for evaluating IMD rainfall.

Fig. 4
figure 4

Spatial distribution of categorical verification statistics POD, FAR, MR and CSI for daily time scale, with threshold of 2.5 mm for TMPA with station rainfall as reference (left column), TMPA with IMD as reference (centre column) and IMD with station rainfall as reference (right column)

Fig. 5
figure 5

Spatial distribution of volumetric indices VHI, VFAR, VMI and VCSI for daily time scale, with a threshold of 2.5 mm for TMPA with station rainfall as a reference (left column), TMPA with IMD as a reference (centre column) and IMD with station rainfall as a reference (right column)

4.1.3 Extreme indices at daily scale

The extreme precipitation indices for all the 18 grids combined is shown in Fig. 6. The comparison is carried out between the rain gauges, IMD and TRMM daily rainfall. IMD is able to preserve the CDD and CWD that were observed from the rain gauge data. In particular, the interquartile range (IQR) in TRMM is very small as compared to the RG and TRMM rainfall.

Fig. 6
figure 6

Extreme precipitation indices based on two different rainfall products and gauge

4.2 Evaluation of monthly datasets

4.2.1 Autocorrelation

The monthly temporal autocorrelation was computed for lags ranging from 1 to 20 for RG, IMD and TRMM to check the persistence among the datasets. It was found that in most of the grids, the temporal autocorrelation is well captured by both the IMD and TRMM with the rain gauge values. The autocorrelation value is close to 0.5 for lag 1 at almost all of the grids. Figure 7 shows the autocorrelation at six grid locations. General behaviour was observed as classified based on the climate. In the grids closer to the Western Ghats and are in the humid zone, there is a very good agreement in the temporal autocorrelation, as seen in grid grids 1, 6 and 11 (Fig. 7). The farther we move away from the Ghats with a semi-arid type of climate, a clear disagreement in the monthly correlation values is observed as dominantly visible in grid 15 (Fig. 7f).

Fig. 7
figure 7

Monthly autocorrelation for rain gauge, IMD and TRMM at six grids locations

4.2.2 Comparison of cumulative density functions

The cumulative distribution function (CDF) comparison is carried out using the rain gauge rainfall as a reference. Figure 8 shows the CDFs based on IMD, satellite estimates and from rain gauge. There is good agreement between the CDFs of IMD and TRMM in the grids 4 and 6 with that of the gauge. However, the CDF of IMD and TRMM show discrepancy from the CDF of rain gauge in the grids 15 and 18, with higher discrepancy observed in TRMM. The TRMM rainfall is highly overestimated in these grids.

Fig. 8
figure 8

Comparison of non-exceedance probability of rain gauge, IMD and TRMM rainfall at four grids in the KR basin

4.3 Bias corrections

The results from the application of two approaches of correction of the gridded products using dry-day correction and QM are presented in this section.

Table 4 Comparison of daily rainfall moments of RG, IMD and TRMM

4.3.1 Dry-day corrections

The hypothesis is that satellite has a wide swath and has a synoptic view of the region, and the SREs are an average of large area and therefore, should have more rainy days. However, it is observed that satellite-based estimates in few grids in the humid region of the KR basin have more non-rainy days than as seen in the RG data. One of the likely reasons could be due to low sampling frequency and missing of low-intensity rainfall events. Therefore, in the present study, a dry-day correction is proposed using the RG or IMD data for correction of TRMM. On the other hand, the IMD gridded data developed by spatial interpolation has a general limitation that any spatial interpolation method overestimates lower-end values and underestimate higher-end extreme values. As it is evident from the number of rainy days comparison for each grid that the TRMM is overestimating the dry-days, they are corrected using the gauges. The IMD rainfall is dry-day corrected using the RG data as a reference. However, the TRMM data can be dry-day corrected using either the rain gauges or the IMD gridded rainfall as a reference. Table 5 shows the results of number of dry-days as observed in TRMM and IMD.

Table 5 Comparison of performance measures before and after dry-day correction

4.3.2 Quantile matching-based corrections

The results of QM correction using the complete and partial RG data set are presented in experiment 1, 2 and 3.

Experiment # 1: complete RG data as reference

Bias corrections at a monthly scale are carried out using the QM approach using data from the rain gauge that is collocated in a particular grid. Rain gauge that is closest to the grid centre whether it is within a grid of interest or outside, is selected for bias correction. Both IMD and TRMM datasets are corrected for 20 grids and the rain gauge and gridded datasets before and after bias correction are compared for distributional similarity using a two sample Kolmogorov-Smirnov test at 5% significance level. The merged gauge timeseries data collected from Department of Economics and Statistics (DES) for the period 1998–2008 and the TRG data available from 2009 to 2016 is used for bias correction. Any year with missing daily data is not used for calculation of monthly totals. Rain gauges have missing data in years 2000, 2002, 2005, 2009 -2010 and therefore monthly data from these years are not used for bias correction. Results indicate that 12 and 16 sites out of 20 fail the KS test before correction for IMD and TRMM datasets respectively. After the bias correction, 3 and 12 sites fail the KS test. Substantial improvement in matching the quantiles of the data is obtained for IMD data compared to TRMM.

Stationarity check

Monthly rain gauge data available from 57 rain gages are evaluated for any statistically significant trends using the Mann-Kendall test that considers seasonality. The test was carried out at 5% significance level. Data from 15 and 3 sites at 5% and 1% significance levels respectively indicate decreasing trends. The augmented Dickey-Fuller (ADF) test results at all the sites indicated an alternative hypothesis at 5% significance level suggesting stationarity of monthly time series. Figure 9 shows the results for two stations, one TRG 227 where a negative trend is observed (Fig. 9a) and other RG 3 where there is no trend (Fig. 9b).

Fig. 9
figure 9

Results from Mann Kendall tests and time series data for RG (a) 227 and (b) 3

Experiment # 2: partial RG data as reference

In this experiment, data from all the rain gauges that are located in a specific grid are used for bias correction. If a grid does not have a collocated rain gauge, a gauge nearest to the grid centre is used. Also, rain gauge data from period 2009–2016 is used for correction of gridded datasets to evaluate the bias correction improvement when data from one specific temporal window which is part of the entire time period is used. Since monthly time series at almost all sites are proved to be stationary, such an experiment of using only a part of the data for bias correction is justified. Results indicate that 12 and 16 sites fail KS test before for IMD and TRMM datasets respectively. After bias correction, 8 and 12 sites (grids) fail the KS test for IMD and TRMM datasets. It is noted that using the complete rain gauge data and partial data for bias correction does vary the results. Although for TRMM the number of grids using partial or complete are the same, however there is a significant difference in results for IMD using the partial or complete rain gauge data for bias correction. Figure 10 shows the density plots at daily scale before and after bias correction at three grid locations. A substantial improvement in bias reduction is observed in grid 15, where the bias correction density is matching with the rain gauge density. At others grid 12 and 3, we see a marginal improvement in the bias after correction.

Fig. 10
figure 10

KDEs of rain gauge, IMD and bias corrected IMD (BIMD) at three grids in the KR basin. a Grid3, b Grid12 and c Grid15

Experiment # 3: complete RG data as reference

This experiment is the same as experiment 1, except that the rain gauge data from the entire period 1998–2016 is used for bias correction. Results indicate that 12 and 16 sites fail KS test before the correction for IMD and TRMM datasets, respectively. After bias correction, 8 and 14 sites fail the KS test for IMD and TRMM datasets.

5 Conclusions

A comprehensive study aimed at evaluation and correction of biases in two gridded precipitation datasets from an environmentally sensitive river basin, Kabini, in South India is reported in this paper. Daily and monthly observations from the rain gauges in the basin are used to evaluate the two gridded products. Results from the analysis indicate spatially varying biases for both the gridded products with one of the products from the Indian meteorological department (IMD) was found to be better than a spatially comparable satellite-based product. Biases were more prevalent in daily precipitation compared to those from a monthly temporal scale. Dry-day corrections using two different methods based on nearest neighbour’s concept and quantile matching approach have helped to correct the biases. Further, the TRMM, IMD and bias-corrected products can be used as inputs to surface and ground water models to test the sensitivity of these models in replicating observed surface and sub-surface flow characteristics. Based on the analyses for the KR basin, we recommend the IMD gridded rainfall data for surface water modelling where there is day to day variations in rainfall are important. At monthly time scales, both IMD and TRMM data are suitable for ground water modelling or water budgeting. However, both at daily and monthly timescales it is recommended that bias corrections of the products are carried out before their use in any hydrologic modelling exercise. There are two limitations of the current work; one is the results presented in this study could be biased as there is a dearth of gauges in the humid region and other is the scale mismatch. The gridded precipitation has been compared with the station rainfall. To solve for the scale mismatch, a gridded precipitation data at 0.25° could be generated using Shepard’s method with modified neighbourhood selection (Yeggina et al. 2019) using available rain gauge observations. The density of the gauges can be improved in the region where installation of weather radar sites is difficult due to the topography of the KR basin or else new approaches of using commercial cellular communication networks (Leijnse et al. 2007) to evaluate the satellite rainfall can be tested as future research.