1 Introduction

Climate of South Asia is dominated by the two monsoons. During summer monsoon season (June–September, JJAS), most parts of India receive major proportion (75–90 %) of the annual rainfall (Parthasarathy et al. 1987; Pant and Rupakumar 1997), which is otherwise called south-west monsoon (Kripalani and Kulkarni 1997; Prasanna and Yasunari 2009), the other being north-east monsoon (Oct–Dec, OND) (Dhar and Rakhecha 1983; Prasanna and Yasunari 2008). Both these monsoons are the lifeline for a billion population, who inhabit the vast land mass of the subcontinent. The rainfall over the country exhibits large spatial as well as temporal variations. Each year the summer monsoon (JJAS) brings large amount of rain for most parts of the country, while the other parts may experience severe drought. The spatial asymmetry of rainfall distribution is dependent on the geographical conditions, with heavy rainfall activity along the orography of the Western Ghats and mountains over the north-eastern part of the country. The spatial distribution of rainfall varies considerably from day to day. Over most parts of the country rain occurs in spells under the influence of favourable circulation conditions.

Heavy rainfall events are generally associated with cyclonic disturbances such as depressions or cyclonic storms moving across the country. These disturbances form over warm waters surrounded by the country and usually have a lifetime of about 3–4 days. Hence the frequent occurrence of heavy rainfall events can affect the strength of the monsoon, which is very important for economic development, disaster management and hydrological planning of the country. The country’s agriculture, power generation and industrial production depend substantially on the monsoon rains. Deficient and excess years of monsoon rainfall are associated with low and high food grain production years, respectively (Parthasarathy et al. 1992, 1993; Guharthakurta and Rajeevan 2008; Preethi and Revedekar 2012; Prasanna 2014).

There are several works on fluctuations of the Indian summer monsoon rainfall (ISMR) on different spatio-temporal scales such as synoptic, intraseasonal, monthly and seasonal followed extensively in India over the past several decades using observed rain gauge data over land stations (Mooley and Parthasarathy 1983, 1984; Hartmann and Michelsen 1989; Krishnamurti et al. 2009), and after the introduction of geostationary satellites, rainfall estimates, based on IR channel data, have been also studied widely and rainfall series on monthly and seasonal scales are created for India using Arkin (1979). Several other studies using Tropical Rain Measuring Mission, TRMM-3B42 V6 data and several merged satellite data evaluate the ISMR features in comparison with the rain gauge observations on the daily scale (Mitra et al. 2009; Rahman et al. 2009; Prasanna et al. 2014). They have also shown that these data products compared favourably with the observed daily rainfall on real-time basis; however, estimates of rainfall are lower than observed gauge rainfall over most parts of the country during the summer monsoon season. The hot spots of heavy monsoon rainfall over India have been identified to lie over the Western Ghats, Eastern Himalayas, Western Himalayas and on the southern flank of the monsoon trough (Goswami and Ramesh 2006).

Over the Indian region, many studies have analysed extensively the variation in summer monsoon rainfall (SMR) with the main emphasis on understanding the changes in the mean monsoon rainfall for the country as a whole as well as for different meteorological subdivisions of India (Mooley and Parthasarathy 1983, 1984; Parthasarathy 1984; Parthasarathy et al. 1993; Preethi et al. 2011; Prasanna et al. 2014). A study of extreme rainfall events over India (Goswami et al. 2006) revealed that there is an increasing trend in heavy rainfall activities during the monsoon season, while the frequency of moderate events shows a decreasing trend and therefore makes the seasonal rainfall trendless. In this paper 90 percentile (heavy rainfall), 50 percentile (moderate rainfall) and 20 percentile (light rainfall) categories are studied in detail.

The seasonal monsoon strength may depend upon the frequency and intensity of heavy rainfall events that occur during the monsoon season. Rainfall is an important parameter that varies continuously in both space and time and has not been well monitored in many areas. Traditionally, it is monitored using a network of rain gauges at point locations, but as rainfall is highly variable, the sampling approach may not always be suitable. In addition, obtaining recent data for a dense network is difficult in practice. However, progress in the use of satellites for rainfall monitoring provides an opportunity to use up-to-date information on rainfall at a high spatial resolution.

The present study is therefore focused on understanding the spatio-temporal distribution of heavy, moderate and light rainfall events during 1998–2007 using datasets based on the rain gauge, satellites, merged products and reanalysis products. Comparison of different rainfall datasets with India Meteorological Department (IMD) 1° latitude × 1° longitude resolution gridded rainfall data (Rajeevan et al. 2005) is made. The datasets like APhro (0.25° × 0.25°), TRMM-3B42 V6 and V7 (0.25° × 0.25°), CPC (0.1° × 0.1°), GPCP-1DD (1° × 1°) and reanalysis ERA-INT (1.5° × 1.5°) and MERRA (1° × 1°) are also used in the present study.

Apart from the characteristics of intensity and frequency of rainfall, the active and break conditions revealed by these datasets are also examined and their discrepancies are discussed in detail for the data overlapping period of 2001–2007.

2 Data and method of analysis

2.1 Data

2.1.1 IMD gridded dataset

The daily gridded rainfall data with a resolution of 1° × 1° from the IMD (Rajeevan et al. 2005) are used in this study. IMD uses the (Sheperd 1968) interpolation technique for gridding data from individual stations, and the interpolation method is based on the weights calculated from the distance between the station and the grid point and also the directional effects. The IMD product uses gauge data from 1803 stations that have at least 90 % data availability (Rajeevan et al. 2006) and is interpolated to a 1° latitude × 1° longitude grid to estimate accumulated rainfall in 24-h ending 08:30 IST (03:00 UTC). This dataset for the Indian region is available for the period 1951–2007 and is used extensively for research studies (Goswami et al. 2006; Krishnamurthy and Shukla 2008).

2.1.2 Aphrodite precipitation dataset

The Asian Precipitation Highly Resolved Observational Data Integration Towards the Evaluation of Water Resources (Aphrodite; Kamiguchi et al. 2010), a rain gauge-based daily precipitation dataset is also used for comparison in this study. Aphrodite uses a new weighted mean method based on Spheremap (Willmott et al. 1985) for gridding the station observations (Yatagai et al. 2009).

2.1.3 High-resolution CPC rainfall estimates

The CPC daily rainfall estimates based on satellite and rain gauge data for the South Asian region are used in the study. Data for South Asia are available for the region bounded by 70°E–100°E and 5°N–35°N. The dataset is available since 1 May 2001 at a high resolution of 0.1° latitude × 0.1° longitude. Inputs to estimate of this rainfall include: (1) analysis derived from Global Telecommunications System (GTS) gauge observations of daily precipitation; (2) the Geostationary Operational Environmental Satellite (GOES) Precipitation Index (GPI; Arkin and Meisner 1987) infrared (IR) from full-resolution data from geostationary satellites (Janowiak et al. 2001); (3) estimates derived from microwave observations of the Special Sensor Microwave Imager (SSM/I; Ferraro and Marks 1995); and (4) those from the Advanced Microwave Sounding Unit (AMSU)-B (Zhao et al. 2001). Details of the merging algorithm can be found in Xie et al. (2002).

2.1.4 GPCP-1DD dataset

The GPCP precipitation product is a widely used product for climate research studies (Adler et al. 2000, 2003; Huffman et al. 1997, 2001). The GPCP Combined Product version 2 is a near real-time product including monthly precipitation data from global telecommunication system (GTS) stations, i.e. synoptic weather stations and climate stations. The basic difference between GPCP and gauge data is that in GPCP, IR and microwave data are used for rainfall estimation with the help of rain gauge calibration.

2.1.5 Tropical Rainfall Measuring Mission (TRMM) precipitation product (ver. 6 and ver. 7)

TRMM is a joint space mission between NASA (USA) and Japan Aerospace Exploration Agency (JAXA) designed to monitor and study tropical rainfall. TRMM is part of NASA’s Mission to Planet Earth, a long-term, coordinated research effort to study the Earth as a global system. The algorithm used in TRMM-3B42 V6 and V7 precipitation product provides a combination of TRMM’s merged passive microwave (HQ) and microwave-calibrated IR (VAR). The current scheme is a simple replacement of each grid box with the merged passive microwave (HQ) value if available; otherwise, the calibrated IR (VAR) value was used. Thereafter, in each grid box all of the multi-satellite (SSM/I, AMSR, AMSU) values in the month were summed and combined with monthly gauge data in order to produce the version 6 and version 7 TRMM data product. Finally, for each grid box, all the 3-hourly values were scaled to approximately sum to 3B43 monthly value. This system was developed to produce TRMM and other data estimates by applying new concepts in merging quasi-global precipitation estimates and taking advantage of the increasing availability of fine-scale input datasets. The overall system is referred to as version 6 and version 7 TRMM Multi-Satellite Precipitation Analysis (TMPA) by Huffman et al. (2007).

2.1.6 ERA-INT reanalysis data

The ERA-INTERIM reanalysis predicted preciptation dataset from 1998 to 2007 is used in this study (Berrisford et al. 2009). Briefly, this reanalysis system uses cycle 31r2 of the ECMWF Integrated Forecast System, which was configured for the following resolution: 60 levels in the vertical with the top level at 0.1 hPa, T255 spherical harmonic representation for the basic dynamical fields and a reduced Gaussian grid with approximately uniform 79-km spacing for surface and other grid point fields. The product covers the data-rich period 1989–2007 and is considered to represent tropical precipitation and the hydrological cycle well because of new humidity analysis and improved model physics (Simmons et al. 2006). The precipitation product from ERA-INT reanalysis is a forecasted quantity.

2.1.7 MERRA reanalysis data

Retrospective analyses (or reanalyses) integrate a variety of observing systems with numerical models to produce a temporally and spatially consistent synthesis of observations and analyses of variables not easily observed. The breadth of variables, as well as observational influence, makes reanalyses ideal for investigating climate variability. The Modern Era-Retrospective Analysis for Research and Applications supports NASA’s Earth science objectives, by applying the state-of-the-art GEOS-5 data assimilation system that includes many modern observing systems (such as earth observing system: EOS) in a climate framework. The Goddard earth Sciences Data Information Services Center (GES DISC) provides utilities for users to access and subset the MERRA data products (Lucchesi 2012). The precipitation product from MERRA reanalysis is a forecasted quantity.

Based on the eight available daily precipitation datasets, namely IMD dataset (1° × 1°), Aphrodite (0.25° × 0.25°), TRMM-3B42 V6 and V7 (0.25° × 0.25°), GPCP-1DD (1° × 1°), CPC (0.1° × 0.1°) and reanalysis ERA-INT (1.5° × 1.5°) and MERRA (1° × 1°), the focus of this study is to compare the heavy rainfall data over the Central Indian land and different subdivisions within India.

2.2 Method of analysis

To understand the spatial and temporal variations in rainfall characteristics over India, the rainfall is classified into three categories, namely heavy (90 percentile), moderate (50 percentile) and light (20 percentile) rainfall. The precipitation categories are studied for number of rainy days in each category, and the amount of rainfall from each category and its contribution to the seasonal total precipitation is studied over the Central India (17°N–27°N and 74°E–82°E) and each meteorological subdivisions over the Indian land region.

The heavy precipitation is classified as rainfall above 90 percentile threshold, the moderate precipitation is classified as rainfall above 50 percentile threshold and below 90 percentile, the light precipitation is classified as rainfall above 20 percentile threshold and below 50 percentile, and the very light precipitation is classified as rainfall below 20 percentile threshold.

The wet year composite anomalies are constructed using the anomalies calculated from the years 2003, 2005 and 2006, and the dry year composite anomalies are constructed using the anomalies calculated from the years 2001, 2002 and 2004.

The number of rainy days and amount of precipitation in each category are calculated for every year. The following sections discuss in detail the spatial and temporal variability of precipitation.

3 Spatial distribution of precipitation intensity over India

Rainfall climatology of summer monsoon season over South Asia for the 10-year period (1998–2007) is shown in Fig. 1 from different datasets except CPC which is for a 7-year period (2001–2007). Heavy rainfall along the west coast of India and over north-east India is noticed in all the datasets. In contrast the rainfall over north-west India and the south-east peninsular is very low in all datasets.

Fig. 1
figure 1

Climatological mean seasonal precipitation (JJAS) from different precipitation products (unit: precipitation in mm/day)

The IMD shows more precipitation over the eastern ganges plain (Fig. 1a), which is not captured well in the Aphrodite data (Fig. 1b). The high-resolution CPC data show heavy rainfall over the Western Ghats and the Indo-Gangetic plains (Fig. 1c), whereas the GPCP-1DD fails to get the heavy precipitation over the Western Ghats (Fig. 1d). The TRMM-3B42 V6 and V7 capture the heavy precipitation over Western Ghats and the north-east Indian regions quite well. 3B42 V7 (Fig. 1f) is an improved version from the previous version of the data 3B42 V6 (Fig. 1e). The reanalysis products in general show a better climatology, while the MERRA reanalysis (Fig. 1h) shows heavy precipitation over the Western Ghats and the Indo-Gangetic plains better than the ERA-INTERIM reanalysis product (Fig. 1g).

In this section the precipitation spatial distribution over India is shown for three categories, namely heavy, moderate and light rain. Figure 2 shows the 90 percentile precipitation distribution (heavy rain), Fig. 3 shows the 50 percentile precipitation distribution (moderate rain), and Fig. 4 shows the 20 percentile precipitation distribution (light rain).

Fig. 2
figure 2

Climatological mean seasonal (90 percentile category) precipitation (JJAS) from different precipitation products (unit: precipitation in mm/day)

Fig. 3
figure 3

Climatological mean seasonal (50 percentile category) precipitation (JJAS) from different precipitation products (unit: precipitation in mm/day)

Fig. 4
figure 4

Climatological mean seasonal (20 percentile category) precipitation (JJAS) from different precipitation products (unit: precipitation in mm/day)

3.1 Heavy rainfall

The heavy rainfall spatial distribution in IMD data reveals that above 50- to 100-mm/day rainfall is evident over Western Ghats and central and Indo-Gangetic plains and north-east Indian regions (Fig. 2a), whereas the spatial distribution of heavy rainfall is less in the Aphrodite data (Fig. 2b). The high-resolution CPC data show heavy rainfall of more than 50-mm/day over more grid points over the Western Ghats and the Indo-Gangetic plains (Fig. 2c), whereas the GPCP-1DD fails to get the heavy precipitation spatial distribution pattern over the Western Ghats (Fig. 2d), while the heavy precipitation is east–west aligned rather than aligned north–south over the Western Ghats orography. The TRMM-3B42 V6 and V7 capture similar pattern to the IMD heavy precipitation data over Western Ghats and the north-east Indian regions quite well and also have a close resemblance in the pattern to the high-resolution CPC data (Fig. 2e, f). The reanalysis products also have discrepancy in getting the heavy precipitation spatial pattern, while the MERRA reanalysis (Fig. 2h) shows heavy precipitation (above 90 percentile) over the Western Ghats and the Indo-Gangetic plains better than the ERA-INTERIM reanalysis product (Fig. 2g), which completely fails over the Western Ghats section and the Indo-Gangetic plains.

3.2 Moderate rainfall

The moderate rainfall spatial distribution in IMD data reveals that around 25-mm/day rainfall is evident over Western Ghats and central and Indo-Gangetic plains and north-east Indian regions (Fig. 3a), whereas the spatial distribution of heavy rainfall is less in the Aphrodite data (Fig. 3b), but compared to heavy rainfall the moderate rainfall is better captured in many datasets. The high-resolution CPC data show moderate rainfall of more grid points over the Western Ghats and the Indo-Gangetic plains of more than 12 mm/day (Fig. 3c), whereas the GPCP-1DD also gets the moderate precipitation spatial distribution pattern over the Western Ghats (Fig. 3d), while the pattern is smoothed over the Western Ghats. The TRMM-3B42 V6 and V7 capture similar pattern to the observed moderate precipitation intensity over Western Ghats and the north-east Indian regions quite well with respect to IMD data and also have a close resemblance in the pattern to the high-resolution CPC data (Fig. 3e, f). The reanalysis products get spatial pattern of the moderate precipitation better than the heavy precipitation pattern, while the MERRA reanalysis (Fig. 3h) shows moderate precipitation (above 50 percentile) over the Western Ghats and the Indo-Gangetic plains better than the ERA-INTERIM reanalysis product (Fig. 3g).

3.3 Light rainfall

The light rainfall (20 percentile) range varies between 1 and 8 mm/day. The IMD data show around 1 mm/day throughout Indian land region (Fig. 4a). Only Western Ghats shows higher intensity. The Aphrodite shows higher spatial distribution in this category over the Western Ghats and north-east India (Fig. 4b). The CPC high-resolution data show packets of higher intensity over Western Ghats and the Indo-Gangetic plains (Fig. 4c), whereas GPCP-1DD shows a uniform light rainfall throughout Indian land region (Fig. 4d). The TRMM-3B42 V6 and V7 capture similar pattern to the CPC light precipitation over Western Ghats and the north-east Indian regions (Fig. 4e, f). The reanalysis products get spatial pattern of the light precipitation stronger than the observed light precipitation pattern, while the MERRA reanalysis (Fig. 4h) shows higher light precipitation (above 4 mm/day) over the Western Ghats and the Indo-Gangetic plains more than the ERA-INTERIM reanalysis product (Fig. 4g).

4 Probability distribution function and percentile threshold for precipitation over Central India

In this section, the probability distribution function (PDF) from each dataset and the threshold for 20, 50 and 90 percentile values area averaged over the Central India bound by latitude 17°N–27°N and longitude 74°E–82°E are discussed in detail. Figure 5 shows the probability distribution function (PDF) of precipitation, and threshold values for heavy, moderate and light rain are shown by three vertical dashed lines in each graph. The precipitation probability density is calculated for 5-mm/day interval bin from 0 to 50 mm/day.

Fig. 5
figure 5

Probability distribution function (PDF) calculated from daily rainfall data averaged over Central India for all rainfall datasets (from 122 days in JJAS season for the period 1998–2007) with 5-mm/day interval bins (dashed vertical lines show threshold of 20 percentile, 50 percentile and 90 percentile, respectively, for each dataset) (unit: precipitation in mm/day)

The precipitation threshold values vary from one dataset to the other. IMD dataset shows light rain threshold at 2.8 mm/day, moderate rain at 6.4 mm/day and heavy rain at 15.4 mm/day. Most of the datasets agree quite well with light rain threshold except three datasets, namely CPC, TRMM-3B42 V7 and MERRA; the thresholds in these datasets are higher compared to other datasets for this rain class 3.1, 3.1 and 5.3 mm/day, respectively. In the moderate rain class, three datasets, namely CPC, GPCP-1DD and MERRA, show higher threshold 7.8, 7.1 and 10.6 mm/day, respectively, while in the heavy rainfall category, four datasets show higher threshold, namely CPC, GPCP-1D, TRMM-3B42 V7 and MERRA. The threshold values are 18.1, 19.2, 16.4 and 19.2 mm/day, respectively. Aphrodite data show lower values in all the three rainfall classes 2.4, 5.4 and 12.4 mm/day for 20, 50 and 90 percentile, respectively.

5 Temporal distribution of precipitation intensity over Central India

In this section the temporal distribution of precipitation intensities for 3 categories (90 percentile, 50 percentile and 20 percentile) is shown for the area-averaged Central Indian domain. Figure 6 shows (1) number of rainy days and (2) amount of precipitation for each class of precipitation (heavy, moderate and light rain) area averaged over the Central India bound by latitude 17°N–27°N and longitude 74°E–82°E.

Fig. 6
figure 6

Temporal evolution of mean precipitation amount averaged over Central India (seasonal mean, above 90, 50 and 20 percentile category) and rainy day (above 90, 50 and 20 percentile category) for (JJAS) from different precipitation products (unit: precipitation in mm/season and rainy days in days)

The top panel shows the seasonal mean precipitation from eight datasets over the Central India. Among the eight datasets MERRA reanalysis shows higher rainfall consistently for all the years, while the Aphrodite shows lower rainfall consistently for all the years. While most dataset shows a similar interannual variability, the ERA-INTERIM reanalysis is poor in getting the interannual variability clearly. It is obvious that the reanalysis is not an observed data and the reanalysis precipitation is a forecasted product and also model dependent; nevertheless, the reanalysis precipitation products have shown remarkable similarity with gauge, satellite and merged precipitation products.

5.1 Number of rainy days and amount of precipitation for heavy rainfall category (>90 percentile)

The number of heavy rainy days from each dataset is shown in Fig. 6. The number of heavy rainy days varies from 5 to 15 days out of 122 days from each year for the period 1998–2007. Heavy rainy days increase after 2002 onwards in the reanalysis products (ERA-INT, MERRA), the convergence before the year 2002 is better among the reanalysis products, while the observed gauge, satellite and merged products have a better convergence among the datasets for the entire time period. The amount of precipitation varies between 150 and 600 mm/season. The amount of precipitation also increases after 2002 onwards in the reanalysis products (ERA-INT, MERRA), the convergence before the year 2002 is better among the reanalysis products, while the observed gauge, satellite and merged products have a better convergence among the datasets for the entire time period.

Though we observe considerable bias in reanalysis data in the case of number of rainy days (>90 percentile) after 2002, the amount of heavy rainfall (>90 percentile) largely agree with other datasets, which suggests that the reanalysis datasets are still not able of capturing the intense heavy rainfall events during the monsoon season.

5.2 Number of rainy days and amount of precipitation for moderate rainfall category (50–90 percentiles)

The moderate rainy days are better captured by all datasets compared to heavy rainy days as well as amount of precipitation. The moderate rainy days are rainy days fall above 50 percentile and below 90 percentile category. The number of moderate rainy days varies from 20 to 60 days out of 122 days from each year for the period 1998–2007. Moderate rainfall days in reanalysis (ERA-INT, MERRA) show higher number of days compared to observed rain gauge, satellite and merged datasets. The amount of precipitation varies between 300 and 900 mm/season, all datasets converge well among the observed datasets except MERRA reanalysis product, which shows higher value, but the observed gauge, satellite and merged products and reanalysis show a better convergence in terms of interannual variability.

5.3 Number of rainy days and amount of precipitation for light rainfall category (20–50 percentiles)

The light rainy days are rainy days, which fall above 20 percentile and below 50 percentile category. The number of light rainy days varies from 10 to 35 days out of 122 days from each year for the period 1998–2007. Light rainfall days in reanalysis (ERA-INT, MERRA) show higher number of days compared to observed rain gauge, satellite and merged datasets. The amount of precipitation varies between 50 and 250 mm/season. All datasets converge well among the observed datasets except MERRA reanalysis product, which shows higher value. The observed gauge, satellite and merged products show a better convergence in terms of interannual variability.

5.4 Number of rainy days and amount of precipitation for very light rainfall category (below 20 percentile)

The very light rainy days are rainy days that fall below 20 percentile category. The number of very light rainy days varies from 20 to 90 days for each year for the period 1998–2007. Very light rainfall days in reanalysis (ERA-INT, MERRA) show only a few days compared to observed rain gauge data, satellite and merged datasets. The amount of precipitation varies between 20 and 80 mm/season, all datasets converge well among the observed datasets except ERA-INT and MERRA reanalysis products, which show higher values, but the observed gauge, satellite and merged products show a better convergence in terms of interannual variability.

6 Spatial summer monsoon precipitation anomalies over India

The spatial precipitation anomalies vary from year to year, and the spatial anomaly features are compared among different datasets in this section. There are years with heavy flooding in one part of India and severe drought occurring in other part of India simultaneously; therefore, the representation of spatial anomalies among different datasets is vital in understanding the performance of these datasets. The daily anomalies are part of seasonal anomalies. Therefore, it is imperative to study both daily variabilities and its impact on the seasonal totals during the monsoon season. Composite analysis is performed to understand the wet and dry cycles over the Indian monsoon region. The wet year composite anomalies are constructed using the anomalies calculated from the years 2003, 2005 and 2006, and the dry year composite anomalies are constructed using the anomalies calculated from the years 2001, 2002 and 2004. The changes in the above 90 percentile rainfall (heavy rainfall) amount and 90 percentile (heavy rain) rainy days are also studied during wet and dry years, respectively, in this section.

6.1 Wet year composite rainfall anomalies

6.1.1 Composite analysis of mean precipitation anomaly distribution

Figure 7a shows spatial seasonal precipitation anomalies for wet years 2003, 2005 and 2006 from the climatology (2001–2007) for each dataset. The IMD precipitation anomalies clearly bring out positive precipitation anomalies over most part of Indian subcontinent, particularly over western and peninsular India and negative anomalies over the eastern Indian region. The Aphrodite data show the positive precipitation anomaly pattern over western India and negative anomalies over peninsular India and eastern India. The precipitation anomalies above 75, 80 and 85 % significant levels (one-tailed Student’s t test) are shaded, and the anomaly contours are shown from −100 to +100 % range with 10 % of contour interval increment. The CPC shows positive anomalies over the entire central and western Indian region and negative anomalies over the eastern and peninsular India. The GPCP-1DD shows positive anomalies over the western Indian region and extending up to the east coast diagonally and negative anomalies over the eastern and peninsular India. The TRMM-3B42 V6 and TRMM-3B42 V7 both show positive anomalies over the western Indian region and extending up to the east coast diagonally and negative anomalies over the eastern and peninsular India. Both reanalysis datasets ERA-INT and MERRA show positive anomalies over the western Indian region and extend up to the east coast diagonally and negative anomalies over the eastern and peninsular India. The precipitation anomalies above 75, 80 and 85 % significant levels (one-tailed Student’s t test) are shaded, and the anomaly contours are shown from −100 to +100 % range with 10 % of contour interval increment.

Fig. 7
figure 7figure 7figure 7

a Spatial precipitation anomalies (seasonal mean precipitation) for JJAS season from different precipitation products for wet years (unit: anomalies are shown by contour intervals from −100 to +100 % with 10 % increment, and shaded areas are significant at 75, 80 and 85 percentage level using Student’s t test). b Spatial precipitation anomalies (90 percentile precipitation category) for JJAS from different precipitation products for wet years (unit: anomalies are shown by contour intervals from −100 to +100 % with 20 % increment, and shaded areas are significant at 75, 80 and 85 percentage level using Student’s t test). c Spatial rainy day departures (90 percentile precipitation category) for JJAS from different precipitation products for wet years (unit: anomalies are shown by contour intervals from −10 to +10 days with 1-day increment, and shaded areas are significant at 75, 80 and 85 percentage level using Student’s t test)

6.1.2 Composite analysis of 90 percentile precipitation intensity distribution

Figure 7b shows spatial 90 percentile precipitation anomalies for wet years 2003, 2005 and 2006 from the climatology (2001–2007) for each dataset. The IMD precipitation anomalies clearly bring out positive precipitation anomalies over western and peninsular India and negative anomalies over the eastern Indian region. The Aphrodite data show the precipitation anomaly pattern only over western India and negative anomalies over peninsular India and eastern India. The precipitation anomalies above 75, 80 and 85 % significant levels (one-tailed Student’s t test) are shaded, and the anomaly contours are shown from −100 to +100 % range with 20 % of contour interval increment. The CPC shows positive anomalies over the western Indian region and Central India and also negative anomalies over the eastern and peninsular India. The GPCP-1DD shows positive anomalies over the western Indian region and extends up to the east coast diagonally and negative anomalies over the eastern and peninsular India. The TRMM-3B42 V6 and TRMM-3B42 V7 both show positive anomalies over the western Indian region and extend up to the east coast diagonally and negative anomalies over the eastern and peninsular India. Both reanalysis datasets ERA-INT and MERRA show positive anomalies over the western Indian region and extend up to the east coast diagonally and negative anomalies over the eastern and peninsular India. The precipitation anomalies above 75, 80 and 85 % significant levels (one-tailed Student’s t test) are shaded, and the anomaly contours are shown from −100 to +100 % range with 20 % of contour interval increment.

6.1.3 Composite analysis of 90 percentile rainy day distribution

Figure 7c shows spatial 90 percentile rainy day anomalies for wet years 2003, 2005 and 2006 from the climatology (2001–2007) for each dataset. The IMD precipitation anomalies clearly bring out positive rainy day anomalies over western and peninsular India and negative anomalies over the eastern Indian region. The Aphrodite data show the rainy day anomaly pattern only over western India and negative anomalies over peninsular India and eastern India. The rainy day anomalies above 75, 80 and 85 % significant levels (one-tailed Student’s t test) are shaded, and the anomaly contours are shown from −10 to +10 days range with 1 day of contour interval increment. The CPC shows positive rainy day anomalies over the western Indian region and Central India and also negative rainy day anomalies over the eastern and peninsular India. The GPCP-1DD shows positive rainy day anomalies over the western Indian region and extends up to the east coast diagonally and negative rainy day anomalies over the eastern and peninsular India. The TRMM-3B42 V6 and TRMM-3B42 V7 both show positive rainy day anomalies over the western Indian region and extend up to the east coast diagonally and negative rainy day anomalies over the eastern and peninsular India. Both reanalysis datasets ERA-INT and MERRA show positive rainy day anomalies over the western Indian region and extend up to the east coast diagonally and negative rainy day anomalies over the eastern and peninsular India. The rainy day anomalies above 75, 80 and 85 % significant levels (one-tailed Student’s t test) are shaded, and the anomaly contours are shown from −10 to +10 days range with 1 day of contour interval increment.

6.2 Dry year composite rainfall anomalies

6.2.1 Composite analysis of mean precipitation anomaly distribution

Figure 8a shows spatial seasonal precipitation anomalies for dry years 2001, 2002 and 2004 from the climatology (2001–2007) for each dataset. The IMD precipitation anomalies clearly bring out negative precipitation anomalies over most part of Indian subcontinent, particularly strong over western and peninsular India and small positive anomalies over the eastern Indian region. The Aphrodite data show the negative precipitation anomaly pattern over western India and Central India and a very small area of positive anomalies over peninsular India and eastern India. The precipitation anomalies above 75, 80 and 85 % significant levels (one-tailed Student’s t test) are shaded, and the anomaly contours are shown from −100 to +100 % range with 10 % of contour interval increment. The CPC shows negative anomalies over the entire central and western Indian region and very small area of positive anomalies over the eastern and peninsular India. The GPCP-1DD shows negative anomalies over the western Indian region and extends up to the peninsular India diagonally and small area of positive anomalies over the eastern India. The TRMM-3B42 V6 and TRMM-3B42 V7 both show negative anomalies over the western Indian region and extend up to the east coast diagonally and small area of positive anomalies over the eastern and peninsular India. Both reanalysis datasets ERA-INT and MERRA show negative anomalies over the western Indian region and extend up to the east coast diagonally and positive anomalies over the eastern India. The precipitation anomalies above 75, 80 and 85 % significant levels (one-tailed Student’s t test) are shaded, and the anomaly contours are shown from −100 to +100 % range with 10 % of contour interval increment.

Fig. 8
figure 8figure 8figure 8

a Spatial precipitation anomalies (seasonal mean precipitation) for JJAS season from different precipitation products for dry years (unit: anomalies are shown by contour intervals from −100 to +100 % with 10 % increment, and shaded areas are significant at 75, 80 and 85 percentage level using Student’s t test). b Spatial precipitation anomalies (90 percentile precipitation category) for JJAS from different precipitation products for dry years (unit: anomalies are shown by contour intervals from −100 to +100 % with 20 % increment, and shaded areas are significant at 75, 80 and 85 percentage level using Student’s t test). c Spatial rainy day departures (90 percentile precipitation category) for JJAS from different precipitation products for dry years (unit: anomalies are shown by contour intervals from −10 to +10 days with 1-day increment, and shaded areas are significant at 75, 80 and 85 percentage level using Student’s t test)

6.2.2 Composite analysis of 90 percentile precipitation intensity distribution

Figure 8b shows spatial 90 percentile precipitation anomalies for dry years 2001, 2002 and 2004 from the climatology (2001–2007) for each dataset. The IMD precipitation anomalies clearly bring out negative precipitation anomalies over most part of Indian subcontinent, particularly strong over western and peninsular India and small positive anomalies over the eastern Indian region. The Aphrodite data show the negative precipitation anomaly pattern over western India and very small area of positive anomalies over peninsular India and eastern India. The precipitation anomalies above 75, 80 and 85 % significant levels (one-tailed t test) are shaded, and the anomaly contours are shown from −100 to +100 % range with 20 % of contour interval increment. The CPC shows negative anomalies over the entire central and western Indian region and very small area of positive anomalies over the eastern and peninsular India. The GPCP-1DD shows negative anomalies over the western Indian region and extends up to the peninsular India diagonally and small area of positive anomalies over the eastern and a few patches over the Central India. The TRMM-3B42 V6 and TRMM-3B42 V7 both show negative anomalies over the western Indian region and extend up to the east coast diagonally and small area of positive anomalies over the eastern and peninsular India. Both reanalysis datasets ERA-INT and MERRA show negative anomalies over the western Indian region and extend up to the east coast diagonally and positive anomalies over the eastern India. The precipitation anomalies above 75, 80 and 85 % significant levels (one-tailed Student’s t test) are shaded, and the anomaly contours are shown from −100 to +100 % range with 20 % of contour interval increment.

6.2.3 Composite analysis of 90 percentile rainy day distribution

Figure 8c shows spatial 90 percentile rainy day anomalies for dry years 2001, 2002 and 2004 from the climatology (2001–2007) for each dataset. The IMD precipitation anomalies clearly bring out negative rainy day anomalies over western and peninsular India and positive anomalies over the eastern Indian region. The Aphrodite data show the rainy day anomaly pattern over western India and peninsular India and positive anomalies over a few patches over eastern India. The rainy day anomalies above 75, 80 and 85 % significant levels (one-tailed t test) are shaded, and the anomaly contours are shown from −10 to +10 days range with 1 day of contour interval increment. The CPC shows negative rainy day anomalies over the western Indian region and Central India and also negative rainy day anomalies over the eastern and southeast peninsular India. The GPCP-1DD shows negative rainy day anomalies over the western Indian region and extends up to the east coast diagonally and positive rainy day anomalies over the eastern and few patches over east Central India. The TRMM-3B42 V6 and TRMM-3B42 V7 both show negative rainy day anomalies over the western Indian region and extend up to the east coast diagonally and positive rainy day anomalies over the eastern and peninsular India. Both reanalysis datasets ERA-INT and MERRA show negative rainy day anomalies over the western Indian region and extend up to the peninsular India diagonally and positive rainy day anomalies over the eastern and a few patches over eastern Central India. The rainy day anomalies above 75, 80 and 85 % significant levels (one-tailed Student’s t test) are shaded, and the anomaly contours are shown from −10 to +10 days range with 1 day of contour interval increment. The spatial pattern for wet year and dry year composites for seasonal mean rainfall, 90 percentile rainfall intensity and rainy days among different datasets agrees on a broad spatial pattern.

7 Temporal distribution of daily precipitation anomalies

Figure 9 shows temporal daily precipitation anomalies averaged over Central India for JJAS period (1998–2007) for each dataset and 2001–2007 for CPC high-resolution dataset.

Fig. 9
figure 9

Daily temporal precipitation anomalies averaged over Central India for the season JJAS from different precipitation products (unit: precipitation in mm/day)

The temporal daily variability shows days of above normal rainfall and days of below normal rainfall with the season of 122 days (JJAS). The daily variability also plays a key role in determining the water availability within the season, and extreme precipitation within the season dictates the vulnerability of crops grown during the season to vagaries of the monsoon.

The departures of 90 percentile rainy days and precipitation amount for each year are shown in Fig. 10 for the (1998–2007). The seasonal temporal variability of 90 percentile precipitation days (heavy rainfall days) in terms of seasonal anomalies averaged over Central India shown by different data has disagreements among them, datasets like IMD, CPC, GPCP-1DD, TRMM V6 and V7 and MERRA have similar anomalies, while the datasets like Aphrodite and ERA-INT have a different year-to-year anomaly patterns. Similarly, the seasonal temporal variability of 90 percentile precipitation (heavy rainfall) anomalies averaged over Central India shown by different data also has disagreements among them, datasets like IMD, CPC, GPCP-1DD, TRMM V6 and V7 and MERRA have similar anomalies, while the datasets like Aphrodite and ERA-INT have a different year-to-year anomaly patterns.

Fig. 10
figure 10

Seasonal mean precipitation anomalies and rainy day departures (90 percentile category) averaged over Central India for the season JJAS from different precipitation (unit: precipitation in mm/season and rainy days in days)

The extreme heavy rainfall (90 percentile category) or low rainfall (20 percentile category) is all part of active–break cycle that persist during the monsoon season of JJAS, and the prolonged active or break conditions (Prasanna and Annamalai 2012) during the season determine the water availability for crops grown during that season (Prasanna 2014).

The intraseasonal anomalies during the rainy season contribute to the seasonal total anomalies, and the intraseasonal variability is also part of the larger seasonal anomalous conditions. The intraseasonal variabilities are partly influenced by the internal variability of the monsoon system as well as the external boundary conditions (Prasanna and Annamalai 2012). Therefore, in the following section, let us discuss in detail the active–break cycle in the different datasets.

8 Active–break events over Central India from different datasets

Active–break cycle is inherent part of the Indian monsoon variability within the season (Yasunari 1979). Active–break cycle also plays a role in bringing heavy rainfall (active phase) or light rainfall (break phase) depending on which mode it is operating on. In this section, we will discuss both the active and break events over the Central India as well as the spatial structure of active–break cycle in the different rainfall products. Figure 11 shows the active and break cycle computed from area-averaged index over the Central India from each dataset similar to the method of Prasanna and Annamalai (2012).

Fig. 11
figure 11

Climatological histograms of active–break events over Central India for the season JJAS from different precipitation (unit: events in days)

To explain the method in brief, the rainfall from each dataset is calculated for daily precipitation anomalies from the daily long-term means (1998–2007). An event is considered if for three consecutive day’s rainfall anomalies averaged over Central India (17°N–27°N and 74°E–82°E) are below one standard deviation. The composite in this study is created for breaks and active condition lasting for 5 days and above to include the effects of both short breaks and long breaks and short active and prolonged active conditions.

8.1 Active–break events over the Central India

The active or break days are shown in the x-axis, and the number of active or break events is shown in the y-axis. All the datasets reveal more short active conditions during the monsoon season (JJAS) typically around 3 days, and we also see active conditions prevailing longer than 7 days during the monsoon season (JJAS) (Fig. 11).

Most datasets reveal more short break conditions during the monsoon season (JJAS) typically around 3 days except IMD and GPCP, and we also see break conditions prevailing longer than 7 days during the monsoon season (JJAS). There is a general consensus among the datasets that long breaks do occur during the monsoon season, which has greater impact on the agricultural production over the Indian subcontinent (Prasanna 2014).

9 Active–break composites over India from different datasets

9.1 Lead–lag composite for active conditions over India

Figure 12 shows 5-day and more than 5-day active composite anomalies computed using area-averaged index over the Central India from lag −10 to +10 days for each dataset. Each dataset is arranged in rows from −10 to +10 days with 5-day interval. Day 0 represents the days of 5-day and more than 5-day active period from 2001 to 2007 period. The days selected for active-days composite from each dataset is shown in Table 1.

Fig. 12
figure 12

Composite precipitation anomalies for active condition using five consecutive active days over Central India for the season JJAS from different precipitation (unit: precipitation in mm/day)

Table 1 Selection of days for creating composite anomalies of active–break cycle in this study for the period 1998–2007 (except for CPC; 2001–2007)

Figure 12 brings out the spatial dependence of anomalies with respect to the 5-day and above active precipitation condition over the Indian region. The precipitation index is created (daily active days exceeding +1 standard deviation in daily anomaly for five consecutive days and above), and composite anomaly maps are created to understand the spatial precipitation pattern in each datasets.

Figure 12 shows the composite anomaly map created with Central India 5-day active index with the spatial precipitation data, and the anomaly figures created (from −10 to +10 days) for lag–lead time periods to understand the spatial anomalies exhibited from each observational and reanalysis datasets and differences between one another. Day 0 corresponds to the peak composite of the 5-day and above active days when the precipitation is positive (above +1 standard deviation daily anomaly for five consecutive days and above from daily climatology for 10 years 1998–2007) over the Central Indian region.

If we notice the day 0 composite, the third column carefully, we can infer that when active precipitation occurs over the Central India, Western Ghats, north and north-west regions, the north-east India goes to dry phase, while this contrasting pattern between Central India and north-east India is very clear in the IMD data, the Aphrodite. CPC dataset exhibits weaker anomalies over the Central India and much weaker over the western India and Western Ghats, whereas the GPCP-1DD is better in capturing this feature. TRMM V6 and V7 capture the contrasting wet and dry phases between Central India and north-east India while missing the north-west Indian anomalies, and a similar pattern also emerges in ERA-INT and MERRA reanalysis datasets.

The propagation of active cycle in each dataset looks similar from −10 to +10 days, but with considerable differences (Fig. 12).

9.2 Lead–lag composite for break conditions over India

Similarly, Fig. 13 shows 5-day and more than 5-day break composite anomalies computed using area-averaged index over the Central India from lag −10 to +10 days for each dataset. Each dataset is arranged in rows from −10 to +10 days with 5-day interval. Day 0 represents the days of 5-day and more than 5-day break period from 2001 to 2007 period. The days selected for break-days composite from each dataset is shown in Table 1.

Fig. 13
figure 13

Composite precipitation anomalies for break condition using five consecutive break days over Central India for the season JJAS from different precipitation (unit: precipitation in mm/day)

Figure 13 brings out the spatial dependence of anomalies with respect to the 5-day and above break precipitation condition over the Indian region. The precipitation index is created (daily break days below −1 standard deviation in daily anomaly for five consecutive days and above), and composite anomaly maps are created to understand the spatial precipitation pattern in each datasets.

Figure 13 shows the composite anomaly map created with Central India 5-day and above break index with the all precipitation data, and the anomaly figures created (from −10 to +10 days) for lag–lead time periods to understand the spatial anomalies exhibited from each dataset and differences between one another. Day 0 corresponds to the peak composite of the 5-day and above break days when the precipitation is negative (below −1 standard deviation daily anomaly for five consecutive days and above from daily climatology for 7 years 2001–2007) over the Central Indian region.

If we notice the day 0 composite, the third column carefully, we can infer that when dry precipitation occur over the Central India, Western Ghats, north and north-west regions, the north-east India goes to wet phase, which is a reverse pattern compared to previous active phase.

While this pattern is very clear in the IMD data and the Aphrodite, the CPC anomalies are not stronger over the eastern India (positive anomalies). GPCP-1DD is better in capturing this feature. TRMM V6 and V7 capture the contrasting dry and wet phases between Central India and north-east India, and a similar pattern also emerges in ERA-INT and MERRA reanalysis datasets.

The propagation of break cycle in each dataset looks similar from −10 to +10 days, but with considerable differences (Fig. 13).

10 Heavy precipitation characteristics over each meteorological subdivisions of India

In this section, let us see the 90 percentile precipitation characteristics over each meteorological subdivisions of India. The six homogeneous meteorological subdivisions are shown in Fig. 14.

Fig. 14
figure 14

Different meteorological subdivisions over India

Figure 15 shows (1) number of heavy rainfall days and (2) amount of precipitation for 90 percentile precipitation (heavy rain) area averaged each meteorological subdivision over India.

Fig. 15
figure 15

Temporal evolution of mean precipitation amount over different meteorological subdivisions of India (above 90 percentile category) and rainy days over different meteorological subdivisions of India (above 90 percentile category) for JJAS from different precipitation products (unit: precipitation in mm/season and rainy days in days)

10.1 Hilly regions

The number of heavy rainfall days in the hilly regions decreases in MERRA and ERA-INT data, but this is not supported by the observed datasets. The amount of heavy precipitation varies from 200 to 500 mm/season with significant interannual variability. IMD data show increase in heavy rainfall days and more heavy rainfall precipitation over the hilly regions.

10.2 North-east India

The number of heavy rainfall days in the north-east India varies from 5 to 20 days with significant interannual variability. The amount of heavy precipitation varies from 200 to 1000 mm/season with significant interannual variability. All the datasets have good convergence over the north-east India except reanalysis datasets

10.3 Central north-east India

The number of heavy rainfall days in the Central north-east India varies from 5 to 20 days, and all the datasets show convergence except reanalysis datasets showing higher values. The amount of heavy precipitation varies from 200 to 600 mm/season with significant interannual variability. All the datasets have good convergence over the Central north-east India.

10.4 North-west India

The number of heavy rainfall days in the north-west India varies from 5 to 15 days with considerable interannual variability, and all the datasets show convergence except reanalysis datasets, which are showing higher values. The amount of heavy precipitation varies from 100 to 400 mm/season with significant interannual variability. All the datasets have good convergence over the north-west India. The number of heavy rainfall days and the amount of rainfall from heavy rainfall (90 percentile rain category) show increasing trend over this region.

10.5 West–Central India

The number of heavy rainfall days in the west–Central India varies from 4 to 16 days with considerable interannual variability, and all the datasets show convergence except reanalysis datasets showing higher values. The amount of heavy precipitation varies from 200 to 600 mm/season with interannual variability. All the datasets have good convergence over the west–Central India. The number of heavy rainfall days and the amount of rainfall from heavy rainfall (90 percentile rain category) show some slight increasing trend over this region.

10.6 Peninsular India

The number of heavy rainfall days in the Peninsular India varies from 4 to 16 days with interannual variability. The amount of heavy precipitation varies from 100 to 400 mm/season with interannual variability. All the datasets have good convergence over the peninsular India.

We see large discrepancies between reanalysis datasets and other datasets only in the number heavy rainfall days, but the heavy rainfall amount in reanalysis datasets is close to other datasets over most of the meteorological subdivisions. We see large discrepancies over the hilly region both in heavy rainfall days and in amount of heavy precipitation, this could be attributed to sparse rain gauge observations, and the representation of observed rainfall along the orography in the reanalysis is quite challenging.

11 Summary and discussion

Different precipitation datasets with varied resolutions and on a daily frequency are estimated for accuracy based on different platforms (satellite, rain gauge, merged and reanalysis data) for the period 1998–2007 over the Indian subcontinent. The datasets are analysed to understand the spatio-temporal variability of heavy precipitation during the summer monsoon season of the Indian subcontinent. Evaluation of the rain gauge alone (IMD and Aphrodite) and satellite precipitation data (TRMM V6 and V7) is carried out solely to determine the ability of the merged data products (GPCP-1DD and CPC) and the reanalysis (ERA-INT and MERRA) forecasted rainfall to capture heavy rainfall activities during dry and wet monsoon years over India.

Day-to-day variations of the monsoon rainfall are compared using daily rainfall data over the Central Indian region, as it is a better representation of large-scale Indian monsoon variability. Though the datasets varied in the quantitative terms, all the datasets have captured the gross features of the mean and heavy rainfall over the Indian region. The heavy rainfall contribution to the mean precipitation during the summer monsoon season (JJAS) is significant.

The IMD data are of coarse resolution, while interpolating the station data at 1° latitude × 1° longitude resolution, localized rainfall events can become smoothed in the IMD dataset. Thus, the analysis indicates that the high-resolution data are useful for understanding the spatio-temporal variation of rainfall over the Indian subcontinent. If we could get high-resolution dataset from IMD gauge measurements, then that would be ideal for analysis relating to extreme precipitation.

The study also brings out the role of remote sensing (satellite data: TRMM V6 and V7) in monitoring climate variability by making continuous datasets at high resolution and also in studying the climate of data-sparse regions. Heavy precipitation studied using TRMM-3B42 V6 and TRMM-3B42 V7 and merged CPC and GPCP-1DD rainfall estimates for all the available years (2001–2007) to determine the spatial and temporal variation of summer monsoon rainfall proves the importance of heavy rainfall to seasonal mean and availability of water in the season.

The spatial patterns of the rainy days and heavy precipitation indices show a strong resemblance with the spatial pattern of the seasonal rainfall. The spatial distribution of the indices shows large interannual variability over India.

To quantify rainfall activities during the summer monsoon season, indices for heavy precipitation were computed for the dry and wet years from 2001 to 2007. All the heavy precipitation indices showed a negative anomaly during the dry years (2001, 2002 and 2004). This indicates that a monsoon drought is associated with a drastic reduction in heavy rainfall activities compared to the normal and wet monsoon years (2003, 2005 and 2006). The study indicates that the satellite precipitation products (TRMM V6 and V7) and merged precipitation data estimated using GTS, the GOES Precipitation Index, SSM/I and AMSU-B enhance our ability to understand the features of monsoon precipitation and its spatial and temporal behaviour in more detail.

The study also establishes the reanalysis products ability to capture heavy precipitation (90 percentile rain category). The MERRA reanalysis product shows better performance compared to ERA-INTERIM; nevertheless, both datasets clearly capture the interannual variability and spatio-temporal variability of heavy rainfall activities, which is a promising signal. The reanalysis product assimilates observed winds, moisture and other surface conditions only, and precipitation is a predicted quantity. In spite of model uncertainties, the reanalysis retains the basic ingredients of precipitation characteristics (light, moderate and heavy rain) over the Indian land region. Though we observe considerable bias in reanalysis data in case of number of rainy days (>90 percentile) after 2002, the amount of heavy rainfall (>90 percentile) agrees with other datasets, which suggests that the reanalysis datasets are still not able to capture the intense heavy rainfall events during the monsoon season.