1 Introduction

The Indian summer monsoon (ISM), characterized by the seasonally reversing winds during June to September, contributes more than 80% of the total annual rainfall over the Indian subcontinent (Rajeevan et al. 2013; Turner and Annamalai 2012). Having such a dominant share in the precipitation across the region, it affects not only the climatic regime but also drives and affects the agrarian economy, water resources, food security, ecosystem, and also the gross domestic product (GDP) of the country (Gadgil and Gadgil 2006). In addition to a distinct atmospheric circulation pattern, it is also characterized by variability at different spatial as well as temporal scales over India. It is known that each year ISM occurs as a result of the interactions and coupling between different systems such as atmosphere–land–ocean–cryosphere. Such interactions consisting of components functioning at different spatial and temporal scales introduce higher order of complexities while investigating the processes involved in the formation, onset, progression, and dissipation of the monsoon. Considering such complexities as well as the recent debates on climate change, it is important to develop a process-based understanding of ISM in order to plan for better adaptation and mitigation strategies for the sustenance of large mass of population. The overall understanding of the ISM and underlying processes is constrained by the limited availability of observation datasets in terms of limited number of variables and their spatial as well as temporal coverages. In this regard, numerical models have been found instrumental while investigating different processes associated with the ISM (Bhaskaran et al. 1996; Ji and Vernekar 1997; Goswami 1998; Wang et al. 2005; Dash et al. 2006; Bollasina et al. 2011). It was demonstrated that most of the global climate models (GCMs) working at relatively coarser resolution are able to capture the large-scale features of ISM (Krishnamurti et al. 2002; Wang et al. 2005; Saeed et al. 2011; Sharmila et al. 2015). Although large-scale features and dynamics are realistically simulated in GCMs, they do not resolve the effects of topography, land use, regional climate forcing, and the associated feedbacks on account of their coarser resolution. The simulation of the ISM using GCMs suggest that most of them poorly simulate the mean monsoon rainfall distribution over the west coast, north Bay of Bengal, northeastern part of India with large systematic biases (Sperber and Palmer 1996; Kripalani et al. 2007; Rajeevan and Nanjundiah 2009; Sperber et al. 2013). However, high-resolution simulations using GCMs have shown better performance in representing ISM rainfall compared to coarser resolution simulations (Rajendran and Kitoh 2008). Due to higher computational requirement, such simulations are not feasible especially for a longer period. Considering such shortcomings in the GCMs, there has been considerable progress in the development of limited area models in the recent times. Such models dynamically downscale the outputs of GCMs to provide regional climate information at higher resolution with representation of topography, land use, and parameterization of the physical processes which cannot be represented on a larger scale (Sun et al. 2006). It has been shown that these models, also known as regional climate models (RCMs), have a wide range of applications starting from process-based studies of climate to future projections for impact and adaptation studies (Jacob and Podzun 1997; Huntingford et al. 2003; Jha et al. 2004; Bhaskaran et al. 2012; Kumar et al. 2006, 2011; Nunez et al. 2009). Although RCMs have been found to potentially add value to the simulations compared to their driving counterparts (Feser et al. 2011; Lucas-Picher et al. 2012; Lee and Hong 2014; Di Luca et al. 2012), there also exist the associated uncertainties, which limit the confidence in their performance. These uncertainties include the dependence of RCMs over parent forcing for initial and lateral boundary conditions (Laprise et al. 2008), domain size (Lucas-Picher et al. 2011), parameterization schemes (Déqué et al. 2007; Nobre et al. 2001), uncertainty due to the use of different greenhouse gas scenarios (Déqué et al. 2007), and the internal variability of the RCM (Laprise et al. 2008). Despite of the capability of high-resolution simulation, these models have been found incapable of resolving the scale interactions of regional and global levels. Additionally, RCMs are not able to match the scale and nature of the data at point locations as collected at meteorological stations, observatories, ocean buoys, etc. (Rummukainen 2010). It is also reported that the RCMs have the tendency to overestimate the precipitation along the orographic features (Kumar et al. 2013; Mathison et al. 2013) in addition to the cold bias in temperature over these regions (Giorgi et al. 2004; Dimri et al. 2018a, b). The higher computational requirement in pursuance of the higher resolution is also one of the disadvantages of RCMs. Not constrained by such demerits, there has been a marked increase in the use of RCMs to study different climatic processes across the globe (Solomon 2007). A number of efforts have been carried out to study different climatic processes including ISM over the Indian region at individual levels using RCMs (Bhaskaran et al. 1996; Ji and Vernekar 1997; Dash et al. 2006; Kumar et al. 2006, 2011, 2013, 2015; Lucas-Picher et al. 2011; Mathison et al. 2013; Dimri et al. 2013; Maharana and Dimri 2014, 2016; Ghimire et al. 2015; Umakanth et al. 2015; Nengker et al. 2017; Choudhary and Dimri 2017; Choudhary et al. 2018; Kumar and Dimri 2018).

It is said that, for summer monsoon systems, convective latent heat release drives the large-scale circulation by providing vast amount of energy (Krishnamurti and Ramanathan 1982), and thus, the convective parameterization has a significant impact in the simulation of monsoon (Slingo et al. 1988). Besides the release of latent heat, the cumulus convection also affects the large-scale circulation through the vertical transport of heat, moisture, and momentum. In turn, the large-scale circulation controls the organization and development of cumulus convection and apparently the clouds (Wu et al. 2007a, b). It was shown that the cumulus parameterization schemes in the numerical models improve the simulation of the low-level southwesterly jet associated with ISM (Zhang 1994). In addition, poor representation of convective processes in numerical models might further contribute to errors in other climatic components such as cloud radiative properties, water cycle, and the variability in the climatic processes at scales ranging from diurnal to interannual (Stevens and Bony 2013). Therefore, it is important to accurately represent these processes occurring at subgrid scales (typically less than 1 km) in RCM numerical formulation (Pal et al. 2007) and parameterized in most cases with tunable parameters. For representing cumulus convection in a model environment, different convective parameterization schemes (CPs) are available such as Kuo (Anthes et al. 1987), Grell (1993)Emanuel (1991), and Tiedtke (1989). However, it has been demonstrated that no specific scheme tends to work better than others do (Hong and Choi 2006; Kang and Hong 2008) and their performance varies depending upon individual cases, periods of simulation, region, and their interaction with other physical processes inside the modeling framework. The choice of a suitable convective parameterization scheme is important especially in case of ISM, as most of the cloud formation during the ISM season occurs as convective clouds, which play an important role in redistributing the heat and moisture (Mohanty et al. 2005). Adopting a common approach to test the suitability of different CPs, several studies have suggested that the numerical simulations of ISM are sensitive to the choice of CPs (Dash et al. 2006; Mukhopadhyay et al. 2010; Taraphdar et al. 2010; Srinivas et al. 2013; Bhatla et al. 2016) under different modeling frameworks.

In reference to ISM, it is difficult to deduce the best suitable convection scheme for downscaling experiment based on available literature as different studies conclude with different CPs to perform better even inside similar modeling framework. Das et al. (1988) compared different versions of Kuo-type schemes in simulating different phases (pre-onset, onset, and break) of ISM and reported that the modified Kuo-type scheme compared better with the observations. Alapaty et al. (1994) used two different CPs in a nested regional model and concluded that the Kuo scheme performed better in representing different dynamical features of ISM. Dash et al. (2006) using RegCM3 concluded that the Grell scheme outperforms the Kuo-type convection scheme in simulating different characteristics of ISM including the total seasonal mean rainfall. Ratnam and Cox (2006) found that, although the large-scale features of monsoon depressions were realistically simulated in the Grell and Kain–Fritsch schemes, the location could not be captured in both schemes. Despite of biases in simulation of different categories of rain rates, the Grell–Devenyi scheme was found to reproduce seasonal mean monsoon precipitation close to observation using the nested WRF model (Mukhopadhyay et al. 2010). Singh et al. (2011) using the MM5 model suggested that both the Grell and Kain–Fritsch schemes were able to reproduce the large-scale features of monsoon depression; however, their location could not be captured in either of the schemes. Adopting an objective approach, Giorgi et al. (2012) suggested that a combination of schemes, i.e., Grell over land and Emanuel over ocean, might be suitable for climate simulation over multiple Coordinated Regional climate Downscaling Experiment (CORDEX) domains across the world. Srinivas et al. (2013) conducted sensitivity experiments for ISM using the WRF model and concluded that with least bias and higher correlations, the Betts–Miller–Janjic scheme performs better in capturing low, moderate, and high rainfall events within the season. Interestingly, a GCM-driven sensitivity experiment using the RegCM model at two different resolutions and multiple convection schemes was carried out by Sinha et al. (2013).

In addition to the sensitivity of the seasonal mean precipitation to the choice of convective physics, it was demonstrated that the representation of intraseasonal oscillations is also sensitive to CPs (Umakanth et al. 2015). Another attempt for evaluating the performance of convection scheme over the CORDEX-South Asia domain by Raju et al. (2015) concludes that the mixed-type scheme (Emanuel over land and Grell over ocean) realistically represents the precipitation and temperature, large-scale circulation feature, annual cycle of precipitation and temperature, and northward propagation of the monsoon intraseasonal oscillations. With further deliberation, Bhatla and Ghosh (2015) suggested that for the simulation of break phases of monsoon, the Grell scheme displays higher utility as compared to others. Moreover, sensitivity of simulation of monsoon onset for different CPs has also been shown with higher suitability of the Tiedtke-type scheme in an RCM experiment (Bhatla et al. 2016). Maity et al. (2017a, b) using a sensitivity experiment concluded that in terms of overall performance, the MIT-Emanuel convection scheme can be used for the simulation of seasonal and monthly features of ISM, although there is underestimation of seasonal as well as monthly rainfall in all the convection schemes tested in their study. Another study by Nayak et al. (2017) suggests greater suitability of the MIT-Emanuel and Grell schemes in simulation of precipitation and temperature over the Indian region. A sensitivity study for testing different CPs for simulating wintertime precipitation over western Himalaya suggests better performance of Grell schemes in representing large-scale, seasonal mean patterns and interannual variability of precipitation for two contrasting seasons (Sinha et al. 2015).

It was proposed that besides the climate noise, a significant portion of the interannual variability of monsoon is contributed by slowly varying boundary conditions like sea surface temperature (SST), snow cover, soil moisture, sea ice, etc. (Goswami and Xavier 2005; Krishnan et al. 2009). Deducing further, it was proposed that half of such variability could arise due to land–atmosphere interaction and the influences of soil moisture and ground hydrology. Further quantification by Saha et al. (2011) indicated that 30–35% of the year-to-year variability arises due to pre-onset rainfall activities during May and its associated feedbacks from the land–atmosphere interaction. In a study using GCMs, it was found that the surface warming driven by negative biases of sensible heat fluxes over the land regions and the tropospheric warming through latent heat flux over northern Asian region results in the delayed development of meridional differential heating gradient. Further, it results into a delayed setup of an active convection zone across the area (Ashfaq et al. 2017). The feedback of soil moisture anomaly during active (break) phases and their subsequent role in modulating the favorable/unfavorable condition for the following active (break) phase was investigated by Saha et al. (2012). A study based on satellite-derived soil moisture data from the Tropical Rainfall Measurement Mission (TRMM) during 1998–2008 suggests a significant decreasing trend in soil moisture while an increasing trend of evapotranspiration over many parts of world including India (Jung et al. 2010). Therefore, representation of surface fluxes, hydrology, and land surface feedbacks is essential for simulation the ISM (Halder et al. 2015).

Previous studies suggest that the RCM experiment using the Community Land Surface Model (CLM) version 3 (Oleson et al. 2004; Steiner et al. 2009) tends to produce less precipitation than the one using the Biosphere–Atmosphere Transfer Scheme (BATS) as a land surface model (Dickinson et al. 1993). In addition, smaller precipitation bias over Africa (Steiner et al. 2009), the western Himalaya using RegCM-CLM3.5 (Tiwari et al. 2015), over the central Indian region (Maurya et al. 2017), Chinese region (Gao et al. 2016), and Tibetan region (Wang et al. 2015) is reported while using CLM as a land surface model in RCM. Although lesser mean bias was found for CLM, it was shown that in terms of mean and interannual variability of precipitation, the BATS scheme performs better than CLM (Halder et al. 2015). Interestingly for wintertime precipitation, CLM configuration was found to dominate the BATS scheme in a downscaling experiment using RCM over western Himalaya (Tiwari et al. 2015). On the other hand, a realistic simulation of precipitation and temperature during wintertime over western Himalaya has been reported by Dimri (2009) using a unique mosaic-type subgrid parameterization scheme. The subgrid scale land use scheme has also been implemented over an alpine region, which portrayed better representation of surface air temperature and surface hydrology (Giorgi et al. 2003). Different responses of CPs using different land surface schemes have been demonstrated for BATS and CLM3 (Kang et al. 2014; Li et al. 2015). Such differential response makes it imperative to ascertain appropriate CPs for different land surface schemes in numerical experiments. Sensitivity experiments for CPs using different land surface models (hereafter LSMs) have been carried out in several studies. Over the Indian region, Nayak et al. (2017) concluded that surface temperature and precipitation simulation by the model was sensitive to convection as well as the choice of land surface parameterization as precipitation simulation was better in BATS. Maity et al. (2017a, b) in a comparative study of contrasting monsoon years concluded that, in a coupled RegCM-CLM3.5 framework, the MIT scheme was found to be more skillful in simulating ISM.

The current study therefore aims at selecting the more appropriate convection as well as land surface scheme for the simulation of ISM using RegCM4 suite. The current study is unique in its sense that subgrid scale land surface parameterization has not been tested for the simulation of ISM in particular. Moreover, there is a gap of knowledge for the appropriate CPs to be used in a CLM4.5 land surface model coupled in RegCM4 for further experiments. To fill this gap and in order to highlight the systematic errors in the simulations, this study has been undertaken. This might further provide avenues for testing the capabilities of such simulations as part of the ongoing CORDEX program over the South Asia domain.

The paper is organized as follows: Section 2 discusses the data and methods, Section 3 provides the results and discussion, and then a summary of the work is given in Section 4.

2 Data and methodology

2.1 The regional climate model

The regional climate model RegCM4 (v4.4.5.5, Giorgi et al. 2012), developed at the Abdus Salam International Centre for Theoretical Physics, is used in the study. It is an evolved version of RegCM3 with improved physics and features, which enhance the model performance over tropical and subtropical regions as compared to previous versions. It is a compressible, hydrostatic core model with terrain following vertical σ-coordinates capable of using different combinations of CPs over land as well as oceans, referred to as the mixed-type convective parameterization approach. Giorgi et al. (2012) suggested that such mixed-type schemes might be better in simulation of climate across different CORDEX domains. RegCM4 has been used for various studies ranging from seasonal to climate change simulations. RegCM4 uses Arakawa-B grid where horizontal components of velocity (U and V) are prescribed at dot points and temperature, pressure, and humidity fields are represented at cross points, respectively (Arakawa and Schubert 1974). In addition to the CPs in the model, it uses the radiative transfer scheme similar to the NCAR global model CCM3 (Kiehl et al. 1996) for the parameterization of radiation. The planetary boundary layer (PBL) scheme of Holtslag et al. (1990) and a new PBL scheme developed by the University of Washington (Bretherton et al. 2004), called UW-PBL, are implemented in the recent versions. Besides the cumulus parameterization, large-scale resolvable precipitation has been represented by subgrid explicit moisture (SUBEX) scheme, which accounts for a prognostic equation of cloud water (Pal et al. 2000). The model further includes options for the ocean flux parameterization scheme, interactive aerosol, microphysics, lake models, etc.

2.1.1 Land surface models

The RegCM4 could be coupled to any of the three different land surface models as of now. The default land surface calculations are carried out using BATS (Dickinson et al. 1993) within the framework of an atmospheric model. BATS is an updated land surface scheme incorporating the interplay of vegetation fraction and soil moisture in modifying the surface exchange of fluxes of energy, momentum, and water vapor through the land–atmosphere interaction. The model consists of three different soil layers as a surface layer (~ 10 cm thick), a root zone layer (~ 1–2 m thick), and a third deep soil layer (~ 3 m thick) along with a snow layer and vegetation layer with 20 different vegetation types. A generalized force-restore method of Deardorff (1978) is used to solve the prognostic equations for soil temperature. A diagnostic energy balance approach is applied for the calculation of canopy and its foliage temperature, which essentially includes sensible, radiative, and latent heat fluxes. In addition, BATS uses 17 different soil texture classes ranging from coarse (sand) to intermediate (loam) and fine (clay) and different soil colors for soil albedo calculations following FAO specifications. The recent update to the BATS scheme introduces the urban and suburban classes to the land use categories, thereby providing the avenues for detailed representation of impervious surfaces and associated parameters.

With certain modifications to BATS, an account of the subgrid scale variability of topography and land use has been incorporated to the RegCM package (Giorgi et al. 2003). For this, a mosaic-type approach is adopted to disaggregate each coarser model grid cell into regular fine-scale surface grid. The meteorological variables are disaggregated from the parent coarse grid cell depending upon the elevation difference among the grids, and then calculation of surface fluxes is performed using BATS at these subgrid cells separately and reaggregated to the coarser grid later on by simple averaging. This is performed as a two-way interaction between the atmospheric model and BATS. This is based on the input of solar and infrared downward radiative fluxes, precipitation and near surface air temperature, water vapor, wind speed, pressure, and density from the atmospheric model to BATS. Further, after calculation, the output in the form of albedo, upward infrared flux, momentum flux (wind stress), and sensible and latent heat flux (or evaporation) is returned to the atmospheric model. As no subgrid disaggregation of precipitation is carried out in this approach and such subgrid scale variability does not affect the formation of precipitation, it has low sensitivity toward precipitation formation especially during winter due to dominance of dynamical processes (Giorgi et al. 2003). However, it is suggested that higher sensitivity of summer time precipitation can be expected because of simple disaggregation of convective precipitation and due to the dominant forcing of the surface fluxes, which are apparently affected during the subgrid surface calculations.

For improving the representation of land surface processes, CLM3.5 was introduced to the RegCM framework. CLM3.5 consists of 10 different soil layers up to the depth of 2.864 m (Lawrence et al. 2008). Different land use classes are represented in a grid cell of the CLM3.5 model as multiple columns. Under these columns, vegetation cover is represented using a maximum of 4 different static plant functional types from a set of 17. Although these functional types do not change with time, their leaf area index and stem area index vary seasonally. The land surface calculations are carried out for these functional types and columns of different land use categories and simply averaged for reaggregation at the coarser grid. Unlike BATS, CLM3.5 uses soil texture information from a global high-resolution dataset from the International Geosphere Biosphere Programme (Bonan et al. 2002) which has varying contents of sand and clay in each layer. Therefore, in case of CLM3.5, a more detailed description of the soil texture is provided and soil properties also vary with depth unlike BATS. Steiner et al. (2009) have shown that due to better representation of the land surface exchanges of moisture, energy, and associated feedbacks, CLM3.5 outperforms BATS.

A recent update to the existing versions of CLM3.5 that culminated into CLM4.5 includes updates in canopy radiation scheme, canopy scaling of leaf processes, and improvement in the representation of the photosynthesis processes (Bonan et al. 2011, 2012). Among other updates, in CLM4.5, wetland units are replaced by surface water stores allowing for prognostic wetland distribution modeling. For different categories of the land cover such as snow-covered, water-covered, and snow/water-free portions of vegetated and other cropland units as well as snow-covered and snow-free parts of glacier units, separate calculation of surface energy fluxes is provisioned (Swenson and Lawrence 2012). An improved and vertically resolved soil biogeochemistry scheme is also included which accounts for vertical mixing of soil carbon and nitrogen due to different processes (Koven et al. 2013).

2.2 Experimental design

The domain for simulation is shown in Fig. 1. This domain encompasses the region between 22° S–50° N and 10–130° E. The region is adequately large to capture the ISM circulation and the cross-equatorial flow. The initial and lateral boundary information for the simulation has been obtained from the ERA-Interim reanalysis having 1.5° horizontal, 6-hourly temporal resolution with 37 vertical levels (Dee et al. 2011). Weekly sea surface temperature (SST) values for the simulation were prescribed from the optimally interpolated SST dataset (OI_SST, Reynolds et al. 2002) from the National Oceanic and Atmospheric Administration (NOAA) at 1° horizontal resolution. The input for topography was obtained from the GTOPO digital elevation model from the United States Geological Survey (USGS), and the land surface classes were prescribed from the Global Land Cover Characterization (GLCC) dataset. For the CLM4.5 model, the plant functional types are prescribed from the National Centre for Atmospheric Research (NCAR) datasets. Three different cumulus convection schemes, namely, the MIT-Emanuel scheme (Emanuel 1991), Grell scheme (Grell 1993), and the mixed-type scheme (Giorgi et al. 2012), have been used to simulate the monsoon years of 1987, 1988, and 1990 (deficit, excess, and normal, respectively; Tyagi et al. 2012) in conjunction with three different land surface models. The land surface models include the control run (BATS is used in default mode), SUB-BATS run (BATS is used with two subgrid disaggregation), and CLM4.5. The model simulations were carried on a horizontal resolution of 50 km and 18 vertical levels. For each individual year, the model has been integrated from 01 April to 31 October. The first 2 months and last 1 month have not been used for further analyses thereby constituting the period of June–September (JJAS) for each year, so apparently 2 months the from start has been considered as the spin-up period for model stabilization. The combinations of CPs and land surface models for different years resulted into 27 different sets of simulations. The detailed model configuration is presented in Table 1. For the sake of discussion, acronyms are used in the further sections for cumulus parameterization schemes, namely, Grell with Fritch–Chappel closure (GFC), MIT-Emanuel (MIT), and mixed type (MIX). The name of the land surface treatment, i.e., control for BATS, SUB-BATS for subgrid disaggregation of BATS, and CLM4.5 for Community Land Model 4.5. has been suffixed to the CPs in order to refer to a particular experiment in the subsequent sections. The names of different combinations of experiments are listed in Table 2.

Fig. 1
figure 1

Model simulated surface elevation (m) over the study area (20° S–50° N and 10–130° E, CORDEX-South Asia domain). The box in square denotes the monsoon core zone (18–28° N and 73–82° E) used in the study

Table 1 Model configuration used in the study
Table 2 List of combinations of land surface parameterization scheme and convection parameterization schemes for each year of simulation (i.e., excess, deficit, and normal monsoon)

For the validation of the model simulations for precipitation, daily observed gridded precipitation dataset for different monsoon years has been taken from the India Meteorological Department (IMD) gridded product at 0.5° horizontal resolution (Rajeevan and Bhate 2009). The IMD dataset has been utilized for model validation over Indian landmass only. For the comparison of simulated near surface temperature (hereafter Tmean) fields, the observed dataset from the Climatic Research Unit TS4.00 (Harris et al. 2014) at 0.5° horizontal resolution has been used. For the comparison of large-scale monsoonal circulation, interpolated wind fields at 850 hPa and 0.5° horizontal resolution have been used from the ERA-Interim reanalysis (Dee et al. 2011) datasets for the mentioned period. For the comparison of surface fluxes, the FLUXNET reanalysis based on upscaled observations using the model tree ensemble technique (Jung et al. 2011) has been used.

A number of basic statistical approaches such as seasonal mean, mean bias, pattern correlation, and Taylor diagram (Taylor 2001) have been used to evaluate the model performance with respect to the corresponding observations. Further, an investigation of the performance of the models has also been accounted based on different thermodynamic variables such as vertical profile of gradient of equivalent potential temperature (dθe/dp) and bias of specific humidity.

3 Results and discussion

The results from the analysis of different experiments for subsequent years are being discussed in the following paragraphs.

3.1 Near surface air temperature (T mean)

The seasonal mean of Tmean for normal year of monsoon (1990) from model simulations and CRU observation are presented in Fig. 2. The model simulations tend to capture the spatial patterns of Tmean reasonably well. The comparison with observations (Fig. 2b) suggests that the model represents the colder (warmer) temperature regimes across the study area similar to the observations. The colder temperature regimes over the Himalayan mountains and Tibetan Plateau and the warmer regimes of northwestern, central, and peninsular India as well as of the Arabian Peninsula are by and large reproduced in all the experiments; however, their magnitudes are differently represented for different experiments. In general, a warmer mean temperature with varying magnitudes over northwestern India is simulated in all the experiments using the MIT scheme irrespective of different land surface schemes. Similar warmer temperatures while using MIT schemes have also been reported by Raju et al. (2015). The spatial patterns are more closely resembled in experiments using the GFC and MIX-type schemes; however, the cold (warm) biases remained. All the CLM4.5 experiments tend to produce colder seasonal mean at higher elevation sites of the Himalayas and Tibetan Plateau compared to SUB-BATS and control experiments. Similar to the spatial features of Tmean during normal monsoon year, all the experiments display reasonably accurate seasonal mean over the study region for deficit and excess year of simulation (Figs. S2 and S3 in the Supplementary information). Again, the magnitude is reproduced differently, while MIT schemes simulate colder (warmer) seasonal mean over various parts as compared to other experiments and observations. On the other hand, GFC schemes (including the mixed type) simulate less temperature magnitudes over land as compared to the MIT scheme. An improvement in the representation of surface temperature is seen especially over the northwestern and central parts of India while comparing the control and SUB-BATS land surface schemes using GFC cumulus parameterization. Such improvements in the spatial pattern of Tmean are also visible for deficit and excess year of simulation (Figs. S1 and S2). Further, in order to validate the performance of different experiments and to highlight the systematic errors in the simulation, mean bias in the Tmean with respect to the CRU observation dataset has been calculated. Figure 3 shows the spatial distribution of bias in Tmean with respect to observation for the normal year of monsoon. All the experiments show cold bias over the western Himalayan region with varying magnitudes. In all the LSMs using the MIT scheme, a prominent cold bias over western Himalaya and a warm bias (~ 2–5 °C) across northwestern, central India and Indo-Gangetic plains are seen. The subgrid disaggregation technique tends to improve the bias in control simulation as the magnitude of bias is reduced in SUB-BATS simulation using the MIT scheme. The CPs based on the GFC and MIX-type approach simulates a widespread cold bias (~ 2–5°) over the Indian landmass except a few improvements in SUB-BATS experiments over the central Indian region. Interestingly, CLM4.5 experiments using different CPs improve the Tmean simulation as compared to control and SUB-BATS. This could be attributed to comparatively greater sensible heating in the CLM4.5 set of experiments than others (Figs. S1719). The experiments with the MIT scheme represent the spatial pattern of bias similar to others, but their magnitudes are comparatively higher. Similar bias magnitudes are reported in previous studies (Zou et al. 2014; Tiwari et al. 2015). Although the amplitude of cold bias over the western Himalaya is higher in this set of experiments, the biases seem to minimize over the northern plains and peninsular regions of the Indian landmass suggesting better simulation of Tmean using the GFC and MIX type of CPs corroborating the findings of Maity et al. (2017a, b). Such reduced cold biases have also been illustrated by Maurya et al. (2017) as they describe better representation of surface hydrology in CLM4.5 leading to such traits in Tmean simulation. Again, corresponding to the warmer Tmean over most of the parts in case of the MIT scheme, warm bias is found irrespective of different LSMs (Fig. 3a–c). Warm bias was also reported by Nayak et al. (2017) using the MIT scheme with CLM3.5 coupled in RegCM4. Such behaviors of the MIT scheme were caused due to higher sensible heating in CLM3.5 as compared to the BATS scheme. The reason for such behavior has also been explained in terms of the simulation of fractional cloud cover by Maity et al. (2017a, b). Their study suggests that, in case of the MIT scheme, lesser value of fractional cloud cover is simulated, which allows the larger amount of solar radiation to reach the surface thereby increasing the sensible heat fluxes in these set of simulations. Interestingly, in the case of MIT-CLM4.5 experiments, the magnitudes of warm biases are less as compared to the MIT-Control and MIT-SUB-BATS experiments mostly due to lesser sensible heat fluxes compared to the latter (Fig. S17 in the Supplementary information). Again, the cold bias in CLM4.5 simulations over higher elevations can be attributed to comparatively higher soil moisture flux (data not presented) inhibiting the sensible heating near the surface in the simulations. Similarly, for deficit and excess year of simulation, the pattern of Tmean bias portrays almost a similar kind of signatures in all the experiments except a few, where magnitudes are slightly different (Figs. S4 and S4).

Fig. 2
figure 2

Seasonal mean near surface air temperature (°C) for JJAS season during normal monsoon year (1990) from different experiments (a (a–i)) and the Climatic Research Unit (CRU) observation dataset (b)

Fig. 3
figure 3

Mean near surface air temperature bias (°C) with respect to the Climatic Research Unit (CRU) observation dataset from different experiments (ai) for normal monsoon year (1990)

3.2 Mean sea level pressure

The spatial distribution of mean sea level pressure (hereafter MSLP) is an important factor for the onset and progression of ISM circulation. The JJAS seasonal mean of MSLP from different experiments as well as the ERA-Interim dataset over CORDEX-SA domain is presented in Fig. 4. Corresponding to the general notion of the land–sea heating contrast, an asymmetric distribution of MSLP over land and ocean is also reasonably simulated in all the model experiments except a few using CLM4.5 as LSM. The widespread low-pressure belt extending over northwestern India to the upper eastern coast of India is well reproduced in all the experiments using all the CPs while BATS is used. In particular, the simulations with MIT schemes tend to produce a stronger land–sea contrast as compared to other experiments. The experiments such as the MIT-Control, MIT-SUB-BATS, GFC-BATS, GFC-SUB-BATS, etc. simulate the spatial pattern of MSLP more precisely than others when compared to the corresponding ERA-Interim reanalysis for the normal year of monsoon. The strong low-pressure cells in the MIT scheme might be related to higher Tmean in association with higher convective activity over this region. Such behavior in the low pressure is also explained by stronger sensible heating near the surface (Fig. S17a&b) and corresponding rising motion in the latitudinal belt of 25–30° N (Fig. 14a, b). A deepening of the low pressure over the Tibet region is simulated in most of these experiments using BATS in all CPs.

Fig. 4
figure 4

Spatial distribution of mean sea level pressure for JJAS season during normal monsoon year (1990) from different experiments (a (a–i)) and ERA-Interim dataset (b)

In the control and SUB-BATS experiments, lesser MSLP over western Himalaya and Tibet region is simulated as compared to CLM4.5 set of experiments. For CLM4.5 experiments, higher values of MSLP are simulated for all the CPs. This indicates higher stability in the vertical atmosphere in association with inhibited convective activities over the Indian landmass. In particular, the simulations with GFC-CLM4.5 and MIX-CLM4.5 combination simulate weaker land–sea contrast of MSLP leading to a possible low-level divergence associated with high-pressure values over most of the land region. Although the distribution of MSLP over the Indian landmass is comparable to observations in SUB-BATS simulations under different CPs, an unusually low-pressure system is incorrectly represented over Tibetan highlands. For deficit year of monsoon, the spatial pattern of MSLP is similar to that of the normal year (Fig. S6). However, some specific signatures in the spatial patterns are visible in different experiments for the excess year of simulations. This includes the incorrect simulation of land–sea pressure gradient in case of the experiments with the GFC and MIX type CPs (Fig. S8). Further, a marked improvement in the representation of trough over Tibetan highlands is apparently seen in this year (refer to Fig. S8). Moreover, CLM4.5 experiments show even higher MSLP over the Indian landmass especially in the GFC-CLM4.5 scheme, indicating improper simulation of monsoon characteristic in terms of land–sea contrast and pressure differences. Further deliberation on mean bias for different years under consideration reveals that all the experiments simulate higher magnitude of negative biases over Tibetan highlands and western Himalayas. These regions are characterized by higher topographic features, indicating the possible role of topography in the simulation of lower magnitudes of MSLP. The underestimation of MSLP over land has also been linked to the warm bias in the simulations (Lucas-Picher et al. 2011). The warm biases in the simulations lead to heat low over northwest India and Pakistan, which in synergy with the underestimated MSLP affect the differential heating over land and ocean and thus affect the large-scale low-level circulation. The biases in the MSLP from different experiments calculated against ERA-Interim reanalysis are presented in Fig. 5. Except for the CLM4.5 experiments, all the other experiments show mixed pattern of bias over the Indian landmass region. The magnitude of bias over the land regions is comparatively less in the experiments other than those using CLM4.5. Interestingly, subgrid disaggregation of BATS in the case of the MIT scheme does not seem to improve the MSLP simulation when compared with the control simulation. This is evident from similar spatial patterns of bias of MSLP over the Indian landmass. For the GFC and mixed-type schemes, slight change in the spatial pattern of bias is noticed between the control and SUB-BATS LSMs for each year of simulation (Figs. S7 and S9). A dominant pattern of high MSLP bias is seen in the experiments using CLM4.5 in each year of simulation. Most of the landmass in such simulation portrays higher magnitudes of positive bias especially while using the GFC and MIX-type schemes. This might have further implications over the simulation of onset and progression of monsoon under these experiments, which will be discussed in subsequent sections.

Fig. 5
figure 5

Mean sea level pressure bias (hPa) with respect to ERA-Interim dataset from different experiments (ai) for normal monsoon year (1990)

3.3 Low-level ISM mean circulation

The large-scale circulation comprised of the cross-equatorial flow, Somali jet, and southwesterly flow in the lower troposphere is an important feature of ISM as it is responsible for moisture incursion over the land. The mean JJAS wind at 850 hPa from different experiments has been compared with ERA-Interim reanalysis for different years of simulation. Figure 6 depicts the spatial patterns of seasonal mean of low-level wind for different combinations of CPs and LSMs for the normal monsoon year. For the simulation of normal monsoon circulation characteristics, all the experiments display coherent features to those of observation. Besides the overestimation (underestimation) of the speed in different experiments, the structure and location of the jet are well captured in the simulations. Evidently, simulations with the MIT scheme produce stronger winds penetrating through the Indian landmass owing to stronger heat low and negative biases in the MSLP. The low-level convergence associated with such low pressure drives the ISM circulation toward the central Indian region, northern Indian plains, and eastern states of India. The strength of such winds ranges up to 16 m/s off the east coast of Somalia and slows down while reaching over the land. Besides the stronger wind magnitudes over the Arabian Sea and Indian landmass, MIT schemes also simulate stronger westerlies over the Bay of Bengal. This feature is more prominent with the control and SUB-BATS experiments and can be attributed to weaker easterlies in these simulations. Although the cross-equatorial flow has comparable magnitudes between 20° S and the equator, it intensifies off the coast of Somalia and ahead. Stronger low-level jet over the west coast of India in GCM simulation has been reported with the MIT scheme by Deb et al. (2007). In terms of magnitude, the MIT-CLM4.5 experiment realistically produces the magnitude as well as the direction of circulation over the equatorial Indian Ocean, Bay of Bengal (BoB), and the Indian landmass. The experiments with GFC schemes in the control and SUB-BATS produce similar wind circulation. The location of the jet is more realistically simulated than that in the MIT scheme; however, there are differences in its magnitude. Again, the stronger westerly winds traverse the Indian landmass to intensify the circulation over southern BoB, similar to the MIT scheme. The stronger wind speed over BoB and adjoining regions is a prominent feature in the observations as well. Under different experiments under this study, such feature occurs due to the simulation of an extended low-pressure belt across BoB and some parts of East Asia (Fig. 4a (a, b, d, e, g, h) and b). The low-pressure system drags the southwesterly winds toward the Far East, while it crosses the central and peninsular India. The stronger magnitude of the winds in such simulations further helps them move toward BoB and Far East Asian regions. This is consistent to that reported in previous studies of Raju et al. (2010) and Mohanty et al. (2005). The magnitude of wind speed over landmass is fairly simulated in the GFC-Control and GFC-SUB-BATS schemes, especially over the central Indian region. A general notion of overestimation of wind speed for normal monsoon can also be inferred from the simulations using the MIX-type schemes. Figure 6a (g and h) represents the wind pattern from the MIX-Control and MIX-SUB-BATS schemes, respectively. Stronger winds over the Bay of Bengal are again represented in these two experiments. For the simulations using CLM4.5, different CPs simulate different kinds of behavior unlike the control and SUB-BATS LSMs. Except for the combination of MIT-CLM4.5, other experiments for the normal monsoon year display weaker circulation features especially over the Indian landmass. The weaker circulation in terms of the spatial coverage as well as the intensity of the jet can be a manifestation of the weaker land–sea contrast of MSLP in these two experiments. As discussed in the previous section, higher MSLP over the land offsets the gradient driving the Somali jet. For deficit and excess years of monsoon, distinct characteristics in the ISM circulation are seen. During the simulation of deficit year of monsoon, weaker circulation is captured in all the experiments. This consists of less wind speed over the equatorial Indian Ocean and Indian landmass (Fig. S10). This further culminates into lesser moisture incursion from ocean to land in the model simulations thereby affecting the precipitation simulation. For excess year of monsoon, mixed-type schemes produce comparatively weaker circulation irrespective of different LSMs, while MIX-CLM4.5 produces the weakest jet among all experiments in terms of location and intensity (Fig. S11). The model tends to produce comparable wind magnitudes for the deficit and excess year of simulations, although there are considerable differences in the observed mean wind field in the ERA-Interim dataset. The underestimation (overestimation) of the magnitude of wind speed is apparently visible in the mean bias of wind at 850 hPa, presented in Fig. 7. The experiments using the MIT scheme over ocean, i.e., MIT-Control, MIT-SUB-BATS, MIX-Control, and MIX-SUB-BATS, simulate overestimated wind speed over south of the equator as well as the equatorial Indian Ocean. In association to this, the wind speed is underestimated over the Indian landmass region in such experiments. Moreover, a general pattern of underestimation of wind magnitude over the Arabian Sea and Indian landmass is indicated in the experiments with CLM4.5 models. The experiments with GFC CPs produce lesser bias in wind magnitude over the Indian landmass, Arabian Sea, and equatorial Indian Ocean. A careful analysis of the spatial patterns of bias in wind magnitude for deficit and excess year suggests that, except for the MIT experiments, the remaining experiments reproduce lesser discrepancies in the wind magnitudes with lesser bias with respect to ERA-Interim reanalysis (Figs. S11 and S12).

Fig. 6
figure 6

JJAS mean wind climatology at 850 hPa from different experiments (a (a–i)) and from ERA-Interim reanalysis dataset (b) for normal monsoon year (1990)

Fig. 7
figure 7

JJAS wind bias (m/s) at 850 hPa from different experiments (ai) against ERA-Interim reanalysis dataset for normal monsoon year (1990)

3.4 Precipitation

In order to see the ability of different combinations of CPs and LSMs in simulating the spatial features of precipitation across the study area, the seasonal daily mean precipitation (mm/day) for normal monsoon is presented in Fig. 8. Comparison with the IMD observation data (Fig. 8b) indicates that although some of the experiments are able to capture the spatial patterns in some parts of the Indian landmass, discrepancies are associated with the simulations. It is found that all the experiments represent the daily mean precipitation for JJAS differently as most of them underestimate the precipitation over certain areas, while the rest simulate an overestimated magnitude. In terms of the spatial distribution of precipitation for normal year, individual experiments produce higher precipitation over the ocean as compared to the land part. For the landmass, most of the experiments simulate a daily mean rainfall of 6–8 mm/day, and the spatial maxima of precipitation over northeast India and Western Ghats are well captured by some of them. Owing to the stronger westerlies over the Arabian Sea, Bay of Bengal, and adjacent areas, higher precipitation distribution can be seen in experiments with MIT CPs. This peculiar feature can be explained in terms of the representation of sea surface temperature and the convection in the model. It was shown that the overestimated precipitation over oceans can occur due to simulation of stronger wind magnitude (Halder et al. 2015) which is further linked to the absence of ocean–atmosphere coupling (Ratnam et al. 2009). Moreover, warmer SST could also lead to enhanced precipitation over the oceans and the adjoining regions (Singh and Oh 2007). Following this, an eastward shift in the precipitation distribution is apparently seen in most of these experiments. The experiments with the MIT scheme do not appear to capture the orographic precipitation over the Western Ghats. Moreover, the MIT schemes tend to produce lesser precipitation over the central Indian region and reasonably capture the spatial patterns over northwestern India. On the other hand, the experiments using GFC schemes tend to capture the precipitation maxima over the Western Ghats and northeast India, and the SUB-BATS scheme appears to improve the precipitation simulation in terms of magnitude over the topographically complex regions like these. This trait is also visible in the precipitation patterns from the experiments using the MIX-type schemes. Despite this, the precipitation pattern over the central and eastern Indian regions is not well represented. In addition, the rain shadow zone over the leeward side of the Western Ghats is not captured as higher precipitation over these regions is simulated in most of the model experiments. Overestimation of precipitation over these peninsular regions has also been described in Choudhary et al. (2018) while using the RegCM4 model driven by different GCMs. Among the control experiments, the MIT-Control experiment presents lesser mean precipitation intensity for normal monsoon, while the GFC-Control and MIX-Control produce comparatively better precipitation distribution over the Indian landmass. The experiments with SUB-BATS land surface scheme, provides a detailed information on regional features of precipitation especially in conjunction with GFC and MIX type of CPs. The CLM4.5 experiments show least daily mean precipitation intensity among all the experiments owing to weaker ISM circulation in conjunction with land-sea MSLP contrast. Although, the magnitude of the precipitation is not well represented in all the experiments using CLM4.5, MIT-CLM4.5 experiment shows an improvement in the spatial coverage over different precipitation regimes. The problem of higher precipitation over the equatorial Indian Ocean, Arabian Sea and Bay of Bengal is prominently seen in these simulations. This is also accompanied by lesser precipitation across countryside areas especially in GFC-CLM4.5 and MIX-CLM4.5 schemes. In addition to the normal monsoon year, the daily mean precipitation for deficit and excess year of simulation shows similar spatial features, however their magnitudes do vary in different experiments (Figs. S13 and S14). The simulations do not show much difference in the simulated precipitation during deficit and excess years. In general, the precipitation is underestimated in all the model experiments especially over the central and northwest India irrespective of the convective and land surface parameterization schemes for normal monsoon year. This is similar to the features reported in Maity et al. (2017a, b). Further investigation with the mean bias of daily mean precipitation suggests that the model experiments have dry (wet) bias across the Indian region. Since the computation of bias has been carried out with respect to the IMD dataset, it is presented over the Indian landmass only as shown in Fig. 9. For normal monsoon, all the experiments underestimate the precipitation over the central Indian region; however, this underestimation varies in magnitude. The simulations using the MIT scheme portray dry bias over the Western Ghats and wet bias over the peninsular Indian region irrespective of different land surface treatments. The underestimation of precipitation magnitude over central India and adjoining regions has also been reported in different studies using different versions of RegCM at climate-scale simulations (Choudhary et al. 2018; Mishra et al. 2014; Pattnayak et al. 2013; Dash et al. 2013). Based on precipitation and the outgoing longwave radiation for the simulations, Maharana and Dimri (2014) explained the reason for such a behavior in the model. It was found that excessive precipitation over the Western Ghats region results into the excessive loss of moisture, which in association with positive biases in temperature over Bay of Bengal results into less moisture content in the atmosphere. Under such conditions, the cyclonic disturbances originating from BoB are devoid of moisture, which contributes to less precipitation magnitude over the central Indian region. Wet bias over the western Himalayan regions is also simulated in these experiments. However, ISM circulation does not contribute much to the total annual precipitation over these high altitude regions, and it is fed mostly through the wintertime precipitation. For normal monsoon, a mixed pattern of positive and negative bias is apparent for the simulation of precipitation in these experiments. In such cases, the underestimation of precipitation could also have arisen due to weaker large-scale ISM circulation over land thereby inhibiting the moisture supply for the development and progression of monsoonal precipitation. The MIT-CLM4.5 experiment, however, improves the precipitation simulation by minimizing the dry (wet) bias over some scattered patches of central (peninsular) India. For the GFC schemes, despite of the mixed pattern of the positive (negative) biases in the precipitation simulation, such biases over central India are minimized in the case of the GFC-Control and GFC-SUB-BATS experiments. The GFC-SUB-BATS indicates peculiar features in the precipitation bias at regional scales like reduction of dry bias over the northeast Indian region. Further, the experiments with the MIX-type scheme (except that using CLM4.5) simulate comparatively less precipitation over the central Indian region, but the wet bias over peninsular and southern India intensifies in these experiments. Similar to the GFC-SUB-BATS scheme, the distinct bias pattern over northeast India is simulated in the MIX-SUB-BATS experiment as well. As discussed previously, all the CLM4.5 experiments show strong dry bias in the precipitation in all the CPs. In addition, there is an improvement in these simulations over peninsular India and western Himalayan region as the bias is minimized in these experiments while using CLM4.5 LSM. This implies that the convective parameterization schemes in association with the CLM4.5 are able to capture the low-intensity precipitation rates over these areas in a more realistic manner. For the deficit year of monsoon, the magnitude of bias in the model simulations is considerably reduced for all the experiments unlike the normal year. The mean bias of daily mean precipitation for the deficit year of monsoon is shown in Fig. 10. The argument that the simulations had better captured the low-intensity precipitation events are well supported by the trait of lesser bias in mean precipitation simulation. Again, the experiments with the MIT convective scheme simulate similar mixed spatial pattern of positive/negative bias; however, their magnitudes are less in comparison to the normal year of simulation. For the control experiments, although the magnitude of bias is reduced over certain parts of central and northeast India, the wet bias over the peninsular Indian region is still carried in the simulated daily mean precipitation. Among the SUB-BATS set of experiments, such biases are again reduced in conjunction with the GFC and MIX-type cumulus schemes. For deficit monsoon simulation, CLM4.5 experiments simulate less bias in the precipitation than normal monsoon over most of the land parts. However, dry bias in the northeastern parts of India does not show any improvement when compared to that of the normal monsoon. Similarly, for excess year of monsoon, dry bias dominates prominently across the landmass in almost all the experiments as shown in Fig. 11. With less magnitude of bias, the MIT scheme represents mixed pattern of bias, with dry (wet) bias pattern over central (peninsular) India. For other experiments while using the GFC and MIX-type schemes, dry bias is simulated with respect to the IMD observation for higher precipitation year. Except for the MIT-CLM-4.5 experiment, other experiments in the set of CLM4.5 experiments produce higher magnitude of dry bias over most of the land parts of India, owing to weaker circulation and land–sea contrast.

Fig. 8
figure 8

Daily mean precipitation (mm/day) climatology for JJAS season during normal monsoon year (1990) from different experiments (a (a–i)) and observed precipitation dataset from IMD (b) over Indian landmass region

Fig. 9
figure 9

Daily mean precipitation bias with respect to IMD observation dataset from different experiments (ai) for normal monsoon year (1990)

Fig. 10
figure 10

Same as Fig. 9 but for deficit monsoon year (1987)

Fig. 11
figure 11

Same as Fig. 10 but for excess monsoon year (1988)

3.5 Statistical validation

3.5.1 Spatial correlation

For further assessment of model performance for the simulation of precipitation and temperature, spatial correlation between the model and observation has been calculated. Table 3 consists of the spatial correlation values for different experiments under different years of simulation. A careful analysis of the pattern correlation values for precipitation over the Indian landmass suggests that the model experiments exhibit a wide range of resemblance to the observations under different years. Deliberating further with individual experiments, it is found that, due to overestimation/underestimation of the precipitation over the Indian landmass, the MIT set of experiments does not perform satisfactorily in simulating the normal and excess monsoon. For all the years under consideration, the GFC-SUB-BATS scheme performs consistently well in representing the spatial pattern of precipitation over the Indian landmass with the correlation values ≥ 0.5. Interestingly, the closeness of the model-simulated precipitation toward observation is highest in the case of deficit monsoon simulation. This might be attributed to the fact that with lower precipitation magnitudes, there is less variability in space as well as time in the model simulations. In addition, simulations are able to capture the low-intensity precipitation in a more realistic way. This is also supported by the fact that the RCMs have greater agreement with observations in representing moderate precipitation, while they have greater biases for representing heavier precipitation intensities (Boberg et al. 2009; Kjellström et al. 2010).

Table 3 Spatial correlation for different experiments for precipitation and near surface air temperature with respect to IMD and CRU observations, respectively

For deficit monsoon year, all the experiments using the MIT scheme perform better in comparison to other years, while the MIT-SUB-BATS performs the best with the highest spatial correlation values. On the other hand, experiments with the GFC cumulus schemes perform better than the MIT schemes across all the years of simulation with the exception of CLM4.5 experiments. The SUB-BATS LSM tends to improve the simulation of spatial patterns of precipitation in association with the GFC cumulus scheme. Unlike to that reported in previous studies on ISM, the MIX scheme does not seem to completely supersede other experiments in the single-year simulation of contrasting monsoon. For CLM4.5 experiments, the MIT cumulus scheme is more suited for the simulation of spatial patterns of rainfall in all the monsoon years despite of inherent biases in the simulations.

For the simulation of temperature, all the experiments match the CRU observations with a higher degree of similarity (correlation > 0.95). Across different years of simulation, all the experiments show better performance during excess year with a higher degree of correlation to the observation for the simulation of seasonal Tmean. The Tmean simulation tends to improve while using CLM4.5 as a land surface model across all the years. This is attributed to a slight negative bias in the representation of sensible heat flux unlike other experiments, which overestimates the sensible heat flux over the Indian landmass (Fig. S17).

3.5.2 Taylor diagram

In order to better quantify the performance of model experiments, the framework suggested by Taylor (2001) has been used. For assessing the performance of the experiments in simulating daily mean precipitation, the Taylor diagram is presented in Fig. 12 for each year of simulation. The metrics in the Taylor diagram are calculated based on the daily mean precipitation over the Indian landmass only and compared with the IMD observation data. For different years, the comparison of experiments based on correlation, standard deviation, and the root mean square error (RMSE) suggests that different experiments show a range of behavior during contrasting years of simulation. During all the years of simulation, the control experiments tend to simulate the precipitation with lesser resemblance to the observed data with higher magnitude of standard deviation and RMSE. These simulations show improvement for the deficit year of monsoon but still lag behind other experiments especially those using CLM4.5. In terms of the representation of the spatial patterns, the MIX-SUB-BATS experiment tends to perform consistently in all the years of simulation; however, these have higher standard deviation and RMSE in particular during the simulation of normal monsoon. This suggests better performance of this particular combination in the simulation of rather extreme features of monsoon in comparison to the normal monsoon year. Comparable to this, the GFC-SUB-BATS scheme produces competitive performance in simulating the daily mean precipitation with higher spatial correlation and lesser RMSE and standard deviation in particular during the normal year. This suggests toward better simulation of daily mean precipitation in the GFC set of simulations. Again, with reference to the previous section, as discussed, the CLM4.5 set of experiments underestimates the precipitation and simulates prominent dry bias across the region. This trait is reflected in the spatial correlation of these experiments along with homogeneous distribution of precipitation across the Indian landmass with lesser standard deviation in space. In addition, among all the CLM4.5 experiments, only the MIT-CLM4.5 suite of experiment appears to simulate different precipitation regimes in a better way with higher standard deviation values for all the years of simulation. The differences in the characteristics of other two CPs in conjunction with CLM4.5 are quite marginal in terms of the Taylor metrics as these display similar values across all the years.

Fig. 12
figure 12

Taylor diagram based on JJAS daily mean precipitation (over Indian landmass only) from different experiments for a normal, b deficit, and c excess monsoon years. Dots in triangle represent the control experiments, squared dots represent the SUB-BATS experiments, and circular solid dots corresponds to the experiments with CLM4.5 experiments

Similar to the precipitation, the Taylor metrics have also been computed for assessing the performance of model experiments in the simulation of Tmean and presented in Fig. S15. Apparently, all the CLM4.5 experiments outperform others in terms of the spatial resemblance of the Tmean with higher correlation values. Moreover, the SUB-BATS schemes also display comparable behavior with CLM4.5, but with lesser correlation amplitude, higher RMSE, and comparable values of standard deviation.

3.6 Physical mechanisms

Furthermore, in order to investigate the prominent dry bias over the central Indian region, the vertical profile of gradient of equivalent potential temperature (dθe/dp; EQP hereafter) and the bias of column specific humidity are analyzed and presented in Fig. 13. EQP is a useful measure of the static stability of the unsaturated atmosphere. The profile of bias is averaged over the monsoon core region following Mandke et al. (2007). In general, all the experiments show dry bias in the vertical distribution of specific humidity with respect to the ERA-Interim reanalysis dataset. With the exception of a few experiments, most of the experiments using CLM4.5 as a land surface model simulate consistent dry atmospheric column across all the years of simulation using different CPs. Vertically increasing EQP indicates stability and suppressed vertical motion in the atmospheric column, while the opposite marks an unstable atmosphere. Across different years of simulation, it is found that most of the CLM4.5 experiments show subdued rate of decrease of EQP vertically in comparison to experiments with other LSMs. This implies that a highly stable atmosphere is simulated in the CLM4.5 experiments in association with different CPs, which resists the vertical motion thereby inhibiting the convection in the atmosphere. On the other hand, the control and SUB-BATS schemes simulate comparatively greater rate of decrease of EQP vertically, hence providing a scope for vertical motion and thereby improved convection. For excess year of monsoon, most of the experiments tend to simulate drier atmosphere at lower levels, which may be attributed to weaker low-level circulation in the model, which limits the moisture incursion toward land. In association with the contrast of Tmean and hence higher MSLP, weaker low-level ISM circulation, and drier atmospheric column due to a stable atmosphere, the precipitation does not occur over the core monsoon region in most of the experiments in excess year of simulation. Overall, a synergistic interplay of stable vertical atmosphere, weaker large-scale circulation, and inhibition of convective processes seems to dominate in the case of GFC and MIX-type experiments while using CLM4.5, which further leads to higher amplitude of precipitation biases over the core monsoon region. As pointed out previously, all the experiments seem to perform satisfactorily for the simulation of precipitation during the deficit monsoon. This is simultaneously supported from the fact that comparatively warm atmosphere in association with less bias in the column moisture in the vertical column is simulated in this case.

Fig. 13
figure 13

ai Vertical profile of rate of change (dEQP/dp) of equivalent potential temperature (K) [solid lines] and bias of specific humidity (g/kg) [dashed lines], averaged over core monsoon region (Mandke et al. 2007) for different experiments and years. The bias has been computed against the ERA-Interim reanalysis dataset. Different colors correspond to profiles from different land surface models

For further explaining the restrained vertical motion in the atmosphere, pressure–latitude cross-section of pressure velocity (ω) has been computed. The mean ω averaged over 60–100° E longitude has been presented for normal monsoon season in Fig. 14. From convention, the negative ω signifies the rising motion, while positive values of the same represent the sinking motion in the atmosphere. In general, an upward vertical motion starting from the lower troposphere and extending deep into the atmosphere near the equator characterizes the rising limb of the Hadley cell (y–p plane). Further, such motions are stronger in the latitude range of 10–20° N and extend up to the height of 300 hPa (Krishnan et al. 2003; Hazra et al. 2017). A careful analysis of the simulated patterns of omega suggests upward motion in the latitude range of 5–10° N in most of the experiments, which mostly consist of the ocean areas. Such feature is more prominent in the simulations with the GFC parameterization scheme and manifests as extended vertical motion indicating deep convection over the oceanic region (Fig. 14d–f). For other CPs, this feature is less prominent with limited spatial and vertical extent as seen in Fig. 14a–i. Moreover, a second peak of rising motion is observed around 25–30° N (part of core monsoon region). In this case, a well-resolved deep convection is simulated with the MIT convective scheme for the control and SUB-BATS experiments. For the GFC and MIX-type schemes, such convection extends only up to the mid-tropospheric levels. Supporting the notion of subsidence in the CLM suite of experiments, anomalous positive values of ω are observed over most of the land parts which suppresses the convection over the region. This possibly results into lesser precipitation in such experiments due to a synergistic effect of weaker large-scale circulation and suppressed convection. Similar patterns of ω were also observed for deficit and excess year of monsoon (Figs. S20 and S21 in the Supplementary information).

Fig. 14
figure 14

ai Pressure–latitude cross-section of seasonal mean vertical pressure velocity (hPa/s) averaged over 60–100° E longitude for JJAS during the year 1990 from different experiments

3.7 Spin-up of experiments

For any RCM simulation, it is very important to allow for the adequate spin-up time in order to achieve equilibrium of the land surface states, in particular the soil moisture and the evapotranspiration. There are different opinions, which advocate for the varying lengths of spin-up time for achieving the equilibrium of the soil moisture states and the evapotranspiration under different cases. Previous studies have suggested that the spin-up for a regional climate model varies between 10 days to 1 month (Wang et al. 2003; Rao et al. 2004; Ratnam and Kumar 2005; Martínez-Castro et al. 2006; Zhong 2006; Kang et al. 2014). In the case of experiments using CLM as a land surface model, 1 month of spin-up period has been allowed in many studies (Kang et al. 2014; Tiwari et al. 2015, 2017; Gao et al. 2016; Maurya et al. 2017, 2018; Maity et al. 2017a, b). It was shown that over dry land areas, the spin-up of land surface state takes approximately 2–3 years, while over monsoon regions, such stabilization is achieved in approximately 3 months if the integration is started just before the onset of monsoon (Lim et al. 2012). Based on the above literature, it is believed that a 2-month spin-up is sufficient for seasonal-scale simulations in order to achieve the dynamical equilibrium of the internal physics of the model. Moreover, in order to verify whether an appropriate spin-up period has been allowed for different experiments, the time series of two different layers of soil moisture (0.1 and 1 m, respectively) averaged over the core monsoon region have been studied for each season. The core monsoon zone is chosen in order to account for the homogeneous precipitation region where possibly soil moisture would not vary much in space. The time series for different seasons are provided in Figs. S22 and S23 for 0.1- and 1-m depth layers, respectively. After a careful analysis of the time series of the top layer, no breakup point in the time series of 0.1 m soil moisture could be identified as a threshold to account for the spin-up period. The temporal variability of soil moisture may be a resultant of the local evaporation/saturation processes as part of the active/break periods of the monsoon. Moreover, a distinct characteristic in terms of extremely saturated 1-m soil moisture level is found in simulations using CLM4.5 as a land surface model. The time series of the 1-m soil moisture also does not have any breakup point to account for the spin-up threshold. These findings suggest the appropriate selection of a spin-up period as part of the current study.

4 Summary and conclusions

An assessment of the performance of different cumulus parameterization schemes and land surface models under the framework of RegCM-4.4.5.5 has been carried out for contrasting years of Indian summer monsoon. Twenty-seven different experiments have been simulated in conjunction with three different cumulus parameterization schemes, namely, the MIT-Emanuel scheme (MIT), the Grell scheme with Fritsch–Chappell closure (GFC), and the mixed-type scheme, i.e., Grell over land and Emanuel over ocean (MIX). Moreover, three different land surface models, namely, default BATS (Control), subgrid disaggregation using BATS (SUB-BATS), and Community Land Model 4.5 (CLM4.5), have also been tested for their suitability and sensitivity for cumulus parameterization in the simulation of different monsoon years. The simulations using these model physics have been performed at the spatial resolution of 50 km for normal (1990), deficit (1987), and excess (1988) year of monsoon over the CORDEX-South Asia domain.

The analysis of individual experiments in terms of the seasonal mean and the bias for near surface air temperature, mean sea level pressure, large-scale 850 hPa wind circulation, and precipitation suggests that simulations of ISM are sensitive to the choice of cumulus schemes as well as the land surface models inside the framework of the RegCM atmospheric model. Although the model simulations exhibit different behaviors in simulating the Tmean over different regions across the years of simulation, these capture the spatial patterns reasonably. Despite of inherent biases in the simulation of Tmean, CLM4.5 experiments perform exceptionally well especially while using with GFC or MIX type of schemes. The cold bias in the CLM4.5 simulations is related to wetter soil moisture seasonal mean, consequently inhibiting the sensible heating. Interestingly, although CLM4.5 outperforms the SUB-BATS experiments in the representation of spatial patterns of Tmean, the latter substantially helps in mitigating the higher magnitude of biases across different regions. Most of the model experiments are able to reproduce the land–sea contrast of mean sea level pressure except for the set of CLM4.5 experiments, which affects the simulation of mean large-scale ISM circulation in such experiments. The higher MSLP over the land inhibits the convective activities and the moisture incursion through southwesterly winds in these experiments, thereby leading to prominent dry bias in the precipitation simulations. Importantly, a deepened trough of MSLP over Tibet in control simulations affects the representation of Tibetan high. Besides some exceptions during deficit and excess year with CLM4.5 experiments, the rest of the simulations display reasonable performance in representing the spatial extent of cross-equatorial flow, Somali jet, and southwesterly flow at 850 hPa. Some of the experiments display stronger westerlies over the Bay of Bengal owing to weaker easterlies in the model simulations. For excess year of monsoon, the MIX-CLM4.5 experiment could not produce the strength of the cross-equatorial flow and Somali jet as in the observation. This suggests that CLM4.5 experiments are not able to represent the mean monsoonal flow while simulated in association with the MIX and GFC schemes. Further, the simulation of precipitation has been found to be very sensitive toward the choice of the land surface model as well as the cumulus parameterization schemes. Among different years, deficit years are best represented in all the experiments with lesser bias and closer resemblance to observations in terms of the spatial patterns. On the other hand, excess and normal monsoons are represented well over the Indian region in model simulations along with comparatively higher magnitude of bias for daily mean precipitation. This concludes that higher daily mean precipitation magnitudes are not captured in all the model simulations especially over the land areas. Most of the experiments simulate higher precipitation over the ocean with slight eastward progression of the precipitation regimes in the Bay of Bengal. Among all the experiments, the SUB-BATS schemes show considerable improvement in the simulation of the precipitation across different years in comparison to other land surface models. The spatial patterns are closely simulated in accordance with the reduction in the bias in this set of experiments in the normal and deficit year with the exception of excess year.

For the simulation of precipitation and other monsoon features using CLM4.5, the MIT cumulus scheme is well suited than others, although these simulations also have inherent errors. In terms of the consistent performance across different years, the GFC-SUB-BATS experiment shows better performance as compared to other combinations of CP and LSM experiments. Although CLM4.5 experiments produce drier seasonal mean of precipitation for all the years and CPs, it also shows certain improvements over different regions by minimizing the bias in precipitation simulation. Further investigation of the stronger dry biases in the simulation over the core monsoon region suggests the suppression of convective activity in most of the experiments. The reason for such suppression can be explained in terms of the synergistic interplay of higher MSLP, weaker large-scale circulation, greater rate of increase in EQP vertically, and stronger subsidence over the major land part. This leads to higher atmospheric stability over the region, thereby restraining the vertical motion and the convective activity in the model simulations. Consequently, a drier atmosphere dominates the monsoon simulation and, thus, a drier monsoon simulation.

In the current study, no effort has been undertaken for the parameter tuning of the cumulus and land surface parameterization schemes as the focus has been kept on the selection of the best performing simulation framework for future longer period simulations of ISM. A shorter period of simulations has been used with appropriate spin-up times from the available literature.

However, it is believed here that the simulations with CLM4.5 might require a longer spin-up period for the accumulation of soil moisture and further stabilization of the model simulations especially in the case of climate-scale simulations subject to further verification. The study presents a firsthand information on the comparison and capability of SUB-BATS and CLM4.5 and its suitability with different cumulus schemes in representing ISM. Further endeavors aim to highlight a process-based description of the shortcomings in the model experiments and the tuning of different parameters especially in the case of CLM4.5 experiments in order to achieve a skillful framework for the simulation of ISM. This study also provides further avenues for testing the sensitivity and suitability of other physical parameterization schemes such as planetary boundary layer, cloud microphysics, etc. in future endeavors with special consideration of ISM.