1 Introduction

The convective cloud system causes electrification and thus results in the origination of lightning, which is regulated by microphysics and cloud dynamics (Rutledge and Mac Gorman 1988; Williams et al. 1991; Lang et al. 2004). Discharge of electricity into the atmosphere is a robust phenomenon, often accompanied by climatic conditions such as heavy rains, hail, storms, etc. (Verma et al. 2021). Ice crystals and super-cooled liquid water are required elements of thunderstorm and cloud separation, explaining that most of the lightning takes place over mixed-phase regions having temperatures 0 to –40°C (Takahashi 1978; Jayaratne and Sunder 1983; Williams et al. 1991; Saunders 1993).

LF (lightning flashes: a flash of light that is caused by the discharge of electricity in the atmosphere) has been associated with convective rain (Dai et al. 2009; Zheng et al. 2020), TCWV, and relative humidity over both oceans and land (Price and Federmesser 2006; Siingh et al. 2011; Shi et al. 2018). Higher humidity levels increase hydrometeor concentrations and lifting speeds, both of which can result in severe lightning strikes (Chakraborty et al. 2021). The high degree of instability demonstrated by the CAPE increases the humidity of the currents due to the strong ice conditioning on the surface and the formation of ice particles as a result of electrical charge separation, which in turn increases the probability of lightning occurrence. (Galanaki et al. 2015; Saha et al. 2017; Dewan et al. 2018). That is why these climatic parameters are used in this paper. We have used convective available potential energy (CAPE); a measure of the amount of energy available for convection. CAPE is directly related to the maximum possible vertical speed during updraft. Thus, higher values indicate a greater potential for extreme weather. TCWV (the total amount of water vapour present in a vertical column of air), RH (relative humidity is a measure of how much water vapour is in a water–air mixture compared to the maximum amount possible), ST (at 2 m height) and OLR (outgoing longwave radiation energy emitted from Earth in the form of electromagnetic radiation that passes through the atmosphere and into space as thermal radiation).

The amplitude and vertical spreading of the CAPE play a crucial role in defining the upward movement speed and vertical spreading of hydrometeors (any water or ice particles that have formed in the atmosphere or at the Earth's surface as a result of condensation or sublimation), which participate in the process of generating charges inside hurricane clouds (Williams and Zhang 1996). Therefore, the CAPE is expected to correlate well with lightning activity. For example, Liou and Kar (2010) showed that the values of rain yield per flash were different for inland and coastal stations and varied seasonally over Taiwan. There is also a positive correlation between LF and CAPE. Louf et al. (2019) and Siingh et al. (2015) studied the correlation between lightning and the surface temperature of the Arabian Sea and the Bay of Bengal and found that lightning has a very high positive correlation with surface temperature and CAPE. Siingh et al. (2014) studied lightning along with the ST and OLR in a different part of India and found that the ST exhibit good correlation with lightning, but OLR has less impact on lightning flashes. Study by Pawar et al. (2012) over central India found that CAPE played a decisive role in the lightning generation. Yadava et al. (2020) have found that LF over India has high seasonal variability, such as in pre-monsoon (March–May) where maximum LF was seen over northeast region and in post-monsoon (Sept–Nov), maximum LF occurred in the northern part of India.

Over the northeast, most lightning occurs in the Nor’wester, which is locally called ‘Kalbaisakhi’. The Nor’wester is a severe thunderstorm that usually originates in northern and northeastern India during the pre-monsoon (March–April–May MAM) period. It occurs due to the confluence of warm and dry air (Choudhury et al. 2020). In the Nor’wester, strong thunder and lightning, hail, sandstorms, gusts, etc., take place. The occurrence of convection in Bihar, Jharkhand and neighbouring states of India is due to the intense warming of the land masses in the region. There, it mixes and intensifies with warm and humid low-altitude air (from the Bay of Bengal) (Mukhopadhyay et al. 2009; Tyagi et al. 2012). The LF, CAPE, and local dynamics all play a role in controlling cloud microphysics via relative humidity and other mechanisms. Clouds cover nearly two-thirds of the Earth's surface. They reflect incoming solar radiation, cooling the Earth-atmosphere system, while reducing outgoing infrared radiation, warming it. Clouds also regulate the atmospheric hydrological cycle by transporting water away from the Earth's surface, redistributing it through the atmosphere, and returning it to the surface through precipitation. Clouds' latent heating and cooling influence atmospheric circulation, and thunderclouds generate lightning (Baker 1997). Lightning is a very important electrical parameter and it is interesting to know the geographical distribution of this parameter in the region.

Lightning incidents also present havoc to human and wildlife. Thus, it is required to build a lightning prediction system for lightning safety and mitigation over India. Further, spatially and temporal depiction of lightning activity is important to fill a significant void in the existing studies. In this context, the present study illustrates a lightning climatology over India using 16 years (1998–2013) data from Lightning Imaging Sensor (LIS) and Optical Transient Detection (OTD) data, then estimate its impact with various meteorological parameters.

The temporal and spatial variability of thunderstorm activity in India and the related factors influencing thunderstorm activity are analysed in detail. The relation between the lightning activity, TCWV, RH, CAPE, ST and OLR over regions of interest are analysed. The study period is selected based on the availability of reliable lightning data over the Indian regions. High-resolution lightning datasets are used over the Indian region, and an attempt is made to identify the most dominant factor affecting the spatio-temporal variability of lightning in a complete manner. The paper is organised into six sections. First and second sections deals with the introduction and study area, respectively. Section three includes the data and observation. Fourth section presents the methodology. Finally, section five gives details of the results and the study concludes in section six.

2 Study area

Figure 1 shows a map of India, located at coordinates 8°4′–37°6′N and 68°7′–97°25′E in the Northern Hemisphere. India covers about 2.4% of the world's total geographic area (approximately 3.28 million km) and is the 7th largest country in the world. India has a diverse and complex society, similarly the effects of climate change on India are also strongly and highly variable (Gupta et al. 2020). India is a highly ethnically diverse country, and hold over 138 crores of population and it is second most populated country in the world (Yadava et al. 2020). India is a peninsula country surrounded by water from three sides south, southwest, and southeast with Indian Ocean, Arabian Sea, and Bay of Bengal, respectively. Northern part of India is bounded largely by Himalayan Range. India share its boundaries with China, Nepal and Bhutan in northern frontier. Pakistan lies in Karakoram Range and share its boundary with western India. Punjab plain, Rann of Kutch (Gujrat), and the great Indian Thar desert lies in the western part of India. India and Bhutan are separated by mountains like Chin Hills and Kachin Hills deeply covered by dense forest. Kashi Hills and Mizo Hills lies in eastern part of India where India share it boundaries with Bangladesh. Indo Gangetic watershed also lies in eastern plain of India. Longest river originating in India is Ganga. Northern, eastern, and central parts of India are occupied by Ganga–Brahmaputra basin, whereas southern part of India is covered by Deccan Plateau.

Figure 1
figure 1

Study area map of seven homogeneous regions of India.

In the present study, Indian domain has been subdivided into seven regions according to geographical division of India (Dash and Mamgain 2011) as described in figure 1 for the detailed evaluation and analysis. The seven regions are as follows: North East (NE), East Coast (EC), West Coast (WC), North West (NW), Western Himalaya (WH), North Central (NC), and Interior Peninsula (IP). The division is based on identical climatic conditions. The two coastlines struck by lightning are hereinafter referred to as the ‘coast’. The IP is a surrounded land region between the above-mentioned coasts with moderate activity of LF. Compared to WC, EC, the EC have a very different response to lightning strikes due to the high-temperature appearance in the EC, and the foothills of the Himalayas are considered free to observe the effect of orographic convection on electrical properties. The NE shows the maximum LF over all regions.

3 Satellite data and observations

Satellite observations of LF over the entire Indian domain have been used to study lightning activity. In the present investigation, the dataset used for analysis is discussed in detail.

3.1 LIS-TRMM

The Tropical Rainfall Measuring Mission (TRMM) is a joint NASA-Japan Aerospace Exploration Agency (JAXA) space mission to study rainfall for weather and climate research (Albrecht et al. 2011). The LIS instrument was placed on the TRMM satellite, which covered the period from November 28th, 1997 to April 1st 2015. The TRMM satellite with an orbital inclination of 35°N and 35°S to the equator with altitude of 350 km (Boccippio et al. 2000; Christian et al. 2003). The LIS instrument is installed on NASA's Earth Observing System (EOS) on the Tropical Rainfall Measurement Mission (TRMM) satellite to detect LF occurrence in the tropics (Christian et al. 1999; Bond et al. 2002). Monthly LF data available on the LIS website was used in the study; this data is useful for studying monthly and seasonal changes. The VHRMC (Very High-Resolution Monthly Climatology) is a merger of two lightning data detected by a Lightning Imaging Sensor (LIS) aboard the TRMM satellite and an optical transient detector (OTD) (Christian et al. 2003).

3.1.1 Lightning flashes

In the present study, lightning flashes are acquired from LIS-TRMM satellite for a 16-year period (1998–2013). The analysis domain is Indian region as shown in figure 1. The gridded lightning climatology dataset from 1998 to 2013 is retrieved from the very high-resolution monthly climatology (VHRMC) flash rate at a 0.1° × 0.1° spatial resolution. The LF is generally related with cumulo-nimbus clouds (thunderclouds). The TRMM satellite is also capable of tracking the growth of cloud to cloud and clouds to ground throughout the day and night conditions (Yadava et al. 2020). There are many scientists who worked on LF from LIS-TRMM datasets, e.g., Qie et al. (2021), Murugavel et al. (2014), and Siingh et al. (2015).

3.2 ERA-Interim

ERA-Interim is a global space exploration by the European Centre for Medium-Range Weather Forecasts (ECMWF). ERA-Interim data in the form of GRIB or NetCDF format can be downloaded from the ECMWF data server (https://apps.ecmwf.int/datasets/data/interim-full-moda/levtype=sfc/). Global fields are available at full resolution, both vertically (on model levels) and horizontally. ERA-Interim covers the period beginning on January 1, 1979 and continues to expand in near real-time. The ERA-Interim at ECMWF is designed to prepare for an upcoming reanalysis project that will be extended into the 20th century with a spatial resolution of ∼79 km (Dee et al. 2011).

CAPE is considered a thermodynamic parameter. The stronger the CAPE, the more unstable the atmosphere and the more it is affected by strong vertical air motion. In the present study, the CAPE is retrieved from the Era-Interim reanalysis dataset. The CAPE is extracted from ERA-Interim monthly climatology with a spatial resolution of 1° × 1° from 1998 to 2013.

Humidity is an important part of the atmosphere and plays an important role in Earth's energy and water cycle, allowing air to move across the planet as a source of clouds. RH extract from ERA-Interim monthly climatology with a spatial resolution of 1° × 1° from 1998 to 2013. Total Column Water Vapour data was also taken from the ERA-Interim for the period of 1998 to 2013 with 1° × 1° resolution. The parameter is TCWV in a column extending from the surface of the Earth to the top of the atmosphere (Dee et al. 2011), and surface temperature (ST) at 2 m has been used.

3.3 Outgoing longwave radiation (OLR)

Outgoing longwave radiation (OLR) is a measure of the amount of energy emitted to space by Earth's surface, oceans and atmosphere. It is a critical component of the Earth's radiation budget. OLR values are often used as a proxy for convection in tropical and subtropical regions since cloud top temperatures (colder is higher) indicate cloud height. The daily averages of OLR with a spatial grid resolution of 2.5° with diurnal temporal resolution were obtained from the National Oceanic and Atmospheric Administration (NOAA). The monthly OLR data analysed in this study are from 1998 to 2013 years at 2.5°-degree grids over the Indian region from Climate Forecast System Reanalysis (CFSR) developed by NOAA's National Centre for Environmental Prediction (NCEP) (Penki and Kamra 2013).

4 Methodology

To study the relationship between LF with CAPE, TCWV, RH, ST and OLR various statistical techniques as described below are applied over the different study region domains as shown in figure 1.

4.1 Correlation

The Karl Pearson correlation coefficient method is utilised in the present analysis for quantitative estimation of the intensity and the linear relationship between two variables.

The Karl Pearson coefficient of correlation is represented as ‘r’ and the formula is expressed as:

$$r= \frac{\sum (X-\overline{X })\sum (Y-\overline{Y })}{\sqrt{{\sum (X-\overline{X })}^{2}}\sqrt{{\sum (Y-\overline{Y })}^{2}}}$$
(1)

where X and Y are the variables, and \(\overline{X }\) and \(\overline{Y }\) are the mean of X and Y variables.

To estimate the average seasonal variability and correlation between LF with TCWV, CAPE, relative humidity, surface temperature and OLR by equation (1), the X variable is substituted by the average seasonal value of LF and Y variable with the value of another parameter, i.e., TCWV, CAPE, and RH, ST and OLR, respectively.

4.2 Principal component analysis (PCA)

PCA is a statistical technique to reduce the dimension or summarise a large dataset into smaller datasets for better visualisation and analysis (Morales et al. 2015). Aim of the PCA in the present work is to find the relationship among all the four variables. Eigenvalue and Eigenvector are used to calculate PCA in this study. To calculate the PCA, firstly, the covariance of the datasets is estimated using equation (2). Covariance is used to determine the deviation of the variable from its mean with respect to each other.

$${\mathrm {cov}}\left(x,y\right)=\frac{\sum_{i=0}^{n}({x}_{i}-{x}^{\prime})({y}_{i}-{y}^{\prime})}{({n}-1)}$$
(2)

where x and y are 2-dimensional variables, n is the number of samples, \({x}^{\prime}\) = mean value of x variable, \({y}^{\prime}\) = mean value of dimension y, \({x}_{i}\) = value of x at point i and \({y}_{i}\) = value of y at point i. Using covariance of all the variable a covariance matrix is calculated. This covariance matrix is used to calculate the Eigenvalue and matrix.

$$\begin{array}{cc}{\mathrm{cov}[\mathrm{dim}}_{x}& {\mathrm{dim}}_{y}]-\left[\begin{array}{cc}i& 0\\ 0& i\end{array}\right]\end{array}$$
(3)

By solving the above equation (3), the value of i, i.e., Eigenvalue for the dimension x and y can be determined.

5 Results and discussions

Analysis has been carried out on the retrieved LF for seasonal and annual variability using LIS TRMM satellite data (1998–2013) over Indian domain as shown in figure 1 (lat. 8°4′–37°6′N; lon. 68°7′–97°25′E).

5.1 Seasonal distribution: Lightning flashes and climatic parameters

Spatio-temporal, seasonal and inter-annual variability in CAPE, RH, TCWV, ST and OLR dynamics are analysed in detail to understand the behaviour of these variables with respect to LF over India.

5.1.1 Lightning flashes

Figure 2 shows the spatial variability in LF (flash/km2/day) for different seasons over India. Figure 2 clearly depicts that LF intensity is highly variable both regionally and seasonally over India. The major LF occurs in pre-monsoon (MAM) season (0.40–0.45 flash/km2/day) over the northeast part of India. However, the LF (0.35–0.40 flashes/km2/day) are found over the northern part of India during monsoon season. During winter, the maximum value of 0.1–0.15 flash/km2/day has been observed over Punjab, Himachal Pradesh, Jammu and Kashmir and some parts of Madhya Pradesh, northeast regions of Meghalaya and Assam. High LF during winter season of around 0.1–0.15 flash/km2/day are observed in parts of Kerala and Madhya Pradesh. During the monsoon season, most LF is encountered over Jammu and Kashmir, Himachal Pradesh, Punjab and Indo Gangetic Plain (IGP), which varies from 0.35 to 0.45 flashes/km2/day. The state of West Bengal, Tamil Nadu and some parts of Jharkhand are more prone to LF in September–November (post-monsoon), with values ranging from 0.05 to 0.1 flashes/km2/day (Yadava et al. 2020).

Figure 2
figure 2

Seasonal variation of LF/lightning flashes (flash/km2/day) over India.

5.1.2 CAPE

Figure 3 shows the seasonal distribution of CAPE for the 16-year period 1998–2013 in J/kg over India. A higher value of CAPE is observed over coastal regions, as depicted in figure 3 with a value reaching as high as 2250 J/kg due to huge moisture content in the coastal region. High value of CAPE is also found over the foothills of the Himalaya due to the geological activity or orographic features. However, over WH region, the orographic feature gives rise to higher lightning. Orographic lifting of the air parcel in mountain areas is one of the mechanisms that can provide finite vertical displacement to initiate the conditional instability and feed the CAPE required for the formation of lightning (Pawar et al. 2015; Yadava et al. 2020). During the pre-monsoon season, the CAPE in South and East India show an upward trend than the CAPE in Central and West India. The upward trend of CAPE is more pronounced during the monsoon season. An upward trend means the maximum increase in CAPE per year, and a downward means a decreasing trend of CAPE per year. During the monsoon season, most part of India shows increasing trends except the western and northern regions.

Figure 3
figure 3

Seasonal variation of convective available potential energy (CAPE J/kg) over India.

5.1.3 TCWV

The formation of lightning requires sufficient water vapour, which plays an important role in the creation of hydrometeor particles. Thus, higher Total Column Water Vapour (TCWV) will lead to high convection and lightning activity (Shi et al. 2018). The spatial and seasonal variability in TCWV over Indian subcontinent is shown in figure 4. Total moisture content (water vapour) is found to be high over coastal and adjoining sea areas in most of the seasons. A much-reduced moisture supply is recorded over landlocked regions, which is the primary cause of lower lightning in these areas. On the contrary, least TCWV is found over the northern part of India as shown in figure 4.

Figure 4
figure 4

Seasonal variation of TCWV over India.

5.1.4 RH

The seasonal relative humidity for 1998 to 2013 using the ERA-Interim dataset over India is assessed. Figure 5 shows the maximum relative humidity (60–80%) over northeastern part of India. Over northern, RH value of 40–50% is exhibited, while in the central part of India, RH below 20% is found during winter seasons. In pre-monsoon, 50–60% RH is observed over north India and 40% RH is found over the eastern coast of India. Over Arunachal Pradesh, RH values of 50–60% are observed. The lightning process is dramatically affected by RH. RH and lightning show a positive correlation. LF maximum is observed in pre-monsoon (northeast region) and monsoon (north region) seasons over India. Over the NE region during the pre-monsoon season, lightning occurs due to the nor’westers (local meteorological circumstances: i.e., two different kinds of air masses (cold–moist and dry–hot) coming from different directions are forced to rise, they flow together or converge to form the thundercloud, Lal et al. 2018) and in the monsoon season lightning occurs due to the RH. The RH also shows maxima over these regions in India. Table 1 shows the seasonal variation of climatic parameters.

Figure 5
figure 5

Seasonal variation of Relative Humidity (RH %) over India.

Table 1 The seasonal variation of climatic parameters.

5.1.5 ST

Figure 6 shows the surface temperature (ST2m) spatial distribution over India. Maximum ST is observed during April and May (above 300 K) and starts decreasing towards monsoon season. During the post-monsoon season (September–November), temperature follows a different monsoon cycle, the northeast (retreating) monsoon, which brings dry, cool, and dense air masses to large parts of India. This period marks the transition from wet to dry seasonal conditions in most parts of India. Monthly average surface temperatures in the Bay of Bengal region are slightly higher than Arabian Sea regions. In the southern and western parts, maximum temperatures range approx. 300 K, as shown in the figure.

Figure 6
figure 6

Seasonal variation of Surface Temperature (ST) over India.

5.1.6 OLR

Figure 7 depicts the spatial distribution of OLR during different seasons in India. OLR indicates radiation intensity going out and whether there are cloud clusters or bright sunshine (Pushpanjali et al. 2020). Figure 7 shows a sharp increasing pattern of OLR over the Arabian Sea and western side of India during summer and a gradual decrease over the Bay of Bengal side during winter and pre-monsoon season. In general, OLR is high in the month of May. As the clouding increases in June and July, the OLR decreases.

Figure 7
figure 7

Seasonal variation of Outgoing Longwave Radiation (OLR W/m2) over India.

5.2 Variability analysis of lightning with climatic parameters

In the present section, LF variability has been determined with the important climatic parameters over all distinguished regions of India as marked in figure 1.

5.2.1 LF vs. CAPE

Region-wise climatology of CAPE, TCWV, RH, ST and OLR are shown in figure 8 over the west coast and east coast regions. Maximum LF was observed on the coastal region, which decreased gradually. This demonstrates the importance of land–sea thermal contrast leading to strong advection of water and sea breezes along the land–sea interface (Pielke 1974), resulting in more hydrometeor and, therefore, the largest LF. North-central and northwestern India have less moisture infiltration (due to their inland location), so they rely more on dry convection, making CAPE a major determinant of lightning characteristics.

Figure 8
figure 8

Region-wise monthly variability in lightning, CAPE, TCWV, RH, ST and OLR over seven regions.

CAPE shows a dramatic increase from February, which reaches extreme in May and then decreases with the withdrawal of monsoon (June, July, and August) over the EC region. In all other regions, CAPE values began to rise in January, peaked in August, and then started decreasing. The East Coast region shows maximum CAPE, followed by the West Coast and a minimum in West Himalaya. Maximum correlation between LF and CAPE is observed over IP, EC, and WH (r ≥ 0.90) of India due to the higher temperature and humidity values over the coastal regions leading to convection and to more graupel (ice and hail particle concentrations above the freezing layer) (Rakov and Uman 2003; Verma et al. 2021). Over the Western Himalayas, LF and CAPE show good correlation (r = 0.94) because of orographic convection processes. The Himalayas delineate high-altitude areas that support widespread lightning activity caused by topographic uplift. Though CAPE value at NC and NW is relatively lower than EC, the occurrence of LF can still be seen because of lower land-to-sea ratio. Coastal areas experience lightning activity as higher CAPE value, while the reverse is true for plain areas.

5.2.2 LF vs. RH

Figure 8 also shows the variability in RH. The NE (74%), EC (70%) and NC (68.5%) show the maximum RH as these regions' locations are near the Bay of Bengal. Bay of Bengal appearance of cyclonic conditions due to high temperature and coastal downwelling (surface water eventually sinks to the ocean floor as wind forces). However, a low RH appears in NW (52%) region because of the Great Thar desert in these regions and rest of all regions' RH varies approx. 60–68%. In NC region, LF and RH show a good correlation (r = 0.70). In the NC region, RH will cause lightning because stronger hydrometeor concentrations can lead to increased humidity as well as maximum draft speeds, both of which contribute to intense lightning.

5.2.3 LF vs. TCWV

Figure 8 also depicts the region-wise classification of TCWV. A high TCWV (57 kg/m2) is observed over the NC and EC due to high moisture content and least TCWV (14 kg/m2) is observed over WH because of high altitude. Over rest of the regions, TCWV varies from 50 to 55 kg/m2. After the NE, maximum LF shows in the EC region because of high TCWV. Total column water vapour and lightning flashes show a good correlation (r = 0.89) over the WH. In WH region, main cause of the generation of lightning is because of high TCWV. WC receive a much-reduced moisture supply (TCWV 52 kg/m2) because of coastal upwelling (a process in which deep, cold water rises toward the surface). This is the main reason for lightning strikes in these areas.

5.2.4 LF vs. ST

Figure 8 shows the correlation between LF and ST. Most of the regions indicate a high correlation between ST and LF (aprox. 0.9). Over the EC region, ST shows higher correlation with LF (0.94) and minimum in NE region (0.61). A high correlation between ST and LF over EC signifies the downwelling over the Bay of Bengal (BoB), which causes the maximum temperature to appear on BoB side.

5.2.5 LF vs. OLR

Figure 8 shows the simultaneous correlation between lightning and outgoing longwave radiation for the period 1998–2013. Over most Indian regions such as EC, IP, NC, and NE regions, OLR shows prominent negative correlation with the LF, implying the radiations and lightning in tropical regions oscillate in the opposite phase. Over WH region, however, lightning and OLR show positive correlation (0.82) due to higher insolation leading to significant OLR during lightning. Previous studies on OLR (Kumar and Kamra 2012; Siingh et al. 2014) also show that OLR has less impact on lightning flashes.

5.3 Principal component analysis (PCA)

Using the Eigenvalue principal component (equations 2 and 3), all the variables TCWV, CAPE, LF, RH, ST and OLR are calculated as PC1, PC2, PC3, PC4, PC5 and PC6, respectively. Figure 9 shows the PC analysis in a biplot to represent the distribution of variables and the number of points for each variable for PC1 and PC2. Loading of PC1 and PC2 is also shown on biplot for all the variables. Loading of PC1 and PC2 is calculated using equation (4). This loading helps us to establish the relationship or correlation between the principal component and original datasets loading of PCs are shown in table 2.

Figure 9
figure 9

Biplot shows PC1 and PC2 scores bottom x-axis and left y-axis, respectively.

Table 2 Loading of variable lightning, RH, TCWV, CAPE, ST and OLR.
$${\mathrm{Loading \; of \; PC}}={\mathrm{Eigen vector}}\times \sqrt{\mathrm{Eigen value}}$$
(4)

Figure 9 biplot shows PC1 and PC2 scores bottom x-axis and left y-axis, respectively. Top x-axis and right y-axis represent loadings obtained by multiplying the standard deviation of principal components with the square root of the number of observations. The directional vector TCWV represents total column water vapour, CAPE represents convective available potential energy, LIS represents LF, RH represents surface relative humidity, ST represents surface temperature and OLR represents outgoing longwave radiation.

Arrow in the plot represents the direction of the variable according to their contribution to PCA. We can see that TCWV is positively valued with PC1. CAPE is also showing a positive value with PC1 as well as PC2 with a positive loading value, and the angle between LF and CAPE is the least. This represents that CAPE is highly correlated with LF. The relative humidity represents a negative loading value with PC1 along with positive value of loadings of PC2. Angle between RH and TCWV with respect to lightning is approximately equal; thus, both the parameters correlate equally with LF. RH and TCWV are inversely proportional to each other as lines representing RH and TCWV in figure 9 are opposite to each other. OLR represent a negative value with PC1 along with a negative value of PC2.

Proportion of variance and standard deviation of all the PCs are calculated in order to see how much data are spreading for the mean and at what level each variable contributes to the PCs. Proportion of variance, standard deviation and cumulative proportion of variance is shown in table form in table 3 and graphically in figure 10. From the table, we may see that PC1 contributes around 39% of the proportion of variance which is most, whereas PC2 contributes 30% of the proportion of variance.

Table 3 Standard deviation, proportion of variance and cumulative variance of principal component (PC1, PC2, PC3, PC4, PC5 and PC6) of PCA.
Figure 10
figure 10

Graphical representation of table 3.

5.4 Taylor diagram for lightning and meteorological parameters

A graphical representation of a Taylor plot that helps to determine the statistical correlation, standard deviation, and standard error of two different sets of data. In this section, correlations and standard deviations for all seven regions (EC, IP, NW, WC, WH, NC, and NE) are presented. Figure 11 depicts the correlation of LF with CAPE, LF with TCWV, LF with RH, LF with ST and LF with OLR over all regions. The figure depicts the maximum correlation that appears for LF and CAPE is 0.96 over IP, 0.94 over WH, 0.90 over WC and 0.90 over EC region, and the rest of the other region ranges from 0.80 to 0.85. LF and TCWV correlation show maximum in WH (0.89) and minimum 0.23 in the WC region. The figure also shows the maximum correlation between LF and RH, 0.70 in NC region and minimum 0.13 in the WC region. The Taylor diagrams show a good correlation between LF and CAPE over the IP, WH, WC and EC region. This means that these regions' main governing factor of LF is CAPE. The figure shows the maximum correlation between LF and ST (0.94) over EC region and the minimum (0.61) in NE region. The figure also shows the maximum correlation between LF and OLR is 0.82 over WH region and minimum negative correlation (–0.52) over EC region. The Taylor diagrams show a good correlation LF and CAPE over the IP, WH, WC and EC regions, implying that CAPE is the main governing factor of LF over these regions.

Figure 11
figure 11

Taylor plot presents statistical correlation between lightning and different climatic parameters over all regions.

5.5 Vertical wind shear

Vertical wind shear (VWS) is the variation or change in wind speed and/or direction over a relatively short distance in the atmosphere. It occurs when there is a considerable difference in wind speed or direction at different elevations or close horizontal proximity. Wind shear can occur both horizontally and vertically and is regularly observed at the Earth's surface as well as at higher altitudes. This definition of VWS is generally representative of the deep-tropospheric shear between the 200- and 850-hPa levels (Chen et al. 2006).

The calculation procedure can be well-defined by the following equation:

$${\mathrm{VWS} }= \sqrt {\left( {\Delta}u \right)^{2} \; + \;\left( {\Delta}v \right)^{2} }$$
(5)

where VWS represents the magnitude of wind shear (m/s), Δu is the difference in the u component of wind velocity between two (200- and 850-hPa) levels, and Δv is the difference in the v component of wind velocity between two (200- and 850-hPa) levels.

The VWS (m/s) has been calculated using equation (5) and spatially plotted over India in figure 12 using the ERA-interim reanalysis dataset from 1998 to 2013 (Hersbach et al. 2020). The strongest wind shear was observed during winter (December–February) varying from 8 to 55 m/s, followed by summer (March–May) varying from 7 to 37 m/s, and post-monsoon (September–November), varying from 15 to 34 m/s. Similarly, during monsoon (June–August), a high easterly wind shear is observed within the lower latitude, which varies from 3 to 28 m/s. As we move up towards IGP, shear decreases (3–11 m/s) along with calm wind conditions. In the pre-monsoon season, it is noticeable that the updraft current of warm, moist air from Bay of Bengal goes to the northeastern side and meets dry air, strong vertical wind shear, which helps maintain a separation between these two regions (dry and moist), allowing the thunderstorm to persist and potentially intensify lightning over northeast region. In the monsoon, however, regional variations and factors like topography, and local atmospheric dynamics influence the relationship between wind shear and lightning.

Figure 12
figure 12

Vertical wind shear (m/s) between 850 and 200 hPa during 1998–2013.

It is important to note that while wind shear is a contributing factor to the development of thunderstorms and lightning, it is not the sole determinant.

6 Conclusions

In this work, we have analysed long-term LIS data from 1998 to 2013 and determined the relationship between LF with that of CAPE, RH, TCWV, ST and OLR. We identify regional-scale meteorological factors associated with lightning outbreaks. The results demonstrate that lightning preferentially occurs during March, April and May (MAM) months in lower elevations over northeast regions in India, while a larger proportion of lightning occurs in higher elevation regions during monsoon season, i.e., June, July and August, representing a reversal of the relationship between lightning and meteorology during the transition from summer to monsoon. The statistical analysis shows that the lightning flash rate follows the CAPE in pre-monsoon season. In pre-monsoon season, high CAPE is found over the eastern coastal region (1750–2250 J/kg) and during monsoon season high over northern region (1250–1500 J/kg) as compared to winter and post-monsoon season. The spatio-temporal distribution shows that most of the lightning flashes took place in the months of MAM (pre-monsoon) (0.40–0.45 flashes/km2/day) in the northeast region covering states like Assam, Meghalaya and Tripura, whereas in the month of JJA (monsoon), the LF shift from northeast to northwest part of India covering states like Jammu and Kashmir, Himachal Pradesh and some parts of north Punjab. During post-monsoon, the entire eastern Ghats and Kerala (0.10–0.15 flashes/km2/day) experience high LF. In winter, the northwestern region got the highest lightning flashes (0.05–0.15 flashes/km2/day). TCWV is found to be higher in the eastern coast region of India in all the seasons, majorly in the monsoon (JJA). During pre-monsoon, high TCWV value (50–70 kg/m2) is found over the southeastern coastal zone of India. Surface relative humidity is unevenly distributed over India, showing higher value in the monsoon season (60–80%) and least in the winter season. Statistical analysis is performed to see the seasonal correlation between lightning flashes and various climatic parameters.

Further analysis has been performed over seven different regions of the Indian subcontinent. We have found maximum LF in the NE (0.049 flash/km2/day) region followed by the EC (0.041 flash/km2/day) and lowest in WC (0.027 flash/km2/day) of India. The CE region marked a significant increase in CAPE, followed by the NE region. The increase in RH (74%) can lead to higher value of CAPE in the region. The RH show a significant value over all seven regions except the WC and NW region, which is approximately 53% and over the rest of the regions, the range varies from 55 to 74%. TCWV was found to be highest in EC region of India, i.e., 57.71 kg/m2. This could be the probable reason behind increased lightning activity over these regions. The PCA result shows the CAPE, TCWV, RH and ST are significantly well correlated with LF. However, lightning is not significantly well correlated with OLR.