1 Introduction

With the increasing population, urbanization and economic development, the occurrence of large earthquakes has a long lasting impression. The knowledge of temporal properties of these damaging earthquakes in a geographic region is a crucial factor in earthquake hazard quantification and associated disaster mitigation (Scholz 2019; Verma et al. 2024; Pasari and Sharma 2020). Traditionally, it has been assumed that the number of earthquakes in a fixed time interval follows a Poisson distribution, suggesting that the interevent times must follow the time-independent exponential distribution (Cornell 1968; Utsu 1984). However, due to the “memoryless” nature of the exponential model, recent studies have considered a variety of time-dependent models to examine the recurrence statistics of large earthquakes (Utsu 1984; Yadav et al. 2012; Pasari 2018). In fact, as earthquake occurrence is influenced by dynamic factors such as aftershocks, foreshocks and seismicity clusters that evolve over time, employing time-dependent models would be crucial to accurately capture and predict seismic activity patterns (Kagan and Jackson 1991; Mangira et al. 2019). In other words, the time-dependent models would be more appropriate to provide valuable insights for earthquake risk assessment and disaster preparedness (Kagan and Jackson 1991; Papazachos et al. 1997). These studies essentially assume that the occurrence of large events (main shocks) in a spatial region is consistent with a random process. As a consequence, the dependent events, such as foreshocks and aftershocks, are commonly removed from the earthquake records to produce a sequence of random sample (Pasari 2015). The interevent time statistics and associated long-term (10–30 years) large earthquakes’ occurrence probability values are then derived through the observed sample data. In the present study, we focus on the interoccurrence time between successive M ≥ 6 events in the seismically active Nepal Himalaya and its adjacent regions with population of more than 50 million people. Using time-independent, time-dependent, heavy-tailed and exponentiated models, we not only estimate the cumulative probability of M ≥ 6 events, but also generate a series of hazard function curves for various combination of elapsed times (time elapsed since the last large earthquake) and residual times (time to a future large earthquake).

Lying in the central part of the ~ 2900-km-long Himalayan megathrust system, Nepal and its adjacent regions are reported to have intense seismicity rates due to the ongoing tectonic collision between the India plate and the Eurasia plate (Yin 2006). On April 25, 2015, the country has experienced a devastating Mw 7.8 (Gorkha) earthquake that has led to a death toll of ~ 9,000 people, injuring 22,000 more. Besides, the country has witnessed a series of large earthquakes in the past. For example, an earthquake of magnitude Mw 7.6–7.9 occurred near the Kathmandu valley in 1833, causing around 500 human causality; in 1934, the Bihar–Nepal megathrust earthquake of magnitude Mw 8.0 struck near the Mount Everest, killing around 12,000 people and causing extensive damage in Nepal and in northern part of Bihar (India); in 1988, another strong earthquake with magnitude Mw 6.9 occurred in the eastern part of Nepal near the Indian border, killing at least 700 people from Bihar and Nepal (Sharma 2021). About a year ago, a moderate event (Mw 5.7) hit the district of Doti (Nepal) on November 9, 2022, and killed six people. The event, although is not a severe one, has reignited the panic of large earthquakes among the citizens of Nepal and its neighboring regions. Therefore, it is timely to re-estimate the recurrence statistics of large earthquakes in the Nepal Himalaya and estimate the occurrence probability to infer future earthquake hazards in the region.

As noted above, in addition to time-independent exponential distribution, several previous studies have used time-dependent models, such as lognormal, gamma and Weibull models (Utsu 1984). The efficacy of other models, such as the bell-shaped normal distribution (Papazachos et al. 1987), asymmetric inverse Gaussian or Brownian Passage Time (BPT) distribution (Matthews et al. 2002; Pasari 2018), triple exponential model (Kijko and Sellevoll 1981), generalized gamma model (Bak et al. 2002), heavy-tailed (heavier tail than the exponential distribution) distributions, such as Pareto, Levy, lognormal and inverse Weibull (Frechet) (Pasari 2015), and the exponentiated group of distributions, such as exponentiated exponential, exponentiated Rayleigh and exponentiated Weibull distributions (Pasari and Dikshit 2015a, b; Pasari 2015, 2018), has also been explored in deriving recurrence time statistics and associated earthquake forecasts in a seismic renewal process. Many of these competitive distributions share exciting physical properties. For example, the inverse Gaussian (BPT) distribution offers a physical connection between observable event times and a formal state variable that not only follows a Brownian relaxation oscillator but also reflects the macro-mechanics of stress and strain accumulation (Matthews et al. 2002). Both inverse Gaussian and lognormal exhibit unimodal density curve with non-monotone hazard function (ratio between the density and the survival function); for both distributions, the hazard function starts from zero, increases to reach to the maximum value and then gradually decreases to a constant asymptotic value (for inverse Gaussian) or to zero (for lognormal); both distributions exhibit reproductive (hereditary) property, in additive manner (inverse Gaussian) or in multiplicative manner (lognormal) (Pasari 2019b). Likewise, the exponentiated exponential, exponentiated Rayleigh and the exponentiated Weibull not only reveal similar characteristics to the familiar gamma and Weibull distributions, but also enable researchers more flexibility due to the additional shape parameter (Gupta et al. 1998; Pasari and Dikshit 2018); for instance, the density function of exponentiated exponential distribution is monotonically decreasing when β ≤ 1, and for β > 1, it is unimodal and right-skewed similar to gamma and Weibull densities (Gupta and Kundu 1999); the hazard function of exponentiated exponential assumes increasing (β > 1), decreasing (β < 1) and constant (β = 1) shape like the hazard functions of gamma and Weibull distributions (Pasari 2018). Together, various shapes of hazard (failure) function play key role in earthquake reliability and survival analysis, especially for the interpretation of future earthquake probabilities for a set of elapsed times and residual times (Sornette and Knopoff 1997; Pasari 2015).

Here we consider twelve reference probability distributions to examine various statistical properties of interoccurrence times in the Nepal Himalaya. We estimate model parameters via the maximum likelihood estimation and assess modeling uncertainty through Fisher information and Cramer–Rao bound. Using three model selection approaches, we rank the suitability of these competitive distributions. The conditional probability plots (hazard function curves) and their implications for future earthquakes are also discussed in detail.

2 Study area and earthquake data

2.1 Tectonic background

The study area comprises entire Nepal, southern part of Tibet, some portion of the fertile north Indian river plain, and its adjoining areas bounded by 26° N to 31° N and 79° E to 89° E (Fig. 1). The region is one of the most seismically active areas in the world due to the high deformation rates from India–Eurasia plate collision (Bilham et al. 2001; Yin 2006). The earthquakes have mostly of thrust focal mechanisms occurring at shallow to intermediate depths (depth 10–80 km). The Himalayan megathrust system comprising a series of three east–west megathrust faults, namely the Himalayan Frontal Thrust (HFT), Main Boundary Thrust (MBT) and the Main Central Thrust (MCT), has a clear demarcation in this region (Fig. 1) (Sharma 2021). This megathrust system is assumed to merge beneath the Himalaya at ~ 20–25 km depth into a basal decollement, known as the Main Himalayan Thrust (MHT). The MHT, serving as an interface between the down-going Indian plate and the overriding Himalayas, has hosted many devastating earthquakes in the region. A number of geodetic studies have suggested that the deeper part of the MHT is creeping under the higher Himalaya, whereas the shallower part (i.e., MFT) is locked in the lesser Himalaya with a slip deficit rate of 14–18 mm/yr (Sharma et al. 2023a, b). This deficit rate accumulated over time gathers sufficient energy to produce large Himalayan earthquakes in the region (Bilham et al. 2001; Ader et al. 2012; Sreejith et al. 2018; Sharma et al. 2020). To mention, the western Nepal that belongs to the “central seismic gap” bounded by the epicenters of the 1905 Kangra earthquake and the 1934 Bihar–Nepal earthquake is believed to host one or more highly damaging earthquakes in the near future (Khattri and Tyagi 1983; Sreejith et al. 2018; Sharma et al. 2020).

Fig. 1
figure 1

Seismotectonic map of the Nepal Himalaya (denoted by the blue solid line) and its adjacent regions. Spatial distribution of large earthquakes (M > 6.0) occurred on the study area (bounded by 26° N to 31° N and 79° E to 89° E) (Table 1). Red stars indicate the location of the 1934 Bihar–Nepal event (Mw 8.0) and the most recent 2015 Gorkha event (Mw 7.8); HFT—Himalayan Frontal Thrust, MBT—Main Boundary Thrust and MCT—Main Central Thrust

In the Nepal Himalaya, there are more than 90 peaks of elevation 7000 m including the world’s highest mountain peak, the Everest (8,848 m). The Siwalik group forms a folded Cenozoic piedmont region along the Nepal Himalaya. The higher Himalaya comprises ~ 10 km thick succession of crystalline rocks and fossiliferous sedimentary rocks, whereas the lesser Himalaya in this part consists of the un-fossiliferous sedimentary and meta-sedimentary rocks of Precambrian to Miocene age (Yin 2006). Three major formations in the Nepal Himalaya are the Kathmandu Nappe (~ 3–4 km thick Lower-Paleozoic Himalayan strata), Dadeldhura Klippe (a synformal klippe that is a continuation of the Almora Klippe in the east direction) and the Jajarkot Nappe that represents the lowest metamorphic grade exposures of the Himalayan orogeny (Sharma 2021). In addition, the region includes the Faizabad Ridge (FZR), a subsurface extension of the Bundelkhand massif with a very low-magnitude seismicity. This ridge detaches Gandak depression in the east from Sarda depression in the west (Sharma et al. 2020). Although the region shows several seismotectonic variations including spatial heterogeneity in seismicity pattern and faulting mechanism, such effects will not be considered in the present empirical analysis that focuses on the statistical distribution of earthquake interevent times in a simple point process (Utsu 1984; Pasari and Dikshit 2015a, b).

2.2 Earthquake data

Nepal and its adjacent regions have a long history of large earthquakes. For the present analysis, we compile earthquake data of M ≥ 6 events from 1800 through 2022 using three seismicity databases, namely the Indian Meteorological Department (IMD), International Seismological Centre (ISC) and Global Centroid Moment Tensor (GCMT) database. We note that the earthquakes are not homogeneous in magnitude, particularly the events listed in IMD during the pre-instrumental era. Using empirical relations (Scordilis 2006; Figures S1, S2, S3) mentioned below, we convert all different magnitudes to the moment magnitude scale \(\left( {M_{W} } \right)\).

$$\begin{gathered} M_{{\text{S}}} - M_{{\text{W}}} :\, \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,M_{{\text{W}}} = 0.67\left( { \pm 0.005} \right)M_{{\text{S}}} + 2.07\left( { \pm 0.03} \right),\,\,{\text{for}}\,\,\left( {3.0 \le M_{{\text{S}}} \le 6.1} \right) \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,M_{{\text{W}}} = 0.99\left( { \pm 0.020} \right)M_{{\text{S}}} + 0.08\left( { \pm 0.13} \right),\,\,{\text{for}}\,\,\left( {6.2 \le M_{{\text{S}}} \le 8.2} \right) \hfill \\ m_{{\text{b}}} - M_{{\text{W}}} :\, \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,M_{{\text{W}}} = 0.85\,\left( { \pm 0.040} \right)m_{{\text{b}}} + 1.03\left( { \pm 0.23} \right),\,\,{\text{for}}\,\,\left( {3.5 \le m_{{\text{b}}} \le 6.2} \right) \hfill \\ M_{{\text{L}}} - M_{{\text{W}}} :\, \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,M_{{\text{W}}} = 0.934\left( { \pm 0.135} \right)M_{{\text{L}}} + 0.356\left( { \pm 0.714} \right),\,\,{\text{for}}\,\,\left( {3.4 \le M_{{\text{L}}} \le 7.6} \right) \hfill \\ \end{gathered}$$
(1)

The homogenized catalog comprises 48 events with magnitude Mw ≥ 6. The geographic location of epicenter, earthquake depth, event size (magnitude) and time of occurrences of these large earthquakes are mentioned in Table 1. The catalog, however, contains dependent events, such as foreshocks and aftershocks, which must be identified and subsequently removed to adhere to the assumption of independent sequence of earthquake point processes in the study region (Utsu 1984). For this, we classify the dependent events using a space–time window approach which states that any event in the proximity of another larger event in both space and time should be treated as a dependent event (Gardner and Knopoff 1974). Moreover, realizing the earthquake size dependency in aftershocks, we consider a dynamic space–time window method (Uhrhammer 1986) with a more conservative choice (Pasari 2018). After examining several distance and time windows in producing robust and consistent conclusions to remove dependent events, we add/subtract 60 days to the time window and 15 km to the distance window of Uhrhammer (1986) relations. Therefore, the search radius for earthquake declustering is considered as

$$r = \exp \left( { - 1.024 + 0.804M_{W} } \right) \pm 15,$$
(2)

and the time window as

$$t = \exp \left( { - 2.870 + 1.235M_{W} } \right) \pm 60.$$
(3)
Table 1 Focal parameters of \(M_{W} \ge 6.0\) events in the Nepal Himalaya and its neighboring areas during 1800–2022

Based on the above space–time window, we mark nine events (~ 19%) as dependent events, including one aftershock of the May 1852 event (Mw 6.9), one foreshock and one aftershock of the June 1966 Dharchula event (Mw 6.3), one aftershock of the May 2008 earthquake (Mw 6.7) and five aftershocks of the 2015 Gorkha (Mw 7.8) earthquake. Without dependent events, the catalog finally contains 39 large events (Table 1), producing 38 interevent times for further analysis.

After homogenization and declustering, we employ a magnitude–frequency-based visual cumulative test to analyze time completeness of the present catalog (Mulargia and Tinti 1985; Pasari 2015). For this, we first obtain the cumulative number of earthquakes versus occurrence times plot, followed by a linear fit to the data through least-squares regression (Pasari 2018). We note that a near-perfect linear trend is present with R2 = 0.97 (Fig. 2). Therefore, the present catalog (Table 1) is deemed to be time complete, indicating that earthquake rates and moment releases in the Nepal Himalaya are consistent over sufficiently longer time period. Employing the maximum curvature (MAXC) technique through ZMAP software (Wiemer 2001), the magnitude completeness threshold (Mc) of the present catalog turns out to be 6.0, indicating that all earthquakes of magnitude 6.0 and higher have been appropriately considered in the study (Figure S4). Moreover, the annual occurrence rate of earthquakes in the study region is observed approximately 23.4, 3.6 and 0.2 for magnitudes 4, 5 and 6, respectively.

Fig. 2
figure 2

Cumulative number of earthquakes of magnitude (as listed in Table 1) versus occurrence times plot in the study region during 1800–2022

3 Methods and results

The principal task in the present study is to investigate seismic recurrence time statistics distribution between successive large earthquakes in the study region. Such interoccurrence time analysis is commonly used to characterize long-term earthquake hazard in terms of occurrence probability. We assume that the observed interevent times exhibit no correlations among them, and they constitute a random sample corresponding to a positive continuous random variable (Utsu 1984). Under this setup, the adopted methodology has three primary steps: choice of reference probability distributions, statistical inference involving estimation and model testing, and estimating conditional probability for a future earthquake. In the first step, we choose a set of twelve reference probability distributions based on previous studies and highlight their important characteristics in data analysis. In the second step, we use the maximum likelihood estimation (MLE) method to infer model parameters based on the available sample data. We examine the estimated parameters’ confidence bounds through Fisher information and Cramer–Rao lower bound, whereas we test the performance of the applied distributions using three model selection approaches, namely the Akaike information criterion (AIC), Kolmogorov–Smirnov (K–S) goodness-of-fit test and the Chi-square test. Finally, we use the best-fit probability distribution to derive a number of occurrence probability curves for different elapsed times and residual times.

4 Reference probability distributions

Let \(T\) denote the random variable of interoccurrence times of Mw ≥ 6 events (main shocks) in the study region with its probability density function \(f\left( t \right)\), cumulative distribution function \(F\left( t \right)\) and hazard function \(h\left( t \right)\). Based on previous studies (e.g., Utsu 1984; Parvez and Ram 1997, 1999; Yadav et al. 2008, 2010, 2012; Bantidi et al. 2022; Working group 2013; Pasari 2015, 2018; Pasari and Dikshit 2014a, 2014b, 2015a, 2015b, 2018; Bajaj and Sharma 2019; Kourouklas et al. 2022), we consider 12 candidate reference probability distributions to model the observed sample \(\{ t_{1} ,t_{2} ,t_{3} , \cdots ,t_{38} \}\) of size 38. These distributions include the time-independent exponential model and time-dependent gamma, lognormal, Weibull, Levy, Maxwell, Pareto, Rayleigh, inverse Gaussian, inverse Weibull, exponentiated exponential and exponentiated Rayleigh models. Specifically, following the reliability theory, recurrence statistics of earthquakes has been discussed by several Japanese researchers (e.g., Utsu 1972, 1984; Hagiwara 1974; Rikitake 1976) in the early stages of implementation. Among these Utsu (1984) has formally discussed the recurrence of earthquakes in Japan through four renewal models, namely exponential, gamma, lognormal and Weibull. Later on, a series of studies (e.g., Utsu 1984; Parvez and Ram 1997, 1999; Yadav et al. 2008, 2010, 2012, Working group 2013; Pasari 2015, 2018; Pasari and Dikshit 2014a, 2014b, 2015a, 2015b, 2018; Bajaj and Sharma 2019) have concentrated on these four probability distributions for earthquake interoccurrence time analysis. To this end, Weibull distribution appears to be the most popular and versatile probability model in statistical seismology (Pasari 2015). Abaimov et al. (2008) also suggested that the Weibull distribution is the favored model for describing recurrence times on the San Andreas fault. Corral (2003, 2004) employed global catalogs and observed that the gamma distribution provides a good fit for intermediate and large values of recurrence time. To note, Kagan and Knopoff (1987) introduced the inverse Gaussian distribution, which Matthews et al. (2002) adopted as the BPT distribution for their specific regions.

For the studied probability distributions, we mention density functions, their supports and a basic explanation of model parameters in Table 2. We note that except Pareto, all distributions consider positive real line as their support. Out of these 12 distributions, four distributions (exponential, Levy, Maxwell and Rayleigh) have one parameter, whereas the rest of them have two parameters. The exponential distribution appears in seismology to describe earthquake interevent times under a homogeneous Poisson process, though it produces a constant hazard function over time. The gamma and Weibull distributions have two parameters, one scale parameter (responsible to control the spread of distribution) and one shape parameter (responsible to produce various appearances). The shape parameter particularly brings out a large variety of flexibility. When the shape parameter takes unit value, both distributions become identical to an exponential distribution. Moreover, as these distributions are popular in survival analysis to model residual times (also known as waiting time or time to failure), they are commonly used in seismic interoccurrence time analysis (Pasari 2015). Under different conditions, these two distributions enable monotonically increasing, decreasing and constant failure rate. The lognormal distribution, like Levy, Pareto or inverse Weibull (Frechet), is a commonly used heavy-tailed distribution that puts higher probability to large events. While the lognormal distribution has distinctive applications in modeling maintenance time of a system, the heavy-tailed models are generally used in modeling huge insurance losses, income data, wildfire and earthquake sizes (Foss et al. 2011). The hazard functions of lognormal and Frechet distributions are non-monotone unimodal upside-down (concave-down \(\cap\) shape) bathtub shape, whereas the hazard rate pattern of Pareto is decreasing (Johnson et al. 1995). The Maxwell and Rayleigh distributions belong to one-parameter family of distributions, with Maxwell having extensive applications in particle speed analysis in statistical physics, while Rayleigh with increasing hazard rate has found applications in medical statistics and oceanographic studies among others. Both distributions have been used in statistical seismology. The inverse Gaussian distribution, also known as BPT distribution, is a popular temporal model (e.g., Matthews et al. 2002; Pasari and Dikshit, 2015a, 2015b, 2018) for interarrival times and consequent long-term seismic forecasting. Unlike many probability models, this distribution with a non-monotone hazard pattern asymptotically attaining a nonzero value enables noteworthy connection to the earthquake mechanics of stress and strain accumulation (Matthews et al. 2002). In addition, we consider two distributions from the exponentiated group, namely the exponentiated exponential and the exponentiated Rayleigh (Mudholkar and Srivastava 1993). With an additional shape parameter, these distributions generalize the exponential and Rayleigh distributions, respectively. Both distributions have many characteristics with commonly used renewal time distributions, such as exponential, gamma and Weibull distributions (Gupta and Kundu 1999; Pal et al. 2006; Pasari and Dikshit 2015a; Mahmoud and Ghazal 2017).

Table 2 Probability distributions, density functions and associated Fisher information matrices (FIM)

4.1 Statistical inference

In order to carry out statistical inference based on available earthquake interarrival times, we utilize the maximum likelihood estimation (MLE) method for parameter estimation of the studied distributions, the Fisher information matrix to compute the variance–covariance matrix associated with the MLE estimates and three model selection approaches to arrange the distributions’ performance. The MLE method works on the principle that the estimated model parameters must maximize the joint likelihood for a given set of sample data points (Johnson et al. 1995). The method often requires to solve a set of linear or nonlinear likelihood equations. On the other hand, the Fisher information matrix (FIM), coupled with the Cramer–Rao lower bound, provides a measure of uncertainty in terms of asymptotic standard deviations and confidence limits of the estimated model parameters (Johnson et al. 1995; Hogg et al. 2005; Pasari 2015).

Let \(\theta = \left( {\theta_{1} ,\theta_{2} , \cdots ,\theta_{p} } \right)\) denote the parameters for a reference distribution. Then, the symmetric and positive semi-definite FIM \(I_{p \times p} \left( \theta \right)\) can be defined (Hogg et al. 2005) as below.

$$\begin{aligned} I_{p \times p} \left( \theta \right) &= \left( {I_{ij} \left( \theta \right)} \right)_{i,j = 1,2, \cdots ,p} \\ &= E\left( { - \frac{{\partial^{2} \ln f\left( {T;\theta } \right)}}{{\partial \theta_{i} \partial \theta_{j} }}} \right)_{i,j = 1,2, \cdots ,p} \\& = E\left[ {\left( {\frac{{\partial \ln f\left( {T;\theta } \right)}}{{\partial \theta_{i} }}} \right)\left( {\frac{{\partial \ln f\left( {T;\theta } \right)}}{{\partial \theta_{j} }}} \right)} \right]_{i,j = 1,2, \cdots ,p} \\& = \frac{1}{n}E\left( { - \frac{{\partial^{2} L\left( {T;\theta } \right)}}{{\partial \theta_{i} \partial \theta_{j} }}} \right)_{i,j = 1,2, \cdots ,p} \\ \end{aligned}$$
(4)

Here, \(E\) denotes the expectation, \(f\left( {t;\theta } \right)\) is the density function, and \(L\left( {T;\theta } \right)\) is the log-likelihood function. After obtaining Fisher information matrix, we compute the asymptotic variance–covariance matrix \(\sum_{{\hat{\theta }}}\) for each distribution through the Cramer–Rao bound (Hogg et al. 2005) defined as \(\sum_{{\hat{\theta }}} \ge \left[ {nI\left( {\hat{\theta }} \right)} \right]^{ - 1}\), where \(\hat{\theta }\) is the maximum likelihood estimate of \(\theta\). Finally, we provide a 95% two-sided confidence limit of the estimated model parameters as \(\hat{\theta } - 1.96\sqrt {\left[ {\sum_{ii} \left( {\hat{\theta }} \right)} \right]}_{i = 1,2, \cdots ,p} < \theta < \hat{\theta } + 1.96\sqrt {\left[ {\sum_{ii} \left( {\hat{\theta }} \right)} \right]}_{i = 1,2, \cdots ,p} .\) To note, the exact (not asymptotic) standard deviations are available for the Pareto distribution (Quandt 1966; Pasari 2015), whereas the uncertainty analysis for the exponentiated Rayleigh distribution cannot be performed as the explicit formulation of the FIM is unavailable (Pal et al. 2006). In Table 3, we list the estimated parameter values along with their associated uncertainties.

Table 3 Estimated parameter values with their asymptotic standard deviations and confidence bounds

The estimated parameter values given in Table 3 suggest the following noteworthy characteristics of the underlying earthquake system: (i) as the shape parameter \(\left( {\hat{\beta }} \right)\) in each of gamma, Weibull and exponentiated exponential is less than 1.0, the associated hazard function turns out to be monotonically decreasing, indicating that the expected time to the next earthquake will increase with an increasing elapsed time (Sornette and Knopoff 1997; Pasari 2015); (ii) inverse Gaussian and lognormal distributions exhibit non-monotone hazard shapes that gradually reach to a constant asymptotic value \(\frac{\beta }{{2\alpha^{2} }} \approx 0.0015\) and zero, respectively (Pasari 2018); (iii) as the estimated shape parameter \(\left( {\hat{\beta }} \right)\) of the Pareto distribution is less than 1, it does not allow us to compute mean interevent time or standard deviation (Johnson et al. 1995); (iv) the hazard function associated with the exponentiated Rayleigh distribution assumes a “bathtub-type” shape as \(\hat{\beta } < 0.5\) (Pal et al. 2006; Pasari and Dikshit 2018), and (v) for exponentiated exponential distribution with \(\hat{\beta } < 1\), we cannot evaluate the parametric uncertainty for this distribution as \(\psi^{\prime}\left( {\hat{\beta } - 1} \right)\) is not defined (Gupta and Kundu 1999).

After parameter estimation and uncertainty treatment, we now rank the reference distributions according to their performance against the observed interevent times. For this, we use three popular model selection approaches, namely the AIC, K–S and the Chi-square criterion. The AIC in general penalizes a model with more parameters and it is defined as \(AIC = 2k - 2L\), where \(k\) denotes the number of parameters and L is the log-likelihood value. Therefore, a model with the least AIC value is deemed to be the most preferable model for a given dataset. In contrast, the K–S nonparametric goodness-of-fit test compares the distance between the cumulative distribution function (CDF) of the tested distribution and the empirical distribution function (EDF), under the null hypothesis that the data are distributed according to the distribution. Arranging the K–S distances in an increasing order, we rank the studied reference distributions (Table 4). In addition to the K–S point measures, we also present a number of K–S plots in Figs. 3 and 4, and Figure S5 to examine the overall fit of the reference CDFs with the EDF. After AIC and K–S tests, we employ the minimum Chi-square criterion that uses observed and expected frequencies to prioritize a group of distributions. For computation of Chi-square value, here we use six classes (< 1, 1–3, 3–5, 58, 8–14, > 14). The detailed results of model selection tests are summarized in Table 4.

Table 4 Model selection results using three goodness-of-fit tests and criteria
Fig. 3
figure 3

Comparison of the best five CDF (i.e., exponential, gamma, Weibull, exponentiated exponential and exponentiated Rayleigh) of the tested distributions against EDF through K–S plots

Fig. 4
figure 4

Comparison of the a gamma against Weibull, b exponential against exponentiated exponential, c gamma against exponentiated exponential, d gamma against exponentiated Rayleigh, e Weibull against exponentiated exponential, f Weibull against exponentiated Rayleigh, g exponential against exponentiated Rayleigh and h exponentiated exponential against exponentiated Rayleigh distributions through K–S plots

Table 4 shows that the exponentiated Rayleigh distribution has the lowest AIC value, least K–S distance and the minimum Chi-square value, whereas the exponentiated exponential, Weibull, exponential and gamma distributions also reveal a satisfactory fit to the observed interarrival times in the study region. Therefore, the best performance comes from the exponentiated Rayleigh (rank 1), exponentiated exponential (rank 2), Weibull (rank 3), exponential (rank 4) and the gamma distribution (rank 5), (ii) an intermediate fit comes from the lognormal (rank 6) and the inverse Weibull distribution (rank 7), whereas (iii) the distributions, namely Maxwell (rank 8), Rayleigh (rank 9), Pareto (rank 10), Levy (rank 11) and inverse Gaussian (rank 12), show poor fit based on the AIC, K–S and Chi-square criteria. For further insights of the model suitability, we demonstrate a pairwise comparison (Fig. 4) between the competitive distributions, such as (a) gamma–Weibull, (b) exponential–exponentiated exponential, (c) gamma–exponentiated exponential, (d) gamma–exponentiated Rayleigh, (e) Weibull–exponentiated exponential, (f) Weibull–exponentiated Rayleigh, (g) exponential–exponentiated Rayleigh and (h) exponentiated exponential–exponentiated Rayleigh. We observe that Weibull is almost inseparable from the exponentiated exponential, though there is a noticeable difference between exponentiated exponential and its parent distribution (exponential), exponentiated exponential and exponentiated Rayleigh, or any two other distributions. Overall, the proximity or farness among these best-fit distributions motivate researchers to further investigate model efficacy in domain-specific practical applications (Johnson et al. 1995; Gupta and Kundu 1999; Scholz 2019).

4.2 Large earthquake occurrence probabilities

After analyzing the performance of the reference distributions, our final task is to assess long-term earthquake occurrence probabilities in the study region. For this, we utilize the best-fit exponentiated Rayleigh distribution to compute cumulative probability and a series of conditional probabilities for different elapsed times (time elapsed since the last large earthquake) and residual times (time to a future large earthquake). The conditional probability values are often represented in terms of hazard function curves of the best performed model (Fig. 5) for scientific, public and commercial purposes (Yadav et al. 2008; Pasari and Dikshit, 2014a, b). The conditional probability of a future event in the time window \(\left( {\tau ,\,\,\tau + v} \right)\) for a given elapsed time \(\tau\) and prospective residual time \(v\) can be mathematically defined (Pasari 2015) as

$$P\left( {V \le \tau + v|V \ge \tau } \right) = \frac{{F\left( {\tau + v} \right) - F\left( \tau \right)}}{1 - F\left( \tau \right)}\,\,\,\,\,\,\,\,\,\,\,\,\left( {v > 0} \right)$$
(5)
Fig. 5
figure 5

Conditional probability curves Hazard function curves for various elapsed times (\(\tau = 8,10,15,...,35\) years) as computed from a gamma, b Weibull, c exponentiated exponential and d exponentiated Rayleigh for M≥6 events in the Nepal Himalaya. The dot line represents the hazard curve corresponding to an elapsed time of 8 years (since the last powerful Gorkha earthquake in April 2015)

To note, the cumulative probability describes the occurrence of a future event within a certain time from the last earthquake, whereas the conditional probability determines the chance of an earthquake in the interval \(\left( {\tau ,\,\,\tau + v} \right)\), knowing that there has been no large event in the last \(\tau\) years. Using the most suitable exponentiated Rayleigh distribution, it is found that the estimated cumulative probability of a M ≥ 6 event in the Nepal Himalaya reaches 0.90–0.95 by 2028–2031, whereas the conditional probability reaches 0.90–0.95 by 2034–2037. These probability values are alarmingly high to draw attention of the disaster management authorities in Nepal.

As the last large earthquake in Nepal occurred on April 25, 2015, the elapsed time as of today is \(\tau = 8\) years (i.e., April 25, 2023). Therefore, for \(\tau = 8\) years and \(v = 1,4,7, \cdots ,25\) years, we compute conditional probabilities and their 95% confidence bounds (Table 5) using all the four best performed models, namely the gamma, Weibull, exponentiated exponential and exponentiated Rayleigh. Results show that the conditional probabilities according to all distributions reach 0.95 in about 14–25 years from now (2037–2048). Similarly, varying both elapsed time \(\left( {\tau = 8,10,15, \cdots ,35{\text{ years}}} \right)\) and residual time, we calculate a series of conditional probabilities and graphically display them in Fig. 5 and Figure S6 to examine hazard for large earthquakes in the study region. The curves in Fig. 5 show a consistent pattern for gamma, Weibull and exponentiated exponential distributions, though the exponentiated Rayleigh exhibits a relatively higher conditional probability value. The curves in Figure S6 highlight the short-term earthquake occurrence behavior corresponding to \(\tau = 8{\text{ (2023) and }}\tau = 10\,\,(2025)\) with future time \(v = 10\) year. To investigate more, we present conditional probabilities (Table 6) computed from the best-fit exponentiated Rayleigh distribution corresponding to \(\tau = 8\,(2023),10\,(2025),15\,(2030), \cdots ,35\,(2050)\) and \(v = 1,2,3, \cdots ,15.\) We note that unlike the exponential distribution that provides conditional probability independent of the elapsed time, the conditional probabilities corresponding to the exponentiated Rayleigh increase with increasing elapsed times. In fact, this observation is contrasting to the conditional probability calculated from gamma, Weibull or exponentiated exponential as each of these distributions with \(\hat{\beta } < 1\) exhibits a decreasing hazard rate for the present earthquake catalog.

Table 5 Estimated conditional probabilities for an elapsed time of 8 years (i.e., April 2023) in the Nepal Himalaya and its adjoining regions
Table 6 Estimated (from exponentiated Rayleigh model) conditional probabilities that an earthquake with magnitude \(M_{W} \ge 6\) will occur in next \(v\) years (residual time), given that no \(M_{W} \ge 6\) event has occurred in last \(\tau\) years (elapsed time) since the last Gorkha event in April 2015

5 Discussion

The Nepal Himalaya lies in the middle of the Himalayan collision boundary between India and Eurasia continental plates. Comprising major faults, lineaments and several physiographic units, the region regularly experiences large earthquakes at shallow to intermediate depths, with dominant thrusting mechanism. The most recent devastation in Nepal comes from the Mw 7.8 Gorkha earthquake on April 25, 2015, at a depth of around 8.2 km from the surface. This earthquake has reignited the necessity of long-term earthquake hazard analysis in the Nepal Himalaya. Among the wide range of techniques, such as satellite-based measures of surface deformation, geophysical approaches, seismological and geological studies, the present analysis concentrates on deriving the recurrence-time statistics of large (M ≥ 6) powerful earthquakes in the study region. Based on more than 200 years (1800–2022) of seismicity data that includes hypocentral information, the location of initial slip, magnitude and the origin time, we tabulate interevent times (between the main shocks) in the specific space–time window and implement twelve reference probability distributions to fit the data. These distributions include exponential, gamma, lognormal, Weibull, Levy, Maxwell, Pareto, Rayleigh, inverse Gaussian, inverse Weibull, exponentiated exponential and exponentiated Rayleigh. Identifying the most suitable reference distribution(s), we finally estimate long-term earthquake hazards in terms of conditional probabilities for various elapsed times and residual times.

There have been many previous studies to identify the most appropriate renewal time distributions. For example, Utsu (1984) employed four distributions (i.e., exponential, gamma, lognormal and Weibull) in Japan and observed that the lognormal distribution provides the best representation in some cases but worst in others, whereas gamma and Weibull provide an intermediate fit. Later, Nishenko and Buland (1987) developed recurrence time statistics using lognormal and Weibull models, and noted that lognormal enables a better fit. In a similar effort, Rikitake (1991) used Weibull and lognormal distributions for earthquake hazard analysis in Tokyo, Japan. For different seismotectonic provinces of Iran, Yazdani and Kowsari (2011) used five reference distributions, namely exponential, Pareto, lognormal, Rayleigh and gamma. They found that exponential and Pareto provide a reasonable recurrence time prediction in Iran. This type of interarrival time analysis has been performed by Parvez and Ram (1997, 1999) for the Hindu Kush and northeast India, and later for the entire India using Weibull, lognormal and gamma distributions. It was observed that gamma and Weibull models have better performance for the Indian subcontinent. Likewise, Yadav et al. (2008) carried out probabilistic assessment of earthquake hazards in three regions of Gujarat using lognormal, gamma and Weibull distributions. It was noted that for two regions, gamma turns out to be the best model, whereas for the other region, lognormal has the best fit to the observed data. Later, Yadav et al. (2010) implemented same set of three reference models in northeast India and concluded that gamma distribution is the most preferable one for the study region. For the northwest India and its adjoining regions, Chingtham et al. (2016) applied two stochastic models (lognormal and Weibull) and remarked that lognormal has better fit than the Weibull distribution. Recently, Pasari and Dikshit (2015a, 2015b) and Pasari (2018) used a versatile set of eleven to thirteen reference probability distribution to compute earthquake hazards for large earthquakes in the seismotectonically active regions of northeast India, western India (Kachchh, Gujarat) and northwest Himalaya. These studies have suggested that for the northeast India, the best representation comes from gamma, exponentiated exponential and Weibull, whereas for the Kachchh region, the time-independent exponential distribution performs best, and for the northwest Himalaya and its adjoining areas, the exponentiated Weibull, exponentiated exponential, Weibull and gamma distributions have relatively better suitability than the others. In a distinctive effort, Bajaj and Sharma (2019) divided the entire Himalayan arc into four seismogenic source zones (northwestern Himalaya, region of central seismic gap, eastern Nepal and Sikkim, and eastern Himalaya) and examined long-term earthquake hazards from large events in these regions. Based on four stochastic models, namely lognormal, gamma, inverse Gaussian and Weibull, their study has shown that the performance of the probability models depends on the selected magnitude range. For instance, in the northwestern Himalaya, gamma has the best fit for interarrival times corresponding to M ≥ 6 events, whereas inverse Gaussian has the best representation for M ≥ 7 events; for the central seismic gap region, inverse Gaussian is the best for M ≥ 6 earthquakes and lognormal for M ≥ 7 events; in the eastern Nepal and Sikkim, the best-fit distribution turns out to be lognormal for both magnitude ranges, and for the eastern Himalaya, the best fit comes from inverse Gaussian (M ≥ 6) and gamma (M ≥ 7) models. Recently, Pasari (2019b) studied the the similitude of inverse Gaussian and lognormal distributions in earthquake forecasting and they found that the heavy-tailed lognormal distribution has better fit to the observed interevent time data for all three studied regions. Therefore, above studies suggest that the empirical fitting of a reference distribution many a times varies from region to region and catalog to catalog. In light of this, here we explore the suitability of twelve competitive distributions and develop conditional probability curves through the best-fit models. It was found the best representation comes from the exponentiated Rayleigh distribution, though exponential, gamma, Weibull and exponentiated exponential provide a similar performance.

The concept presented in this work originally stems from the idea of stochastic renewal process of earthquake “cycles” in a large geographic region. The earthquake cycle defined here is of course different from the recurring events associated with a single fault or a fault section, as such a rigid concept of seismic cycle does not allow researchers to examine statistical distribution of seismicity in space, time and size. Throughout the literature, on the grounds of familiarity, simplicity and convenience, four members of renewal time distributions, namely exponential, gamma, lognormal and Weibull, have been extensively used to study recurrence times and associated earthquake forecasting, though there are some instances where the underlying model (say, Brownian passage time and exponentiated Weibull) is constructed based on theoretical grounds (Matthews et al. 2002; Pasari and Dikshit 2018). In contrast to earthquake forecasting that looks ahead in time, Rundle et al. (2016) recently formulated the concept of “earthquake nowcasting” based on the idea of renewal process of earthquake cycles through a “short-term fault memory.” Following the original formulation, Pasari et al. (2021) developed seismic nowcasting in Nepal to quantify the current level of seismic progression at 24 major cities through irregular repetitive cycle of regional earthquakes. Using natural times, the intersperse event counts of “small” earthquakes (say, M ≥ 4) between a pair of subsequent “large” earthquakes (say, M ≥ 6), the authors have developed five reference distributions (exponential, gamma, Weibull, exponentiated exponential and exponentiated Weibull) to empirically derive earthquake potential score (EPS) of these cities in a 0–100% scale of extremity. The nowcasting technique, unlike forecasting, does not require to separate out dependent events, such as foreshocks and aftershocks (Rundle et al. 2016). The analysis in Nepal assigns EPS values between 59 and 99% to 24 major cities, with the scores of metropolitan areas Kathmandu (95%), Pokhara (93%), Lalitpur (95%), Bharatpur (93%), Biratnagar (92%) and Birganj (93%), sub-metropolitan cities Janakpur (95%), Ghorahi (67%), Hetauda (93%), Dhangadhi (94%), Tulsipur (59%), Itahari (93%), Nepalgunj (97%), Butwal (96%), Dharan (93%), Kalaiya (93%) and Jitpur Simara (93%), and seven municipality areas Birtamod (89%), Damak (92%), Budhanilkantha (95%), Gokarneshwar (95%), Bhimdatta (94%), Birendranagar (99%) and Tilotamma (97%) (Pasari et al. 2021). Physically, these scores indicate, for example, that the capital city Kathmandu has reached the rear end of its earthquake cycle of magnitude 6.0 or higher events, whereas the Tulsipur city is near the middle of its earthquake cycle. Therefore, the results of earthquake nowcasting, in contrast to the earthquake forecasting, have some theoretical grounds on the basis of tectonic stress accumulation and energy release in the study region. Like conditional probabilities in forecasting, the EPS values enable a systematic ranking of the cities based on their “current” exposure to earthquake risk. In an effort to characterize earthquake hazard in Nepal using geodetic data, several researchers (e.g., Bilham et al. 2001; Ader et al. 2012; Lindsey et al. 2018; Sreejith et al. 2018; Bilham 2019; Sharma et al. 2020) have highlighted high-risk areas in terms of crustal velocity, strain distribution, moment deficit rate and earthquake potential. It has been suggested that the 2015 Gorkha earthquake has partially released the accumulated strain and the central seismic gap region covering the western part of Nepal is the most earthquake prone area to experience a great Himalayan earthquake (Mw ≥ 8.0) in the near future (Sreejith et al. 2018; Bilham 2019; Sharma et al. 2020). Therefore, with the high volume of observational data and sophisticated methodology, we believe that interdisciplinary efforts in future will provide more realistic earthquake hazard assessment strategy in the Nepal Himalaya.

6 Conclusions

The present study leads to the following conclusions:

  1. (i)

    The best performed distributions are the exponentiated Rayleigh (rank 1), exponentiated exponential (rank 2), Weibull (rank 3), exponential (rank 4) and the gamma (rank 5), whereas lognormal (rank 6) and the inverse Weibull distribution (rank 7) reveal intermediate suitability, and the rest, namely Maxwell (rank 8), Rayleigh (rank 9), Pareto (rank 10), Levy (rank 11) and inverse Gaussian (rank 12), have poor suitability to the observed interarrival times of large events in the Nepal Himalaya.

  2. (ii)

    The best-fit exponentiated Rayleigh model shows that the estimated cumulative and conditional probability of a M ≥ 6 event in the study region reach 0.90–0.95 in the next 5–8 years (2028–2031) and 11–14 years (2034–2037), respectively.

  3. (iii)

    Emanated conditional probability curves (hazard function curves) for different elapsed times and residual times reveal long-term seismic hazard in the study region. The probability values are alarmingly high to draw attention of the disaster management authorities in Nepal.