1 Introduction

The time series of earthquakes in a catalog is often treated as a Poisson process. The more such a time series is thinned by eliminating the smaller shocks, the closer it resembles a Poisson process. The reason is that any stationary process can be made to tend to a Poisson process by applying an entropy-increasing operation such as superposition, thinning, or random translation (Daley and Vere-Jones 2002; Lomnitz 1994). Actually, the earthquake process may not obey the conditions of independence, stationarity, and orderliness which define the Poisson process. Nevertheless, the Poisson assumption can be useful as a model when one is dealing with large, rare events.

Consider a sequence of earthquakes modeled after a Poisson process and let the rate of occurrence be λ events per unit time. The probability that an interval of time of length θ contain n events is

$$P\left( {n,\theta } \right) = \frac{{\left( {\lambda \theta } \right)^{n} }}{n!}e^{ - \lambda \theta } ;\quad n = 0,1,2, \ldots .$$
(1)

Many seismic risk estimates are based on the assumption of Poisson-distributed seismicity, meaning that the intervals τ between events are distributed exponentially (cf. Parzen 1960):

$$f(\tau ) = \lambda e^{ - \lambda \tau } ;\quad \tau \ge 0;$$
(2)

where f(τ) is the probability density function (pdf) of τ. As Eq. (2) shows, f(τ) attains its maximum equal to λ at τ = 0. Figure 1 illustrates the exponential probability density function (pdf) and its cumulative for two very different values of λ; λ = 0.14 year−1 corresponds to the global rate of M ≥ 8.5 significant earthquakes reported by NOAA over the last 130 years (NOAA 2013), and λ = 3.25 year−1 represents the rate of M ≥ 7.5 earthquakes.

Fig. 1
figure 1

Top Exponential probability density distributions of interseismic intervals for two different occurrence rates. Bottom The corresponding cumulatives with vertical dash-dotted lines indicating the intervals for which the cumulative probability equals 0.5

Distribution (2) is monotonically decreasing, so that f(τ a ) < f(τ b ) for all τ a  > τ b . For any arbitrary value of λ, f(τ = 0) = ef(τ = 1/λ). This might be interpreted as saying that even if the mean return period of a megaquake is a thousand years, it is e = 2.718… times more likely to occur in a small interval around today than around the average recurrence time!

2 The roles of the cumulative distribution and the seismicity rate

Let us proceed with our argument. It is well known that the probability of any specific interval τ is f(τ)dτ which is infinitesimal. Finite probabilities are obtained for finite interval ranges, so that

$$\int\limits_{0}^{{\tau_{0} }} {f\left( \tau \right){\text{d}}\tau { > }\int\limits_{{\tau_{0} }}^{{2\tau_{0} }} {f\left( \tau \right){\text{d}}\tau { > }} \int\limits_{{2\tau_{0} }}^{{3\tau_{0} }} {f\left( \tau \right){\text{d}}\tau { > }} },$$

and so on. Thus, the most probable interval range is the one that includes τ = 0, a rather alarming result. What does this say in terms of disaster prevention?

The probability of small intervals should be considered in terms of the cumulative probability distribution

$$F\left( \tau \right) = \int\limits_{0}^{\tau } {f\left( u \right){\text{d}}u = 1 - e^{ - \lambda \tau } ;\quad \tau \le 0,}$$
(3)

so that the probability of an interval having a duration between τ 1 and τ 2 is given by

$$P\left( {\tau_{1} ,\tau_{2} } \right) = F\left( {\tau_{2} } \right) - F\left( {\tau_{1} } \right);\quad \tau_{2} \ge \tau_{1} \ge 0.$$
(4)

From our earlier argument, the probability of occurrence of an interval τ in the range [τ 1,τ 1 + Δτ) is indeed maximal for τ 1 = 0 and decreases with increasing τ 1. This decrement is faster for larger values of λ. For instance, when λ = 0.14 year−1 the probability of an interval between 0 and 1 day is P(0,1/365) = 0.00122, larger than the probability of an interval between 1 and 2 days P(1/365,2/365) = 0.001216, and so on.

However, the probability of an interval being longer than 1 day is 0.99878. Thus, an interval in the range of [0,1 day) is indeed the most probable one-day interval, but its actual probability is very small compared with that of its complement. In fact, for this value of λ, the range of smaller intervals has to become as large as [0,5.006 year) to reach a probability of 0.5. In conclusion, even though smaller intervals are more probable, an even chance of success requires having an interval in the range between 0 and 5 years.

As illustrated in Fig. 1 (bottom), the uncertainty is less extreme for larger occurrence rates. Thus, for λ = 3.25 year−1, an even chance of success is attained for the range of 0–0.21 years. And if the rate of occurrence of smaller earthquakes is on the order of hundreds or thousands per year, the most likely interval does tend to zero.

The uncertainty is reflected in the histograms. If the class intervals are too small or too large, the exponential fit becomes worse. There is an optimal class width which may be selected by comparing the (theoretical) number of elements in any given class with the value of the distribution for the middle of that class. Figure 2 shows, for our two examples, the theoretical histograms and distributions for the class widths that yielded the best fit. For λ = 0.14 year−1, the most probable histogram interval is from 0 to 6.13 years with 0.57209 probability; for λ = 3.25 year−1, the most probable histogram interval is from 0 to 0.10 years with 0.27775 probability.

Fig. 2
figure 2

Exponential probability density distributions of interseismic intervals for two different occurrence rates and the corresponding theoretical histograms for class widths that resulted in the best fits

The optimal class width checks with a well-known statistical rule of thumb, namely the number of classes over the range of observations should approximately equal the square root of the number of observations. Using Monte Carlo simulation with 10,000 realizations for each value of λ, the distribution of interval maxima is shown in Fig. 3, where circles represent the mean value and bars indicate ± 1 standard deviation.

Fig. 3
figure 3

Monte Carlo simulation results for the range of maximum intervals τ max as a function of λ; the circles represent the mean value of the maximum intervals, and the bars indicate ±1 standard deviation

For T = 130 year, λ = 0.14 year−1, N = 18, and \(\left\langle {\tau_{\hbox{max} } } \right\rangle = 23.53\,{\text{year}}\) result in four classes 5.88 years wide; and λ = 3.25 year−1, N = 423, and \(\left\langle {\tau_{\hbox{max} } } \right\rangle = 2.04\,{\text{year}}\) result in 21 classes 0.097 years wide. These class widths agree with those determined theoretically. They check well with variations in the expected observation ranges well below one standard deviation.

Finally, consider the problem of forecasting occurrence times, when the intervals are distributed according to (2), for some time t beyond the last earthquake, occurred at time t 0. Forecasting at t 0 for any possible t results in the well-known waiting time paradox (c.f. Feller 1971; Daley and Vere-Jones 2002). However, forecasting for a given t 1, given that the expected event has not occurred yet, presents no problem. For a given time t 1, the interval distribution is given by (2) normalized by the fact that intervals smaller than τ 1 = t 1 − t 0 did not occur. This is the well-known hazard function:

$$f\left( {\tau |\tau \ge \tau_{1} } \right) = \frac{{\lambda e^{ - \lambda \tau } }}{{1 - \int_{0}^{{\tau_{1} }} {\lambda e^{ - \lambda \tau } {\text{d}}\tau } }} = \lambda e^{{ - \lambda (\tau - \tau_{1} )}} ,$$
(5)

which is equivalent to (2) for intervals \(\tau^{\prime } = \tau - \tau_{1} ;\, \tau^{{\prime }} \ge 0\).

This is a well-known property of the exponential distribution, due to the lack of memory for the Poisson distribution. Hence, our original discussion about the interval to the next earthquake applies at any given time.

3 Conclusions

Although the exponential distribution of inter-earthquake intervals may suggest that shorter interval ranges are more probable than longer ones, the actual probability of a small range may be dwarfed by the probability of the interval falling outside the range. This effect is particularly pronounced at low event occurrence rates, as observed in very large earthquakes and megaquakes. This is the case of interest in practice. For shorter intervals to be most probable, the interval range would have to be long enough to exceed a probability of 0.5, which means that the actual value of the interval is uncertain. It could take any value in the range, and the next earthquake could occur at any time in the range.