1 Introduction

The territory of Azerbaijan is located within the central part of the Mediterranean tectonic belt, whose seismicity is caused by intensive geodynamic interaction of the Eurasian and Arabian lithospheric plates (McKenzie 1972; Reilinger et al. 2006; Kadirov et al. 2012; Kadirov et al. 2015).

Although the territory of Azerbaijan represents one of the most seismically active regions worldwide, and in the past several strong and catastrophic earthquakes with magnitude M ≥ 6 occurred (like the Goygol earthquake (1139), the Ganja earthquake (1235), the Eastern Caucasian earthquake (1668), the Mashtaga earthquake (1842), and the numerous Shamakhi earthquakes (1192, 1667, 1669, 1828, 1859, 1868, 1872, 1902), or Caspian earthquakes (957, 1812, 1842, 1852, 1911, 1935, 1961, 1963, 1986, 1989, 2000) that triggered earth relief changing, destroyed buildings completely and caused numerous casualties, it was only after the earthquake in the Caspian Sea occurred on November 25, 2000 (M = 6.3; φ = 40°, λ = 50°, h = 35 km) that the seismic monitoring of Azerbaijan was improved by the installation of modern telemetric stations with satellite communications system (Yetirmishli et al. 2013; Kadirov et al. 2013) providing a rather good spatial coverage of the whole territory and surrounding regions.

In the territory of Azerbaijan, there are also numerous mud volcanoes which are perhaps under the influence of the crust deformations and earthquakes (Fig. 1). The number of mud volcanoes located on land and at sea exceeds 250. Many mud volcanoes are active at the present time (Yakubov et al. 1971; Aliyev et al. 2009, 2015). A relationship between the occurrence of large earthquakes and the eruptions of close mud volcanoes is well known and it is studied in works (Mellors et al. 2007; Babayev et al. 2014). Mellors et al. (2007) showed that there is a statistically significant number of mud eruptions triggered by earthquakes by analyzing their temporal–spatial relationships especially in Azerbaijan. Babayev et al. investigated the evaluation of both the static and dynamic strains induced by earthquakes in the substratum of mud volcanoes. A relationship between the occurrence of the crust deformations (GPS velocity) and the activity of mud volcanoes is studied in works (Kadirov et al. 2014; Kadirov and Safarov 2013). The analysis of the relationships between the contemporary deformation processes and mud volcanic activity overall demonstrates the predominance of vertical displacements, which may result from the horizontal compression, where horizontal strains act as triggers.

Fig. 1
figure 1

Overview of the tectonics of the Azerbaijan. Black vectors are GPS velocities relative to Eurasia from Reilinger et al. (2006). Red triangles are mud volcanoes. The focal mechanism solutions are from Global CMT catalog (Ekström and Nettles 1997; Huang et al. 1997; Chen et al. 2001) and the white star marks the approximate location of the 1902 Shamakhi and 1139 Ganja earthquakes. NCT North Caucasus Thrust fault, MCT Main Caucasus Thrust fault, LCT Lesser Caucasus Thrust fault, WCF West Caspian fault, NCF North Caspian fault. The figure was generated using the Generic Mapping Tools (GMT) software (Wessel et al. 2013)

With the exception of some case studies, mainly focused on seismic hazards and risk of some regions within the Azerbaijani territory (Babayev 2010; Babayev and Telesca 2014), an exhaustive statistical analysis of the properties of the space-time-magnitude distribution of the entire instrumental seismic catalog of Azerbaijan is still lacking, up to our knowledge.

Therefore, the aim of the present study is to furnish a detailed picture of the statistical properties of the most updated seismic catalog of Azerbaijan and surrounding regions from 2003 to 2016.

2 Seismo-tectonic settings

The territory of Azerbaijan represents the mountainous section of the Greater Caucasus, the Lesser Caucasus, Kur depression zone, and the South Caspian Basin (Fig. 1). Mountains of the Greater and Lesser Caucasus extend between the Black and Caspian seas and creates a part of the continuous Alpine-Himalayan orogenic belt (Nemčok et al. 2011; Kadirov et al. 2012; Alizadeh et al. 2016) (Fig. 1). Greater and Lesser Caucasus is the main orogens of the Azerbaijan earthquake-prone country. The Azerbaijan territory has been exposed to the continuous collision between Arabian and Eurasian plates (Mckenzie 1972; Sengor et al. 1985; Jackson 1992; Philip et al. 2003; Reilinger et al. 2006; Kadirov et al. 2012; Kadirov et al. 2015; Alizadeh et al. 2016). The collision closed the Greater Caucasus region, further deformed it together with the Eurasian Platform during Middle-Late Miocene, and the Kur Basin and the Greater Caucasus become zones of the maximum underthrusting (Nemčok et al. 2011).

Plate tectonic reconstructions provide only broad constraints on the timing of the initial collision of the Arabian Plate with Eurasia of between 10 and 30 Ma BP (e.g., Allen et al. 2004; Kadirov et al. 2008; Alizadeh et al. 2016) and indicate that the rate of northward motion of Arabia relative to Eurasia has remained more or less constant at about 20 mm/year since collision began (Reilinger et al. 2006). These reconstructions imply that Arabia has progressed from 200 to 600 km “into” space formerly occupied by Eurasian continental lithosphere. This “intrusion” of Arabia into Eurasia continues to be accommodated by lithospheric shortening on roughly E-W striking thrust faults and lateral displacement of lithosphere out of the collision zone along right-lateral strike-slip faults (McKenzie 1972; Sengor et al. 1985; Jackson 1992; Reilinger et al. 2006). These regional tectonic processes give rise to earthquakes that have devastated the Caucasus region throughout recorded history.

Repeating GPS measurements in Azerbaijan during the period 1998–2016 were providing direct observations of present-day surface motions (Fig. 1). They clearly define active convergence between the Lesser Caucasus/Kur depression and the Greater Caucasus with strain concentrated along MCT (Philip et al. 2003; Reilinger et al. 2006; Kadirov et al. 2008; Kadirov et al. 2012; Telesca et al. 2015). Present-day slip rates on the MCT decrease from 10 ± 1 mm/year in eastern Azerbaijan to 4 ± 1 mm/year in western Azerbaijan (Kadirov et al. 2008; Kadirov et al. 2015). In the Lesser and Greater Caucasus, the observed stress pattern shows lateral variations. The seismic activity pattern provides important information about the recent block differentiation.

The predominant faults in Azerbaijan are longitudinal sublatitudinal of the Caucasus extension which considerably obscures the appearance of the transversal faults. Tectonics of Azerbaijan is characterized by main fault structures which are North Caucasus Thrust fault (NCT), Main Caucasus Thrust fault (MCT), Lesser Caucasus Thrust fault (LCT), West Caspian fault (WCF), North Caspian fault (NCF) (Fig. 1). The compression is observed in the western part of Azerbaijan through MCT fault and the depression occurs southward alongside the northern edge of the mountain ring. Besides, an obvious transition from the left-lateral strike-slip to the mostly right-lateral strike-slip occurs towards southern part of the Greater Caucasus Mountain Range. Reverse dip slips in the north-north-eastern direction are predominant along MCT, which results in the crustal contraction along MCT (Nemčok et al. 2011; Kadirov 2000; Babayev and Telesca 2014; Telesca et al. 2013; Kadirov et al. 2015; Alizadeh et al. 2016). Figure 1 shows faults determined by surface geological mapping and those interpreted from earthquake and gravity data. This set of faults demonstrates that the majority is formed by NW-SE striking faults. Mapped faults (Alizadeh 2008; Shikhalibeyli 1996; Kadirov 2000), based on observation of the dip-slip displacement component, indicate that some NW-SE striking faults comprise reverse and normal faults (Kadirov 2000; Agayeva and Babayev 2009). Their dip towards NE is prevalent. Mapped NNW-SSE to NE-SW faults in the Greater Caucasus region indicate that they formed as dextral and sinistral strike-slip faults accommodating in homogeneous shortening (Alizadeh 2008; Shikhalibeyli 1996).

Taking into account the geological structure, level of seismicity, complex analysis of GPS velocities (Reilinger et al. 2006), seismicity (Kondorskaya and Shebalin 1982; Gasanov 2003; Babayev 2010; Babayev et al. 2010), fractal dimension of the earthquakes, and the stress state of the Earth’s crust (Agayeva and Babayev 2009), Azerbaijan territory can be divided into the several individual large zones: southern slope of the eastern part of Greater Caucasus (SSGC), Kur depression (KD), northern slope of Lesser Caucasus (NSLC), Gusar-Shabran depression (G-SD), Absheron Peninsula (AP), Talish Zone (TZ), and Caspian Sea. There are four seismogenic zones throughout the southern slope of the eastern part of the Greater Caucasus: Balaken-Zagatala, Sheki-Gabala, Shamakhi-Ismayilli and Absheron (Kadirov et al. 2013) (Fig. 2). The Balaken-Zagatala zone and the Shamakhi-Ismayilli zone are characterized by the extension, and the displacements over those areas are mainly normal dip slips and normal dip slips with strike-slip motion. The Sheki-Gabala and the Absheron zones are mostly compression with the thrust and reverse faults (Kadirov et al. 2013). The Lesser Caucasus is characterized by strike-slip fault type, while Talysh Mountains are characterized by thrust regimes.

Fig. 2
figure 2

The seismogenic zones throughout the southern slope of the eastern part of the Greater Caucasus: I—Balaken-Zagatala, II—Sheki-Gabala, III—Shamakhy-Ismailly, and IV—Absheron

According to the map of the focal mechanisms and stress distribution (Fig. 1), the thrust fault of horizontal compression trending north-north-east in the western part of the southern Caucasus and east-northeast within the eastern part of the Greater Caucasus occurs (Agayeva and Babayev 2009; Kadirov et al. 2013). The map of focal mechanisms of earthquakes with magnitudes larger than 5 is shown in Fig. 1.

3 Data

Our multiparametric statistical analysis of the seismicity relied on the catalog the Republican Seismic Survey Center of the Azerbaijan National Academy of Sciences of the earthquakes with local magnitude M ≥ 2, available at the following link: http://www.seismology.az/en/earthquakes#.WMVMgDieaPg (Fig. 3 shows the seismic network of Republican Seismic Survey Center of Azerbaijan National Academy of Sciences). Figure 4 shows the spatial distribution of the investigated seismic catalog from 2003 to 2016.

Fig. 3
figure 3

The Seismic network of Republican Seismic Survey Center of Azerbaijan National Academy of Sciences

Fig. 4
figure 4

Spatial distribution of seismicity in Azerbaijan and surrounding regions from 2003 to 2016. The sizes of the crosses are proportional to the magnitude of the events

4 Methods and results

In this study, we investigate the seismicity occurred from January 1, 2003, to April 21, 2016, in the territory of Azerbaijan and surrounding regions by employing several and independent statistical approaches. Our aim is to get the most exhaustive description of the earthquake process involving the territory of Azerbaijan, by furnishing a complete space-time dynamical characterization of the Azerbaijani seismicity that, up to our knowledge, has not been performed so far.

4.1 The frequency-magnitude distribution

The frequency-magnitude distribution (FMD) in tectonic areas can be fit by the Gutenberg-Richter (GR) law (Gutenberg and Richter 1944) that is a power-law relationship between a threshold magnitude M th and the cumulative number of seismic events with magnitude larger than such a threshold; it is generally expressed as log10(N) = a-bM th (a line in semi-log scales) where N represents the cumulative number of events whose magnitude is above the threshold, a represents the earthquake productivity, and b is a critical parameter informing about the size distribution of earthquakes (Gutenberg and Richter 1944; Ishimoto and Iida 1939). A large/small b value suggests a relatively larger/smaller proportion of less intense events in relation with the more intense ones. In particular, the b value can indirectly quantify stress crustal conditions (Scholz 1968; Wyss 1973) or identify volumes of active magma bodies (Wiemer et al. 1998). It is even employed to discriminate purely tectonic seismicity (b < 1.5–1) from volcano-tectonic earthquakes (b > 1.5) that are principally caused by hydraulic fracturing of the host rock induced by overpressurized magma and/or associated fluids. Variations in seismic b value of acoustic emission events during the stress buildup and release on laboratory-created fault zones were investigated (Goebel et al. 2013), and evidence was shown that the b value in the size distribution of acoustic emission events decreases linearly with differential stress (Scholz 2015). Tormann et al. (2015) found that b changes in space mirroring the tectonic regime. Spada et al. (2013) found a negative correlation between b value and differential stress, confirming, thus, the idea of b as stress meters in the Earth’s crust (Schorlemmer et al. 2005).

The reliable estimation of b for seismicity represents an important task aimed at the characterization of different stages of seismicity evolution and, thus, changes in the dynamic processes; it has also a great importance in seismic hazard assessments (Naylor et al. 2009).

In our paper, we estimate the b value by means of maximum likelihood method (Aki 1965),

$$ b=\frac{{ \log}_{10}(e)}{< M>-\left({M}_{\mathrm{c}}-\raisebox{1ex}{$\Delta {M}_{\mathrm{bin}}$}\!\left/ \!\raisebox{-1ex}{$2$}\right.\right)}, $$
(1)

where <M> is the mean magnitude of the subset of seismic events with magnitude larger or equal to the completeness magnitude M c and ΔM bin represents the binning width of the catalog (Utsu 1999).

The standard deviation of the estimate of b is calculated by using the Shi and Bolt’s (1982) formula,

$$ {\sigma}_b=2.3{b}^2\sqrt{\frac{\sum_{i=1}^N{\left({M}_i-< M>\right)}^2}{N\left( N-1\right)}}. $$
(2)

The estimation of the b value depends on the estimate of the completeness magnitude M c which represents the minimum magnitude above which the seismic catalog can be considered complete. Selecting only the earthquakes with magnitude M ≥ M c ensures that the results of the statistical analysis are reliable.

Figure 5 shows the cumulative (CFMD) (squares) and non-cumulative frequency-magnitude distribution (NCFMD) (triangles) of the Azerbaijani seismicity. The CFMD and NCFMD are used to evaluate the completeness magnitude M c. There are several methods that perform such estimation. Wiemer and Wyss (2000) proposed the method of the maximum curvature (MAXC), which allows a simple estimate of M c as that magnitude corresponding to the highest frequency of earthquakes in the NCFMD (Fig. 5a). Another method, so called entire magnitude range (EMR) (Woessner and Wiemer 2005) considers the entire magnitude set, whose complete part is modeled by a power-law with a and b value estimated by the MLE method, and the incomplete part is modeled by a normal cumulative distribution function describing the detection capability as a function of magnitude is fitted to the data and depends on μ (magnitude at which 50% of the earthquakes are detected), σ (the standard deviation describing the width of the range where earthquakes are partially detected) and M c, which represents the lower limit of magnitudes that are detected with probability 1. The M c corresponds to the magnitude that maximizes the log-likelihood function of a, b, μ, and σ (Fig. 5b). Wiemer and Wyss (2000) proposed also a method based on the goodness-of-fit (GFT) calculated as the absolute difference of the number of earthquakes in the magnitude bins between the observed CFMD and synthetic CFMDs computed using the a and b values of GR law of the observed dataset for M ≥ M th as a function of increasing threshold magnitudes M th . It is taken as M c of the catalog the magnitude above which the 90% of the observed data are well modeled by a straight line (Fig. 5c). All these methods were implemented in the freely available software package ZMAP (Wiemer 2001). In our case, the three methods furnish different values of M c: 2.1 (MAXC), 2.5 (EMR), and 2.9 (GFT), and correspondingly different values for the couple (a, b): (4.45, 0.507) (MAXC), (4.68, 0.579) (EMR) and (5.3, 0.763) (GFT). The standard deviation σ b , calculated by using Eq. (2) are: 0.008 (MAXC), 0.01 (EMR), and 0.02 (GFT). Actually, the shape of NCFMD seems quite unusual, because appears bimodal with the presence of two very close maxima, one at 2.1 and one at 3.0, although the absolute maximum is at 2.1, as identified by the MAXC method. The bimodal shape of the NCFMD leads to clearly lower estimation performance of MAXC and EMR methods; in fact, the GR power-law (red line) does not fit adequately the NCFMD for magnitude larger than M c, especially at higher ranges. The bimodal shape of the NCFMD is very probably due to a mixing of seismic data recorded by different seismic network spatial configurations: the maximum at 3.1 could be associated to a regional network and the maximum at 2.1 can be associated to a local network. Mignan (2012) found the same phenomenon in the Nevada earthquake catalog whose NCFMD displayed two different maxima due to the superposition of two different NCFMD arising from a regional and local seismic network. Also Wiemer and Wyss (2002) examined several cases of bimodal NCFMDs due, for instance, to contamination by explosions, or to onset of volcano-related events.

Fig. 5
figure 5

NCFMD (triangles) and CFMD (squares) of the Azerbaijani catalog and estimation of the completeness magnitude by using the MAXC (a), EMR (b), and GFT (c) methods. The red line represents the GR law with a and b value estimated on the base of the value of the completeness magnitude M c. d b value versus magnitude threshold

In our case, therefore, for the Azerbaijani seismic catalog, there should exist a spatial heterogeneity in M c that significantly alters the shape of the NCFMD. The estimation of the M c by using the GFT method, thus, seems more reliable.

Figure 5d shows the variation of b value with the magnitude threshold 2.0 until 4.5, with error bars calculated by using formula (2). It is visible that for thresholds until 2.9 the b value increases with the threshold but it is quite stable for 3.0 ≤ threshold ≤ 3.8 with b value around 0.85; for thresholds larger than 3.8, the b value peaks at 4.2 but stabilizes again around 0.85 at largest thresholds.

Thus, on the base of all these results, we can assert that the completeness magnitude of the Azerbaijani catalog during the period 2003–2016 is 3.0 and the b value of the GR law is about 0.85.

In the next statistical analyses, then, we will consider the subset of seismic events with magnitude M ≥ 3.0. The number of events with M ≥ 3.0 is 1163. The spatial distribution of the events is shown in Fig. 4. Figure 6 shows the time-magnitude plot (Fig. 6a) and the time-depth plot (Fig. 6b) for the events with M ≥ 3.0.

Fig. 6
figure 6

a Time-magnitude plot and b time-depth plot of the seismicity of the investigated area (M ≥ 2.9) (the plots were produced by ZMAP software)

4.2 The coefficient of variation

The coefficient of variation C v is a simple quantity used to investigate the properties of the temporal distribution of a seismic series. It is defined as

$$ {C}_{\mathrm{v}}=\frac{\sigma}{\mu}, $$
(3)

where σ and μ are the standard deviation and the average of the interevent times (Fig. 7), respectively. If C v is smaller, equal or higher than 1, the seismic series is regular (or periodic), purely random (or Poissonian) or clustered (Kagan and Jackson 1991). The coefficient of variation was extensively employed to identify the type of temporal distribution of earthquakes in many seismic zones worldwide. Recently, Telesca et al. (2016) introduced the local coefficient of variation L v, defined by Shinomoto et al. (2005), to analyze the volcano-related seismicity at El Hierro, Canary Islands (Spain):

Fig. 7
figure 7

Interevent time series of the Azerbaijani seismic catalog

$$ {L}_{\mathrm{v}}=\frac{1}{N-1}\sum_{i=1}^{N-1}3\frac{{\left({T}_i-{T}_{i+1}\right)}^2}{{\left({T}_i+{T}_{i+1}\right)}^2} $$
(4)

The value of C v and L v is 1 for a Poisson process (with exponential probability density function of the interevent times) and is 0 for a periodic process. C v is able to identify global variability of a whole interevent sequence and can be affected by event rate fluctuation, while L v identifies local stepwise variability of interevent times, because it is rather independent of slow variation in average rate. Just as an example, if one joins two periodic point processes like those in Fig. 8, C v ≫ 1 because globally the process appears highly clustered, but L v ~0, due to the regular character of the process at a local scale.

Fig. 8
figure 8

Example of superposition of two periodic finite point processes

We calculated both the global and the local coefficient variation for the seismicity of Azerbaijan for the earthquakes with magnitude M ≥ 3.0, and obtained C v ~1.25 and L v~1.28. We compared these values with those obtained from 10,000 Poisson processes randomly generated with the same size (N = 1163) and mean (<T > ~4.18 days) as the original seismic interevent time series. The 95% confidence interval, which is given by the 2.5th and 97.5th percentiles of the distribution of C v and L v of the Poissonian surrogates, are [0.9448, 1.0579] for C v and [0.9393, 1.0630] for L v; and this indicates that both globally and locally, the distribution of the Azerbaijani earthquake occurrence times is clusterized.

4.3 The scaling exponent of the magnitude time series

The Detrended Fluctuation Analysis (DFA) (Peng et al. 1994) is a well-known method employed to detect long-range correlations in non-stationary series; it was used in many scientific fields (Telesca and Lovallo 2009; Telesca and Lovallo 2010; Telesca and Lovallo 2011; Telesca et al. 2012). Telesca et al. (2016) found a relationship between the enhancement of the scaling exponent calculated by the DFA (see later in the text) during the reactivation periods of the volcanic activity at El Hierro, Canary Islands (Spain) in the 2011–2014. Varotsos et al. (2014) applied the DFA to the series of magnitude of earthquakes occurred in different seismo-tectonic zones worldwide and found characteristic variations in the temporal correlations between earthquake magnitudes and interpreted such variations in terms of earthquake prediction. Lennartz et al. (2008) analyzed by using the DFA the long-range correlations of the magnitude series of earthquakes occurred in Northern and Southern California and evidenced that the temporal fluctuations of magnitudes are characterized by long-term memory in the seismicity. Varotsos et al. (2012) found that in stationary regimes, California seismic activity is characterized by long-range temporal correlations among magnitudes (indicated by a DFA scaling exponent ~0.6), while before the occurrence of large shocks, these correlations break down.

From all above, we can argue that the analysis of long-range correlations in earthquake magnitude series can allow gaining insight into the dynamics of a seismic process.

The DFA method is described below:

  1. i)

    The magnitude series M i , where i = 1,…,N, and N is the total number of events is integrated

$$ {y}_k={\sum}_{i=1}^k{M}_i-< M>, $$
(5)

where <M> is the average magnitude of the sequence;

  1. ii)

    The integrated series y k is divided into windows of same length n;

for each n-size window, the least square line y n,k fits y k and is subtracted from y k ;

  1. iii)

    The fluctuation, F n , is calculated

$$ {F}_n=\sqrt{\frac{1}{N}\sum_{k=1}^N{\left[{y}_k-{y}_{n, k}\right]}^2}; $$
(6)
  1. iv)

    The steps i–iv are repeated for all the available window sizes n; if the relationship between F n ~n is a power-law, the magnitudes are long-range correlated:

$$ {F}_n\sim {n}^{\alpha}; $$
(7)
  1. v)

    From the numerical value of the scaling exponent α, we can get information about the type of correlations: if the magnitude is uncorrelated, then α = 0.5; if the magnitudes are persistently correlated (meaning that a large (small) magnitude (compared to the mean) has larger probability to be followed by a large (small) magnitude), then α > 0.5; if the magnitudes are antipersistently correlated (meaning that a large (small) magnitude (compared to the mean) has larger probability to be followed by a small (large) magnitude), then α < 0.5.

Figure 9 shows the fluctuation function F n of the magnitude series (M ≥ 3.0) of the Azerbaijani catalog plotted in log-log scales. The fluctuation function displays a very clear power-law behavior (indicated by the linear shape in bilogarithmic scales). The slope of the line fitting in a least square sense the fluctuation functions furnishes an estimate of the scaling exponent, α~0.53 that indicates that the magnitudes should not be correlated. This indicates that in the period analyzed and for the area investigated, the magnitude of any event that occurred in a certain time and in a certain location does not depend on the magnitude of past events nor will influence the magnitude of future events. The significance of α is evaluated by using the method of surrogates, uncorrelated magnitude sequences constructed by randomly shuffling the magnitudes of the original series. The DFA is then performed on the surrogates and the scaling exponent α S of surrogates is computed. After generating a sufficiently large number of surrogates and calculated α S for each one, the 2.5th and 97.5th percentiles of the distribution of α S will furnish the 95% confidence band. If α lies within the 95% confidence band, this indicates that the original sequence is significantly uncorrelated, otherwise, it is correlated (persistent if above, or antipersistent if below the confidence band). In our case, on the base of 10,000 random surrogates, the 95% confidence band is given by [0.40, 0.62], thus the magnitude series is uncorrelated.

Fig. 9
figure 9

Log-log plot of the fluctuation function of the magnitude series (M ≥ 3.0) for the Azerbaijani catalog. The 95% confidence interval on the base of 10,000 surrogates is [0.41, 0.57]

This finding is not trivial, if compared with analogous results obtained in previous studies. Lippiello et al. (2008), for instance, based the feasibility of earthquake predictions also on the dependence of magnitude of an event from those of past earthquakes. In fact, if temporal and/or spatial clusterization is nowadays accepted by the seismological community, much more debatable is the presence of correlations in the magnitude series. Lippiello et al. (2008) suggested that seismic events occur with enhanced probability close in time, space, and even magnitude to previous earthquakes. Sarlis et al. (2009), using the natural time approach verified that correlations between magnitudes are larger for closer in time earthquakes when the maximum interevent interval varies from half a day to 1 min).

In our case, we found that magnitudes are independent, due to the absence of correlations, and thus they are in principle unpredictable. This last finding represents good information, in any case, in the context of seismic hazard analysis of the area.

This finding supports current short-term earthquake clustering models like ETAS, which draw the magnitudes of future events randomly from a GR distribution.

4.4 The correlation dimension of the spatial distribution of the epicenters

The spatial distribution of the earthquake epicenters was analyzed by using the Grassberger-Procaccia method (Grassberger and Procaccia 1983) that is well known in spatial statistics for its efficiency and low noisiness in estimating the correlation dimension D c of datasets with even small size (Doxas et al. 2010).

Let N R<r be the number of points separated by a distance R less than r; then, the correlation integral is defined as the fraction of couples of points whose interdistance is less than r:

$$ C(r)=\frac{2{N}_{R< r}}{N\left( N-1\right)} $$
(8)

For fractal spatial point processes, \( \begin{array}{l} C(r)\approx {r}^{D_{\mathrm{C}}}\\ {}\end{array} \). The numerical value of the correlation dimension D c (that is an estimate of the fractal dimension) reveals spatial patterns of the point process. In bidimensional systems, D c is between 0 and 2; for D c = 0, all the points are clustered into one; for D c = 2, the points are homogeneously distributed. D c is given by the slope of the line that fits in its linear (scaling) range by the least squares the correlation integral versus r plotted in bilogarithmic scales.

The estimation of the correlation dimension, however, could be affected by bias due to limited size of the dataset and to the improper choice of the scaling range. In order to check these effects, we applied the Grassberger-Procaccia method to a synthetic spatial monofractal Sierpinsky dataset of 1163 point (as many as the epicenters of the Azerbaijani catalog) generated by using the method of Kamer et al. (2013), whose theoretical fractal dimension is D~1.585 (Fig. 10).

Fig. 10
figure 10

Random simulation of Sierpinski set

Figure 11 shows the correlation integral of the spatial point process shown in Fig. 10. We can see that the correlation integral is not linear for all the spatial scales, but at large scales (for r > L max) tends to deviate from linearity and then to bend down. In order to better determine the value of L max, we calculated the slope (namely the correlation dimension D c) of the line best fitting the correlation integral curve in a spatial scale range [L min, L max], with L min corresponding with the lowest spatial scale and L max varying until the largest available spatial scale. Figure 12 shows the variation of D c versus L: the minimum absolute deviation from the theoretical fractal dimension is for L max = 0.455 that corresponds to ~43% of r max. For this value of L max, the correlation dimension of the Sierpinsky set is 1.584. This result is in agreement with Dongsheng et al. (1994) who found that a value of L max ~20–30% of r max should be enough to avoid finite-size effect visible at large spatial scales (see also Telesca et al. (2001)). Such small absolute value enables us to use the correlation dimension for quantifying the type of spatial distribution of the epicenters of the Azerbaijani seismicity. Figure 13 shows the correlation integral for the Azerbaijani seismicity for spatial scale ranging from ~10−1 to ~878 km that is the maximal interevent distance. As it was observed for the synthetic monofractal Sierpinsky set, also here, the relationship between the log10(N R<r ) and the spatial scale log10(r) is not the linear. Analyzing the first derivative (Fig. 14) that is the local slope of the curve, we can see that between L min~10 km and L max~262 km, it is rather constant; within this range, the correlation dimension is D c~1.38, which indicates that the spatial distribution of epicenters is fractal. Let us notice that the upper value of this range (L max) is about 30% of the maximal interevent distance, in agreement with the results of Dongsheng et al. (1994). The lower value of that range (L min) is about 1% of the maximal interevent distance. Applying the Smith’s criterion (Smith 1988)

Fig. 11
figure 11

Correlation integral of the random Sierpinski set shown in Fig. 10

Fig. 12
figure 12

Variation of D c versus L

Fig. 13
figure 13

Correlation integral for the Azerbaijani seismicity for spatial scale ranging from ~10−1 to ~878 km

Fig. 14
figure 14

Local slope of the curve shown in Fig. 13

$$ {N}_{\min}\ge {\left(\frac{L_{\max }}{L_{\min }}\frac{2- Q}{2\left(1- Q\right)}\right)}^{\left[ D\right]}. $$
(9)

where [D] is the integer part of the fractal dimension, 0 ≤ Q ≤ 1 is a quality factor, and N min represents the minimum number required to estimate the fractal dimension, for [D] = 1, Q = 0.95, L min = 10 km and L max = 262 km, N min ≥ 275; and this criterion is totally satisfied for our dataset.

4.5 Analysis of the temporal variation of the seismic parameters

The analysis of the time variation of the statistical parameters defined in the previous section is important to check if the parameters change with time or are characterized by a rather stable behavior; possible change through time would reveal changes in the dynamics underlying the seismic process. We analyzed the time variation of the statistical seismic parameters by using two approaches: (i) fixed event number (W N ) and (ii) fixed day length (W D ) of a window sweeping the entire catalog with a shift of 1 event or 10 days, respectively. In each window, the completeness magnitude M c was calculated by the GFT method and only in case the number of events with magnitude M ≥ M c was larger or equal to 50 (in agreement with Woessner and Wiemer (2005)), the statistical seismic parameters were calculated and their value associated with the time of the last event in the window. The minimum number of events per window is enough to significantly calculate the following seismic parameters with a good time resolution: completeness magnitude, b and a values, mean magnitude, C v and L v. We did not calculate the time variation of the DFA scaling exponent of the magnitude series and the correlation dimension of the epicenters, since the computation of both these two parameters would require a significantly larger window size that would practically impede to perform an analysis of the time variation with a good time resolution.

Figure 15 shows the time variation of the seismic parameters in the case of fixed event number window for two different lengths: W N  = 100 (blue circles) and W N  = 200 (red circles). In particular, the following parameters were calculated: M c (Fig. 15a), number of events with magnitude larger or equal to M c (Fig. 15b), b value (Fig. 15c) (the b value along with its error bar calculated by using formula (2) is shown in Fig. S1), a value (Fig. 15d). Figure 15e, f shows respectively the departure of the C v and L v from C v,Pois(97.5%) and L v,Pois(97.5%). C v,Pois(97.5%) (and, analogously, L v,Pois(97.5%)) was calculated in the following manner: (i) for 1000 Poissonian, sequences were randomly generated with the same length and the same rate as the actual earthquake sequence contained in the moving window; (ii) for each Poissonian sequence, the C v,Pois was computed; (iii) the 97.5% percentile of the distribution of the 1000 values of C v,Pois is calculated; (iv) C v,Pois(97.5%) represents, then, the superior value of the 95% confidence band of the C v,Pois distribution, meaning that if the value of C v of the actual earthquake sequence contained in that window is larger than C v,Pois(97.5%), then the actual earthquake sequence is significantly globally clusterized, otherwise is significantly Poissonian. Figure 16 shows the same seismic parameters as plotted in Fig. 15, but considering a fixed day length window for two different cases: W D  = 180 days (blue circles) and W D  = 365 days (red circles). We can observe that the time evolution of the seismic parameters is quite robust against the type and length of the moving window; they, in fact, show approximately the same behavioral trend in all the cases. The completeness magnitude ranges between 2 and 3.5, being quite stable at 3.0 between 2007 and 2011 (Fig. 15a); during this period, no relatively strong events occurred. Relatively short-range fluctuations in M c are especially evidenced from 2011 to 2016, indicated by a certain variability of the parameter between 2.0 and 3.5. During the same period, also the number of events with magnitude larger or equal to M c is quite unstable through time; this is probably due to the occurrence of several relatively strong earthquakes (M ≥ 5) along with their aftershock sequences and to the mixing of different seismic sources in the same time window. Similar behavior is shown by the other parameters (a value, b value) highly fluctuating between 2011 and 2016. In particular, it is observed a certain relationship between the range of variability of the b value and the occurrence frequency of relatively strong earthquakes (Figs. 15d and 16d): in the period between 2007 and 2011, it ranges between ~0.7 and ~1.25, and between 2011 and 2016, it ranges between ~0.5 and ~1.0. This indicates that in the first period a relatively large number of small events are generated and a low stress characterizing the investigated area; while in the second period, relatively more large earthquakes are generated and a high level of stress is present in the area. On the base of the time variation of C v-C v,Pois(97.5%) (Figs. 15g and 16g) and L v-L v,Pois(97.5%) (Figs. 15h and 16h), we can see that the peaks of time clusterization both a global and local scale could be associated to the strongest events of the sequence.

Fig. 15
figure 15

Time variation of M c (a), number of events with magnitude larger or equal to M c (b), b value (c), a value (d), departure of the C v from C v,Pois(97.5%) (e), and departure of L v from L v,Pois(97.5%) (f), in 100 event number window size (blue) and 200 event number window size (red). The error on b value was calculated by using Shi and Bolt’s formula (1982) are 0.04 and 0.18 (W n  = 100) and 0.03 and 0.13 (W n  = 200)

Fig. 16
figure 16

Time variation of M c (a), number of events with magnitude larger or equal to M c (b), b value (c), a value (d), departure of the C v from C v,Pois(97.5%) (e) and departure of L v from L v,Pois(97.5%) (f), in 180-day window size (blue) and 365-day window size (red). The maximum and minimum error on b value calculated by using Shi and Bolt’s formula (1982) are 0.04 and 0.17 (W D  = 180) and 0.03 and 0.17 (W D  = 365)

4.6 Analysis of the spatial variation of M c and b value

Since the data is sampled over a large and potentially strongly inhomogeneous area, we performed an analysis of the spatial variability of the completeness magnitude and b value. In fact, the gradient shown in the time variation of b could either mean that the conditions in the same region change, or that no temporal changes exist, but different regions with a different overall b value have been activated. And of course, there might be mixtures. Analogously, M c depends on network configuration, and unless there have been campaigns during which more stations were used equally over the whole study region, the times of apparently improved M c of as much as a unit of magnitude, could be associated with periods in which the seismicity concentrated in a region of better network coverage overall or a local campaign with additional stations. Therefore, in order to check on all these aspects, we superimposed to the investigation territory a spatial grid with square cell side of 0.01° in latitude and longitude. For each node, we considered a circle of 50 km radius and calculated for the earthquakes included within it the number of earthquakes (Fig. 17a), the completeness magnitude M c (Fig. 17b), the number of events with M ≥ M c (Fig. 17c), and only in case this number if larger or equal to 50 events, the b value is estimated (Fig. 17d). The b value spatial variation can be interpreted in terms of stress changes across collision zone along Main Caucasus Trust (MCT) thrust fault and West Caspian fault (WCF). The decrease of the b value from south to north along the WCF is an indication of higher stress in northern part of the region. It might be reasonable to connect this low b value region with a “harder” patch on the fault. Based on new GPS observations on the Absheron peninsula and along the western coastal side of the Caspian Sea south of Absheron peninsula, it is evidenced that below the 40° E latitude between 48° and 49° longitude, the MCT turns sharply to the south, crossing the Kur depression and extending along the western side of the Caspian Sea. While the MCT is predominantly a thrust fault, the WCF has a substantial right-lateral, strike-slip component, at least in the region immediately south of the Absheron Peninsula. The existence of WCF is also supported by the topographic path of the Kur River, which sharply turns to the south at 40° E latitude. This inflection is partly marked by the region of submeridional discontinuances, generally typical of plastic rocks of depression (Philip et al. 1989; Saintot et al. 2006; Kadirov et al. 2012; Alizadeh et al. 2016). The central part of this strike-slip fault zone (near SALY, BLVR, and NEFT GPS points (see Fig. 1)) is characterized by very low seismicity. However, whether this segment is creeping aseismically or accumulating strain without generating seismicity still remains uncertain (Kadirov et al. 2015).

Fig. 17
figure 17

Spatial distribution of a number of events, b M c, c number of events with M ≥ M c, d b value

The low b values (0.60–0.70) in the central part of the southern Talysh seismic zone surrounded by areas of increased values of b (≥ 0.90) evidence the presence of relatively more solid crust material than in Kur depression; here, folded rocks of the Oligocene-Miocene age lie on volcanic rocks of the Eocene age (Alizadeh 2008; Alizadeh et al.2016).

Except western part of the Lesser Caucasus, there is a very low strain accumulation within the Kur basin; in fact, very low b values can be observed along MCT. Particularly, at the central part of the MCT, it becomes minimal, and this is the most seismically active segment. This part of MCT has broken historically in shallow (~15–20 km) continental thrust events including highly destructive earthquakes in 1191, 1859, and 1902 in the Shamakhi region (e.g., Kondorskaya et al. 1977; Triep et al. 1995). It is possible that the differences on seismic regime and estimates of b value at MCT and Kur depression are due to differences in the nature of the crust and lithosphere, which is more soft and thin composed by quaternary marine, marine terrigenous sediments for the Kur Depression and more solid with very thick and consolidated sediments in the MCT and Talysh (Saatli ultradeep well 1999; Allen et al. 2004; Vincent et al. 2005). It is also possible that the different stress regimes trusting along the MCT, strike-slip along the WCF, reflect the different dynamics. Furthermore, the evidence of mud volcanoes distributed to the east of WCF in the eastern Kur depression and in Absheron peninsula can also be the reason of less seismicity (due to more plastic crust material).

5 Conclusions

The present study furnishes a detailed picture of the statistical properties of the spatial, temporal, and magnitude distribution of the 2003–2016 instrumental seismic catalog of Azerbaijan and surrounding regions that represent one of the most seismically active areas worldwide. The statistical analysis has been performed by using standard and non-standard methodologies to get the most exhaustive description of the catalog. The main findings are as follows:

  1. 1)

    The Gutenberg-Richter law is satisfied for M ≥ 2.9 with a b value of 0.76, by using the GFT method.

  2. 2)

    The time clustering of the complete catalog was investigated by using the global (C v) and local (L v) coefficient of variation, obtaining the values C v ~1.23 and L v~1.26 with a 95% confidence interval of [0.9459, 1.0566] and [0.9391, 1.0598], respectively, that indicate time clusterization of the events at both global and local scale.

  3. 3)

    The magnitude series are uncorrelated, and this indicates that they are in principle unpredictable; this finding represents an important information in the context of seismic hazard analysis of the area.

  4. 4)

    The spatial distribution of the epicenters of the Azerbaijani earthquakes is fractal for spatial scales ranging from ~10 to ~262 km.

  5. 5)

    The time variation of the analyzed seismic parameters (M c, number of events with magnitude larger or equal to M c, b value, a value, C v, and L v) show that from 2011 to 2016, the completeness magnitude is weakly fluctuating, along with the number of events; other parameters (a value, b value, <M>) show a very visible variation during the same period. The behavior shown by these seismic parameters between 2011 and 2016 could be linked with the occurrence of the strongest events associated with a high level of stress in the area, to which also high time clusterization both a global and local scale could be associated.

  6. 6)

    The spatial variation of the seismic parameters (M c, number of events with magnitude larger or equal to M c, b value) show a highly space variability connected with the peculiar seismo-tectonic settings of the different areas of Azerbaijan.