Introduction

Western Turkey is one of the most rapidly deforming continental regions on earth, and widely spread seismicity in the region is an indicator of this deformation (Alptekin et al. 1990). The active deformation of western Turkey is governed by the interaction of three major plates (Eurasia, Arabia, and Africa). North Anatolian Fault (NAF), East Anatolian Fault (EAF), Bitlis Thrust Belt (BTB) and Aegean graben systems are consisted of the main tectonic structures of Anatolia (Fig. 1). The main part of the deformation in western Turkey has been caused by subduction and collision-related processes; however, the current-day deformation is closely related to the stage of collision, specifically to the rate of convergence and subduction of some of the plates (Royden 1993a, b). The present geomorphology of the region is characterized by a series of east-west trending major grabens with northeast-southwest-trending secondary (cross-cutting) grabens (Koçyiğit et al. 1999).

Fig. 1
figure 1

Tectonic map of Turkey showing study area

The activity of basin bounding faults is shown by numerous historical earthquakes, such as September 20, 1899, Menderes, November 18, 1919, Soma (M = 6.9), March 31, 1928, Torbalı (M = 6.3), April 23, 1933, Gökova (M = 6.5), September 22, 1939, Dikili-Bergama (M = 6.5), October 6, 1942, Gulf of Edremit-Ayvacık (M = 6.8), July 16, 1956, Söke-Balat (M = 7.1), March 23, 1969, Demirci (M = 5.9), March 28, 1969, Alaşehir (M = 6.5), and March 28, 1970, Gediz (M = 7.2). Recently, there have been other earthquakes in Urla and Sığacık. April 10, 2003, Urla (M = 5.7) and October 17–21, 2005 (M = 5.7, M = 5.9, and M = 5.9), Sığacık earthquakes were other important seismic activities in the region.

Various studies have been undertaken to determine the seismic features of Western Turkey. According to Sayıl and Osmanşahin (2008), the region has been divided into 13 sub-regions due to certain seismotectonic characteristics, plate tectonic models, and the geology of the region. According to their estimation, the highest earthquake occurrence probability of surface wave magnitude MS ≥7.0 in the next 100 years is 80.6% (σ = 0.20, R = 0.87) for sub-region 9 and 77.8% (σ = 0.17, R = 0.90) for sub-region 1. R and σ are the correlation coefficient and standard deviation, respectively. They found the recurrence time intervals for the earthquakes with the same magnitude to be 61 and 67 years in these sub-regions (Sayıl and Osmanşahin 2008).

In addition to conventional techniques, prediction of earthquakes has recently been studied by artificial intelligence methods. Bodri (2001) predicted the occurrence time of M ≥ 6 earthquake seismicity rate variations of Carpathian–Pannoman region, Hungary, and the Peloponnesos–Aegean area, in the Greece region using artificial neural networks (ANNs). Alves (2006) used ANN to predict the time of occurrence and the locations of the earthquakes, experienced for a specified time interval, with the help of the seismicity of the Azores region. Qiang (2000) analyzed the predictability of time series and introduced a method for the application of ANN in forecasting earthquake precursor chaotic time series.

Sri Lakshmi and Tiwari (2009) evaluated monthly occurrence frequency time of the M ≥ 4 earthquakes with MLPNN and nonlinear forecasting techniques by using earthquake catalog data from Northeast India. The earthquakes of the East Anatolian Fault System (EAFS) were predicted using changes of the radon with ANN (Külahçı et al. 2009). Feed forward neural networks (FFNNs), adaptive neural fuzzy inference systems (ANFIS), and probabilistic neural networks (PNNs) were used to discriminate between earthquakes and quarry blasts in Istanbul and the vicinity (the Marmara region) (Yıldırım et al. 2011).

Reyes et al. (2013) used a new prediction system, based on ANN to predict earthquakes in Chile. Morales-Esteban et al. (2013) built two multilayer feed forward ANNs to predict earthquake occurrences. They tested their ANN model on two areas with larger seismic activity in the Iberian Peninsula: Alboran Sea and Western Azores–Gibraltar Fault. Zamani et al. (2013) applied the neural network and ANFIS model on earthquake occurrence in Iran. Buscema et al. (2015) used USGS and ISIDe (the Italian seismic instrumental and parametric database) catalog to predict the magnitude of earthquakes with ANN. Their data set was composed of 324,542 events. Similarly, Alexandridis et al. (2014) studied the Southern California Seismic Network catalog which contains 313,068 seismic events ranging in magnitude from 1.5 to 7.5 to estimate large earthquake occurrence using radial basis function neural networks (RBFNNs). Wang et al. (2015) estimated the seismic hazard potential in the Sichuan–Yunnan region, western China, with three different methods. In their study, the catalog includes M ≥ 5.0 earthquakes in the Sichuan–Yunnan region from 1500 to 2013.

The nature of earthquake induced by seismic sources is defined as a noinear function by using neural network (Zakeri and Pashazadeh 2015). ANN and ANFIS are used for classification, prediction, and modeling nonlinear problems, and therefore, they are a good candidate for processing earthquake data. In this study, two different ANNs (which are MLPNN and RBFNN) and ANFIS were applied to the earthquake frequency data from Western Turkey to predict possible earthquake frequencies.

Artificial neural networks and adaptive neuro-fuzzy inference systems

Neural network (NN), or artificial neural network (ANN), is an information processing system composed of large amount of highly interconnected processing elements (neurons), and it is an emulation of a biological neural system. There are different types of networks for broad application areas: pattern recognition, time series analysis, signal processing, and control. Adaptive neuro-fuzzy inference systems (ANFISs) combine the learning capabilities of neural network and the reasoning capabilities of fuzzy logic to provide enhanced prediction capabilities. In this study, three networks were applied to earthquake frequency data of Western Turkey.

Multilayer perceptron neural networks

Multilayer perceptron neural networks (MLPNNs) are one of the most important classes of neural networks and have many areas of application ranging from finance to engineering. The network consists of an input layer, one or more hidden layers, and an output layer (Fig. 2b). The input layer is only responsible for feeding the input data to the neurons of the second layer, which is the first hidden layer. The outputs of the second layer are used as the input to the third layer, and so on, for the entire network. The computation only occurs at the hidden and output layer neurons. The connections between all the elements of the networks are realized through synaptic weights. These synaptic weights are adjusted via a back-propagation algorithm to provide a nonlinear mapping.

Fig. 2
figure 2

a Single neuron structure. b MLPNN structure

As shown in Fig. 2a, the output of the j th neuron in the hidden layer is given by

$$ {y}_j= f\left( v=\sum_1^N{w_{j i}}^{\ast }{x}_i+{b}_j\right) $$
(1)

where x i is the input vector, w ji is the synaptic weight between the input i and the neuronj, y j is the output of the j th neuron, b j is known as bias (Fig. 2a), and f(v) represents the activation function. The sigmoid and the hyperbolic tangent functions are the most commonly used functions.

MLPNN is one of the supervised neural network types which are trained via the presentation of input and the corresponding desired output set. The standard back-propagation algorithm for training the network is based on the steepest descent gradient approach applied to minimization of a defined energy function related to the instantaneous error between the desired output and actual output. After the training, the network is also tested for generalization performance with a separated test data set.

Radial basis function neural networks

Radial basis function neural network (RBFNN) is a feed-forward network trained using a supervised training algorithm and is suggested by many scientists as an alternative to the MLPNN. RBFNN performs a nonlinear mapping between the input and output vector spaces. They have advantages in training and learning of a given training set in a shorter period of time compared to MLPNN. RBFNN has, typically, a kind of fully connected feed forward structure and consists of three layers as shown in Fig. 3: an input layer, a hidden layer with a nonlinear RBFNN activation function, and a linear output layer (Ham and Kostanic 2000). The output of any neuron at the output layer of RBFNN is calculated as

$$ {y}_i=\sum_{j=1}^N{w}_{i j}{\varphi}_j\left(\left\Vert x-{c}_j\right\Vert \right) i=1,\kern0.5em 2,\dots, m $$
(2)
Fig. 3
figure 3

Radial basis function neural networks structure

where φ j (.) is a set of N arbitrary functions known as radial basis functions, ‖.‖ denotes the Euclidean norm, w ij is the weight connecting hidden neuron j to the output neuron i, N is the number of neurons in the hidden layer, x ∈  nx1 is an input vector, and c j   ∈  nx1 are the centers of radial basis functions in the input vector space. First, the Euclidean distance between the input vector and the center of the basis function is computed for each unit in the hidden layer. The output of each hidden unit is a nonlinear function of this distance and gives a score for the match. The output of the network for each output neuron is obtained using Eq. (2) as the weighted sum of the hidden layer outputs. The activation function is generally based on a Gaussian function and given as

$$ \varphi (x)= \exp \left(-{x}^2/{\sigma}^2\right) $$
(3)

where σ is the spread parameter which controls the width of the radial basis functions. The training of the RBFNN can be realized through the weights in the output layer, the centers of the RBFNN, and the spread parameter of the Gaussian function. The simplest form of RBFNN training can be obtained with a fixed number of centers. If the number of centers is made equal to the number of input vectors, namely exact RBFNN, then the error between the desired and actual network outputs for the training data set will be equal to zero (Haykin 1999). In this work, the exact RBFNN was used. The other advantage of exact RBFNN in network training is to have a closed-form solution. This certainly gives a training time advantage to RBFNN (Ham and Kostanic 2000).

Adaptive neuro-fuzzy inference system

Adaptive neuro-fuzzy inference system (ANFIS) was proposed by Jang (1993) and used for various applications covering control systems, prediction of chaotic time series, signal processing, etc. ANFIS combines the advantages of neural networks and the linguistic interpretability of fuzzy inference systems to provide specific solutions. ANFIS can serve as a basis for generating a set of fuzzy if–then rules with appropriate membership functions to generate the stipulated input–output pairs. Here, a hybrid learning algorithm is used to identify the parameters of fuzzy inference systems. A given data set is emulated for training fuzzy inference membership function parameters by combining the least squares method and the back-propagation gradient descent method.

Earthquake catalog data of Western Turkey

Western Turkey is one of the regions where many historical and instrumental intensity earthquakes occur due to graben tectonic plate movements. Fault lines lying on grabens are the places where intense earthquakes occur. So far various techniques have been evaluated to predict the occurrence time and intensity of earthquakes. In this study, earthquake catalog data have been evaluated by MLPNN, RBFNN, and ANFIS.

The catalog data used in this study with M ≥ 3 were collected from earthquakes which occurred at 37°–39.30° south longitude and 26°–29.30° east latitude between 1975 and 2009 years (Kandilli Observatory and Earthquake Research Institute and Republic of Turkey Prime Ministry Disaster and Emergency Management Presidency). In this time interval, there were 10,333 earthquakes in the area. All of the earthquakes are shown in Fig. 4. The monthly earthquake frequency data set was determined by obtaining the total number of earthquakes which occurred each month. In this way for the studied area, a total of 408 monthly frequency earthquake values were calculated. The earthquake frequency data set is given in Fig. 5.

Fig. 4
figure 4

Study area and the epicenters of the earthquakes with M ≥ 3 for years from 1975 to 2009

Fig. 5
figure 5

Earthquake monthly frequency data set of Western Turkey from 1975 to 2009

Data analysis with ANN and ANFIS

The whole 408 data set of monthly earthquake frequency of western Turkey earthquakes catalog data was divided into two parts for training and testing the neural networks and ANFIS, 85 and 15%, respectively. Three hundred fifty monthly earthquake frequency data were used for training only. To predict the future values of monthly earthquake frequency data series, it is necessary to find the best number of past data to be used as inputs to the networks. For this purpose, correlation coefficients of data series for different number of consecutive monthly frequency data were calculated up to six consecutive data values. Moreover, for each number of inputs, MLPNN, RBFNN, and ANFIS were tested to calculate errors between the desired response and the network response which was the predicted monthly earthquake frequency data. The arrangement of ANN-ANFIS inputs and corresponding data for different number of inputs are illustrated in Fig. 6. The initial number of inputs was one, and the last tested number of inputs was six. The circles with straight lines represent the currently used network inputs whereas the circles with the dotted lines represent the next input group to the network. In a similar manner, the pointed arrows below with straight and dotted lines correspond to network output for two consecutive training sessions.

Fig. 6
figure 6

Input–output pairs

To find the optimum network structure for MLPNN and RBFNN, during the training, the number of inputs was set from one to six as shown in Fig. 6. For the MLPNN training session, various numbers of hidden layers were used. The number of neurons in these layers was also changed. As a result of training and test sessions, the optimum network for the given data set was found as follows: MLPNN with one hidden layer consisting of 15 neurons, 0.2 learning rate parameter, and a target training MSE of 0.0001. With this network, the best RMSE and the highest correlation coefficient were obtained with four consecutive inputs. The predicted earthquake frequency values are given in Fig. 7. The correlation coefficients and the RMSE values between observed and predicted earthquake data are given in Figs. 8 and 9, respectively

Fig. 7
figure 7

The result of MLPNN, RBFNN, and ANFIS for inputs from 1 to 6

Fig. 8
figure 8

Correlation coefficients of predicted frequency data for MLPNN, RBFNN, and ANFIS results

Fig. 9
figure 9

RMSE values of MLPNN, RBFNN, and ANFIS results

For RBFNN, exact structure where the number of centers is equal to the number of input data was used. Moreover, the number of centers was fixed during the training. With this network, the error for the training was zero. In the centers of the hidden layer, a Gaussian function was used. To find the optimum spread parameter σ, a program was written to test various spread parameters to provide the minimum error value. The optimum RBFNN, which provides minimum RMSE and the maximum correlation coefficient between observed and predicted earthquake data, was found with two consecutive inputs and 350 centers. During ANFIS applications, frequency data which were used in the preceding steps were evaluated by using different membership functions. During the test, various numbers of rules were tested and the best results were obtained with 2 and 3. Data differing from 1 to 6 consecutive inputs were evaluated separately for each membership function. Correlation coefficients and RMSE values of the obtained results were analyzed.

The predicted frequency values for all methods can be seen in Fig. 7. Correlation coefficients and RMSE values are given in Figs. 8 and 9, respectively. When frequency values that were predicted by ANFIS are observed, four consecutive inputs gave the maximum correlation coefficient value but minimum RMSE value. In one input situation, the correlation coefficient is minimum and the RMSE value is maximum.

When the results obtained for all methods from Figs. 8 and 9 are examined, it can clearly be seen that RBFNN provides reduced RMSE values and larger correlation coefficients. In this case, it can be said that RBFNN gives better results for the data set studied.

Correlation coefficient and root-mean-squared error

Correlation coefficients and RMSE between observed and predicted earthquake data were specifically calculated to make an assessment between MLPNN, RBFNN, and ANFIS. The calculated correlation coefficients and root-mean-squared errors are given in Figs. 8 and 9, respectively. When Table 1 is examined, it is seen that in general for both NN, the highest correlation coefficients are obtained for two, three, and four consecutive input values (Table 1). For MLPNN, the highest correlation coefficient 0.38 was obtained for four consecutive values, whereas for RBFNN, the maximum correlation coefficient 0.93 was obtained for two consecutive input values. The maximum correlation coefficient for ANFIS, 0.44, was obtained for four consecutive input situations. When all the selected applications are evaluated, it was observed that correlation coefficients for MLPNN and ANFIS are similar and coefficient for RBFNN is higher when compared with others.

Table 1 Correlation coefficients of MLPNN, RBFNN, and ANFIS results

The root-mean-squared error values obtained for the test data set are given in Fig. 9 and Table 2. When the results were analyzed, it can be seen that the lowest error value is obtained for four inputs for MLPNN and ANFIS and two inputs for RBFNN. The calculated correlation coefficients and RMSE values are well matched with each other at four inputs for MLPNN and ANFIS and at two inputs for RBFNN.

Table 2 RMSE values of predicted earthquake frequency data for MLPNN, RBFNN, and ANFIS

Conclusions

Western Turkey is one of the most seismically active regions in the world because of the approximately E–W trending grabens and their basin-bounding active normal faults. There have been many historical and recent major earthquakes along these graben systems. For the prediction of the earthquakes and the determination of the tectonic features of the region, different techniques were applied to seismological data. In this study, MLPNN, RBFNN, and ANFIS have been applied to the earthquake catalog data of Western Turkey. The networks were trained and tested for various numbers of consecutive inputs, up to 6. The correlation coefficient values were estimated to be in a range of 0.18–0.38 for MLPNN, 0.17–0.44 for ANFIS, and 0.3–0.93 for RBFNN. The RMSE values are between 39.6 and 57.3 for MLPNN, 33.7 and 45.6 for ANFIS, and 13.8 and 38.8 for RBFNN. When the results obtained are analyzed, it can be seen that four consecutive inputs for MLPNN and ANFIS and two consecutive inputs for RBFNN are more accurate. The test results show that the RMSE values of RBFNN are lower than the MLPNN and ANFIS results. When the MLPNN and ANFIS results are compared, besides similar results, ANFIS may be evaluated better than MLPNN. RBFNN gave better results when compared with the other two methods. Therefore, it can be said that RBFNN provides a better prediction for the monthly earthquake frequency data of the region.