1 Introduction

Earthquake predicting is one of the difficult issues in the world. The main purpose of the earthquake studies is to predict the probability of occurrence of an earthquake in an area in the most reliable way. There have been many studies which have tried to work out earthquake mechanisms and many others which developed different earthquake parameters. Due to the fact that the stress level which is the most significant parameter of the earthquake cannot be directly measured and no observations can be made inside the earth crust, the earthquake prediction problems are carried out with difficulty [33]. Monitoring the crustal motions and observing the changes in the velocities have a great significance in earthquake prediction. Knowing that the crustal motion velocities result in earthquake gives an idea about when an energy accumulation, which causes earthquake on a fault in a specific region, will probably happen; therefore, it is crucial to follow the crustal motions and to observe the variations in velocities.

Geodesic deformation networks need to be established in order to determine the movements and velocities of the crust. The data obtained from the deformation networks are located on a regional area. Spatial statistics are used in the analysis of this data. One of the methods used for spatial statistics is kriging. Kriging is optimal interpolation based on regression. Measurements made for earthquake predictions contain uncertainties arising from reasons such as environmental impacts, insufficiencies in human senses, malfunctioning of the measuring devices and changes in the structure of the data. Kriging does not consider the uncertainty in measurements. Neural networks and fuzzy systems can be used in earthquake prediction due to their ability to solve the problems related to these uncertainties.

Giacinto et al. [15] used neural networks in order to evaluate the risk of earthquake in regions where the risk is already existent. Muller et al. [24] classified seismic events with low magnitude, which had been recorded in France by seismometer network, as fuzzy. Huang and Leung [17] suggested using fuzzy neural networks to estimate correlation between earthquake field and magnitude. It has also been observed that artificial neural networks were effectively used for defining electrical earthquake signals, for producing response spectrums to artificial earthquakes, for estimating the density of radon in the soil as a signal of earthquake, the magnitude of medium intensity earthquake and the basic motion records of the crust [3, 8, 23, 25, 27]. Bodri [10] evaluated the applicability and benefits of neural networks for earthquake estimation. Fuzzy methods to classify strong ground motion records have proven to be useful [4]. Fuzzy methods are used to predict reservoir-induced earthquake, to keep the record of strong ground motions and to predict the following seismic moment [2, 6, 22, 36]. Baldovino and Dadios [9] showed that the earthquake simulator system, which they had developed by fuzzy logic algorithm, had yielded true, reliable and durable results. Aboonasr et al. [1] used fuzzy logical deduction system in order to define the earthquake potential and seismic zone of İran-Zagros orogenic belt. Likewise, to analyze the seismic risk in the city of Kunming in China, Andric and Lu [7] suggested a new approach which was dependent on fuzzy logic techniques and probability theory. Ameur et al. [5] used ANFIS to get robust ground motion prediction model.

In this study, we aimed to predict the crustal velocities with the help of AFNN approach. AFNN is a kind of artificial neural network based on Takagi–Sugeno fuzzy inference system. The combination of fuzzy inference systems and neural network learning can help to improve the performance of the earthquake prediction.

In AFNN approach, firstly fuzzy clustering is used for the fuzzification of the studied area. For the fuzzification, membership function is used. Membership function is a function that specifies the degree to which a given input belongs to a set [34]. Depending on the type of membership function, different types of fuzzy sets will be obtained. One of the main difficulties in fuzzy set theory has been with the meaning and measurement of membership functions [11]. In this study, since the data are spatial and their distances to each other are important, it is proposed to select the variogram function as membership function in order to identify this factor in fuzzy logic. The variogram provides a description of the distance-dependent relation between the variables [14].

According to suggested algorithm, fuzzy rules are obtained from the network. Based on these rules, predictions of motion velocities at unobserved points are calculated. In order to evaluate the performance of the approach, suggested AFNN is compared with kriging in terms of functionality. To makeit clearer, the structure of this study is given in Fig. 1.

Fig. 1
figure 1

Structure of the study

For the data analysis, spatial prediction techniques are utilized since the data on the velocities of the crustal motions are collected from a spatial region. This technique is presented in Sect. 2 of this study. And the rest of the paper is organized as follows. AFNN Inference System will be explained in Sect. 3. Section 4 discusses algorithm based on AFNN. An application on the prediction of the crustal motion velocities in Marmara Region, Turkey will be presented in Sect. 5. Comparison of the results is given in Sect. 6, and the discussion is given in Sect. 7.

2 Spatial Prediction

The values of spatial variables are evident only in the sampled locations of the study area. When the calculation of the unknown values in the unsampled locations is needed, the known values in the sampled locations are utilized. Calculation of spatial variables in an unsampled location is called prediction [14, 31].

Variogram function is used in spatial prediction. The variogram function, \(2\gamma \left( {\mathbf{h}} \right)\), is used to characterize the distance-dependent relation between two random variables whose distance between is h and provided that Z(x) is spatial variable, it is defined as follows:

$$2\gamma \left( {\mathbf{h}} \right) = Var\left( {Z\left( {\mathbf{x}} \right) - Z\left( {{\mathbf{x}} + {\mathbf{h}}} \right)} \right) = E\left[ {Z\left( {\mathbf{x}} \right) - Z\left( {{\mathbf{x}} + {\mathbf{h}}} \right)} \right]^{ \, 2}$$
(1)

For the determination of the variogram, first the estimation of semi-variogram obtained from the sampling is calculated as follows:

$$\hat{\gamma }({\mathbf{h}}) = \frac{1}{{2N({\mathbf{h}})}}\sum\limits_{i = 1}^{{N( \, {\mathbf{h}})}} {(z({\mathbf{x}}_{i} ) - z({\mathbf{x}}_{i} + {\mathbf{h}}))^{2} }$$
(2)

In Eq. (2), \(N({\mathbf{h}})\) shows the number of pairs seperated by lag \({\mathbf{h}}\). \(z({\mathbf{x}}_{i} )\) and \(z({\mathbf{x}}_{i} + {\mathbf{h}})\) are the values for the locations  \({\mathbf{x}}_{i}\) and \({\mathbf{x}}_{i} + {\mathbf{h}}\), respectively [14, 16, 18, 28, 29]. Obtaining semi-variogram values against each of the \({\mathbf{h}}\) distances, they will be transferred on the graph and a function adoption is implemented [18, 31]. The most widely used variogram models in the literature are normal, exponential, global, nugget effect, linear, algorithmic, quadratic, proportional quadratic, cubic, power model, wave model and pentagonal models. Variogram function has five defined parameters: C0—nugget effect, C—structural variations, C0 + C—threshold and a—structural distance [31].

Spatial prediction is calculated through the equation below:

$$\hat{z}({\mathbf{x}}_{0} ) = \sum\limits_{i = 1}^{N} {w_{i} z({\mathbf{x}}_{i} )}$$
(3)

In Eq. (3), \(\hat{z}({\mathbf{x}}_{0} )\) is the prediction value for the location \({\mathbf{x}}_{0}\); \(z({\mathbf{x}}_{i} )\) are the values of the variables observed at each \({\mathbf{x}}_{i}\) location; w i are the weight values corresponding to each \(z({\mathbf{x}}_{i} )\) and N is the number of points to be used in the prediction of \(\hat{z}({\mathbf{x}}_{0} )\) [14, 18]. The weight values given in Eq. (3) were predicted.

Kriging is the determination of weight values in a given prediction in Eq. (3) in a way that the estimation of mean error is zero and the variance is the minimum [31, 35]. After the weights are determined, the prediction value for a given location in the study area is calculated through Eq. (3). In Kriging algorithm, for every new point the weight calculation needs to be repeatedly made [18].

3 AFNN Inference System

Fuzzy inference system is a calculation system based on fuzzy set theory and fuzzy if–then rules. Fuzzy if–then rule is as follows:

$$R_{k} :{\text{If}}\,\,\,x \in A_{k} \,\,\,{\text{then}}\,\,y \in B_{k} ,\,\,\,\,k = 1,2, \ldots ,K$$
(4)

where R k , k is rule; \(A_{k} \,\) and \(B_{k} \,\) are fuzzy sets defined by membership functions; x is linguistic input variable; y is linguistic output variable. The section between “if and then” statements shows the input information (premise), and the section following the then statement shows the output information (consequent) [12, 30]. AFNN system is put forward by Jang [20]. In AFNN system, a relationship is established between the input and output variables in the if–then rule by utilizing learning skills of artificial neural networks and fuzzy rules are determined by means of this relationship. System is a feed-forward network with five layers which are connected to each other by direction links and part of which consists of adaptive neurons. Adaptive neurons have certain parameters. Values of these parameters are determined by means of learning [19, 20]. Fuzzy adaptive network structure with two inputs and two rules is given in Fig. 2. Operation of network is given below [12, 20]:

Fig. 2
figure 2

Structure of fuzzy adaptive network with two inputs and two rules [12]

Layer 1

Fuzzy sets concerning fuzzy if–then rules are shown by F1, F2, F3 and F4. Neurons located in this layer are adaptive, and output value of h neuron is defined as follows, membership function of F h being \(\mu_{{F_{h} }}\):

$$\begin{aligned} f_{1,h} = \mu_{{F_{h} }} (x_{1} ),{\text{ for }}h = 1, \, 2 \, \hfill \\ f_{1,h} = \mu_{{F_{h} }} (x_{2} ),{\text{ for }}h = 3, \, 4 \, \hfill \\ \end{aligned}$$
(5)

Layer 2

Neurons located in this layer are demonstrated as Λ l (l = 1, …, 4), and they are fixed neurons. Each neuron has two input signals coming from Layer 1. Λ l is defined as the multiplied of these input signals, and neural functions of this layer are expressed as follows:

$$\begin{aligned} f_{2,1} & = w^{1} = \mu_{{F_{1} }} (x_{1} ) \cdot \mu_{{F_{3} }} (x_{2} ) \\ f_{2,2} & = w^{2} = \mu_{{F_{1} }} (x_{1} ) \cdot \mu_{{F_{4} }} (x_{2} ) \\ f_{2,3} & = w^{3} = \mu_{{F_{2} }} (x_{1} ) \cdot \mu_{{F_{3} }} (x_{2} ) \\ f_{2,4} & = w^{4} = \mu_{{F_{2} }} (x_{1} ) \cdot \mu_{{F_{4} }} (x_{2} ) \, \\ \end{aligned}$$
(6)

Layer 3

Neurons located in this layer are fixed neurons shown by N l (l = 1,…,4). Output value of this layer is the normalization of outputs of Layer 2 and neural function is defined as:

$$f_{3,l} = \bar{w}^{l} = \frac{{w^{l} }}{{\sum\nolimits_{t = 1}^{4} {w^{t} } }},\,\,\,\,l = 1, \ldots , 4$$
(7)

Layer 4

Neurons located in this layer are adaptive neurons of whose neural functions are expressed as follows:

$$f_{4,l} = \bar{w}^{l} \cdot \hat{Y}^{l} ,\,\,\,\,l = 1, \ldots , 4$$
(8)

\(\hat{Y}^{l}\) is the consequent of a fuzzy if–then rule and is defined as follows:

$$\hat{Y}^{l} = \, c_{0}^{l} + c_{1}^{l} x_{1} + c_{2}^{l} x_{2}$$
(9)

c l i coefficients in Eq. (9) are fuzzy numbers expressed as c l i  = (a l i b l i ) (i = 0, 1, 2; l = 1, …, 4), and they show consequent parameters.

Layer 5

The single neuron located in this layer is the fixed neuron that calculates the overall output and is calculated as follows:

$$f_{5,1} = f_{\text{output}} = \hat{Y} = \sum\limits_{l = 1}^{4} {\bar{w}^{l} \cdot \hat{Y}^{l} } = \frac{{\sum\nolimits_{l = 1}^{4} {w_{l} f_{l} } }}{{\sum\nolimits_{l = 1}^{4} {w_{l} } }}$$
(10)

The aim of AFNN is to achieve the relationship between the input–output data pairs given. This required model is obtained by a learning algorithm. In order to measure the performance of AFNN, different error measures are used. The error measure is defined as the difference between the outputs of the model obtained and the outputs of the target. The training of the network is terminated when this error criterion is less than a prespecified small error.

Different methods are used for the premise and consequent parameters in the training of AFNN. Backpropagation is used for the training of premise parameters, and likelihood linear programming is used for the training of consequent parameters [13, 20, 21].

4 Suggested Algorithm for Spatial Prediction

The main problem in a spatial prediction is to obtain the best prediction of a spatial variable at an unsampled location. To obtain the prediction value for the unsampled location, the observation values of the sampled locations are used. In order to exemplify this problem, a sample prediction problem is given in Fig. 3. The objective here is to obtain the prediction value for the location q with the help of the observation values of the other five locations.

Fig. 3
figure 3

A sample prediction problem

The value seen in Fig. 3 for the location q1 covers the area determined around the location q1 as well. The size of this area, however, is fuzzy. It is thought that this fuzziness should be taken into account in the predictions to be made. But classical spatial statistics methods do not consider the uncertainty. For this reason, AFNN approach is suggested to be used in this prediction problem.

In prediction through AFNN, the independent variables to constitute the inputs of the network are the values of the latitude and the longitude. Spatial prediction with AFNN starts with the determination of sub-cluster numbers of the independent variables and the membership function. Firstly, the study area should be divided into sub-clusters by cluster analysis. Cluster analysis divides data into sub-clusters that are meaningful or useful. There are different methods for cluster analysis. In this study, subtractive clustering algorithm is used. Afterward, the study area should be converted into a fuzzy area. This is known as fuzzification. Fuzzification is the process of changing a real scalar value into a fuzzy value. Membership functions are used in the fuzzification. There are different forms of membership functions such as triangular, trapezoidal, piecewise linear or Gaussian [19]. In the determination of the membership function, a function that models the distance-dependent relation between the variables is suggested. This study is made on spatial variables. For this, variogram function is found to be suitable.

The fuzzy model is constituted based on the Sugeno fuzzy logic method. The fuzzy rules are defined in Table 1.

Table 1 Fuzzy rules

In Table 1, C i (i = 1,…, m) is the sub-clusters for longitude and D j (j = 1,…, n) is the sub-clusters for latitude, the x1 is longitude value, and x2 is latitude value. Y outputs show the values of spatial variables such as the radon concentration, the intensity of the earthquake and crustal motion velocities. \(\hat{Y}^{l}\) (l = 1, …, n × m) values are the outputs corresponding to each rule. For each of the n × m number of rules, the unknown c l i (i = 0, 1, 2, l = 1, …, n × m) values need to be found.

The algorithm for the determination of most suitable values of the c l i coefficients and the premise parameters is defined as follows:

Step 1 With the use of subtractive clustering algorithm, number of fuzzy sub-clusters is obtained.

Step 2 Variogram model suitable for the structure of the data is determined.

Step 3 The variogram model determined in Step 2 will be selected as the membership function.

Step 4 Depending on the number of the fuzzy sub-clusters and the value range of the independent variables determined in the first step, the premise parameters are determined.

Step 5 For every cluster that each of the independent variables belongs to, the value of the membership level is determined. With the use of this membership levels, the weights that are the outputs of the second layer of the adaptive network are obtained. The weights obtained are normalized and the output of the third layer of the adaptive network is obtained.

With the use of the weight values obtained from the third layer, consequent parameter set is determined.

With the use of the consequent parameter set, the models belonging to the fuzzy rules are defined as:

$$\hat{Y}^{l} = \, c_{0}^{l} + c_{1}^{l} x_{1} + c_{2}^{l} x_{2}$$
(11)

Using the models constituted and the weights \(\bar{w}^{l}\) determined in Step 5, the estimation values are calculated as follows:

$$\hat{Y} = \sum\limits_{l = 1}^{m} {\bar{w}^{l} \cdot \hat{Y}^{l} }$$
(12)

Errors for each observation are calculated and the amount of errors for the model is calculated in:

$$\hat{\varepsilon } = \frac{1}{N}\sum\limits_{k = 1}^{N} {\left( {y_{k} - \hat{y}_{k} } \right)^{2} }$$
(13)

Provided \(\phi\) is the amount of error prespecified by the decision maker, if \(\hat{\varepsilon } < \phi\), then go to Step 8. If \(\hat{\varepsilon } \ge \phi\), then go to Step 6.

Step 6 The backpropagation error that is used in the updating of the premise parameter set is calculated and the premise parameter is updated. The updating formula is

$$\Delta \rho = - \eta \frac{{\partial \left( {y_{k} - \hat{y}_{k} } \right)^{2} }}{\partial \rho } \,$$
(14)

where ρ is a parameter of the lth neuron at layer r and η is the learning rate.

Step 7 Go to Step 4.

Step 8 The algorithm is stopped. The consequent parameter set is set as the parameter for the model to be established. The center and the deviation values, too, correspond to the premise parameter set.

Figure 4 shows the flow chart of algorithm.

Fig. 4
figure 4

Flow chart of algorithm

5 Application on the Prediction of the Crustal Motion Velocities in Marmara Region, Turkey

For the application, the global positions of the crustal motion velocities and the measurement values corresponding to these positions that are available in the study of Reilinger et al. [26] are used. After the 1999 earthquakes in Turkey, many national and international research projects have been started in Turkey and it has been aimed to obtain many more parameters about the regional geography. For this reason, the study area is determined to be the region between the latitudes 39°–42° and longitudes 26°–31° which covers the Marmara Region and the surroundings. The 73 values within this region constituted the data set (Fig. 5).

Fig. 5
figure 5

Locations of the data

As the motion velocities are given in two directions as northward and eastward, the analyses are made separately for the north and the east directions. Some descriptive statistics about the crustal motion velocities are seen in Table 2.

Table 2 Descriptive statistics for crustal velocity

5.1 Kriging Application

To compute prediction in kriging method GS+ (Gswin7), Surfer (Version 8.2) is used. First of all, variogram model needs to be determined. Sample variograms are calculated independently from the direction considering the locations of the data. Therefore, average variogram is taken into account in variogram modeling.

The calculated sample variograms and the revised model variograms are shown in Fig. 6a for north direction and in Fig. 6b for east direction.

Fig. 6
figure 6

Sample and model variogram. a North direction and b east direction

When the graphs are examined, it is observed that the distribution graph looks like normal distribution for both directions. According to this, for both of the variables, normal distribution model is accepted as the variogram model. The normal distribution model is as follows:

$$\gamma (h) = C_{0} + C\left( {1 - \exp \left( {\frac{{ - h^{2} }}{{a^{2} }}} \right)} \right)$$
(15)

The model parameters are calculated and given in Table 3. After the determination of the variogram model, prediction is computed in kriging method. The predictions are obtained by computing the weights defined in Eq. (3).

Table 3 Parameters of variogram model

The accuracy of the predictions is verified with the use of cross-validation technique. As for the cross-validation, each one of the 73 locations within the data cluster has been taken out from the data cluster in turns and kriging prediction is performed on that location with the help of the other data values. With the calculation of the difference between the prediction values and the measured values, the error values are obtained (Table 4).

Table 4 Premise parameters

The error value for the kriging prediction is calculated by MSE—mean-square-error criterion, and is found for the north and the east directions, respectively, as follows:

$$\begin{aligned} {\text{MSE}}_{\text{Kriging - n}} & = \frac{{\sum\nolimits_{i = 1}^{N} {(Y_{i} - \hat{Y}_{i} )^{2} } }}{N} = 3.4286 \\ {\text{MSE}}_{\text{Kriging - e}} & = \frac{{\sum\nolimits_{i = 1}^{N} {(Y_{i} - \hat{Y}_{i} )^{2} } }}{N} = 5.2120 \\ \end{aligned}$$

5.2 Application of AFNN

For the AFNN application, Anfis Editor under fuzzy logic module of MATLAB software is used. First of all, the data set is separated into two as training set and test set. 75% (55 data) of the data is in the training set and 25% (18 data) is in the test set. The training set is used in the training of the network, and the test set is used in the measuring the performance of the training.

Loaded training data under Anfis Editor are shown in Fig. 7.

Fig. 7
figure 7

Training data. a North direction and b east direction

Application with the AFNN is shown algorithmically as follows:

Step 1 The numbers of fuzzy sub-clusters for independent latitude and longitude variables are calculated using subtractive clustering algorithm. Parameters required for algorithm are selected as squash factor η = 1.25, range of influence ra = 0.5, accept ratio \(\overline{\varepsilon } = 0.5\), reject ratio \(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\varepsilon } \, = \,0.15\) (Fig. 8).

Fig. 8
figure 8

Subtractive clustering window

As a result of the clustering algorithm, the number of sub-clusters for north and east directions is determined as four for both of the latitude and longitude variables. Figure 9 shows the Anfis model structure.

Fig. 9
figure 9

ANFIS model structure

The number of the fuzzy rules to be established as per the numbers of sub-clusters determined is obtained as sixteen through multiplying the numbers of the sub-clusters.

Step 2. For both of the variables, Gaussian distribution model is selected as the variogram model (Eq. 15).

Step 3. According to Step 2, the membership functions are taken as Gaussian distribution and

Step 4. For four cluster, premise parameters are obtained as follows:

Step 5. It is found suitable to perform 500 iterations for training as a result of preliminary tests realized. \(\phi\) is selected as 0.001. Training of the network started. If \(\hat{\varepsilon } \ge \phi\), then go to Step 6. Otherwise, go to Step 8.

Step 6. The backpropagation error is calculated by using Eq. (14), and the premise parameter set is updated with the training of the network.

Step 7. Go to Step 4.

Step 8. The algorithm is stopped. The result of the training is given in Fig. 10.

Fig. 10
figure 10

Result of training. a North direction. b east direction

The consequent parameter set is set as the parameter for the model. The predictions for the center and the deviation values initially determined and obtained as a result of the training are given in Tables 5 and 6 for north and east directions, respectively.

Table 5 The center and deviation values for the north direction
Table 6 The center and deviation values for the east direction

The fuzzy rules that are formed using the values of consequent parameter set are given in Tables 7 and 8 for north and east directions, respectively. In Tables 7 and 8, the x1 and x2 coordinates are the values of the latitude and the longitude, respectively, and \(C_{i} \,\), \(D_{j} \,\) \((i,j = 1,2,3,4)\) clusters are the fuzzy sub-clusters for the latitude and longitude values, respectively. The predictions for the crustal motion velocities are calculated by means of these models.

Table 7 Fuzzy rules for north direction
Table 8 Fuzzy rules for east direction

For the evaluation of the results obtained through the method, performance of the test set is considered. Figure 11 shows the testing data and FIS output for test data.

Fig. 11
figure 11

Testing data and FIS output. a North direction. b east direction

The MSE values for the test set are calculated for the north and the east directions as follows:

$$\begin{aligned} {\text{MSE}}_{\text{AFNN - n}} = \frac{{\sum\nolimits_{k = 1}^{N} {(y_{k} - \hat{y}_{k} )^{2} } }}{N} = 5.6503 \hfill \\ {\text{MSE}}_{\text{AFNN - e}} = \frac{{\sum\nolimits_{k = 1}^{N} {(y_{k} - \hat{y}_{k} )^{2} } }}{N} = 3.0725 \hfill \\ \end{aligned}$$

6 Comparison of Results

For the evaluation of the results obtained through both methods, the contour maps are compared. Then the performance of the test set is considered. The errors on the predictions are given in Fig. 12 on the contour map. On the map that is printed on the map of the Marmara Region, the fault lines (red lines) and the errors are shown together. The region is between the latitudes 39°–42° and longitudes 26°–31°.

Fig. 12
figure 12

Errors on crustal velocity prediction (Marmara Region). a For Kriging. b For AFNN (for north and east directions, respectively) (errors are in mm and equivalent error range is 1 mm)

When the errors are evaluated, it is seen that the contours intensify near the fault lines and larger errors are formed in Kriging method. In AFNN method, however, it is seen that the contours are not very intensified and the errors are small.

When the results are obtained through both methods, it can be seen that the performance of the test set has significance as well.

The coordinates for the test set and the velocities observed at these coordinates (Y), the prediction values (\(\hat{y}_{\text{AFNN}}\)) and errors (\(\hat{e}_{\text{AFNN}}\)) obtained from AFNN approach and the prediction values (\(\hat{y}_{\text{Kriging}}\)) and errors (\(\hat{e}_{\text{Kriging}}\)) obtained from Kriging method are given in Tables 9 and 10 for the north and east directions, respectively.

Table 9 Value of prediction and error of test set for north direction
Table 10 Value of prediction and error of test set for east direction

From Fig. 13, we can see the errors obtained from both methods for test data.

Fig. 13
figure 13

Errors for test data. a North direction and b east direction

When the MSE values are compared for the test set, the error obtained in AFNN is found to be larger for the north direction and smaller for the east direction (Table 11).

Table 11 Comparison of MSE for test set

Different from the kriging method, prediction value, in adaptive networks, for a given location can be calculated without the need to make the calculations from scratch with the use of the models belonging to the fuzzy rules obtained.

7 Conclusions

Determination of the crustal motion velocities causing earthquakes is hard and is a prolonged process. Observation stations are established to determine the motion velocities and data are obtained as a result of the measurements made in different periods at these stations. Motion velocities in unknown coordinates can be predicted with the use of motion velocities determined by the observation stations. As there is a spatial relation between the motion velocities, spatial prediction is utilized for this prediction. In this paper, as an alternative to the existing methods used in spatial prediction problems, use of AFNN approach and the variogram function which considers the spatial dependence in selection of the membership function is suggested. In order to determine the efficiency of the system, a comparison is made to the Kriging method. The MSE obtained from the test data showed that the AFNN system had given results as efficient as the kriging method which is known to be the best prediction method. So, it can be said that the ability of AFNN for earthquake prediction is good. When worked with enough data, the AFNN system could be used in solving earthquake prediction problems.

The advantage of the AFNN system over the kriging method is that it has models in hand for predictions. With the help of these models obtained, prediction of the crustal motion velocities in any given location in a study area can be quickly calculated. In kriging method, the weight values used in predictions are calculated based on the distance between the variables. Therefore, the weights need to be recalculated for the prediction of each new spot. It can be said that AFNN has eliminated this problem. When the fuzzy models are obtained with the working of the network, the predictions will be made any time. This is the strength of the AFNN.

Besides the advantages of the AFNN system, there are some points to be careful about during the application stage. If there are not enough data for the application, models suitable for the AFNN system may not be established. In such circumstances, even though sufficient results for the training set are obtained, the predictions on the sample locations may have big errors. Another point to be watched is the selection of the membership function. In the selection of the membership functions, selection of a variogram model that is not appropriate for the structure of the data will surely affect the performance of the AFNN system, values of the premise parameters and the models for the fuzzy rules to be obtained through the network. For this reason, attention should be given to the selection of the membership function. These are the weaknesses of the AFNN.

Prediction of the crustal motion velocities has significance in terms of earthquake studies. Thanks to the research regarding crustal motions, annual velocity of the deformation in the seismically active portions of the earth’s crust can be calculated. The period of time that the deformation reaches the saturation point, taking this velocity into account, can be predicted and earthquakes might be predicted in advance. Therefore, this study is thought to be of contribution to the earthquake prediction studies.

This study is applied in Marmara Region, Turkey. Researchers can similarly use AFNN approach in any spatial region.