Introduction

Loess is one of the major problematic soils in Golestan Province. The Loess of Golestan Province has a potential to collapse which has caused severe settlement problems for many structures founded upon it. For this purpose, the collapse potential of loess in Golestan Province was studied. Loess is a fine-grained, clastic sedimentary rock primary of Aeolian origin which covers nearly 10 % of the Earth’s surface (Heller and Evans 1995), and these soils are generally composed of homogeneous and angular particles. The particles’ size is often similar to silt (50–90 %), and they are accompanied by illite and sometimes sand. These sedimentations are recognized by lack of stratification and homogeneous sorting in field. Loess is classified into three groups: silt, clay, and sand groups (Jinfeng et al. 2006). In engineering behavior, loess soils are considered as problematic materials and the collapse phenomenon is a common risk in this type of soil (Sariosseiri and Muhunthan 2009). The collapsible soil is defined as soil that is susceptible to a large and sudden reduction in volume upon wetting. Collapsible soil deposits share two main features; first, they are loose, cemented deposits, and second, they are naturally quite dry (Day 2000). The collapsible soil can withstand a large applied vertical stress with a small amount of compression but then shows much larger settlement upon wetting, with no increase in vertical stress (Jennings and Knight 1975). Collapse behavior of soil can yield disastrous consequences for structures unwittingly built on such deposits. Unsaturated soil can experience significant collapse under the following conditions (Mitchell 1993; Rogers 1995; Fookes 1997):

  • The soil has an open, potentially unstable and unsaturated structure (high void ratio, low dry density, high porosity).

  • A high enough value of external stress is applied to develop a metastable condition.

  • A high enough value of suction is available to stabilize intergranular contacts whose reduction on wetting leads to collapse of the soil.

  • The interparticle bond strength is low.

Loess in Golestan Province with high thickness has covered more than 17 % of the province area (Feiznia et al. 2005). Numerous studies have been carried out on these soils, including the effect of soil structure on behavior of loess (Gerard et al. 2007), effect of intergranular cement on mechanical strength of loess (Sariosseiri and Muhunthan 2009), influence of climate and secondary changes (Derbyshire et al. 1997), and study of the effect of physical characteristics on their deformation properties (Reznik 2007).

Gorgan and its surrounding (northeastern part of Iran), which suffered from collapsing for many years, was selected as the application site of this study. Determining the potential of collapsibility and estimating the collapse rates in loess soils of Golestan Province on experimental test basis, particularly for managing railway, roadway, and other civil structures in the study area, requires a simple and effective assessment tool. Artificial neural network has been shown to be quite effective in estimating collapsibility (Salehi 2011).

A neural network is a massively parallel-distributed processor made up of simple processing units called neurons, which have a natural propensity for storing experimental knowledge (Haykin 1999). Each neuron has a number of inputs and one output. It also has a set of nodes called synapses that connect to the inputs, output, or other neurons. The motivation for the development of neural network technology stems from the desire to develop an artificial system that could perform “intelligent” tasks (Hayati and Shirvany 2007).

Learning a mapping between an input and an output space from a set of input-output data is the core concern in diverse real-world applications. Function approximation, which finds the underlying relationship from a given finite input-output data, is the fundamental problem in a vast majority of real-world applications, such as prediction, pattern recognition, data mining, and classification. Various methods have been developed to address this problem, where one of them is by using artificial neural networks.

In short, the main concern is to minimize the error function. In other words, the principle objective of function approximation is to enhance the accuracy of the estimate. There exist multiple methods that have been established as function approximation tools, where an artificial neural network (ANN) is one of them. According to Cybenko (1989) and Hornik et al. (1989), there exists a three layer neural network that is capable of estimating an arbitrary nonlinear function f with any desired accuracy. Hence, it is not surprising that ANNs have been employed in various applications (Doulati Ardejani et al. 2013; Jodeiri Shokri et al. 2013), especially in issues related to function approximation, due to their capability in finding the pattern within input-output data without the need for predetermined models.

Therefore, in this paper, the collapse potential of loess soils in Golestan Province is estimated by using neural networks. Determining an appropriate architecture of the neural network for a particular problem is an important issue, since the network topology directly affects its computational complexity and generalization capability (Kisi and Uncuoglu 2005). But due to geological complexity, many variables (e.g., quantitative, semi-quantitative, qualitative) are involved in an engineering geology evaluation, and these variables have a highly nonlinear relation with the evaluation results. Therefore, it is quite necessary and meaningful to establish an engineering geology evaluation method that can minimize artificial influences and deal with quantitative and qualitative indexes. ANN is a new discipline. A comparison study is also carried out to show the performance of three different neural networks, multilayer perceptron (MLP), radial basis function (RBF), and adaptive neuro-fuzzy inference system (ANFIS), using different train algorithms and different sets of parameters for each network. Finally, the best network with the optimal parameters is presented.

The organization of this paper is as follows: the “Materials and methods” details geological settings and study area, and the “Results and discussions” presents the design of neural networks and numerical results. This section consists of three subsections: one neural network is simulated, and corresponding results are analyzed in each of these subsections. And the last section is the “Conclusion.”

Materials and methods

The geological setting

The study area

Golestan Province is located in following geographical coordinates: lat. 38.15–36.30 (north); lang. 54.00–56.00 (east), and the capital of this province is Gorgan. It is also surrounded by Mazandaran, Semnan, and Khorasan provinces. The northeastern part of this province has a common border with the Turkmenistan Republic. The Caspian Sea borders the northwestern corner of this. Golestan Province has a mild Mediterranean climate in most of its regions. Golestan plain is close to Turkmenistan desert with lacks of high mountains and has a semi-desertic warm climate. Humid forest regions of northern Alborz are extended to Minoodasht and Golestan forests. Loess in Golestan Province has a high thickness (varying from 30 to 150 m) and covers more than 17 % of the province area (388,000 ha) (Feiznia et al. 2005) .

This province is situated in two different structural domains. The northern sector is part of Turan plate, and the southern part belongs to Iranian plate. The Pliocene-Quaternary Loess of Caspian Sea region is known as a sediment wind-refrigerator (Andalibi 1994). According to recent studies (Salehi 2011), loesses of Golestan Province have high sensitivity to the phenomenon of water erosion. In terms of origin, loesses of this area are divided into two brigades: East and West Caspian Sea Loess. Loess of brigade East Caspian Sea has some minerals such as quartz, calcite, feldspar, dolomite, mica, and clay with sedimentary genesis, while loesses of brigade West Caspian Sea have igneous origin due to the presence of minerals such as hornblende, basalt, and volcanic ash (Fig. 1).

Fig. 1
figure 1

Geotechnical zoning map of Golestan Province soils (Rezaei 2008)

Based on sedimentology and geotechnical characteristics, loess of this province is extended in the three regions 1, 2, and 3 (Fig. 2). This study is the outcome of experiments in seven different zones of Golestan Province. This research aimed to estimate the collapsibility potential of Golestan loess to prevent from dangers and damages on structures founded upon it (Table 1).

Fig. 2
figure 2

Extension of loess deposits in Golestan Province: Region 1: Seid-Miran, Aliabad, Agh-emam; Region 2: Gonbad-Kalaleh, Golestan dam; Region 3: Alagol, Tangli

Table 1 The geological situation and climatic conditions of Golestan Province

Characteristic properties of Golestan loess

Loess is an Aeolian sediment formed by the accumulation of wind-blown silt, typically in the 20–50-μm size range, 20 % or less clay and the balance equal parts of sand and silt that are loosely cemented by calcium carbonate. It is usually homogeneous and highly porous and is traversed by vertical capillaries that permit the sediment to fracture and form vertical bluffs (Donahue et al. 1977).

Golestan Province has an extent of 21,000 and 3880 km2 of which at the northeast part of Iran is covered with loess deposits which originate from the south part of China. In the aspect of dispersion, loesses can be divided into three areas. The first area has an extent about 1380 km2 which consists of mountainous and skirts part of the south. The second area with an extent about 1540 km2 is located at the middle parts of the field, and the third with an extent about 960 km2 consists of the northern and border part of the province. This area mainly consists of different thicknesses of loess deposits on formation outcrops (Salehi 2011). These homogeneous, porous loesses are also friable, pale yellowish brown or buff in color, slightly coherent, very light and dry, typically nonstratified, and often calcareous (sedimentary deposits are composed largely of silt size grains that are loosely cemented by calcium carbonate).

Golestan Province’s loess grains are angular with little polishing or rounding and are composed of crystals of quartz, feldspar, mica, and other minerals. Loess can be described as a rich, dust-like soil. Some of the samples demonstrate a macroscopic pore that is easily visible (Fig. 3). Soil erosion is one of the most serious ecological and environmental problems in Golestan Province’s loess. There are many kinds of landscapes in Golestan Province’s loess such as landslide, gully, and subsidence. All of these landscapes are created by high collapsibility (Salehi 2011).

Fig. 3
figure 3

Macroscopic pore in the samples of the northern area

Mechanism of collapse

Collapse is a typical feature of unsaturated rather loose and low plastic soils, which are typical features of loess. Collapse is a significant volume reduction observed when wetting an unsaturated sample under load. This phenomenon has been described for a long time in arid regions. Jennings and Knight proposed in 1957 the double-oedometer method and in 1975 the single oedometer test to estimate the amplitude of possible collapse. Also, Gibbs and Bara proposed a simple collapse criterion based on the dry density and the liquid limit in 1962.

The most common types of collapsible soil are as follows (Mansour et al. 2008):

  • Alluvial (water deposited) and colluvial (gravity deposited).

  • Wind deposited (Aeolian) soils are fine sands, volcanic ash tuffs, and loess.

  • Residual soils formed by extensive weathering of parent materials. For example, weathering of granite can yield loose collapsible soil deposits.

As commented by Houston (1995), there are numerous potential sources of water that can cause soil wetting until created collapse phenomenon:

  • Broken water lines, canals, and landscape irrigation

  • Roof run-off and poor surface drainage

  • Intentional and unintentional recharge

  • Rising ground water table

  • Damming due to cut/fill construction

  • Moisture migration due to capillarity and protection from the sun

Barden et al. (1973) gave three conditions to observe collapse in a soil:

  • An open potentially unstable partly saturated structure

  • A high enough value of applied stress component to develop a metastable condition

  • A high enough value of suction (or other bonding or cementing agent) to stabilize intergranular contacts, and whose reduction on wetting will lead to collapse

Collapsible soil deposits share two main features:

  1. i.

    They are loose, cemented deposits.

  2. ii.

    They are naturally quite dry.

Simple capillary forces have often been mentioned as a possible binding agent. However, as stated by Barden, the majority of collapsing soils involved the action of clay plates in the bonds between the bulky sand and silt grains. Possible effects of other chemical cementing agents like iron oxide or calcium carbonate are also mentioned (Rogers 1994). Loess soils have several bond factors such as clay mineral; carbonate bonding (calcium carbonate agents) that could affect the mechanism of hydroconsolidation. Thus, there is a large range of lateral and vertical variations of hydroconsolidation in this type of area.

Results and discussions

Neural networks analysis

The multilayer perceptron network

The most common neural network model is the multilayer perceptron (MLP). This type of neural network is known as supervised networks because they require a desired output in order to learn. This success is primarily associated with its ease of implementation and testing.

The goal of this type of network is to create a model that correctly maps the input to the output using historical data so that the model can then be used to produce the output when the desired output is unknown. Therefore, an MLP network can be seen as a very versatile interpolator that produces a set of output values (output vector) for a given set of input values (input vector), thus being able to mimic complex mappings between input and output variables in situations where the physical relationship linking these variables is difficult or even impossible to obtain.

To achieve this, the MLP possesses a number of basic units called neurons (Wasserman 1989; Cichoki and Unbehauen 1993). The fact that every single neuron applies a nonlinear function to the sum of its weighted inputs makes the MLP capable of representing complex functions. These weights are modified during the training phase, whereby sets of input vectors and their associated output vectors are sequentially presented to the MLP. The modification of the weights is automatically executed by the training algorithm so as to minimize the difference between a calculated output vector and the corresponding desired output vector. MLP training is an iterative procedure that possesses a few control parameters for evaluating convergence and deciding when to stop iterations. After the training phase is completed, all MLP weights have a well-defined value that will not change in time (unless more training iterations are performed). In this moment, the MLP is ready for use in processing mode, where only input vectors are presented to it.

One can conclude that in the MLP ANN, the number of inputs, the number of hidden nodes, transfer functions, and training methods affect the prediction performance and hence need to be chosen carefully. The most important work in building our soft computing-based collapse potential prediction models is the selection of the input variables.

The network is implemented by using the MATLAB Neural Networks Toolbox. The size of the input vector is 17 × 17 including the particle size factors (gravel percentage, sand percentage, silt percentage, clay percentage), the physical and mechanical properties (natural moisture content, void ratio, dry density, degree of saturation, liquid limit, activity, inverse the liquidity index), the chemical properties (calcium carbonate), and other properties such as the type of sediment, climate, age, precipitation, and vegetation (Appendix 1). The size of the target vector is 1 × 17 in this structure which corresponds to the collapse potential (Ic%) of 17 samples. Two samples (samples 10 and 18) are also chosen to be used as “the performance verification samples” (17 × 2 inputs and 1 × 2 targets) (Appendix 2).

Different train algorithms with various network structures are implemented to achieve the best performance which is denoted for the two noted samples, by the mean square error (\( \mathrm{M}\mathrm{S}\mathrm{E}=\frac{1}{n}{\displaystyle {\sum}_{i=1}^n}{\left({\widehat{x}}_i-{x}_i\right)}^2 \), where \( {\widehat{x}}_i \) and x i are the estimated and true values, respectively), the mean absolute percentage error (\( \mathrm{MAPE}=\frac{1}{n}{\displaystyle {\sum}_{i=1}^n}\left|\frac{{\widehat{x}}_i-{x}_i}{x_i}\right| \)), and the mean absolute error (\( \mathrm{M}\mathrm{A}\mathrm{E}=\frac{1}{n}{\displaystyle {\sum}_{i=1}^n}\left|{\widehat{x}}_{{}_i}-{x}_i\right| \)). The corresponding numerical results are presented in Table 2. In all the simulations, the train ratio, the test ratio, and the validation ratio are fixed at 70, 15, and 15 %, respectively.

Table 2 The collapse potential estimation results using MLP network

One can see that the best performances are reached by the Levenberg-Marquardt (LM) training algorithm (MSE = 6.3771 × 10− 8, MAPE = 0.0398 %, MAE = 1.9578 × 10− 4), using 5, 25, and 5 neurons, in the first and second hidden layers and the output layer, respectively. The first hidden layer performs with the hyperbolic tangent sigmoid transfer function, and the other two layers use the linear transfer function. This training algorithm is more precise and let the neural network working with a simpler structure and less number of neurons (compared to other training algorithms).

The radial basis function network

The MLP network itself has certain shortcomings. Firstly, the MLP tends to get trapped in undesirable local minima in order to reach the global minimum of a very complex search space. Secondly, training of the MLP is highly time-consuming, due to the slow converging of the MLP. Thirdly, the MLP also fails to converge when high nonlinearities exist. Thus, these drawbacks deteriorate the accuracy of the MLP in function approximation (Haykin 1999; Ahmad Fadzil et al. 2001; Zainuddin and Evans 2003).

To overcome the obstacles encountered by using an MLP, a radial basis function (RBF) network, which has been introduced by replacing the global activation function in the MLP with a localized radial basis function, has been found to perform better than the MLP in function approximation (Moody and Darken 1989; Broomhead and Lowe 1988).

The radial basis function network was first introduced by Broomhead and Lowe (1988), which is just the association of radial functions into a single hidden layer neural network. As one can see, a RBF network is a standard network with inputs and two layers, consisting of d input nodes, one hidden layer consisting of m radial basis functions in the hidden nodes, and a linear output layer. There is an activation function φ for each of the hidden nodes that receives multiple inputs \( \overrightarrow{x}=\left({x}_1, \dots, {x}_d\right) \) and produces one output y. The size of the input vector and the target vector, the train ratio, the test ratio, and the validation ratio are the same as the last subsection. The training performance goal is set to 1 × 10− 20.

Simulation results show that the RBF network is even better than the MLP network trained by LM algorithm. This network is implemented with a maximum number of 20 neurons, and the spread of the radial basis transfer functions is set to 1.5. The best achieved performances are MSE = 2.6631 × 10− 7, MAPE = 0.1026 %, and MAE = 3.6490 × 10− 4 (better than all the simulated MLP networks except the one trained by LM algorithm in Table 2).

The adaptive neuro-fuzzy inference system

An adaptive neuro-fuzzy inference system is a fuzzy Takagi-Sugeno model put in the framework of adaptive systems to facilitate learning and adaptation (Jang 1993). In adaptive neuro-fuzzy inference system (ANFIS) architecture, the first layer is formed by adaptive nodes that give the degree of fuzzy membership of the input. The second computes firing strengths of the associated rules. Neurons constituting the third layer are fixed neurons and play a normalization role to the firing strengths from the previous layer. The fourth layer is adaptive that gives the product of the normalized firing level and a first-order polynomial. Finally, the last layer presents a summation of all incoming signals.

Two types of data are loaded in this network:

  1. 1.

    Training data: the size of the training data is 17 × 18, and the last column corresponds to the output.

  2. 2.

    Testing data: the size of the testing data is 2 × 18, and the last column corresponds to the output. This type consists of the collected data of two samples which are chosen to be used as “the performance verification samples.”

The subtractive clustering partition method is used for generating the fuzzy inference system (FIS). Four parameters are set in this method:

  • The range of influence: this variable is a vector of entries between 0 and 1 that specifies a cluster center’s range of influence in each of the data dimensions, assuming the data falls within a unit hyperbox. Small values of this parameter generally result in finding a few large clusters. In this paper, the best performance is achieved for 1.0.

  • The squash factor: this variable determines the neighborhood of a cluster center, so as to quash the potential for outlying points to be considered as part of that cluster. This variable is between 1.0 and 2.0. In this paper, the best performance is achieved for 2.0.

  • The accept ratio: this factor sets the potential, as a fraction of the potential of the first cluster center, above which another data point is accepted as a cluster center. This constant is between 0 and 1. In this paper, the best performance is achieved for 0.9.

  • The reject ratio: this factor sets the potential, as a fraction of the potential of the first cluster center, below which a data point is rejected as a cluster center. This constant is between 0 and 1. In this paper, the best performance is achieved for 0.85.

The following step is to choose how to train the FIS. In this paper, the hybrid optimization method is used which is a combination of least-squares and backpropagation gradient descent method. Two membership functions are considered for each input. The error tolerance is set to 1 × 10− 2 and is reached at the second epoch. The best achieved performance is MAE = 0.029185. The result is not good enough compared to the two other neural networks, but as noted before, the estimation is done in a very small number of epochs (two epochs).

One can see that the MLP network trained by LM algorithm performs better than the other networks in all the error indexes. The RBF network also estimates the collapse potential better than the ANFIS network, and its estimation error indexes are very close to MLP ones. The estimated %Ic by the MLP network trained by different algorithms for the samples 10 and 18 is represented in Table 2.

Data analysis applied on neural networks

In this section, the simulated neural networks are the same as the last section and their structures do not change.

The first case—related parameters suppression

In this subsection, the following inputs are omitted from simulations, as they are related to other important inputs (by equations (Salehi 2011) and/or by conditions):

  • The degree of saturation (S) which is related to the natural moisture content (w) by the following equation:

    $$ S\cdot e={G}_{\mathrm{s}}\cdot w $$
    (1)

    where e is the void ratio and G s is the specific weight of the soil.

  • The dry density (γ d) which is related to the void ratio (e) is by the following equation:

    $$ {\gamma}_{\mathrm{d}}=\frac{G_{\mathrm{s}}\cdot {\gamma}_{\mathrm{w}}}{1+e} $$
    (2)

    where γ w is the water density.

  • The liquid limit (l l) which is related to the inverse of the liquidity index (l i) by the following equation:

    $$ {l}_{\mathrm{i}}=\frac{w-{P}_{\mathrm{l}}}{l_{\mathrm{l}}-{P}_{\mathrm{l}}} $$
    (3)

    where P l is the plastic limit.

  • The rainfall which is related to the natural moisture content (w) by conditions.

  • The vegetation.

“The performance verification samples” are the same (samples 10 and 18). The corresponding numerical results are presented in Table 3. One can see that the RBF network performs better than the other networks in all the error indexes.

Table 3 The collapse potential estimation results using different neural networks in the case of related parameters suppression

The second case—changing the numbers of train data and test data

Until now, the size of the input vector was 17 × 17, and the size of the target vector was 1 × 17. Two samples were also chosen to be used as “the performance verification samples” (17 × 2 inputs and 1 × 2 targets). But in this subsection, we will change the noted sizes of input and test vectors (the performance verification samples).

In the first step, 7 samples are chosen as the test samples and thus 12 samples are used as the neural networks inputs. The corresponding numerical results are presented in Table 4. One can see that the MLP network performs better than the other networks in all the error indexes.

Table 4 The collapse potential estimation results using different neural networks in the case of 7 test samples and 12 input samples

In the second step, 4 samples are chosen as the test samples and thus 15 samples are used as the neural networks inputs. The corresponding numerical results are presented in Table 5. One can see that the MLP network performs better than the other networks in all the error indexes.

Table 5 The collapse potential estimation results using different neural networks in the case of 4 test samples and 15 input samples

The third case—in-depth analysis of samples

As noted before, there are 19 samples of soil from seven zones and the first sample of each region corresponds to a less deep sample and the last sample is taken from the deepest depth from the surface. So, in this subsection, we will analyze the effect of choosing less deep samples or deepest samples as the test data. Thus, 7 samples will be used as the test samples.

Therefore, in the first step, we choose the less deep samples as the test data:

The corresponding numerical results are presented in Table 6. One can see that the MLP network performs again better than the other networks in all the error indexes.

Table 6 The collapse potential estimation results using different neural networks in the case of the less deep samples as test data

Now in the second step, the deepest samples are chosen as the test data:

The corresponding numerical results are presented in Table 7. One can see that the MLP network performs again better than the other networks in all the error indexes.

Table 7 The collapse potential estimation results using different neural networks in the case of the deepest samples as test data

Conclusion

In this paper, the collapse potential prediction of loess soils in Golestan Province is investigated. The estimation is carried out by using three different neural networks, MLP, RBF, and ANFIS. Simulation results and comparison studies are shown to demonstrate the effectiveness and performance of our proposed networks. Numerical results show that the best estimation is achieved via the MLP network with Levenberg-Marquardt backpropagation training algorithm, but the error indexes in the RBF network are also very close to the MLP network. The designed ANFIS is not suitable for this estimation, because of its high MSE (subject to the other networks), but operates in just two epochs, while the MLP network takes six epochs to reach the best performance. Finally, the three neural networks have been tested in different cases of data sets; at first, five related parameters have been omitted and simulations proved that the RBF network performs a little better estimation of the collapse potential in this case. Secondly, the numbers of train data and test data have been changed in two ways and numerical results demonstrated that the MLP network trained by LM algorithm achieves a better estimation of the collapse potential than the RBF network. In the last case, the test samples have been chosen at first, from the less deep samples and then from the deepest samples. Using the simulation results, one can conclude that the MLP network trained by LM algorithm estimates the collapse potential more precisely than the RBF network, but for some of the test samples, the RBF networks possesses very close estimations to the MLP ones.