1 Introduction

Since the soil liquefaction was found to be the main cause of engineering failure and natural disaster, many researchers have been trying to predict and evaluate the liquefaction potential of one certain in situ in advance, so that the relevant loss can be relieved. The simplified procedure of comparing the cyclic shear ratio CSR with the cyclic resistance ratio CRR proposed by Seed and Idriss (1971) is a milestone of developing the prediction model of soil liquefaction. This procedure is always based on the in situ tests, which mainly consist of the cone penetration test (CPT) and standard penetration test (SPT). Comparing to the SPT, a primary advantage of the CPT is the nearly continuous information provided along with the depth of the target stratum, and is also more consistent and repeatable (Kohestani et al. 2015), so many researchers have been proposing and developing the empirical prediction model based on the CPT data. Robertson and Wride (1998) described a new method to estimate grain characteristics directly from the CPT and to incorporate this into one of the methods for evaluating cyclic resistance. Olsen (1997) described the cone resistance-based and the CPT soil characterization chart-based techniques for estimating liquefaction resistance. Juang et al. (2000) developed a mapping function between the factor of safety and the actual probability of liquefaction based on field case records; Juang et al. (2008) presented a piezocone penetration test (CPTu) method for evaluating soil liquefaction potential covering a more comprehensive range of soil types than previous approached and using the simplified stress-based procedure. Moss et al. (2006) presented a complete methodology for both probabilistic and deterministic assessment of seismic soil liquefaction triggering potential based on CPT. Idriss and Boulanger (2004) recommended revised CPT-based liquefaction correlations for use in practice based on these re-evaluations of the CPT case history databases.

The models above are established by deriving empirical formulations for cyclic resistance ratio (CRR) and cyclic shear ratio (CSR) based on massive historical data and in situ test data, including physical index such as cone tip resistance, overburden loading, water depth et al., they are unable to adequately learn and reflect the complexity of mesoscopic mechanism behind the soil liquefaction. As the development of the technology of computer science and machine learning, they are gradually adopted into the actual engineering prediction, such as water inrush incident (Yonggang Zhang & Yang, 2020), displacement prediction of the landslide (Safa et al., 2020; Zhang et al. 2020a, b), underground mining (Zhao et al. 2020) and concrete technology (Shariati et al. 2020a, b; Shariati et al. 2020a, b). Besides, they are also tried to learn the relationship between soil liquefaction and influencing factors. Samui (2007) proposed the use of the Relevance Vector Machine (RVM) to determine the liquefaction potential of soil by using actual cone penetration test (CPT) data. Goh and Goh (2007) trained the support vector machines (SVM) model and tested it on 226 field records of liquefaction performance and cone penetration test measurements. Sadoghi Yazdi et al. (2012) employed the Support Vector Data Description (SVDD) strategy to ‘‘up sample’’ the minority data to overcome learning bias to the majority class in the prediction model. Xue and Yang (2013) developed an integrated fuzzy neural network model, called Adaptive Neuro-Fuzzy Inference System (ANFIS) which was revealed to be capable of representing the complicated relationship between seismic properties of soils and liquefaction potential. Muduli and Das (2015) developed an empirical model for determining the CRR using multigene genetic programming (MGGP) based on the post-liquefaction CPT data. Kohestani et al. (2015) and Nejad et al. (2018) developed the random forest (RF) models on 226 and 415 field records, respectively, and indicated RF models provided more accurate results by comparing with the available artificial neural network (ANN) and SVM models. Ahmad et al. (2019) investigated the performance of Bayesian belief network (BBN) and C4.5 decision tree (DT) models to evaluate soil liquefaction, showed that the BBN model was preferred over the other approaches for evaluation of seismic soil liquefaction potential. Das et al. (2020) proposed and applied multi-objective feature selection algorithms (MOFS) to highly unbalanced databases of in situ tests including SPT, CPT, and Vs, to effectively select the optimal parameters and simultaneously minimize the error.

To make the development of machining learning be better adopted in soil liquefaction, and provide more choices of establishing prediction model, a new ANNs called ELM which is faster than traditional methods and suitable for predicting nonlinear problems was introduced to predict the soil liquefaction based on CPT. Firstly, seven prediction parameters based on the in situ CPT were analyzed and determined; then the 226 samples cited from Juang et al. (2003) were divided into the training set and test set; by comparing the training accuracy and speed of model when setting the number of the neuron of the hidden layer from 5 to 16 and the activation function as \({\text {sig}}\), \({\text {sin}}\), \({\text {hardlim}},\) the parameter of ELM model was assured. Finally, the performance of this ELM model was tested through the test set, and the cause of low overall accuracy was analyzed, its ability to predict the happening of soil liquefaction was also indicated.

2 Determination of the training database

2.1 The selection and analysis of parameters

According to the framework of the simplified procedure of evaluation of soil liquefaction (Seed and Idriss 1971), the \({\text{CSR}}\) and \({\text{CRR}}\) need to be calculated and compared. The \({\text{CSR}}\) is generally obtained by using Eq. (1) in actual engineering.

$$ {\text{CSR}} = 0.65\frac{{a_{{{\text{max}}}} }}{g}\frac{{\sigma_{v0} }}{{\sigma_{v0}^{^{\prime}} }}r_{d} $$
(1)

where \(a_{{{\text{max}}}}\) is the peak acceleration of surface, \(\sigma_{v0}\), \(\sigma_{v0}^{^{\prime}}\) is the total overburden stress and effective overburden stress, respectively; \(r_{d}\) is the reduction index of shear stress, it is calculated by Eq. (2);

$$ r_{d} = \exp \left[ {a\left( z \right) + \beta \left( z \right)M_{w} } \right] $$
$$ a\left( z \right) = - 1.012 - 1.126{\text{sin}}\left( {\frac{z}{11.73} + 5.133} \right) $$
$$ \beta \left( z \right) = 0.106 + 0.118{\text{sin}}\left( {\frac{z}{11.28} + 5.142} \right) $$
(2)

where \(a\left( z \right)\) and \(\beta \left( z \right)\) are functions of depth \(z\), the concrete expression can be checked in Idriss (1999), \(M_{w}\) is the seismic magnitude.

As for the cyclic resistance ratio \({\text{CRR}}_{7.5}\) it can be estimated by using Eq. (3) proposed by Robertson and Wride (1998) and Robertson (2009).

$$ {\text{CRR}}_{7.5} = \left\{ {\begin{array}{*{20}l} {0.833\left( {\frac{{\left( {q_{c1N} } \right)_{cs} }}{1000}} \right)^{3} + 0.05 \left( {50 \le \left( {q_{c1N} } \right)_{cs} < 160} \right)} \hfill \\ {93\left( {\frac{{\left( {q_{c1N} } \right)_{cs} }}{1000}} \right)^{3} + 0.08 \left( {\left( {q_{c1N} } \right)_{cs} < 50} \right)} \hfill \\ \end{array} } \right. $$
(3)

where the \(\left( {q_{c1N} } \right)_{cs}\) is the normalized penetration resistance of pure sand, it is related to the normalized penetration resistance \(q_{c1N}\) in Eq. (4).

$$ \left( {q_{c1N} } \right)_{cs} = K_{c} q_{c1N} $$
(4)

where the value of \(K_{c}\) is decided according to the value of \(I_{c}\) in Eq. (5), the \(q_{c1N}\) is the normalized penetration resistance which can be calculated in Eq. (6).

$$ I_{c} = \left[ {\left( {3.47 - lgQ} \right)^{2} + \left( {1.22 + lgF} \right)^{2} } \right]^{0.5} $$
(5)
$$ q_{c1N} = \left( {q_{c} /p_{a2} } \right)\left( {p_{a} /\sigma_{v0}^{^{\prime}} } \right)^{0.75} $$
(6)

where \(Q\), \(F\) is the modified tip resistance \(q_{c}\) and sleeve friction \(f_{s}\).

So the selected parameters are \(M_{w}\), \(\sigma_{v0}\), \(\sigma_{vo}^{^{\prime}}\), \(a_{{{\text{max}}}}\), \(z\), \(q_{c}\), \(f\).

Another reason of choosing \(M_{w}\) is that the seismic magnitude generally represents the seismic energy, under the same condition, the higher value of \(M_{w}\) means the more energy input into the stratum, the liquefaction happens more easily; however, due to the different thickness of the soil layer and other reasons, the surface produces a different intensity of vibration for the different places within the same range of intensity, thus the seismic magnitude \(M_{w}\) and the peak surface acceleration \(a_{\max }\) are required simultaneously. The total overburden stress \(\sigma_{v0}\) and effective overburden stress \(\sigma_{vo}^{^{\prime}}\) represent the total and effective vertical axial stress loaded on the target stratum, respectively; they can also indirectly represent the total and effective lateral axial stress if the lateral pressure coefficient is fixed; moreover, the corresponding effective overburden stress \(\sigma_{vo}^{^{\prime}}\) includes not only the information of the depth of underwater but also the contact force among grains. The \(q_{c}\), \(f\) can reflect the relative density of the target stratum, besides they can also indirectly reflect the friction coefficient and the particle geometry. Thus, after the analysis above, it can be indicated these selected macroscopic parameters contain the information of the mesoscopic property of the target stratum, so it is possible to dig out the inner relationship with developed computer technology.

2.2 The introduction of CPT database

The database used in this paper was cited from the summary of 226 cases from Juang et al. (2003), this database collects CPT data from over 52 sites including 6 different earthquakes, the detailed information can be checked in the paper above. Part of the database is listed in Table 1. \(LI\) represents the index of liquefaction, 0 and 1 corresponds to the unhappen and happens of the liquefaction. The \(f\) is symbolized by the friction ratio \(R_{f}\) in Eq. (7).

$$ R_{f} = f/q_{c} \times 100\% $$
(7)
Table 1 Part of the CPT database (Hsein Juang et al. 2003)

According to the study of Zhang and Gu (2005), there is a linear relationship between clay content \({\text{FC}}\) and \(R_{f}\) as Eq. (8), so the value of \(R_{f}\) in the database can indirectly reflect the clay content.

$$ {\text{FC}} = 7.28R_{f} + 0.58 $$
(8)

3 The introduction of the ELM

The structure of the ELM is illustrated in Fig. 1, and its topological structure is similar to other networks, such as BP single hidden layer neural network and RBF neural network, it is a new kind of single hidden neural network. It is characterized by online simulation, dynamic calculation, and faster speed than traditional methods; ELM is suitable for all kinds of model calculation and can avoid common problems of local optimal and too many iterations without affecting the training speed. It had been used to predict the displacement of landslides (Huang et al. 2017), the prediction of the strength of concrete (Shariati et al. 2020a, b), and the photovoltaic power output (Zhou et al. 2020).

Fig. 1
figure 1

The structure of the extreme learning machine

During the adoption, the weight and the threshold values between input layer and neuron will be randomly generated after the number of neurons is set; then, these values will be input into the feature space of ELM through the activation function; then real values are obtained by using the method of mathematics solving; finally, the trained network will be verified by the test set.

The scheme of establishing the relationship between the sand liquefaction determination result and relevant parameters is explained as follows:

  1. (1)

    The liquefaction and non-liquefaction CPT samples \(\left( {x_{i} ,y_{i} } \right)\) are divided into the training set and test set according to the optimist proportion ratio, the training set is used to calibrate the coefficients of prediction model while the test set is used to exam the performance of this model; \(x_{i}\) and \(y_{i}\) represent the input data and output data, respectively.

  2. (2)

    Set the number of the neuron of the hidden layer as \(N\), the potential activation function includes \(sig\), \(sin\), \(hardlim\), then train the model. The specific formulation is expressed as follows;

    $$ O_{i} = y_{i} = \mathop \sum \limits_{i = 1}^{D} \beta_{i} g\left( {x_{i} } \right) = \mathop \sum \limits_{i = 1}^{D} \beta_{i} g\left( {mx_{i} + n_{i} } \right) $$
    (9)

    where \(O_{i}\) is the output value, \(D\) represents the number of the neuron, \(m_{i}\) and \(n_{i}\) means the connection weights and thresholds between the ith neuron and the input layer nodes, respectively; \(\beta_{i}\) is the weight vector between the ith neuron and the output layer.

  3. (3)

    The value of \(\beta_{i}\) is calculated by fitting the result of the model to the actual result of the training sample with zero error in Eq. (10)

    $$ \sum\limits_{{i = 1}}^{D} {\left\| {O_{i} - t_{i} } \right\|} = 0 $$
    (10)

    where the \(t_{i}\) is the actual result. By combining Eqs. (9) and (10), the matrix form of \(\beta_{i}\) can be expressed in Eq. (11).

    $$ \beta = H^{ + } T = \left( {H^{T} H} \right)^{ - 1} H^{T} T $$
    (11)

    where \(H = g\left( {mx + n} \right)\), \(T\) is the matrix form of \(t_{i}\), \(H^{ + }\) is the Moore–Penrose generalized inverse of the hidden layer output matrix \(H\).

4 The establishment of the prediction model

4.1 The setting of model parameter

In the process of adopting the ELM into the liquefaction prediction based on CPT, the optimist network topology can be obtained by trial and error method; the number of the neuron was tried to be determined by setting this number from 5 to 16 when the activation was fixed, the corresponding accuracy of the training set and test set is illustrated in Fig. 2a. It was shown the value of these two accuracies was 81.43% and 87.5%, respectively, which is the highest when the number of the neuron was 12.

Fig. 2
figure 2

The accuracy of different number of the neuron

Simultaneously, the variation of operation rate with the increase in the number of the neuron is shown in Fig. 3a, it changed from 1.50 to 1.52 s; it can be accepted this rate was not the fastest when this number was 12 considering the high accuracy; thus, this number was assured as 12. Besides, to verify the advantage of CPT over SPT, the ELM model established through the same procedure based on the SPT data is cited in Figs. 2 and 3. It can be seen the highest accuracy of the training set and test set of CPT model are all higher than that of SPT model, and the operation time of CPT model in Fig,3(a) is lower than that of SPT model in Fig. 3b. So the model of CPT performs better than that of SPT.

Fig. 3
figure 3

The operation rate of different number of the neuron

After that, the activation function was chosen from \(sig\), \(sin\), \(hardlim\) by comparing the accuracy and operation rate of them when the number of the neuron was 12; the results were listed and are indicated in Table 2, the function \(sig\) should be selected because of its highest accuracy and fastest rate. Finally, according to the number of parameters and liquefaction index, the structure of the ELM model was decided as 7-12-1.

Table 2 Accuracy of three activation function

4.2 The analysis of prediction result

The performance of the ELM model on the test set is illustrated in Fig. 4, among the 16 samples, 14 samples were predicted correctly; the overall accuracy was 87.5% which was a decent result. To furtherly explain the advantage of ELM, the specific test result is listed in Table 3.

Fig. 4
figure 4

The performance of the trained model

Table 3 Prediction of Liquefied test sample

It can be found that the accuracy of 7 liquefied cases was 100%, but that of 9 non-liquefied cases was 77.78%; thus the overall accuracy of 87.5% was due to the low accuracy of the non-liquefied case, which can be explained as the low proportion of non-liquefied samples in CPT database. So the accuracy of the liquefied set can satisfy the request of the engineering.

5 Conclusions

Considering the advantage of CPT over the SPT and the complexity of the mesoscopic mechanism of the soil liquefaction, the ELM was tried to train a prediction model based on 266 CPT samples. The ELM model based on CPT was established by choosing function \({\text{sin}}\) and 12 neurons, in which the accuracy of using was 81.43% and 87.50% for the training set and test set, respectively; at the same time, the operation was 1.5055 s which was not much different from other two functions. It was indicated that the prediction model based on CPT perform better than that of SPT and can obtain a highly accurate prediction of 100% for the liquefied case and overall accuracy of 87.5% which can be promoted by adding more non-liquefied cases into the training set. ELM was proved to be feasible to be used and developed into the in situ evaluation.