1 Introduction

The uplift capacity (P) of suction caissons is an important parameter for designing offshore structures. So, the determination of P of suction caisson is an important task in ocean engineering. The value of P depends on different parameters such as passive suction under caisson-sealed cap, self-weight of caisson, frictional resistance along the soil–caisson interface, submerged weight of soil plug inside the caisson and uplift soil (reverse end bearing) bearing pressure (Albert et al. 1987; Rauch 2004). Researchers use different methods for determination of P of suction caisson (Goodman et al. 1961; Hogervorst 1980; Tjelta et al. 1986; Larsen 1989; Steensen-Bach 1992; Dyvik et al. 1993; Clukey and Morrison 1993; Whittle and Kavvadas 1994; Clukey et al. 1995a, b; Cauble 1996; Datta and Kumar 1996; Singh et al. 1996; Rao et al. 1997a, b; El-Gharbawy and Olson 2000; Zdravkovic et al. 2001; Cao et al. 2001, 2002a, b; Luke 2002; Cho et al. 2002). Clukey et al. (1995) determined the response of suction in normally consolidated clays for cyclic TLP loading conditions. Cauble (1996) conducted experiment for determination of behavior of suction caisson in clay. Cao et al. (2001) discussed results of centrifuge test of suction caisson in clay. Luke (2002) described experimental results of suction caisson for determination of axial pullout. El-Gharbawy and Olson (2000) used finite element method (FEM) for verifying the experimental results. The available methods have their own limitations (Rahman et al. 2001). Artificial neural network (ANN) has been successfully used for prediction of P of suction caissons (Rahman et al. 2001). However, ANN has several drawbacks such as low convergence speed, low generalization capability, “black-box approach” and overtraining problem (Park and Rilett 1999; Kecman 2001).

This article examines the capability of Gaussian process regression (GPR), minimax probability machine regression (MPMR) and extreme learning machine (ELM) for determination of P of suction caisson. This study adopts the database collected from the work of Rahman et al. (2001). The database contains information about L/d (L is the embedded length of the caisson and d is the diameter of caisson), undrained shear strength of soil at the depth of the caisson tip (Su), D/L (D is the depth of the load application point from the soil surface), inclined angle (θ), load rate parameter (Tk) and P. Table 1 shows the statistical parameters of the dataset. GPR is developed based on Bayesian concept. The parameters of GPR are assumed random variable. Researchers have successfully applied GPR for solving different problems in engineering (Kongkaew and Pichitlamken 2012; Chen et al. 2013, 2014; Cheng et al. 2015; Alborzpour et al. 2016). MPMR is a probabilistic model. It is constructed by Lanckriet et al. (2002a, b). The different applications of MPMR are available in the researches (Zhou et al. 2013; Shen et al. 2013; Yang and Ju 2014; Huang et al. 2015; Yang and Sun 2016). ELM is the modified version of single-hidden-layer feed-forward neural network (SLFN) (Huang et al. 2006, 2011). It gives excellent generalization performance (Huang et al. 2006). Many applications of ELM are available in the researches (Lu and Sho 2012; Xie et al. 2013; Bazi et al. 2014; Singh et al. 2015; Zhang et al. 2016). The developed GPR, MPMR and ELM have been compared with the ANN model.

Table 1 Statistical parameters of the dataset

1.1 Details of GPR

Let us consider the following datasets (D).

$$D = \left\{ {\left( {x_{i} ,y_{i} } \right)} \right\}_{i = 1}^{N} ,$$
(1)

where x is called input, y is called output and N is the number of datasets.

This article uses L/d, D/L, su and θ as input variables. The output of GPR is P.

So, \(x = \left[ {{L \mathord{\left/ {\vphantom {L {d,{D \mathord{\left/ {\vphantom {D {L,s_{u} ,\theta }}} \right. \kern-0pt} {L,s_{u} ,\theta }}}}} \right. \kern-0pt} {d,{D \mathord{\left/ {\vphantom {D {L,s_{u} ,\theta }}} \right. \kern-0pt} {L,s_{u} ,\theta }}}}} \right]\) and \(y = \left[ P \right]\).

GPR uses the following equation for prediction of y.

$$y_{i} = f\left( {x_{i} } \right) + \varepsilon_{i} ,$$
(2)

where f is latent real-valued function and ε is the observation error.

For a new input (xN+1), the expression of yN+1 is given as follows:

$$\left( {\begin{array}{*{20}c} y \\ {y_{N + 1} } \\ \end{array} } \right)\sim{\rm N}\left( {0,K_{N + 1} } \right),$$
(3)

where KN+1 is covariance matrix and its expression is given as follows:

$$K_{N + 1} = \left[ {\begin{array}{*{20}c} {\left[ K \right]} & {\left[ {k\left( {x_{N + 1} } \right)} \right]} \\ {\left[ {k\left( {x_{N + 1} } \right)^{T} } \right]} & {\left[ {k_{1} \left( {x_{N + 1} } \right)} \right]} \\ \end{array} } \right],$$
(4)

where k (xN+1) is vector of covariances between training inputs and the test input and k1 (xN+1) denotes autocovariance of the test input.

The distribution of yN+1 is Gaussian (Williams 1998), and its mean \(\left( {\mu \left( {x_{N + 1} } \right)} \right)\) and variance \(\left( {\sigma^{2} \left( {x_{N + 1} } \right)} \right)\) are given as follows:

$$\mu \left( {x_{N + 1} } \right) = k\left( {x_{N + 1} } \right)^{T} K^{ - 1} y,$$
(5)
$$\sigma^{2} \left( {x_{N + 1} } \right) = k_{1} \left( {x_{N + 1} } \right) - k\left( {x_{N + 1} } \right)^{T} K^{ - 1} k\left( {x_{N + 1} } \right).$$
(6)

For developing the above GPR, the datasets have been divided into the following two groups:Training dataset: This is used to construct the GPR model. This study adopts 51 datasets out of 62 as training dataset (see Table 2).

Table 2 Training datasets

Testing dataset: This is used to verify the developed GPR. The remaining 11 datasets have been used as testing dataset (see Table 3).

Table 3 Testing datasets

The datasets are scaled between 0 and 1. Radial basis function (\(\exp \left[ {\frac{{ - \left( {x_{i} - x} \right)\left( {x_{i} - x} \right)^{T} }}{{2\sigma^{2} }}} \right]\), where σ is the width of radial basis function) has been used as a covariance function. The program of GPR has been constructed by using MATLAB.

1.2 Details of MPMR

In MPMR, the relation between input (x) and output (y) is given as follows:

$$y = \sum\limits_{i = 1}^{N} {\beta_{i} K\left( {x_{i} ,x} \right) + b} ,$$
(7)

where N is the number of datasets, K(xi,x) is kernel function and βi and b are output from the MPMR.

This article uses L/d, D/L, Su and θ as input variables. The output of MPMR is P.

So, \(x = \left[ {{L \mathord{\left/ {\vphantom {L {d,{D \mathord{\left/ {\vphantom {D {L,s_{u} ,\theta }}} \right. \kern-0pt} {L,s_{u} ,\theta }}}}} \right. \kern-0pt} {d,{D \mathord{\left/ {\vphantom {D {L,s_{u} ,\theta }}} \right. \kern-0pt} {L,s_{u} ,\theta }}}}} \right]\) and \(y = \left[ P \right]\).

The following two classes are created from the available training datasets.

$$u_{i} = \left( {y_{i} + t,x_{i1} ,x_{i2} , \ldots ,x_{id} } \right),$$
(8)
$$v_{i} = \left( {y_{i} - t,x_{i1} ,x_{i2} , \ldots ,x_{id} } \right).$$
(9)

The classification boundary between the two classes is defined as regression surface.

MPMR uses the same training datasets, testing datasets and normalization technique as used by the GPR model. Radial basis function has been used as a kernel function. MPMR has been constructed by using MATLAB.

1.3 Details of ELM

Let us consider the following N training samples.

$$\left\{ {\left( {x_{i} ,y_{i} } \right)} \right\}_{i = 1}^{N} ,$$
(10)

where x is called input and y is called output.

This article uses L/d, D/L, su and θ as input variables. The output of MPMR is P.

So, \(x = \left[ {{L \mathord{\left/ {\vphantom {L {d,{D \mathord{\left/ {\vphantom {D {L,s_{u} ,\theta }}} \right. \kern-0pt} {L,s_{u} ,\theta }}}}} \right. \kern-0pt} {d,{D \mathord{\left/ {\vphantom {D {L,s_{u} ,\theta }}} \right. \kern-0pt} {L,s_{u} ,\theta }}}}} \right]\) and \(y = \left[ P \right]\).

Single-hidden-layer feed-forward network (SLFN) uses the following equation for prediction of y.

$$\sum\limits_{i = 1}^{L} {\beta_{i} G\left( {a_{i} ,x_{j} ,b_{i} } \right)} = y_{j} ,\quad \, j = 1, \ldots ,N,$$
(11)

where L is the number of hidden nodes, βi is weight and G (ai,xj,bi) is activation function.

Equation (11) is written in the following form.

$$H\beta = T,$$
(12)

where \(H = \left[ {\begin{array}{*{20}c} {G\left( {a_{1} ,x_{1} ,b_{1} } \right)} & \cdots & {G\left( {a_{L} ,x_{1} ,b_{L} } \right)} \\ \vdots & \cdots & \vdots \\ {G\left( {a_{1} ,x_{N} ,b_{1} } \right)} & \cdots & {G\left( {a_{L} ,x_{N} ,b_{L} } \right)} \\ \end{array} } \right]_{N \times L}\), \(\beta = \left[ {\begin{array}{*{20}c} {\beta_{1}^{\text{T}} } \\ \vdots \\ {\beta_{L}^{\text{T}} } \\ \end{array} } \right]_{L \times m}\) and \(T = \left[ {\begin{array}{*{20}c} {t_{i}^{\text{T}} } \\ \vdots \\ {t_{N}^{\text{T}} } \\ \end{array} } \right]\).

The value of β is determined from the following expression:

$$\beta = H^{ - 1} T,$$
(13)

where H−1 is the Moore–Penrose generalized inverse (Serre 2002).

ELM uses the same training datasets, testing datasets and normalization technique as used by the GPR and MPMR models. The program of ELM has been constructed by using MATLAB.

2 Results and Discussion

The performance of GPR depends on the proper choice of ε and σ. The design values of ε and σ have been determined by a trial-and-error approach. The developed GPR gives best performance at ε = 0.003 and σ = 0.5. Figure 1 shows the plot between actual and predicted P for training and testing datasets. The performance of GPR has been assessed in terms of coefficient of correlation (R) value. For a good model, the value of R should be close to one. As shown in Fig. 1, the value of R is close to one for training as well as testing datasets.

Fig. 1
figure 1

Performance of the GPR model

The success of MPMR depends on the proper choice of t and σ. The trial-and-error approach has been used to determine the design values of t and σ. The developed MPMR gives best performance at t = 0.007 and σ = 0.8. Figure 2 depicts the performance of training and testing datasets. Fig. 2 shows that the value of R is close to one for training as well as testing datasets. Therefore, the developed MPMR shows its ability for prediction of P.

Fig. 2
figure 2

Performance of the MPMR

For developing ELM, radial basis function has been adopted as activation function. There are ten hidden nodes in the ELM. The performance of ELM is depicted in Fig. 3. As shown in Fig. 3, the value of R is close to one. So, the developed ELM predicts P reasonable well.

Fig. 3
figure 3

Performance of the ELM

The developed GPR, MPMR and ELM have been compared with the ANN and FEM models. Comparison has been made for testing datasets (Rahman et al. 2001). Figure 4 illustrates the bar chart of R values of the different models. The developed models are also assessed in terms of root mean square error (RMSE) (Kisi et al. 2013), mean absolute percentage error (MAPE), root mean square error-to-observation’s standard deviation ratio (RSR) (Moriasi et al. 2007), normalized mean bias error (NMBE) (Srinivasulu and Jain 2006), weighted mean absolute percentage error (WMAPE), Nash–Sutcliffe coefficient (NS) (Nash and Sutcliffe 1970), variance account factor (VAF), maximum determination coefficient value (R2), performance index (PI) and adjusted determination coefficient (adj. R2) (Ceryan 2014; Chandwani et al. 2015). Table 4 shows the value of the above parameters of the developed models. The value of NS should be close to one for a perfect model. For a good model, the value of VAF should be close to 100. The performance of GPR, MPMR, ELM and ANN models is almost same. The developed MPMR has control over future prediction. However, the developed GPR, ELM and ANN have no control over future prediction. In GPR, it is assumed that the datasets should follow Gaussian distribution. The developed MPMR, ELM and ANN do not assume any data distribution. The developed GPR, ELM and MPMR use two tuning parameters. However, the ANN model uses many tuning parameters. The major limitation of the developed model is that design parameter is determined by the trial-and-error approach. This study adopts sensitivity (S) analysis to determine the effect of each input on P. The concept has been taken from the work of Liong et al (2000). Figure 5 shows that Tk has maximum impact on P.

Fig. 4
figure 4

Bar chart of R values of the different models

Table 4 Comparison of parameters of the developed models
Fig. 5
figure 5

Effect of inputs on P

3 Conclusion

This article examines the capability of GPR, ELM and MPMR for prediction of P of suction caisson. The detailed methodologies of GPR, ELM and MPMR are described in this article. The developed GPR, ELM and MPMR give excellent performance. The performance of GPR, ELM and MPMR is comparable with the ANN model. User can employ the developed models as quick tools for prediction of P. The developed ELM is fast compared with the other developed models. It can be concluded that the developed GPR, ELM and MPMR are excellent models for prediction of P of suction caisson.