1 Introduction

In the vadose zone of geotechnical engineering, rainfall is a critical indicator of variations in the properties of unsaturated soil and is also an important parameter in analysis of hydrological processes, because rainfall infiltration can seriously reduce the effective stress and shear strength of unsaturated soil [17, 27, 33, 39, 43, 49, 56, 57]. In recent decades, heavy rainfall events have become increasingly likely as a result of climate change (e.g., global warming, the urban heat island, and rain island effects) caused by human activities [23, 41]. Heavy rainfall can cause natural disasters such as collapse, landslide, debris flow, and waterlogging, bringing significant economic loss and threatening human safety. Accordingly, in-depth analysis of variations in unsaturated soil’s parameters as a result of different rainfall events can offer insights into the characteristics of rainfall (e.g., amount, intensity, duration, pattern) and their effects on unsaturated soil parameters in a specified area. These can serve as a reference for engineers to evaluate and predict unsaturated soil’s properties, informing actions taken to prevent disasters, reduce economic loss, and ensure human safety.

Numerous studies have indicated that rainfall amount, intensity, and duration can influence variations in soil parameters, triggering natural disasters [4, 5, 19, 23, 25,26,27, 43, 46]. In some early studies, rainfall pattern was not considered in hydraulic analysis of shallow unsaturated soil’s properties. Later, it was demonstrated that peak rainfall rates of a single rainfall event may be dozens of times higher than the mean event rate, requiring determination of the temporal position of peak intensity within an event [2, 35]. Tsai [43] investigated the effects of rainstorm pattern on shallow landslide and found that rainstorm pattern affects slope stability. Ng et al. [29] observed that rainfall pattern influences pore-water pressures in soil layers near the ground surface. Rainfall accompanying an advanced storm pattern of 24 h duration was found to be critical for causing the highest pore-water pressure in a slope. Rahimi et al. [34] reported that the rate of decrease in the factor of safety was fastest for the advanced rainfall pattern, followed by the normal and delayed rainfall patterns. Zhan et al. [51] found that volumetric water content was greatest in deep soil layers induced by the advanced pattern, followed by the central, uniform, and delayed rainfall patterns, and vice versa for volumetric water content in shallow soil layers. Wu et al. [48] showed that under similar slope conditions, the factor of safety changes with the precipitation pattern, being largest under the linear delayed pattern. A et al. [2] reported that the delayed rainfall pattern can lengthen the response time of increase in soil water content to rainfall. Zhai et al. [53] demonstrated that rainfall pattern is a key parameter in providing warnings about some disasters. Some researchers have argued that rainfall pattern is randomly distributed [47], but rainfall pattern in a certain region can be derived based on the measured data to identify one or two prevalent rainfall patterns in the region [2, 20, 35, 45, 50]. Too few studies have focused on rainfall patterns’ influence on soil properties under natural environmental conditions, and many have analyzed it only using rainfall simulation experiments [2, 14, 52]. Analysis of natural rainfall events is needed to arrive at a comprehensive understanding of the effects of various rainfall events on shallow soil’s properties.

In this study, variations in the soil suction, weather, and vegetation parameters of three monitoring points were recorded under natural environmental conditions from September 2016 to March 2019. The three monitoring points were distributed at different distances from selected trees (0.5 m, 1.5 m, and 3.0 m) at a consistent depth from the ground surface (0.2 m). An image processing technique was used to quantify the vegetation parameter and the radius of the tree canopy so as to represent the effect of trees. Initial statistical analysis indicated that various rainfall parameters, including rainfall intensity, amount, duration, and pattern, can have various influences on soil suction. A machine learning method, multi-gene genetic programming (MGGP), was then used to develop a multivariate model with which to formulate the relationships between influential parameters and two key indices reflecting variations in soil suction. Based on the obtained multivariate models, each input’s influence on the output variables was assessed through global sensitivity analysis and effects of selected rainfall, weather, and vegetation parameters on soil suction response investigated by parametric study.

2 Field monitoring scheme

The field monitoring test was carried out at a vegetated site on the campus of University of Macau (Hengqin Island, Zhuhai, China, 113.55 °E, 22.13 °N), as Fig. 1 shows. The monitoring site is covered with various types of vegetation amid a range of landscapes. Detailed characteristics of the local climate and topography can be found in Zhou et al. [55].

Fig. 1
figure 1

Site selection for field monitoring

Over the entire monitoring period, from September 2016 to March 2019, three trees (Fig. 2) were targeted. Various sensors and pieces of equipment were arranged at different distances from the target trees and different depths from the ground surface to quantify soil suction, vegetation, and atmosphere parameters. Data were recorded at half-hour intervals. Common grassland and tree species in Macau are distributed at the selected monitoring site. As Fig. 2a shows, the site has primarily two types of grass, carpet grass, and Manila grass. The selected tree was of the variety Elaeocarpus apiculatus Masters, commonly distributed in southern China (e.g., Macau).

Fig. 2
figure 2

Vegetation coverage at the field monitoring site

Soil samples were taken in shallow layers around the target trees for use in laboratory experiments to determine several physical properties of the soil. Soil sampling depths ranged from 10 to 80 cm, as determined by the embedded depth of sensor. According to sieve analysis, the shallow soil at the monitoring site was of a single kind. As Fig. 3 shows, according to the Unified Soil Classification System [1], the soil in the shallow soil layers (underground surface 0–0.8 m) of the study area was well-graded sand (SW).

Fig. 3
figure 3

Particle size distribution of shallow soil at the monitoring site

Variations in soil suction at three monitoring positions A, B, and C (Fig. 4), distributed at different distances from the target tree (0.5 m, 1.5 m, and 3.0 m) at a constant depth of 0.2 m, were used as the objective unsaturated soil parameter for analysis of response to various rainfalls while taking into account the effects of atmosphere and vegetation factors. Employed sensors and equipment included a soil suction sensor (MPS-6), an air temperature/relative humidity sensor (VP-4), a rainfall gauge, and a drone. The air temperature/relative humidity sensor and rainfall gauge were distributed at the open center of the monitoring site, and the drone was used to record variations in tree canopy, representing the impact of vegetation.

Fig. 4
figure 4

Cross section of the three monitoring points’ positions

3 Monitoring results and statistical analysis

3.1 Quantification of tree canopy

The selected target trees can significantly affect soil suction variations through root water uptake [54]. In this study, radius of tree canopy and distance from tree reflect the influence of the target tree. Various images of the tree canopy were obtained during the monitoring period by using a drone. The obtained images were processed to obtain the radius of tree canopy. As Fig. 5 shows, an original image can be converted to a black-and-white image showing only the tree canopy, whose radius can then be calculated by assuming that the canopy is circular [42].

Fig. 5
figure 5

Image processing for tree canopy quantification

The entire monitoring period can be divided into six monitoring subperiods reflecting adjustments to the monitoring plan in response to termite infestations, typhoon, and embedding of new sensors. During different monitoring subperiods, image acquisition was conducted at least twice to analyze variations in tree canopy. It was found that the tree canopy did not vary meaningfully across subperiods, so a mean value for the tree canopy’s radius during every subperiod was used for modeling analysis. Table 1 summarizes the radius values used for the tree canopy during different monitoring periods. A dimensionless parameter, ratio of distance from tree to radius of tree canopy, was used to build a model reflecting the effects of the targeted trees.

Table 1 Results of image processing

3.2 Rainfall characteristics and their effects on soil suction

To reveal the effects of different rain events on soil suction, the definition of a single rainfall event is important, because it can affect the quantification of rainfall properties. Many time intervals can be used for the sorting of rainfall data, such as 5, 10, 20, 30, 40, 60, 80, 90, 100 min and 2, 3, 4, 12, 18, 24 h [19]. It is pointed out that different definition methods of a single rainfall event should be utilized according to the specific study objectives [14]. Considering that the reduction rate of field-monitored soil suction is fast, and more data are beneficial to the model development, in this study, each rainfall event was delineated using the minimum inter-event time (MIT) approach, where MIT = 6 h. Thus, a period of rainfall bounded by dry periods of at least 6 h apiece was regarded as a distinct event. A MIT value of 6 h has been widely adopted in many studies. According to this criterion, 300 rainfall events were recorded during the entire monitoring period.

Based on the distribution of rainfall intensity during an entire rainfall event, rainfall events can be classified into three patterns—advanced, intermediate, and delayed—as proposed by Horner and Jens [18]. Rainfall pattern can be determined using the equation:

$$r = \frac{{T_{{{\text{RI}}_{\max } }} - T_{0} }}{T}$$
(1)

where \(T_{{{\text{RI}}_{{{\text{max}}}} }}\) is the time of the rainfall’s maximum intensity and r represents its temporal position; T0 is the start time of a rainfall event; and T is the rainfall duration. Rainfall patterns are defined as advanced, intermediate, and delayed when values of r are 0.0–0.4, 0.4–0.6, and 0.6–1.0, respectively [3, 13]. This method is simple and convenient for use in mathematical modeling and analysis, but in initial statistical analysis, another rainfall pattern, uniform rainfall pattern, is also considered, in which there is no obvious peak rainfall intensity during a single rainfall event. Of the 300 rainfall events obtained from field monitoring based on the classification criteria, some included only two data points, making classification of rainfall patterns difficult. In total, 149 rainfall events whose rainfall duration had no fewer than three data points were separated by further screening for analysis of rainfall patterns. As Table 2 shows, these 149 rainfall events comprised 55 advanced, 47 delayed, 31 uniform, and 16 central patterns during the monitoring period. The proportions of each rainfall pattern for each year, with total rainfall, are shown in brackets, indicating that the advanced and delayed patterns were prevalent at the field monitoring site, with the uniform pattern the second most important and the central rainfall pattern rarely appearing. Rainfall patterns offer insights into the distribution of peak rainfall intensity during a single rainfall event. Based on the happened regular of rainfall patterns obtained from the statistical rainfall events at a monitoring site, the safety of a geotechnical engineering infrastructure can be estimated in advance.

Table 2 Ratio of different rainfall patterns during the monitoring period

Under natural environmental conditions, soil, meteorological, and plant parameters are uncontrolled, making it difficult to find multiple cases in which all conditions are the same except one control variable. Some extreme cases can occur, however, under complicated natural climate conditions. For example, when the initial soil suction of one wetting process is small, heavy rainfall ensues of high intensity and long duration, causing a sudden drop in soil suction. To better analyze different rainfall events’ effects on variations in soil suction, several cases were selected, as Fig. 6 shows. For these cases, initial soil suction at the same monitoring point was in the approximate range 20–600 kPa, with corresponding rainfall events of different temporal patterns featuring appropriate rainfall amount and intensity. According to Fig. 6, it is found that rainfall events of various rainfall amounts, intensities, and temporal patterns can cause different impacts on the response of vegetated soil that can be evaluated using two indices: time to descent of soil suction after the beginning of a single rainfall event (t) and mean descent rate of soil suction (vs) during a wetting process. The soil suction cannot immediately decrease when a rainfall event happens. The time to descent of soil suction after the beginning of a single rainfall event is a characteristic physical parameter reflecting the relationships between the status of soil suction and relevant influential factors. It is noted that the resolution and accuracy of the time to descent of soil suction are limited by the monitoring equipment and record setting (i.e., time interval of data record is 0.5 h). After a rain, an obvious reduction period of soil suction is observed, and the reduction rate of soil suction is greatly affected by the environmental parameters. In this study, the reduction period starts from a peak soil suction to an initial soil suction value of a stable stage. The value of a stable stage is not always the minimum value (e.g., 10 kPa) that can be measured by the used sensor. The gradient between two adjacent record data in the reduction period can be computed, but some values are too large for use in statistical and modeling analysis. Accordingly, the mean descent rate of soil suction was calculated by averaging the gradient values of all two adjacent data in a reduction period of soil suction and used for further analysis.

Fig. 6
figure 6

Soil suction variations induced by various rainfalls

Table 3 summarizes rainfall parameters and two indices representing vegetated soil’s response in eight selected cases during wetting processes. All other influential factors being equal, higher rainfall intensity clearly increases the reduction rate of soil suction, as shown by cases 1, 4, and 8 in Table 3. What’s more, temporal rainfall pattern significantly influences the response of vegetated soil, as shown by cases 1–5 in Table 3. Within a certain range of rainfall intensity, suction reduction in response to a delayed pattern is slower, but the mean reduction rate of soil suction induced by a delayed pattern is larger, than with the uniform and central patterns, indicating that the response time of soil suction reduction associated with a single rainfall is closely related to the rainfall’s pattern. For instance, for cases 1 and 8 in Table 3, when peak rainfall intensity appeared in the second half of rainfall duration (delayed pattern), the time to descent of soil suction was relatively long. Conversely, the advanced pattern usually triggered a shorter time to reduction start and quicker reduction of soil suction, as in case 7. In addition, although the amount and intensity of the rainfall event for case 6 were larger than for case 1, the mean reduction rate was smaller, reflecting differences in initial soil suction and other factors. The influence of initial soil suction and other factors requires further analysis and will be elucidated in the following modeling analysis. Finally, the effects of rainfall pattern, except rainfall amount and intensity, are significant. Analysis of variations in shallow soil parameters, initial state condition, and other factors (e.g., tree canopy, air temperature) also must be considered.

Table 3 Two indices representing vegetated soil’s response to different rainfall events

4 Modeling soil suction variation induced by rainfall

4.1 Multi-gene genetic programming method

MGGP is a variant of genetic programming. Accordingly, to better understand the working mechanism of MGGP, genetic programming is first introduced. As an evolutionary algorithm technique [31], GP can automatically program a computer model to solve a problem through Darwinian natural selection [21]. GP was first proposed by Cramer [10] and attracted much attention with the publication of Koza’s book [24]. As Fig. 7 shows, the GP equation is usually given as a syntax tree, consisting primarily of functional and terminal nodes. The former can include basic arithmetic operators (e.g., + , –, × , ÷), nonlinear functions (e.g., sin, cos, exp, power), and any other mathematical functions defined by the user, and the latter primarily contain the arguments for the functions, such as numerical constants and variables [38]. Functions and terminals are usually chosen at random. Compared with other similar techniques, GP has two main advantages, flexibility and interpretability, when used to model nonlinear systems [15]. Some modeling and prediction techniques are considered black box methods, because they generate models that are hard to understand; such techniques traditionally include ANNs and SVMs. Unlike such techniques, tree models generated by GP are simpler and are easier to evaluate; what’s more, their complexity can be controlled by setting maximum GP tree depth and length.

Fig. 7
figure 7

A typical genetic programming (GP) syntax tree

Given input variables and corresponding outputs, GP can develop a mathematical model based on the user’s set parameters [38]. The initial population is created through a blind random search for solutions in the large space of possible solutions. Once a population of models is created at random, the GP algorithm evaluates individuals; selects individuals for reproduction; generates new individuals by mutation, crossover, and direct reproduction; and finally creates a new generation in all iterations [7]. The evolutionary process continues by evaluating the fitness of the new population and starting a new round of reproduction and crossover. The best program to appeared in any generation, the best-so-far solution, defines the output of the GP algorithm [8]. A typical fitness function for assessing the performance of the generated population is the root-mean-square error (RMSE), given by

$${\text{RMSE}} = \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left| {P_{i} - A_{i} } \right|^{2} } }}{n}}$$
(2)

where Pi is the estimated value of the ith data point, Ai the measured value of the ith data point, and n the number of data points.

As a variant of GP, MGGP can capture the dynamics of complex systems involving multiple input–output parameters [16]. In this study, where the relationship between the selected output variables and the other six parameters is complicated, the MGGP method is used to build a computational model whose parameter settings are as simple as possible. Each model evolved using the MGGP is a weighted linear combination of the outputs of several GP trees. The model is linear in its parameters with respect to coefficients W0, W1, W2, …, Wn despite using nonlinear terms. Figure 8 shows the MGGP’s working procedure and principle. Within a certain range, more complicated parameter settings can help generate an optimal model, reducing error. For a certain problem, it is better to choose simpler functions and terminals that still produce acceptable performance for the obtained solution, simple structure being more conducive to analysis of the relationship between different variables. Generally, GP generates a simpler model than MGGP, but in this study, it is difficult for GP to develop a good model having high accuracy, as validated by various trials—primarily because a complicated relationship between multiple variables must be formulated using limited data. MGGP can accomplish this difficult task, making it better suited for some complicated and difficult cases. Accordingly, in this study, MGGP was employed for multivariate model development, with the obtained explicit multivariate models used for sensitivity analysis and parametric study.

Fig. 8
figure 8

Working procedure of the MGGP method

4.2 Parameter settings

4.2.1 Settings of input and output variables

Analysis showed that during wetting process of in situ soil suction, different types of rainfall events generally cause variations in time to start of descent after the beginning of a rainfall event and in reduction rate of soil suction. In practice, time to start of soil suction descent reflects rainfall water’s rate of infiltration into rooted soil at a certain depth. When a rainfall event occurs that is of high intensity and long duration, soil suction decreases sharply, whereas the curve of soil suction over time drops slowly. The rate of reduction in soil suction is an important parameter indicating the effects of relevant factors (e.g., rainfall amount, rainfall intensity, rainfall pattern) on variations in soil suction. Accordingly, in this study, two characteristic parameters, time to start of descent of soil suction after the beginning of a rainfall event, t, and average rate of decline of soil suction during the wetting process, vs, were taken as output variables for development of a multivariate model, as Fig. 9 shows. Although 149 rainfall events were screened for statistical analysis, only 74 values of t and vs were selected for model development and analysis over the entire monitoring period, because soil suction remained low during some rainfall events, hindering quantification of time to descent start and of mean descent rate of soil suction. Particularly, it is noted that the used data in this study are in chronological order. According to the analysis results, the training data used for model development possess enough variability to ensure the acceptable performance of the obtained MGGP.

Fig. 9
figure 9

Selected two characteristic parameters representing vegetated soil response

Field monitoring testing indicates that rainfall amount, intensity, and pattern can critically influence in situ soil suction variations in a shallow soil layer. What’s more, variations of in situ soil suction are closely related to local air temperature, relative humidity, and vegetation conditions. Initial soil suction is also an important parameter reflecting the initial energy state of unsaturated soil. The state variable, initial soil suction, plays a controlling role in the infiltration process, affecting the time to start of descent of soil suction after the beginning of a single rainfall event observed in field monitoring [9, 29]. Accordingly, as Fig. 10 shows, six parameters were selected as the input variables for multivariate modeling in this study: rainfall amount (x1), mean rainfall intensity (x2), normalized value of rainfall pattern (x3), ratio of distance from tree to radius of tree canopy (x4), air temperature (x5), and initial soil suction (x6).

Fig. 10
figure 10

Selected six input variables for modeling analysis

Best practices for multivariate modeling require a low level of relation between each pair of input variables. Multicollinearity between input variables was assessed using variance inflation factors (VIF), based on the following criteria: VIF = 1/T < 5 indicates that multicollinearity problems are not likely to exist [22]. The relationship among VIF, tolerance values (T), and correlation coefficient (R) can be described as VIF = 1/T, T = 1 – R2. Table 4 gives the results of multicollinearity evaluation, showing that there is low correlation between each pair of selected input variables, reflecting good modeling practice.

Table 4 Multicollinearity evaluation of input variables used for multivariate modeling

4.2.2 Parameter settings for MGGP

Among the total data selected for the two models’ development, 60 sets of data were taken as the training data for building the t-model and vs-model, respectively, with the remained 14 sets of data used as the testing data for validating the reliability of obtained multivariate models. Table 5 shows MGGP parameter settings for the two models. The probability rates of crossover, mutation, and reproduction were set at 85%, 10%, and 5%, respectively, for maximum model efficiency [28, 32]. Other parameter settings were determined by trial and error with a view to finding an optimal model efficiently and economically, developing a model with better fitness and simpler structure.

Table 5  Parameter settings of MGGP for the t and vs models

4.3 MGGP modeling and performance evaluation

Based on the variables and parameters set for the MGGP program, two explicit multivariate models were obtained to formulate the relationships between two output variables, the time to descent of soil suction after the beginning of a single rainfall event (t), the average reduction rate of soil suction (vs), and the selected six influential parameters. It has been demonstrated that MGGP performs well at fitting high discrete data. However, owing to its working principle, it usually gives a relatively complicated explicit formula for the given parameter settings. Two multivariate models were obtained using MGGP, based on simplified parameter settings within an acceptable degree of error:

$$\begin{aligned} t & = {0}{\text{.085}}x_{2} x_{4}^{2} x_{6} /(x_{1} x_{5} \sin x_{6} ) - 0.255(x_{5} + 147.177)/(\sin x_{5} )^{2} \\ & \quad *(x_{6} - 705.058) - 1.619x_{6} /(x_{5} (\sin x_{5} )^{{2}} )(x_{5} + x_{6} ) \\ & \quad - 0.082(x_{5} /x_{3} )^{2} \sin x_{4} /(x_{6} - 842.092) + 1.439 \\ \end{aligned}$$
(3)
$$\begin{aligned} v_{{\text{s}}} & = {0}{\text{.203/}}\sin x_{1} - 0.01{3}x_{6} (1 + x_{3} ) + 0.01{3}x_{4} /x_{3}^{3} \\ & \quad + 1.02{\text{e}} - {4}x_{3} x_{5}^{2} x_{6} - 3.34{\text{e}} - {9}x_{1} x_{6}^{3} (x_{1} - x_{6} ){/}x_{6} - 144.738) \\ & \quad + 3.24{\text{e}} - {3}x_{6} (x_{2} - x_{4} )/x_{2} /\sin (x_{6} + 101.079) + 1.238 \\ \end{aligned}$$
(4)

Figure 11 compares the estimated and measured values of two selected parameters representing variations in soil suction. Figure 12 evaluates training and testing data for the obtained t-cycle and vs-cycle models. In addition to the fitness function, three metrics are used to estimate the performance of the obtained multivariate models: mean absolute percentage error [MAPE(%)] [Eq. (5)], relative error (%) [Eq. (6)], and coefficient of determination (R2) [Eq. (7)],

$${\text{MAPE}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {\frac{{A_{i} - P_{i} }}{{A_{i} }}} \right|} \times 100$$
(5)
$${\text{Relative}}\;{\text{error}} = \frac{{\left| {P_{i} - A_{i} } \right|}}{{A_{i} }} \times 100$$
(6)
$$R^{2} = \left[ {\frac{{\sum\nolimits_{i = 1}^{n} {(A_{i} - \overline{{A_{i} }} )(} P_{i} - \overline{{P_{i} }} )}}{{\sqrt {\sum\nolimits_{i = 1}^{n} {(A_{i} - \overline{{A_{i} }} )^{2} \sum\nolimits_{i = 1}^{n} {(P_{i} - \overline{{P_{i} }} )^{2} } } } }}} \right]^{2}$$
(7)

where Ai and Mi are the measured and estimated values, respectively; \(\bar{A}_{i}\) and \(\bar{P}_{i}\) the mean values of the measured and estimated values, respectively; and n the number of training or testing samples. It can be concluded that the two multivariate models obtained by MGGP have learned the relationships between the two selected output variables and the six influential parameters. Although two multivariate models developed using limited data behave relatively poor estimation or prediction for testing data, they can be used to estimate or predict soil suction’s response to various rainfall events with an acceptable degree of error. What’s more, two models can be used to better analyze each influential parameter’s effect on variations in soil suction in shallow soil layers.

Fig. 11
figure 11

Values measured and calculated by MGGP

Fig. 12
figure 12

Performance of two MGGP models

5 Sensitivity analysis and parametric study

5.1 Contribution of each input to t and v s

Sensitivity analysis, which can be used to study how variations in the output variable can be attributed to different input variables [37], offers insights into the relationships between input and output variables in an obtained model. For a complicated model consisting of several types of variables, global sensitivity analysis (GSA) usually attracts more interest for its ability to reflect integrated sensitivity over the entire input variable space [6, 30, 36, 37]. Among GSA methods, the variance-based method is seen as a versatile and reliable indicator of parameter importance as well as quantitative sensitivity indices with which to demonstrate the effect of a single input and the coupled effects of different parameters [12, 30]. The Fourier amplitude sensitivity test (FAST) [11] and Sobol sensitivity [40] are both robust approaches to performing variance-based global sensitivity analysis [6]. This study uses these methods to investigate each input variable’s sensitivity to output variables based on two multivariate models.

Figures 13 and 14 show the results of sensitivity analysis, with both methods giving similar results. Air temperature (x5) has the greatest impact on time to start of descent of soil suction after the beginning of a rainfall event (t), and initial soil suction (x6) is the second most important parameter. The other four parameters, rainfall amount (x1), mean rainfall intensity (x2), rainfall pattern (x3), and ratio of distance from tree to radius of tree canopy (x4), have a similar sensitivity to the output (t), indicating that temporal rainfall pattern and the vegetation parameter are almost as important as rainfall amount and rainfall intensity. During the wetting process, initial soil suction (x6) and rainfall amount (x1) are the first and second most important influences on the descent rate of soil suction, respectively. Rainfall pattern (x3) and the ratio of distance from tree to radius of tree canopy (x4) strongly influence the descent rate of soil suction. Air temperature (x5) has a greater influence than mean rainfall intensity (x2) on the descent rate of soil suction.

Fig. 13
figure 13

Effects of input variables on time to start of descent with different sensitivity analysis methods

Fig. 14
figure 14

Effects of input variables on descent rate of soil suction with different sensitivity analysis methods

5.2 Effect of atmosphere and vegetation factors

In order to better understand the behavior of the developed MGGP models, parametric analysis can be performed to reveal the interaction mechanism between different parameters in soil–vegetation–atmosphere interaction, and the rationality and effectiveness of the two obtained multivariate models can be verified [44]. Based on the two multivariate models obtained using MGGP, rooted soil’s response to different influential factors during wetting processes is investigated by changing individual input variables keeping other variables constant. In this study, when one input variable varies according to the monitored data values, other input variables are kept as the minimum, mean, and maximum values calculated using the obtained data from the field monitoring. However, it is noted that because the relationships between the selected input and output variables are complicated, the results of parametric analysis cannot reflect the whole variation ranges of outputs induced by input variables. Table 6 shows the minimum, mean, and maximum values of input variables, including rainfall amount (x1), mean rainfall intensity (x2), rainfall pattern (x3), the ratio of distance from tree to radius of tree canopy (x4), air temperature (x5), and initial soil suction (x6).

Table 6 Settings of inputs’ values for parametric study

Figures 15 and 16 show the results of parametric analysis for two multivariate models. To better display variations in trends of curves under different settings, several longitudinal axes are broken and one marker is provided, as can be seen in Fig. 15c, d. From Fig. 15a, b, the time to start of descent of soil suction decreases with rainfall amount and mean rainfall intensity at a certain range, and when rainfall amount and intensity exceed a certain value, time (t) remains at a small constant value. Figure 15c indicates that the advanced rainfall pattern can cause more a rapid response of rooted soil than the intermediate and delayed rainfall patterns when other variables reach a certain value. As Fig. 15d shows, when values of variables other than the tree canopy-related parameter are below a certain value, the rooted soil closer to the edge of tree canopy (with ratio of distance from tree to radius of tree canopy 1.2–1.4) responds slower to rainfall events. Interruptions of tree canopy are more prominent when other influential parameters exceed a certain value. The higher air temperature and larger initial soil suction can lengthen the time to start of descent (Fig. 15e, f).

Fig. 15
figure 15

Parametric analysis of the t MGGP model

Fig. 16
figure 16

Parametric analysis of the vs MGGP model

As Fig. 16a shows, the closer the monitoring point to the tree, the smaller the change in soil suction, indicating that interruptions in the tree canopy play a key role but that rainfall amount is less influential when other variable values are below a certain value. Conversely, when other input variables are at their maximum, the descent rate of soil suction clearly increases with rainfall amount. Figure 16b shows the inverse change induced by rainfall intensity at a certain range (< 2 mm/h) when other variables are at different values. When mean rainfall intensity is less than 2 mm/h, other variables remaining at larger values indicate that a single rainfall event of delayed pattern with longer duration (large rainfall amount and small rainfall intensity) reduces the descent rate of soil suction at the monitoring point farther from the tree, and vice versa when the value of variables other than mean rainfall intensity is below a certain value. However, the descent rate of soil suction remains almost constant with mean rainfall intensity exceeding 2 mm/h. As Fig. 16c shows, different rainfall pattern values affect variations in soil suction differently. If pattern value is less than 0.2, the descent rate of soil suction decreases with pattern value, but if not then the reverse. Figure 16d shows that the tree canopy significantly influences variations in soil suction when rainfall parameters are at a low level. Variations in soil suction are more severe with increases in air temperature (Fig. 16e). When other influential parameters are larger, then the higher the initial soil suction, the more quickly the slope of soil suction’s descent rate increases. Conversely, a reduction in rainfall near the tree is unlikely to cause a significant descent in soil suction when initial soil suction is large (Fig. 16f). In addition, it is noted that the results of parametric analysis are limited by the used data for model development, which makes the analysis results have particularities.

6 Conclusions

A field monitoring test was conducted to record and quantify soil suction variations at three monitoring points (0.5 m, 1.5 m, and 3.0 m from the tree, 0.2 m below the ground surface), corresponding rainfall events, tree canopy, and air condition. Characteristics of rainfall at the monitoring site and effects on soil suction variation were analyzed, after which a machine learning method, MGGP, was used to develop multivariate models. Finally, based on global sensitivity analysis and parametric study of two obtained multivariate models, the effects of rainfall parameters, tree canopy, and air condition on soil suction variation were revealed.

  1. (a)

    Two multivariate models created through the MGGP method can be used to estimate and predict the time to start of descent of soil suction (t) and the mean descent rate of soil suction (vs) within an acceptable degree of error, showing that for a complicated system, MGGP is suitable for developing a computational model under conditions of limited data.

  2. (b)

    Global sensitivity analysis shows that as a stress state variable, initial soil suction plays a significant role in evaluating rooted soil’s response to different types of rainfall events. Rainfall pattern’s effect is similar to or greater than those of rainfall amount and intensity, underscoring the importance of rainfall pattern in analysis of variations in shallow soil parameters.

  3. (c)

    Parametric study indicates that the advanced rainfall pattern can cause a more rapid response of rooted soil than intermediate and delayed rainfall patterns when other variable values reach a certain value. Interruptions in tree canopy are more prominent when other influential parameters exceed a certain value. Higher air temperature and larger initial soil suction can lengthen the time to start of descent.

  4. (d)

    The impacts of rainfall amount, tree canopy, air temperature, and initial soil suction on average descent rate of soil suction are monotonic. However, the influence of mean rainfall intensity is limited by changes in other variables’ values. Rainfall pattern’s impact on the descent rate of soil suction is not monotonic. The tree canopy significantly affects soil suction decrease when rainfall parameters are at a low level.

This study emphasizes the response of vegetated soil’s parameters to different rainfall events and supports implementation of a machine learning method, MGGP, for evaluation of vegetated soil response. A short monitoring time produced limited data, limiting the generalizability of the proposed multivariate models and the conclusions derived from their use. In future, more data collected from the field monitoring test or laboratory test with artificial rainfall can be used to further verify the performance of the obtained MGGP model. The results of the analysis offer insights into the effects of rainfall events on the properties of shallow soil, with the proposed multivariate models useful for evaluating and predicting variations in shallow soil’s parameters.