1 Introduction

Dispersive soils are widely distributed in all parts of the world, especially in arid and semiarid area (Abbaslou et al. 2016). It is a kind of special soil with water sensitivity, and its water erosion resistance is very low. The existence of dispersive soils is easy to cause piping, caves, gullies and other damages, which poses a serious threat to the safety of buildings and structures.

Dispersion occurs when the repulsive forces between individual clay particles exceed the attractive forces so that when the clay mass is in contact with water, individual clay particles are progressively detached from the surface and go into suspension. If the water is flowing, the dispersive clay particles are carried away (Sherard et al. 1972). Many factors governing the susceptibility to dispersion are physical, chemical and mineral properties indexes of soils and environmental water. The dispersion of soil is positively correlated with the content of dissolved sodium cations, exchangeable sodium cations, montmorillonites and pH and is negatively correlated with the content of organic matters and clay particles. The lower the content of dissolved salts in the water, the greater the susceptibility of the soils to dispersion. Such conclusions have been widely accepted (Richards et al. 1954; Sherard et al. 1972; Holmgren et al. 1977; Chorom et al. 1994; Marchuk et al. 2012, 2013). Zhao et al. (2015) applied the gray system theory to analyze the relationship between the physical, chemical and mineral properties and dispersibility of 123 groups of soil samples. It was concluded that the dispersive soil is routinely observed to have rich exchangeable sodium percent and high pH value and cannot be predicted solely by sodium or montmorillonite contents. Thus, the effect factors of soil dispersibility have been well studied.

Conventional soil mechanics parameters, such as particle relative density, particle gradation, boundary moisture content, cohesion and friction angle, cannot directly reflect the degree of soil dispersion. Different dispersibility discrimination tests (pinhole test, crumb test, double-hydrometer test, pore water test and exchangeable sodium percent test) were designed from different perspectives (simulation model, colloidal properties, particle size distribution and soil chemistry). However, due to the complexity of dispersive soils, dispersive discriminant results of same samples under different dispersive tests are often inconsistent with each other (Maharaj et al. 2015). At present, comprehensive identification methods are mostly used. Bell and Walker (2000) developed a rating system which requires results from pinhole test, exchangeable sodium percentage test, crumb test and pore water test. Fan et al. (2013) and Ju et al. (2016) designed weights analysis method which requires results from pinhole test, crumb test, double-hydrometer test, pore water test and exchangeable sodium percent test. Fan and Kong (2013) established an empirical equation with four soil physical and chemical indicators which are liquid limit, clay content, sodium percentage in the pore water and pH. Maharaj (2013) proposed a complete process which requires results from in situ crumb test, laboratory crumb test, double-hydrometer test and chemical analysis tests.

Artificial neural network (ANN) is a highly self-learning and self-adaptive nonlinear system. Under the condition of unknown physical and mathematical models of objective natural phenomena, the actual output achieved by the self-learning of ANN is completely consistent with the expected output in a certain error range. The aim of ANN is to explore the mystery of human intelligence by studying the mechanism of the thinking of human brain and then to simulate the structure of work of human brain so as to make machines have human-like intelligence. ANN has been applied in pattern recognition, machine learning, expert system and other fields and has become an active field in artificial intelligence (AI) research. In the field of hydraulic engineering and civil engineering, ANN had worked well for deep mixing technique (Hosseini et al. 2020), for identifying and categorization of water and soil (Olawoyin et al. 2013; Wu et al. 2018) and for estimation and prediction of the them indexes of air permeability (Cai et al. 2020), hydraulic properties (Schaap et al. 1998; Tamari et al. 1996), thermal characteristics (Erzin et al. 2008; Ozturk et al. 2011; Jalal et al. 2021), chemical composition (Aitkenhead et al. 2012; Sarani et al. 2016) and mechanical behavior (Attoh-Okine 2004; Mohanty et al. 2019; Jalal et al. 2021).

According to a study published by IEEE Spectrum, Python has become the most popular computing language in the world, with Java and C ranking second and third, respectively. Python is an object-oriented programming language and can easily connect various modules written in other languages (especially C/C++). It has a large number of third-party libraries, so developers only need to be good at using these libraries to complete most of the work.

The aim of this study is to propose a new reliable identification of dispersibility of soils. With the vigorous research of ANN, it has been widely used in various disciplines. The relationship between the comprehensive discrimination result of dispersibility of soils and their physical, chemical, mineral and dispersibility discrimination tests indices forms a complex nonlinear system, essentially. The study used 80 samples information obtained from different provinces in west of China to build ANN model by using TensorFlow which is an open-source library of Python.

2 Materials and Methods

2.1 Data Sources and Selection of Discriminant Parameters

There are 80 sets of soil data information. They were came from Dashixia Hydropower Station, Xijiao and Sanping reservoirs in Xinjiang Province, main channels of irrigation area in Heilongjiang Province, Ningmute and Banduo hydropower stations in Qinghai Province, Wenjiagou, Majiashu and Nanping hydropower stations in Ningxia Hui Autonomous Region, Heihe Water Control Project in Shaanxi Province and Shangma Reservoir in Shanxi Province. In all, there are 11 water conservancy projects. Soil’s detailed information is shown in Table 1.

Table 1 Sources and distributions of data

Based on the double-layer theory of soil–water–electrolyte system, Fan et al. (2012) systematically studied the relationship between basic physical, chemical, mineral indexes and dispersibility of soils. It was considered that the dispersive soils consist of three aspects: low clay content, high sodium content and high pH. Referring to the relevant research results, the input parameters were determined as liquid limit (Ww), plastic limit (Wp), clay content (Pc), dispersive degree (D), pH, organic matter (Wu), montmorillonite content (M), illite content (I), exchangeable sodium percentage (ESP), pore water sodium ion content (CNa+) and sodium percentage (PS). Statistical results of data are shown in Table 2. The output parameter is one of discriminant results: dispersive soil (DS), transitional soil (TS) or non-dispersive soil (NS).

Table 2 Statistical results of data

2.2 Artificial Neural Network

The concept of ANN originates from biological neural network. Each neuron in human brain has thousands of channels which are widely interconnected with other neurons to form complex biological neural network. Signals can be transmitted through new connections generating by stimulation to form feedback in biological neural network (BNN). However, ANN cannot generate new connections randomly, so it updates neurons to form a good nervous system.

Diagram of the ANN model used in this paper is shown in Fig. 1. Each circle stands for one neural, and the all circles form an ANN. The information transmission between neurons is expressed by function y = f (weights * x + bias). Different weights (w) and bias(b) make different output results. The computing process consists of two parts, the forward propagation of information and the back propagation of error. In the process of forward propagation, node data can be calculated in accordance with formulas (1), (2) and (3). Sn in formula (3) is the probability of three discriminant results, and the final output result is one of them. The commonly used activation functions are sigmoid, tanh and relu. In the process of back propagation, the gradient descent algorithm is often used to update w and b. Common gradient descent algorithms include gradient descent optimizer, momentum optimizer and Adam optimizer. Simply, the self-learning process of ANN is an iterative process.

$$z_{j} = f\left( {\sum\limits_{{i = 1}}^{{11}} {w_{{ij}}^{{(xz)}} x_{i} + b_{j}^{{(xz)}} } } \right),\;(j = 1,2,...,11)$$
(1)
$$y_{k} = f\left( {\sum\limits_{{i = 1}}^{{11}} {w_{{jk}}^{{(zy)}} z_{j} + b_{k}^{{(zy)}} } } \right),\;(k = 1,2,3)$$
(2)
$$s_{n} = \frac{{e^{{y_{n} }} }}{{\sum\nolimits_{{m = 1}}^{3} {e^{{y_{m} }} } }},(n = 1,2,3).$$
(3)
Fig. 1
figure 1

Diagram of the dispersivity discriminant ANN model

Thus, ANN has intellectualized and computerized characteristics. There is no need to establish an empirical statistical relationship between discriminant parameters and predictive objectives based on some theory.

2.3 Modeling Tool

The Python language was invented by Guido van Rossum in 1991. As a powerful cross-platform object-oriented programming language, it is one of the most widely used computer languages. The advantages of this language are as follows: (1) simple grammar, clear code, free and open source; (2) easy to connect with modules made in other languages (C, Java, C++,VB, etc.); (3) easy porting to Linux, Windows; (4) supporting interactive mode, which can input and get results from terminals without compiling environment in the development process, also called dynamic language; and (5) containing more rich and powerful software libraries. The libraries used in this article are TensorFlow, NumPy, Pandas and Scikit learn. The function of min–max scaler and train–test split in Scikit learn library was used to standard and bath data, respectively.

The machine learning flowchart is shown in Fig. 2. Basic flow of ANN discriminant model by using TensorFlow module is as follows: (1) Define function of add layers: Calculation formula is node data = w * upper layer node data + b; discriminant formula of excitation function is that if node layer has excitation function. (2) Define placeholder of node: Node data placeholder of input layer is 11 and of output layer is 3. (3) Define function of layers: The ANN in this study consists of three layers: Input layer only has inputs data without activation function; the activation function of layer1 (hidden layer) and output layer2 (output layer) is tanh and softmax, respectively. (4) Define loss function: Cross-entropy was used; cross-entropy and softmax function are used together to solve the problem of binary classification or multi-classification with mutual exclusion of categories. (5) Choose an appropriate optimizer to minimize the loss: Gradient descent algorithm was used and the learning rate is 0.5. (6) Initialize all variables and start operations. The maximum number of iterations is set to 3001. In order to verify the correctness of the model, 70% of the 80 data samples were randomly selected as training samples and the remaining 30% of the data were used as test samples. The training samples and test samples did not coincide.

Fig. 2
figure 2

Machine learning flowchart of the dispersivity discriminant ANN model. (a) Activation function = Sigmoid, learning rate = 0.001. (b) Activation function = Sigmoid, learning rate = 0.01. (c) Activation function = Sigmoid, learning rate = 0.1. (d) Activation function = Tanh, learning rate = 0.001. (e) Activation function = Tanh, learning rate = 0.01. (f) Activation function = Tanh, learning rate = 0.1. (g) Activation function = Relu, learning rate = 0.001. (h) Activation function = Relu, learning rate = 0.01. (i) Activation function = Relu, learning rate = 0.1

3 Results and Discussion

3.1 Data Preprocessing

What is clear from the data is that dimensions of factors affecting dispersion of soils are different. The model is easy not to converge if used different dimension data for training, so the data should be normalized. The function of min–max scaler was used to normalize input layer data to 0.01 ~ 0.99. The scaling principles are shown in formulas (4) and(5). xstandard is the result of standard normalization, as well as normalizing data to [0,1]; xmax and xmin are maximum and minimum value in data, respectively; max and min are maximum and minimum value in range of normalization; and xscaler is the result of normalizing data to [0.01,0.99]. Results of data preprocessing are shown in Table 3. Due to limited space, only some data are listed.

$$x_{{{\text{standard}}}} = \frac{{x - x_{\min } }}{{x_{\max } - x_{\min } }}$$
(4)
$$x_{{{\text{scaler}}}} = s_{{{\text{standard}}}} * (\max - \min ) + \min .$$
(5)
Table 3 Results of data preprocessing

3.2 Model Building and Discussion

The influence of numbers of neurons in hidden layer and iterative steps on dispersivity discriminant ANN model under different activation functions and learning rates is discussed, as shown in Fig. 3. It could be concluded that the speed of gradient descent and finding minimum value of function when learning rate is small. Even if iterative calculation is up to 10,000 steps, the accuracy of ANN model with small learning rate is still low. Therefore, it is suggested that learning rate should be 0.1. As shown in Fig. 3c, e and i, higher and stable accuracy could be obtained, ranging from 0.80 to 0.84 when activation function is relu as well as using less iterative steps. It is further suggested that numbers of neurons in hidden layer should be 29 and iterative steps should be 2500. In general, for ANN discriminant model of dispersive soil, it is suggested that the activation function is relu, the learning rate is 0.1, the number of neurons in hidden layer is 29, and iterative steps are 2500.

Fig. 3
figure 3figure 3

Relationships between iterative steps, numbers of neurons in hidden layer and accuracy of dispersivity discriminant ANN models

3.3 Distribution of Weights and Bias Values

The difficulty of ANN model is to change weights and bias iteratively, so that prediction result is infinitely close to actual result. The histograms of weights and bias in different layers (hidden layer and output layer) which reflect statistical characteristics of weights and bias are shown in Fig. 4. The values of weights in hidden layer are between -3.06 and 3.06 and the most common values are −0.106 and 0.528 and the values of weights in output layer are between -3.05 and 3.71 and the most common value is −0.717, as given in Fig. 4a. The values of bias in hidden layer are between −0.655 and 1.31 and the most common value is 0.0898 and the values of bias in output layer are between −0.209 and 0.417 and the most common value is −0.209, as given in Fig. 4b.

Fig. 4
figure 4

Distribution of weights and bias values in different layers. (a) Weights histograms. (b) Bias histogram

3.4 Model Validation

We listed the results of discrimination and classification of new ten data which did not participate in the neural network model system construction with the knowledge information of dispersive soil prediction, so as to test the reliability of the established machine identification system, as given in Table 4. Test dates were chosen from Nanyin project, Taoshan Reservoir in Heilongjiang Province and a dam in Northeast China. The prediction results are in good agreement with the actual results, which verifies ANN model has good application effect in predicting the dispersibility of soil as shown in Table 5.

Table 4 New data for validation
Table 5 Validation results of the dispersivity discriminant ANN model

4 Conclusion

ANN has the characteristics of intelligence and computerization. It is not necessary to establish empirical statistical relationship between discriminant parameters and prediction targets based on some theory. It is very effective in soil dispersion prediction.

Python is a high-level language, which has characteristics of fast computing speed, high computing ecology and easy grammar. TensorFlow library has the advantages of good visualization effect and easy to understand the model calculation process.

The combination of Python and ANN is suitable for the identification and simulation of highly complex and nonlinear problems, which is one of the trends of hydraulic engineering and civil engineering in the future.