Introduction

There are a few main factors affecting soil CO2 flux such as soil organic matter content, soil type, soil tillage and management systems, root respiration, etc. The decomposition of soil organic matter causes CO2 flux (Kuzyakov 2002; Fender et al. 2013). Fertilization, especially N fertilization, accelerates CO2 flux due to the effect of root development (Shao et al. 2013) and microbial activity (Yan et al. 2010; Fangueiro et al. 2008). Soil temperature and soil moisture affect soil CO2 flux because of their direct impact on microbial activity (Risk et al. 2002; Rustad et al. 2001). Soil respiration amount increases with the increase in soil temperature (Kirschbaum 1995; William et al. 1994, Lou et al. 2003, Lu et al. 2008).

Various methods have been used while modeling of the CO2 flux from soil to atmosphere. Assorted studies in the literature to model CO2 fluctuation have applied various techniques (Oprea and Iliadis 2011; Ibarra-Berastegi et al. 2008; Huebnerova and Michalek 2014). Among these techniques, multiple linear regression and artificial neural networks have been mostly utilized (Huebnerova and Michalek 2014; Elangasinghe et al. 2014; Kurt and Oktay 2010; Banja et al. 2012).

ANN has been successfully utilized for modeling many complex systems (Droulia et al. 2009). This is an efficient method for modeling nonlinear systems. This method uses input and output parameters for prediction with different transfer-learning function combinations and neuron numbers (Franch and Panigrahi 1997). Besides, different neural network types have been used such as back-propagation neural network (Van Wijk and Bouten Verstraten 2002).

ANN is frequently applied in the studies on ecological modeling such as temperature and rainfall prediction (Somaratne et al. 2005; Zhuang et al. 2012; Wen et al. 2014; Li et al. 2017). Also, Papale and Valentini (2003), Hagen et al. (2006) and Song et al. (2014) stated that the ANN model is a very appropriate method for efficiently predicting soil respiration. In many studies, the ANN method has been used successfully to model gas emission in forest soils. For example, Van Wijk and Bouten Verstraten (2002) and Papale and Valentini. (2003) stated that CO2 emissions successfully modeled in European forests using the ANN method.

MLR is another method that was chiefly used in modeling works (Hutchinson et al. 2000; Welles et al. 2001). Besides, MLR has been designated in most of the approaches to CO2 modeling studies (Pedersen 2000). The model’s performance in the researches using the MLR method has been evaluated considering R2 values (Pedersen et al. 2001). Higher the R2 value obtained from the study results approaches to 1, more accurate the model’s acceptance (Hutchinson and Livingston 2001).

For the performance evaluation of this model, the R2 values are taken into consideration. If the value of the R2 approaches to 1, the efficiency of the model is considered as good. Bond-Lamberty and Thomson (2010) reported a linear model (R2: 0.32) among the soil CO2 flux, soil temperature and moisture. In this research, temperature and moisture values were used as inputs, CO2 flux was used as output values. Similarly, Chen et al. (2013) obtained a R2 of 0.40 from the linear model between the CO2 flux and soil temperature—moisture contents.

The PCA method condenses the input parameters into a smaller set called principal components (Johnson and Wichern 2002). MLR and ANN are employed to model the levels of CO2 flow from soil to the atmosphere. In addition to these methods, two different hybrid models were formed; one of the hybrid models was planned as PCA + MLR while the other was PCA + ANN. As for ANN, 36 different structures were used with different transfer—learning functions and number of neurons. The manure norm soil type, soil temperature, soil moisture content, soil depth and photosynthetically active radiation values were taken into account as input parameters while CO2 flux was output parameter.

The level of CO2 fluxed from the soil to the atmosphere is directly affected by factors such as soil type, fertilizer norm and application form of the fertilizer, soil temperature, soil moisture content and soil management practices. Continuous observation of CO2, one of the most effective greenhouse gases in the atmosphere, is very important for a sustainable environmental approach. From this point of view, the ratio of CO2 emitted from the soil to the atmosphere can be continuously monitored by modeling the level of CO2. Artificial neural networks and hybrid models can determine the relationships between nonlinear changing factors and model these relationships with high accuracy.

The purpose of this research is to investigate the effects of different soil conditions on the fluxed CO2 from soil to atmosphere and determine the best CO2 flux model using artificial neural networks and hybrid models.

Materials and methods

Laboratory experiments

In this study, two different soil types (normal and saline), two different farmyard manure norms (2–4 t ha−1) and two different manure application methods (surface and subsurface) were examined in the laboratory conditions for modeling CO2 flux from soil to atmosphere.

Saline and normal type soil examples provided east of Iğdır pasture and west of Iğdır pasture, Turkey, respectively. In the east of Iğdır, pasture has saline soil properties. In this region, soils have salinity properties as a result of wrong field applications such as excess irrigation, conventional agriculture, etc. The properties of the soil used in laboratory experiments are given in Table 1.

Table 1 Properties of soil examples

The manure used in the experiments was applied with two different methods as surface and subsurface. Manure had been homogenously laid on the soil surface as surface application method. In the subsurface application, manure laid on the 10 cm soil depth and then mixed with a paddle. The chemical content of the farmyard manure is given in Table 2.

Table 2 Chemical content of the farmyard manure

A flux-type temperature resistance was used in the laboratory experiments. The resistance is laid on the soil surface approximately 15 cm of soil depth. The electronic control unit was used for blocked temperature fluctuation. The automated ACE and soil CO2exchange system were used for determining the CO2 flux. The technical information of CO2 exchange system is given in Table 3.

Table 3 Technical information of CO2 exchange system

Before the experiments were started, the soil was saturated by water. After waiting for 2 days, the soil was heated from 20 to 50 °C degrees with grades of 0.5°. An electronic temperature control unit (ECU) with flexible temperature resistance was used for this purpose. After reaching the maximum temperature level, the temperature resistance and ECU system were deactivated until the soil temperature reached 20 °C. These processes were continued about 48 h for each factor.

The resistance equipped with an electronic control unit and the soil CO2 exchange system is given in Fig. 1. Volumetric soil moisture percentage (%) and temperature (°C) were simultaneously measured via automated ACE and soil CO2 exchange system sensors.

Fig. 1
figure 1

CO2 flux, temperature resistance and electronic control unit

Dataset for CO2 flux modeling

In the research, 27,713 data (7 parameters × 3959 observation) were used for CO2 flux prediction model. These data were obtained by automated ACE and soil CO2 exchange system during the 48 h for all of the factors.

The modeling with multiple linear regression

The MLR method and model architecture are given in Eq. 1 and Fig. 2, respectively. In the equation, Y is model’s predicted value, X is contaminant concentration, ai, i:0,…,n, is coefficient of regression.

$$Y = a_{0} + a_{1} x_{1} + a_{2} x_{2} + \cdots + a_{n} x_{n}$$
(1)

The MATLAB software was used for the MLR model. The input and output parameters for this model are given in Table 4.

Fig. 2
figure 2

Model architecture of MLR

Table 4 The input and output parameters

The modeling with principal component analyses

The principal component analysis (PCA) was used to decrease the number of input parameters. These new input parameters were called principal components (PC-eigenvectors). To construct principal components MathWorks MATLAB was used. MATLAB’s PCA function uses the singular value decomposition (SVD) algorithm by default and returns the percentage of the total variance explained by each principal component. In general, the smallest number of components explaining 80–99% of the total variance is chosen, where these values follow PCA best practices.

The modeling with artificial neural network (ANN)

Another model used in the research is artificial neural network (ANN). Artificial neural networks are frequently used in the modeling studies conducted between the variables which especially has nonlinear correlation. In this method, models are established with the aid of appropriate transfer and activation functions, number of neurons and learning algorithms considering the structural specifications of the problem (Gardner and Dorling 1998). In the research, the combinations of two learning functions, three transfer functions and three different neuron numbers were used in ANN structures to model CO2 flux flowing from soil to air (Table 5). Artificial neural network architecture is given in Fig. 3.

Table 5 Functions and neurons numbers used in the ANN
Fig. 3
figure 3

Artificial neural network architecture

Principal component analysis with multiple linear regression

In this method, for modeling CO2 emission, PCs were accepted as input parameters and combined with the MLR method (Fig. 4). PCs were obtained from the principal component analyses.

Fig. 4
figure 4

The architecture of principle component analysis with multiple linear regression

Principal component analysis with artificial neural network

The PCs were used as input parameters in this method as in the PCA + MLR method. The same transfer—learning functions and neuron numbers used in the ANN method were used together with PCs for modeling CO2 emission (Table 6). Figure 5 illustrates architecture of principal component analysis with artificial neural network.

Table 6 Functions and neurons numbers used in the PCA + ANN
Fig. 5
figure 5

The architecture of principal component analysis with artificial neural network

Statistical analysis for the dataset

Analysis of variance (ANOVA) was used to assess the significance of each treatment on soil properties and CO2 fluxes. Means were compared when the F test for treatment was significant at 5% level by using Duncan’s multiple range tests.

Performance evaluation for hybrid models

Accuracies of models were confirmed via root mean-square error (which is also known as root mean-square deviation or RMSE), mean absolute error (MAE), and R2 (which is also known as coefficient of determination or R2). A model is evaluated as its accuracy is high when R2 reaches to 1 and RMSE and MAE approaches to zero.

$${\text{RMSE}} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {Y_{pi} - Y_{di} } \right)^{2} }$$
(2)
$${\text{MAE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left| {Y_{pi} - Y_{di} } \right|$$
(3)
$$R^{2} = 1 - \left( {\frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {Y_{pi} - Y_{di} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {Y_{pi} - \bar{Y}} \right)^{2} }}} \right)$$
(4)

In these equations, where n is the number of observations, Ypi is the predicted value for observation i, Ydi is the real value from observation i, and is the average of the real value.

Results and discussion

The results of statistical analyses for the dataset

Soil CO2 flux was affected by soil type, farmyard manure norm, manure application techniques and soil temperature statistically highly significant (p < 0.001), but this trend was not observed interaction values (Table 7).

Table 7 The results of variance analysis for the dataset

At the initial temperature conditions (20–25 °C), CO2 flux assigned as 1.173 µmol g cm−3, CO2 flux gradually raised according to higher soil temperature conditions. When the soil temperature had been reached the maximum level (45–50 °C), CO2 flux from soil to atmosphere determined as 6.62 µmol g cm−3. The CO2 flux on the subsurface manure application was bigger than the surface manure application, approximately 50%. However, the CO2 flux increased with increasing manure norm. CO2 flux determined as 2.754 and 3.975 µmol g cm−3 for 2 and 4 t ha−1 manure norm, respectively. When examined effects of soil type on the CO2 flux, maximum CO2 flux values were observed at the normal-type soil with 3.758 µmol g cm−3 and minimum values determined at the saline soil conditions with 2.971 µmol g cm−3.

The results of multiple linear regression (MLR) modeling

In the research firstly, multiple linear regression models were used to estimate the CO2 flux. For this purpose, soil temperature (St), soil moisture content (Sm), soil type (St), fertilizer norm (fn), soil depth (sd) and photosynthetically active radiation (PAR) were used as input parameters for prediction of CO2 flux. Table 8 illustrates the statistical results of the MLR. Examining Table 8, it can be seen that R2 and P values are 0.681 and 0.000, respectively. The equation of the MLR model and predicted—observed values are given in Eq. 5 and Fig. 6, respectively.

$${\text{CO}}_{2} \,{\text{flux}} = - \,8.60 - 1.40x_{1} + 1.1 x_{2} + 0.24x_{3} + 0.03x_{4} + 0.22x_{5} 9.045x_{6}$$
(5)

In this equation, x1: St, x2: Sm, x3: Sty, x4: Fn, x5: Sd, x6: PAR.

Table 8 The statistical results for MLR analysis
Fig. 6
figure 6

Observed and predicted CO2 flux in the multiple linear regression model

The results of the principal component analysis (PCA) modeling

The principal component analysis (PCA) results showed that the first two principal components, PC1 and PC2, explained, respectively, 82.39 and 15.78% of the variance for all areas and jointly was responsible for more than 98.17% of the variance (Table 9). A similar result was found in a study by Panosso et al. (2011) on CO2 fluxes, where the PCs together explained 70% of the variability of soil attributes (physical and chemical), with PC1 explaining 52% and PC2, 18%.

Table 9 The eigenvalues of principal components analyses

The result of multiple linear regression with principal component analyses (PCs + MLR) hybrid modeling

In this method, the PCs (PC1 and PC2) were used as input parameters to predict CO2 flux. The results of the statistical analyses and the equation of this model are given in Table 10 and Eq. 6, respectively. In this equation, x1 and x2 were expressed PC1 and PC2, respectively. Also, observed and predicted values can be seen in Fig. 7.

$${\text{CO}}_{2} \,{\text{flux}} = - \,4.8 + 0.21x_{1} + 0.05x_{2}$$
(6)

The R2 value was calculated as 0.432. This value is smaller than the R2 of the MLR model. The MLR model used six input parameters such as soil temperature, soil moisture content, soil type, fertilizer norm, soil depth and photosynthetically active radiation for prediction of the CO2 flux. However, this method used only two inputs parameters such as PC1 and PC2. According to this result, it can be said that better modeling will be done as the number of input parameters increase in the modeling of CO2 emission.

Table 10 The statistical results of PCs + MLR
Fig. 7
figure 7

Predicted and observed values of the PCs and MLR

The results of the artificial neural network (ANN) modeling

In the ANN, it was used 36 different neural structures with different learning—transfer functions with different neuron numbers. The statistical results of these structures are given in Table 11. Among these structures, the best results were obtained from the ANN 18 structure. This network model used Levenberg–Marquardt (Trainlm) learning function and Tansig–Pureline transfer function with 30 neurons.

Table 11 The statistical results of the ANN model

In the ANN 18 structure, it can be seen that the highest R2 and the lowest MAE values were calculated as 0.983 and 0.024, respectively. Also, the R values of test and validation were more than 0.99 (Fig. 8). Figure 9 illustrates the observed and predicted values of the ANN18 structure.

Fig. 8
figure 8

Performance evaluation of the ANN18 structure for ANN model

Fig. 9
figure 9

Predicted and observed CO2 flux in the ANN18 structure for ANN model

The results of the artificial neural network with principal components analysis (PCs and ANN) hybrid modeling

In this model, PCs were used as input parameters, and the 36 different ANN structures were examined for the CO2 flux. Table 12 illustrates the statistical results of the PCs and the ANN model. The best-predicted results were obtained from the ANN16 structure. In this structure, the R2 and MAE values were determined as 0.756 and 0.051, respectively. As can be seen in Table 12, the Levenberg–Marquardt (Trainlm) learning function and Logsig—Tansig transfer function with 30 neurons were used in the ANN 16 structure. Also, the R values training and validation were calculated as 0.86 and 0.87, respectively (Fig. 10).

Table 12 The statistical results of the PCs and ANN model
Fig. 10
figure 10

Performance evaluation of the ANN16 structure for PCs and ANN model

The R2 value of the PCs and ANN model was smaller than the ANN model. This result can be thought to be caused by the difference in input parameters. The PCA model has two input values such as PC1 and PC2, while the 6 input values (soil temperature, soil moisture content, soil type, fertilizer norm, soil depth and photosynthetically active radiation) in the ANN model have affected the model performance. Similar results were observed at the MLR and MLR with PCA models. Predicted and observed values of the CO2 flux for Pcs and ANN model are given in Fig. 11.

Fig. 11
figure 11

Predicted and observed CO2 flux in the ANN16 structure for PCs and ANN model

Conclusion

Among the methods conducted to model CO2 flux, the ANN gave the best results. It is a good idea to visualize the data in the 2D plot using PCA to decrease 6 parameters to two principal components. However, PCA + MLR and PCA + ANN combinations resulted worse than MLR and ANN methods when they were utilized singularly. When PCs were constructed over 98% of the variance, it would be expected that MLR and ANN results should be close to PCA + MLR and PCA + ANN results, respectively. Regarding only R2 values, MLR (0.681) differs from PCA + MLR (0.432) and ANN (0.983) from PCA + ANN (0.756). This much difference may be caused by our parameters being nonlinear. Further research may work on simultaneous-, progressive-, successive-, prioritized- (Liu and Chang 2007), independent-, sparse-, sparse independent- (Lee et al. 2016), parallel-, kernel- (Jiang and Yan 2018), local- or constrained (Aversano et al. 2019) PCA. ANN presented better models than MLR. Thus, CO2 flux seems nonlinearly dependent on input parameters (manure norm, soil type, soil temperature, soil moisture content, soil depth, photosynthetically active radiation and maybe more). Nonlinear regression methods should be used instead of linear models.