Keywords

1 Introduction

Nitrogen is a “life element” in crop growth and development, rapid and accurate determination of nitrogen is conducive to the real-time management of nitrogen, providing a guarantee for high yield and quality of crops [1, 2]. The use of spectral reflectance and spectral index for crop nitrogen estimation has been an important content of agricultural remote sensing research, and domestic and foreign scholars have done a lot of research [3,4,5]. In the jointing stage of wheat, artificial neural network, support vector regression and random forest were used to establish the estimation model of nitrogen accumulation in wheat leaves [6]. The absorption characteristic spectrum (590–756 nm, 1096–1295 nm, 1295–1642 nm) were used to determine the characteristics of the depth and area of the leaves, and it can be a good total nitrogen content inversion [7]. A number of spectral indices were established using multi-spectral data of high score number satellite to estimate the nitrogen content of wheat leaves [8]. The new red edge bimodal parameters were constructed by using the ground spectrum, and the nitrogen content of wheat leaves was estimated by regression analysis [9]. The red edge index was used to estimate the nitrogen nutrition index [10]. The grey relational analysis was used to analyze the correlation of commonly used vegetation indices, and five indices of good vegetation indices were selected based on experience to participate in partial least squares regression (PLSR) [11, 12]. The Manhattan, Euclidean measures and a variety of neural networks were combined into an optimal model to predict the leaf nitrogen content. The results showed that the accuracy of the algorithm could predict leaf nitrogen content to 92.1% [13].

However, most researches do not consider whether the model established between crop nitrogen content and vegetation index is optimal. In this study, the variable importance projection (VIP) is used to sort the selected vegetation indices, and the number of different vegetation indices was selected as the input variables according to the importance of the vegetation index. The number of input variables was gradually increased. Based on the AIC, the optimum nitrogen content of winter wheat leaves estimation model was selected, and the optimal nitrogen content estimation model was selected to estimate the nitrogen content of winter wheat leaf. The purpose of this study was to estimate the nitrogen content of wheat leaves, to establish a method and the optimal model for monitoring nitrogen content in wheat leaves.

2 Material and Method

2.1 Overview of the Study Area

The experiment was conducted at the wheat experiment base in Beijing Academy of Agriculture and Forestry in 2013–2014. The experiment base was located in Haidian District, Beijing, 39° 56’N, 116° 16’E, the specific location shown in Fig. 1. The district has four distinct seasons, cold and dry in winter, high temperature and rainy in summer, the average annual temperature of 12.5 ℃. Wheat was sown on September 29, 2013. And the soil was tested. The content of organic matter in the soil was 1.54–2.16 g/kg; the available potassium content was 77.54–216.27 mg/kg, the content of nitrate nitrogen was 6.99–21.40 mg/kg, the available phosphorus content was 29.92–136.44 mg/kg, the Total nitrogen content is 0.09–0.12 g/kg. The tested varieties were Jing 9428 and Nongda 211 respectively. Each plot size 1.2 × 1.5 m, planting 33.75 g, showing 6 lines, planting density of 6 million/hm2. 40.5 g of urea per cell. 2 varieties, 5 water treatment, 4 repeat tests, water stress test from no water to water full saturation test, 0 m3/ha (W1), 225 m3/ha (W2), 450 m3/ha (W3), 675 m3/ha (W4), 900 m3/ha (W5), respectively. The test was conducted on April 20–21, 2014.

Fig. 1.
figure 1

Beijing academy of agriculture and forestry sciences

2.2 Leaf Spectrum Determination

Leaf spectrum measurements were carried out using an American FieldSpec ® Pro FR spectrometer (ASD Inc. Boulder Colorado, USA) coupled with a hand-held leaf blade spectroscopy detector (ASD leaf clip). The wavelength range is 350–2500 nm, and the interval is 1 nm. Blade holder detector built-in quartz halogenated lamp, guaranteeing that the light source is stable. The blade detector can cover the whole leaf chamber of the leaf blade and has the same detection area. It can eliminate the background reflection, the spectral fluctuation caused by the curved surface of the blade, and the internal variation caused by the leaf blade. Before the measurement of the leaf spectrum, the dust on the surface of the blade was wiped off with a clean, damp cloth, and corrected with a standard whiteboard to reduce the error.

2.3 Leaf Nitrogen Content Determination

There are 40 wheat plots. Each plant is divided into 5 layers. 5 samples were taken per cell, and 5 layers of each plant were combined as samples of each layer, 199 samples in total. The samples were placed in a paper bag, placed in an oven at 105 ℃ for 30 min, and then each organ sample was dried at 80 ℃ for more than 48 h until constant mass. Finally, the dried crop leaves were crushed. The nitrogen content (N, %) of the crop leaves was measured using a Kjeldahl nitrogen apparatus (Buchi B-339, Switzerland).

2.4 Method

2.4.1 Selection of Vegetation Index

According to the previous research results and the significant relationship between vegetation index and leaf nitrogen content, 19 vegetation indices related to leaf nitrogen content were selected.

2.4.2 Variable Importance Projection

Variable projection importance (VIP) is a variable screening method based on partial least square (PLS), which describes the interpretation of the dependent variable to the dependent variable and ranks the independent variables according to their explanatory power [26, 27]. The formula is

$$ {\text{V}}_{{{\text{VIP}}_{\text{j}} }} = \sqrt {\frac{\text{k}}{{\sum\limits_{h = 1}^{m} {r^{2} (y,c_{h} )} }}\sum\limits_{h = 1}^{m} {r^{2} (y,c_{h} )w_{hj}^{2} } } $$
(1)

k–the number of vegetation indices, m–The number of components extracted from the original vegetation index variables, \( c_{h} \) – Principal Component Analysis of Independent Variables of Correlated Vegetation Index, \( r(y,c_{h} ) \) – correlative coefficient of dependent variables and principal components of winter wheat, indicating that the ability of principal components to explain y(LAI of wheat), Whj – Weight of Vegetation Index on Principal Components. The higher the VIP value, the stronger the vegetation index can explain the LAI and vice versa.

2.4.3 Akaike’s Information Criterion

The Akaike’s Information Criterion [28] is a measure of the goodness of fit of a statistical model that measures the complexity of the model and the goodness of fit of the model. AIC can be expressed as

$$ AIC = n\,\ln S_{p}^{2} + 2(k + 1) $$
(2)

n – the number of samples, \( S_{p}^{2} \) – Residual Mean Square, k – Number of variables in the model.

2.4.4 Partial Least Squares

The partial least squares method is a multivariate statistical data analysis method, which mainly studies the multivariate regression model for single or multi-variables on the independent variables. The formula is

$$ {\text{Y}} = a_{0} + a_{1} X_{1} + a_{2} X_{2} + \cdots + a_{n} X_{n} $$
(3)

a0 – The intercept of the regression coefficient, ai – regression coefficient, Xi – Independent variables 1–n.

2.5 Accuracy Verification

Leave one out cross-validation (LOOCV) each time to select a sample for verification, all other samples as a training sample. The model was validated, and the sample errors were calculated until all the samples were involved in the validation. LOOCV is the extreme case of K-cross-validation, and all values are modeled and validated. And it can avoid the problem of selection between modeling sample and verification sample in experimental design. This method can effectively evaluate the reliability and stability of the model.

The determination coefficient (R2), root mean square error (RMSE) and relative error (RE) were selected as the indices to evaluate the modeling accuracy and validation accuracy. R2 is indicating that the predictive value and the measured value of the degree of fit. R2 is closer to 1, indicating that the higher the accuracy of the fitting curve. RMSE reflects the deviation between the predictive value and the measured value, the smaller the value, the higher the model accuracy. RE mainly reflects the credibility of the measurement. The smaller the relative error is, the more reliable the measured value. The calculation formula is shown below.

$$ R^{2} = \frac{{(\sum {xy - \frac{{\sum {x\sum y } }}{q})^{2} } }}{{(\sum {x^{2} } - \frac{{(\sum x )^{2} }}{q})(\sum {y^{2} } - \frac{{(\sum y )^{2} }}{q})}} $$
(4)
$$ RMSE = \sqrt {\frac{{\sum\limits_{i = 1}^{q} {(y_{i} - \hat{y}_{i} )^{2} } }}{q}} $$
(5)
$$ RE = \frac{{\left| {\sum\limits_{i = 1}^{q} {\hat{y}_{i} } - \sum\limits_{i = 1}^{q} {y_{i} } } \right|}}{{\sum\limits_{i = 1}^{q} {y_{i} } }} \times 100\% $$
(6)

x – vegetation index, y – crop LNC measured value, \( \widehat{y} \) – the predicted value of the crop, q – the number of samples.

3 Results and Analysis

3.1 Analysis of VIP Between Vegetation Index and LNC of Winter Wheat

19 hyper-spectral indices (as shown in Table 1) were calculated based on hyper-spectral data of leaves in wheat. The relationship between the obtained hyper-spectral indices and the measured nitrogen content of wheat leaves were analyzed by VIP. The results about the ability of vegetation indices to explain nitrogen content in wheat leaves are shown in Table 2. At the flag stage of wheat, between the spectral index and the leaf nitrogen content of the wheat, the maximum VIP corresponding spectral index is PSSRc, and the VIP value is 1.1944. The smallest VIP corresponding spectral index is MCARI/MTVI, and the VIP value is 0.8815. The top 10 spectral indices ordered by the VIP values are, PSSRc, GMI-2, SR705, RI-half, ZM, GMI-1, PSSRb, RI-3 dB, VOGc and CIred edge, respectively.

Table 1. Spectral indices for leaf nitrogen content
Table 2. VIP and rank of evaluation index for LNC of winter wheat (n = 199)

3.2 Selection of Optimal LNC Estimation Model for Wheat

According to the VIP sequence shown in Table 2. The PLS models were constructed by using the different number of vegetation indices as the independent variables, calculating the AIC value of the model respectively. As shown in Table 3, when the number of independent variables was 5 vegetation indices, the AIC value was the smallest, and the AIC value was −427.878. The AIC principle of building model was using as few parameters as possible. According to the AIC principle, in the flag stage of wheat, 5 vegetation indices were selected as independent variables to build PLS model, and its equation was an optimal model. The model was VLNC = 4.9467 + 0.5186VPSSRc − 13.9993 VGMI-2 + 43.7809 VSR705 + 2.9433 VRI-half − 36.7703 VZM. The 5 vegetation indices were PSSRc, GMI-2, SR705, RI-half, and ZM. The R2, RMSE and RE of the model were 0.79, 0.33 and 0, respectively.

Table 3. Comparison of PLS regression equations built by wheat LNC different independent variables according to VIP (flag leaf stage)

3.3 Verification of Optimal LNC Model for Winter Wheat

In order to verify the accuracy of VIP-PLS-AIC to predict the nitrogen status of wheat leaves, the data of wheat flag date of 2013–2014 were used respectively. For the nitrogen content of wheat leaves, the VIP-PLS-AIC model of 5 vegetation indices was validated. The results showed that R2 and RMSE were 0.79 and 0.33 respectively. The predicted values and the measured values of the scatter plot have high consistency, indicating that the use of VIP-PLS-AIC for winter wheat leaf nitrogen status estimation with high accuracy and good reliability as shown in Fig. 2.

Fig. 2.
figure 2

Comparison of predicted and measured wheat LNC based on VIP-PLS-AIC

4 Conclusion

The relationship between vegetation index and LNC was evaluated by VIP. The vegetation index with the highest VIP value was PSSRc, and the vegetation index with the lowest VIP value was MCARI/MTVI.

The number of vegetation indices was increased as the input factors, PLS regression model was used to establish the regression model of seven wheat LNC. The optimal LNC model was selected by AIC criterion. The results show that the PLS model with 5 vegetation indices are the optimal model for estimating LNC.

In wheat LNC model, the R2 of VIP-PLS-AIC model set and validation set are 0.79, and RMSE is 0.33. The results of modeling and validation show that the importance of ordination between vegetation index and wheat LNC was better by VIP. And the optimal estimation model of wheat LNC can be obtained based on AIC.