Introduction

Nitrogen is one of the most critical nutrients for rice growth representing the largest part in fertilizer variable input costs. An appropriate nutrient management can improve the overall rice marketable yields. Thus, it is very necessary to monitor the nitrogen status during the growing season by site-specific application of fertilizers. The spatial and temporal variations of the nitrogen in rice fields need to be determined in order to make it correspond as closely as possible with the rice requirements. This possibility depends on the available techniques to evaluate nitrogen status.

Several different methods are available for assessing the nitrogen status of the crop. One is tissue and chemical analysis such as Kjeldahl nitrogen determination method. It is a direct and accurate way of crop nitrogen status detection, but it is time-consuming, and operators are required. Researchers also approach nitrogen status estimation through remote. Remote sensing with aerial images was used to assess nitrogen over the entire fields. The accuracy of such a technique appeared to be easily affected by the low resolution and obvious soil background noises (Broge and Leblanc 2001). Noh et al. (2006) adopted multi-spectral images to increase the sensitivity and improve the testing precision of nitrogen status in the crop. Visible and near-infrared spectroscopy was also used for a non-destructive detection of nitrogen in Chinese cabbage leaves, the stepwise multiple linear regression (SMLR) showed the highest determination coefficient (r 2) of 0.846 (Min et al. 2006). Feng et al. (2008) established a quantitative model for real-time monitoring of leaf N status with key hyperspectral bands and estimation indices in wheat. In the paper of Stroppiana et al. (2009), radiospectrometry was used for crop condition monitoring, in details for nitrogen and chlorophyll assessment. Yang et al. (2009) compared radial basis function neural network and regression model for the estimation of rice biophysical parameters using the remote sensing. Hence, it is very crucial to use multivariate calibration methods to extract the relevant part of the information for the very large dataset. Several common methods including principal component regression, SMLR, and partial least squares (PLS) regression have been developed and used for data mining in agricultural applications, being considered a process for quantitative analysis using numerous correlated variables (Martens and Naes 1989; Cen et al. 2006; Liu et al. 2009a). PLS is the most commonly used multivariate calibration method, which is widely applied in the assessment of agricultural product quality and plant nutrition (Ehsani et al. 1999; Fassio and Cozzolino 2003). It is built based on the linear model, where the relationship between spectra and properties of samples exists. However, PLS is not appropriate when the nonlinear model is required (Zhang et al. 2008). It is very obvious that factors such as experimental condition, instrument variation, and analyzed characteristics induce nonlinearities in the spectra. Generally, artificial neural network (ANN) is a usual choice to solve this problem (Despagne and Massart 1998; Mello et al. 1999). Although this method is used frequently to deal with the nonlinear model, it does not satisfy researchers due to some difficulties such as the selection of hidden layer size, learning rate, and momentum. In addition, ANN model requires a large number of training data, which always influence the training speed. The overfitting nature is another difficulty to overcome in order to generate a good result (Moody 1992).

Recently, a promising method called support vector machine (SVM) was proposed by Vapnik (1998a). It is becoming popular because of many attractive features and excellent performances in wide applications (Burges 1998; Vapnik 1998b; Guo et al. 2001; Comak et al. 2007; Li and He 2009). SVM has a good theoretical foundation based on the statistical learning theory. It embodies the structural risk minimization principle instead of traditional empirical risk minimization principle (ERM) employed by conventional neural network to avoid overfitting problems. SVM is used as a binary classification tool but also can be easily extended to regression tasks (Cogdill and Dardenne 2004). Least squares SVM (LS-SVM) is a modified version of SVM (Suykens et al. 2002). LS-SVM applies least squares error in the training error function. LS-SVM has the capability for linear and nonlinear multivariate calibration and solves the multivariate calibration problems in a relatively fast way (Suykens and Vanderwalle 1999). Learning problem is formulated and represented as a convex quadratic programming problem (Lu et al., 2003) to obtain the support vectors. It adopts least squares linear system, as it is the loss function and is applied in the pattern recognition and nonlinear evaluation. Due to its attractive advantages and excellent performances, LS-SVM has attracted attention and has been extensively applied in spectral analysis (Chauchard et al. 2004; Borin et al. 2006).

The aim of this work is to investigate the potential of reflectance spectroscopy technique combined with LS-SVM for nondestructive detection of nitrogen status in rice at three important growth stages. The robust calibration model derived from LS-SVM was compared to back-propagation ANN (BPNN) and PLS model according to statistical parameters of prediction results. The sensitive wavelengths (SWs) were extracted by independent component analysis (ICA) executed to build a SW-LS-SVM model for soil plant analysis development (SPAD) value prediction in the booting stage.

Materials and Methods

Experimental Design

The basin's inner caliber was 30 cm, height was 45 cm, and soil weight was 10 kg. Place the basin in the slotted field. To ensure that different level of nitrogen fertilization was markedly, the soil in the basin was obtained from the experimental field in the Science Garden and with the mixed soil 20–40 cm under the surface. Set eight basins for each of the nitrogen fertility gradient (four gradients, N0, N1, N2, and N3, two repetition). Soil properties are organic matter 2.14%, N 121 ppm, P 126 ppm, and K 175 ppm, pH value of 7.30.

Field Data Acquisition

A total of 64 samples were obtained for spectral measurement, one basin as a sample and with two rice plant. The measurements were made at the tillering (June 15), booting (July 2), and heading (July 22) stages. All rice canopy reflectance measurements were made using a portable spectroradiometer Field Spec Vis/NIR (325–1075 nm, Analytical Spectral Devices, Boulder, USA). The instrument uses a sensitivity 512-element, photo-diode array spectroradiometer, with the resolution of 3.5 nm. The scan number for each spectrum was set to ten at the same position, and for each sample, three reflecting spectra were taken; thus, a total of 30 individuals were properly stored for later analysis. Considering its 25° field of view, the spectroradiometer was placed above the rice canopy at a distance of 70 cm from the top of the canopy. To achieve the relative reflectance measurements, the white reference (a white panel purchased with the spectroradiometer used as white reference) was collected before scanning samples until a nice, clean, 100% reference line is obtained. All samples were stochastically divided into calibration sets of 48 samples and prediction sets of 16 samples. In order to compare the performance of different calibration models, the samples in the calibration and prediction sets were kept unchanged for all calibration models.

A SPAD-502 chlorophyll meter (Minolta, Osaka, Japan) was used to measure the chlorophyll concentration of the rice. The SPAD reading for each sample was measured by 30 times for the canopy leaves, and the averaged value was used as the final value of this rice sample as the referenced concentration of its nitrogen status.

Data Pretreatment

Due to the potential system imperfection, obvious scattering noises could be observed at the beginning and end of the spectral data. Thus, the first and last 75 wavelength data were eliminated to improve the measurement accuracy, i.e., all visible and NIR spectroscopy analysis was based on 400–1,000 nm. The above spectral data preprocessing was finished in ViewSpec Pro V4.02 (Analytical Spectral Device, Inc.). After that, the spectral data was preprocessed by the Savitzky–Golay smoothing with a window width of 7 (3-1-3) points and then the multiplicative scatter correction was used (Helland et al. 1995). The pretreatments were implemented by “The Unscrambler V 9.6” (Camo Process AS, Oslo, Norway).

Partial Least Squares

PLS is a bilinear modeling method where the original independent information (X-data) is projected onto a small number of LVs to simplify the relationship between X and Y for predicting with the smallest number of LVs.

The first step in PLS is to decompose the matrix, and the model is

$$ {\mathbf{X}} = {\mathbf{TP}} + {\mathbf{E}} $$
(1)
$$ {\mathbf{Y}} = {\mathbf{UQ}} + {\mathbf{F}} $$
(2)

In these equations, X are the spectral variables, and Y are the corresponding concentration values (SPAD values). T and U are the score matrices of X matrix and Y matrix, P and Q are the loading matrices of X matrix and Y matrix, and E and F are the errors that come from the process of PLS regression.

In the development of PLS model, calibration models were built between the spectral data and the SPAD values; full cross-validation was used to evaluate the quality and to prevent overfitting of calibration models. The optimal number of latent variables (LVs) was determined by the lowest value of predicted residual error sum of squares (PRESS). LVs can eliminate noises and random errors in the original data and account as much as possible for the variability of the original variables.

The prediction performance was evaluated by the residual predictive deviation (RPD) (Arana et al. 2005), the correlation coefficient (r), the root mean square error of calibration (RMSEC), validation (RMSEV) or prediction (RMSEP), and bias. Leave-one-out cross-validation was used in the study. The ideal model should have higher RPD, higher r value, lower RMSEC, RMSEV, and RMSEP values, and lower bias. The models were carried out by “The Unscrambler V 9.6” (Camo Process AS, Oslo, Norway).

Back-Propagation Artificial Neural Network Model

The most popular type for use in analytical applications is artificial neural network with the BPNN. It is a one-way multi-perceptron feed-forward network (Widyanto et al. 2005).

The size of the topology, including inputs, hidden, output neurons, and so on will influence the BPNN complexity. Reducing the number of inputs will reduce the training time of network. Furthermore, it can also reduce repetition and redundancy of the input spectra data. PLS is a method of data reduction that constructs new uncorrelated variables, known as LVs. LVs account as much as possible of the original variables and can be used as the inputs of neural network.

Least Squares Support Vector Machine

LS-SVM can work with linear or non-linear regression or multivariate function estimation in a relatively fast way (Suykens and Vanderwalle 1999; Chen et al. 2007). It uses a linear set of equations instead of a quadratic programming problem to obtain the support vectors (SVs). The details of LS-SVM algorithm could be found in the literature (Suykens et al. 2002).

In the model development using LS-SVM and radial basis function (RBF) kernel, gam(γ) and sig 22) parameters were adopted to regulate the models. For each combination of gam(γ) and sig 22) parameters, the root mean square error of cross-validation (RMSECV) was calculated, and the optimum parameters were selected when produced smaller RMSECV. In this study, gam(γ) were optimized in the range of 2−1–210 and 2–215 for sig 22) with adequate increments. These ranges were chosen from previous studies where the magnitude of parameters to be optimized was established (Liu et al. 2009b). The grid search had two steps, and the first was for a crude search with a large step size while the second step was for the specified search with a small step size. The free LS-SVMlab toolbox (LS-SVM v 1.5, Suykens, Leuven, Belgium, http://www.esat.kuleuven.be/sista/lssvmlab/) was applied with Matlab 7.0 to develop the calibration models.

Independent Component Analysis

ICA is a well-established statistical signal processing technique that aims to decompose a set of multivariate signals into a base of statistically ICs and with the minimal loss of information content. The independent components (ICs) are LVs, meaning that they cannot be directly observed, and the IC must have non-Gaussian distributions.

There are lots of algorithms for performing ICA (Hyvarinen et al. 2001; Lee 1998). Among these algorithms, the fast fixed-point algorithm (FastICA) is a computationally highly efficient method for performing the estimation of ICA, which was developed by Hyvärinen and Oja (2000). FastICA was chosen for ICA and carried out in Matlab 7.0 (The Math Works, Natick, USA).

Results and Discussion

Reflectance Spectral Investigation

The reflectance spectra shown in Fig. 1 appeared typical spectral characteristics of rice canopy reflectance in booting stage. The distribution rule of different nitrogen content was similar in trend and different in corresponding reflectivity. In the blue (400–500 nm) and red region (600–700 nm), the low reflectance (<10%) was developed due to the strong absorbing of blue and red light from crop photosynthesis. While in the green bands (560–570 nm), a small peak appeared because of the absorbing reduction. Reflectance increased rapidly at about 690–760 nm (red edge) from 10% to 50–80%. Besides, different nitrogen treatments caused variation of reflectance and average nitrogen of rice canopy. The canopy reflectance decreased in both of the visible and near-infrared region as nitrogen availability increased. This trend was explained as the plant pigment content increases such as chlorophyll with growth and adequate nitrogen supply, resulting in more visible light absorption, especially at blue and red bands. The deceases of reflectance with the increased nitrogen supply in near infrared region were related to the increases of plant biomass, leaf area index, and water content in the high nitrogen rate.

Fig. 1
figure 1

Reflectance spectra of rice canopy with different rates of nitrogen in booting stage

In the following, the correlation between the wavelength and SPAD value was analyzed. It would be helpful to examine how SPAD value is simply related to individual wavelength.

Consider that the reflectance data of rice canopy may be affected by the height of spectroscopy, the variety of illumination intensity, and background factors of different regions. In order to eliminate the influence of background and clearly reflect the spectral variation properties, the original spectral reflectivity was disposed with first derivative. The use of all reflectivity values after first derivative may increase the calculation time, so data were averaged by ten from wavelengths 400–1,000 nm, and 60 data were obtained in all. The correlation between SPAD value and the reflectance by SPSS12.0 software was built, and the result was shown in Fig. 2. It changed dramatically over wavelengths from visible spectral region to near-infrared spectral region. Generally, the SPAD value showed negative correlation with reflectance mainly at the wavelength region from 430 to 600 nm and 670 to 700 nm and positive correlation mainly from 580 to 775 nm. The reflectances near 460, 500, 550, 690 nm have higher negative correlations, and reflectances near 670 nm and 720–760 nm have higher positive correlations with the SPAD values. Two wavelengths just matching with green peak and red edge were included, which took an important part in the assessing of nitrogen status. Wavelength regions showing high correlation indicated that reflectance at these wavelengths might be important for the SPAD value.

Fig. 2
figure 2

Correlation between rice canopy reflectance and SPAD values

Selection of Feature Input Subset for LS-SVM Model Based on PLS Analysis

Forty-eight samples used in the calibration sets and the remaining 16 samples as the prediction sets. In the calibration models, the results of all parameters for SPAD value prediction in three different growth stages are shown in Table 1. With a comparison of these models by the aforementioned evaluation standards, the models with five LVs turned out to be the best for prediction SPAD value in tillering stage, four LVs the optimum number for the booting stage, and six LVs for the heading stage. In the prediction models, the correlation coefficient (r p), RMSEP, and bias in prediction set by optimal PLS models were with the r p, RMSEP, and bias of 0.8545, 0.7628, and 0.0521 for tillering, 0.9082, 0.4452, and −0.0109 for booting, and 0.8632, 0.7469, and 0.0324 for heading stages. It indicated that the prediction result in booting stage is better than the other two stages (tillering and heading). The reason may be explained by the fact that, in the tillering stage, rice is not big enough to eliminate the influence of the soil reflectivity. In the heading stage, rice begins to be tasseled, and the spike information may affect the spectral reflectivity of rice canopy. In the booting stage, rice was thriving, and the monitoring of its N status is very important. Figure 3 shows the measured vs predicted values plots for SPAD value in booting growth stage by PLS model. The diagonal line (y = x) shows the ideal results, which means that the predicted values are equal to the measured values. The closer the sample plots are to this line, the better is the model. The high RPD and r p, low RMSEP, and bias also define the ability for the model. From Fig. 3, the sample plots in the prediction sets were distributed near the ideal line, but they were not close enough to the ideal line. Hence, an acceptable prediction performance was achieved by these PLS calibration models. However, these results were not satisfactory for practical analysis.

Table 1 Validation results of rice by PLS in calibration and validation sets
Fig. 3
figure 3

Measured vs predicted values for SPAD value prediction by the best PLS calibration model in booting stage

LS-SVM Regression Model

In order to improve the training speed and reduce the training error, SWs obtained from ICA were applied as inputs of LS-SVM models because the training time increased with the square of the number of training samples and linearly with the number of variables (dimension of spectra) (Chauchard et al. 2004).

ICA was applied for the selection of SWs, which could reflect the main features of the raw absorbance spectra. FastICA (one of the algorithms of ICA, introduced above) was used to the preprocessed spectra data, and the main absorbance peaks and valleys were indicated by the spectra of the ICs. The SWs were selected by the weights of the first three ICs, and wavelengths with the highest weights were selected as the SWs. Figure 4 showed three ICs, and the strong peaks and valleys with the highest weights were thought to be the SWs for the prediction of SPAD value in booting stage, such as 575–580 nm and 730 nm in IC1, 560, 700, and 730 nm in IC2, and 580 and 740 nm in IC3, some of them have high correlation between the wavelength and SPAD value analyzed above. In order to evaluate the performance of SWs, they were applied as the input data matrix to develop the SW-LS-SVM models.

Fig. 4
figure 4

Three ICs with the highest loading weights for the prediction of SPAD value in booting stage

In the model development using LS-SVM and RBF kernel function, the determination of parameters γ and σ 2 is an important task, which is similar to the process employed to select the number of factors for PLS analysis. In this study, these parameters were optimized by grid-search technique using fivefold cross-validation with values of γ in the range of 2−1–210 and σ 2 in the range of 2–215 with adequate increments. For each combination of γ and σ 2 parameters, the RMSECV was calculated, and the optimum parameters were selected to produce smaller RMSECV. The optimal pair of (γ, σ 2) was found at the value of γ = 68.5 and σ 2 = 23.1.

The performance of these models was evaluated by 16 samples in prediction set, and the r p, RMSEP, and bias for prediction set were 0.9421, 0.2586, and −1.012e−06 for SPAD values (booting stage), as shown in Fig. 5. The prediction results for calibration and prediction sets showed that SW-LS-SVM models outperformed PLS models in this growth stage. Therefore, the SWs from ICA analysis could represent most of the features and characteristics of the original spectra and could be applied instead of the whole wavelength region to predict the SPAD value in booting stage. Furthermore, the SWs might be important for the development of portable instruments and online monitoring N status of rice.

Fig. 5
figure 5

Measured vs predicted values for SPAD value prediction by the SW-LS-SVM calibration model in booting stage

PLS Regression Model Combined with the ICA

In order to compare the performance of LS-SVM and PLS models, the PLS model combined with the ICA was also analyzed. The SWs obtained from ICA was used as the inputs of PLS model.

The same as the above paragraph, SWs of 575–580 nm, 730 nm in IC1, 560, 700, and 730 nm in IC2 and 580 nm and 740 nm in IC3 were applied as the input data matrix to develop the PLS models. The optimal pair of (γ, σ 2) was found at the value of γ = 46.2 and σ 2 = 17.5.

The performance of PLS model was evaluated by 16 samples in prediction set, and the r p, RMSEP, and bias for the prediction set were 0.9128, 0.4397, and 2.983e−03 for SPAD values (booting stage). The prediction results for calibration and prediction sets showed that the precision of SW-PLS model was lower than SW-LS-SVM model in this growth stage.

Comparing Predicting Results of LS-SVM, BPNN, and PLS Models

BPNN was also established to predict SPAD value in booting stage with the same sample set in LS-SVM, which was split into two groups: a calibration set of 48 and a prediction set of 16. The BPNN model with three layers was derived using the four LVs of PLS analysis, which have been referred above when PLS model was discussed. The momentum was set as 0.8, and it was determined after several trials in the range of 0.3–0.9 (The optimal momentum was determined by the lowest value of PRESS). The least learning rate was set as 0.2, the threshold residual error was set as 1.0 × 10−5, and the times of training was set as 2,000. The above parameters were settled based on experience and previous reports (Shao et al. 2007). The BPNN model was achieved with the structure of three layers, three nodes for input layer, six nodes for hidden layer, and one node for output layer, and the transfer function was sigmoid function.

The performance of BPNN models was validated by the samples in the prediction sets. The residual error of prediction was −1.058 × 10−4 for SPAD value in the booting stage. The prediction results of the correlation coefficient (r p), RMSEP, and bias were 0.9185, 0.4562, and 0.0650, respectively, as shown in Fig. 6. It indicated that the performance of BPNN was a little better than that of PLS models. The reason might be that BPNN model could handle certain latent nonlinear information of spectral data, and the nonlinear information was contributed to the better performance of BPNN model.

Fig. 6
figure 6

Measured vs predicted values for SPAD value prediction by the BPNN calibration model in booting stage

The performance of LS-SVM was found to be better than the classical linear and non-linear methods. LS-SVM produced the best r p of 0.9421, RMSEP of 0.2586, and bias of −1.012e−06, compared to the results of BPNN and PLS. LS-SVM with ten variables (560, 575–580, 700, 730, and 740 nm) from ICA analysis succeeded in predicting SPAD value in the booting stage to estimate the nitrogen status of rice and was obviously superior to the conventional linear and non-linear methods. These indicate that LS-SVM is a powerful tool for the regression analysis to quantify nitrogen status in rice.

In some early papers, Xue et al. (2004) found that R 810/R 560 was especially linearly related to total leaf N accumulation, and the predicted and observed values was with an estimation accuracy of 96.69%, root mean square error of 0.7072, and relative error of 0.0052. Lee et al. (2008) used the reflectance of 735 nm to assess nitrogen status of rice canopy. They developed a simplified imaging system, which was assembled and mounted on a mobile lifter and used a helicopter to take spectral imageries for mapping canopy N status within fields. Results indicated that the imaging system was able to provide field maps of canopy N status with a reasonable accuracy r of 0.465–0.912 and root mean standard error of 0.100–0.550.

Conclusions

LS-SVM regression model using canopy reflectance produced acceptable precision and accuracy in predicting SPAD values for assessing nitrogen status in rice. The results of comparison analysis showed that LS-SVM outperformed the other methods. Finally, it can be concluded that LS-SVM is a promising alternative for the regression analysis to quantify nitrogen status in rice.