Introduction

Chlorophyll is the most important pigment in plant photosynthesis or even the most important organic matter on Earth (Davies 2004; Blackburn et al. 2008). Changes in chlorophyll content affect the physiological stress, development and aging process of plants and are also closely related to the primary productivity. Thus, from the perspective of scientific research, especially research on plant-environment interactions or from the perspective of specific applications, such as applications in the fields of agriculture, forestry and environmental management, access to information about the temporal and spatial variation of chlorophyll content in leaves is very important (Blackburn et al. 2008). According to the principles of spectroscopy, the absorption characteristics are usually determined based on the composition and component content of the substance. Extracting chlorophyll information based on the spectral features of plants using the principle of spectroscopy has become a major means of estimating chlorophyll contents. This method has attracted widespread attention because of its advantages of being fast, convenient and non-destructive. Numerous studies have been conducted using spectral data to retrieve chlorophyll information at different time and space scales in carrying platforms such as ground, aviation and aerospace (Botha et al. 2007; Xue and Yang 2009; Huang et al. 2011; Main et al. 2011; Croft et al. 2014; Houborg 2013; Schlemmer et al. 2013; Saberioon et al. 2014).

Imaging spectroscopy can be divided into two categories according to the carrying platform of sensors and the applicable fields. One type is based on remote-sensing platforms such as satellites and aircrafts. The most common one is aerospace remote sensing, which is suitable for large-scale regional studies and applications. The other type is based on a small ground application platform, i.e., the ground remote-sensing system. Because of its compact size, flexibility and mobility, this type of system is widely used in agriculture, including biochemical parameters retrieval (Monteiro et al. 2007; Fernández Pierna et al. 2012; Dong et al. 2013), crop/weed discrimination (Liu et al. 2010a; Zhang et al. 2012; Dale et al. 2013), crop pest and disease monitoring (Nansen et al. 2009; Larbi et al. 2013), quality control of agricultural products and meat (El Masry et al. 2007; Gowen et al. 2007; Chen et al. 2013; Rustioni et al. 2014). Such systems can be divided into single sensor spectrometers and imaging spectrometers depending on whether an image can be formed. Compared with the single sensor spectrometer, an imaging spectroscopy system provides not only spectral information but also a wealth of images and spatial details. For example, the system developed in our study can achieve a spatial resolution of 2 mm when placed on a 1-m-high observation platform (details will be described later in this paper). Usually, different parts of the leaf will have discolored spots due to the health condition or the status of nutritional stress, resulting in differences in the spectral properties of different parts of the leaf (Nansen et al. 2009). Therefore, it is natural to wonder whether these spatial differences or image features can be used to retrieve the chlorophyll content if different health statuses of leaves can lead to varying spatial features of the spectrum of leaves. The spatial detail can only be obtained using imaging spectroscopy systems. Currently, most ground-based studies on chlorophyll retrieval use Analytical Spectral Devices (ASD, FieldSpecFR spectrometer, Analytical Spectral Devices Inc., USA) data generated by a single sensor spectrometer or images with a small number of wavelengths that are generated by multi-spectral systems (Thenkabail et al. 2004; Grisham et al. 2010; Liu et al. 2010a, b; Prabhakar et al. 2013; Singh et al. 2013). Studies using imaging spectral data have rarely been reported. In addition, almost all the studies that involved a high spatial resolution field imaging spectroscopy system (FISS) only used the spectral information but not the spatial information; thus, the advantage of the imaging spectroscopy system was not fully exploited (Thenkabail et al. 2004; Nansen et al. 2009; Liu et al. 2010b; Prabhakar et al. 2013; Singh et al. 2013).

In the present study, 100 soybean leaf images were collected using an in-house-developed FISS (also the first FISS in China). The spectral information (e.g., leaf reflectivity) and image features (e.g., entropy and standard deviation) were extracted, key wavelengths were selected using stepwise multiple regression and the chlorophyll content was estimated using the spectral information, spatial information and the two in combination. In addition, ASD data of these soy leaves were also obtained for comparison with the FISS data in terms of the chlorophyll content estimation and to investigate the applicability of different types of sensors for estimating the chlorophyll content of soy leaves. Our study attempted to address the following topics regarding the use of FISS to extract chlorophyll content: (1) Can the image features or spatial information be used to extract information about the chlorophyll content of soy leaves? (2) Compared with spectral information alone, can the combination of “spatial information” and “spectral information” help in improving the accuracy of chlorophyll content estimations for soy leaves? (3) Evaluation of the performance and potential of FISS in quantitative applications of spectral information in agriculture.

Experimental design and data preprocessing

Experimental design

Experiment

The experiment was performed in the Huanjiang Observation and Research Station for Karst Ecosystem, Chinese Academy of Sciences. Several fertilization gradients were set up, and fertilizers with different nutrient ratios were used in the experimental soybean plot to study the nutrients in rain-fed lands. Fresh leaves were picked from the upper part of the plant on August 20, 2009 in the plot, stored in coolers and then rapidly transported back to the laboratory.

Field imaging spectroscopy system

The FISS consists of a platform for operating, an electronics system, an opto-mechanical system, a computer system, and auxiliary equipment (Fig. 1) (Tong et al. 2010, 2013). As the core component of the FISS, the opto-mechanical system consists of a scanning mirror, optical lenses, spectroscopic devices (ImSpector V9, Spectral Imaging, Ltd., Finland), and a CCD (charge-couple device) camera. The electronic system includes the power and the motor control circuit. The motor control circuit is used to control the rotation of the scanning mirror, synchronize the beam splitter and receiver, and collect and store the data.

Fig. 1
figure 1

Photograph of the FISS components

The hardware of the computer system is a portable computer used for operating the FISS. The software system includes data collection software and data processing software. The data collection software serves primarily to display spectral images and curves, and set operation parameters such as integral time, aperture, FOV (Field of View), etc. Data processing software is designed for geometric correction, radiometric correction, reflectance inversion, data format conversion, etc.

The FISS can acquire high-resolution images (spatial resolution up to 2 mm) of measured targets as well as extract a spectrum curve of any pixel from these images. This system helps to improve the efficiency of field spectral measurements greatly. It could provide both image and spectral information for analyzing the structure of spectra, extracting pure spectra and deriving detailed spectral characteristics of targets surface. The FISS is the first field imaging spectrometer based on a cooling area CCD in China and designed for multi-applications in agriculture, food chemistry or online monitoring, etc. Table 1 shows the main technical parameters the FISS (Tong et al. 2010, 2013).

Table 1 Main parameters of the FISS

Spectra measurement

Soy leaves were removed from the cooler and laid flat on the bench. A set of light source (halogen light) was placed on the sample bench. In order to provide uniform light intensity on the sample surface and minimize the potential influence of measurement geometry: (1) light source and detectors were fixed to be nearly perpendicular to the sample bench so that zenith angle of observation was minimized. (2) The distance between light source and sample was kept enough long to make the light field on sample surface as uniform as possible. (3) Each leaf sample was placed on the same marked area to make sure that all leaf samples had the identical measurement condition.

The imaging spectral data of soybean leaves were obtained using the FISS while the ASD data were obtained using an ASD spectrometer, and the spectral data of a reference gray board were also obtained at the same time. The leaf was rapidly returned to the cooler after the measurement and taken to the chemical analysis laboratory for processing.

Chlorophyll content determination

The chlorophyll content was determined using the rapid extraction method with an ethanol-acetone mixture (1:1 by volume). After being cut into pieces, 0.2 g leaf samples were weighed and placed into a 25-mL colorimetric tube for extraction for 18–24 h in the dark. The tube was shaken once every hour. Then, the absorbance was measured with a UV spectrophotometer, and the measurement data were used for calculating the contents of chlorophyll a, chlorophyll b and carotenoids.

Data preprocessing

For the ASD data obtained, the soybean leaf reflectivity was calculated by comparison with the gray plate. For the imaging spectral data, reflectivity images were also obtained by comparison with the gray plate. Then, the soy leaves were divided according to the threshold set, and shadows on the leaves were also separated and removed. The valid image data extracted will be used for calculating the features in the chlorophyll retrieval model in the following section.

Chlorophyll retrieval

Model feature

The model feature in this section refers to the information that can be used to retrieve the nitrogen content of the pigment. The features are divided into image features (i.e., spatial information) and spectral reflectivity (i.e., spectral information).

The imaging spectral data can provide both spectral and image information, while current studies generally use only the spectral information but not the image information. Figure 2 presents the soybean leaf data obtained by FISS. The color varied across the entire soybean leaf, with a yellow color on part of the edge. In addition, the spectra of parts in different colors also varied greatly. A vigorously growing plant that has adequate nutrition will have a relatively uniform color (spectrum) across the leaf. However, if the plant lacks certain nutrients or is damaged by pests, its colors (spectrum) will change accordingly. The color variation of the leaf is also an important basis for crop disease diagnosis (Nansen et al. 2009).

Fig. 2
figure 2

Soybean leaf data obtained by FISS

The spectral heterogeneity of the soy leaf reflects the chlorophyll content and other information to some extent, and the degree of heterogeneity is an image feature. In the present study, the standard deviation and entropy are used to describe the spatial distribution of heterogeneity and variation, defined as follows:

$$Std = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {(D_{i} - \bar{D})^{2} } }$$
(1)

where n is the number of pixels of the leaf, D i is the digital number (DN) of pixels and \(\bar{D}\) is the mean DN of the entire leaf.

$$E = - \sum\limits_{s = 1}^{N} {P(s)\log_{2} P(s)}$$
(2)

where N is the quantization level and P(s) is the normalized probability of the occurrence of DN value s.

P(s) can be computed by the histogram of the soy leaf image:

$$P(s) = \frac{{N_{s} }}{{N_{total} }}$$
(3)

where Ns is the number of pixels with a gray value of s and Ntotal is the number of total pixels in a leaf image.

Entropy generally is used as an image texture feature in image processing. It measures randomness of the input image and is an indication of the complexity within an image (Wu et al. 2008). Namely, if an image is more complex, it will have higher entropy values. In this study, vigorously growing soybeans generally have a relatively uniform color across the leaf and thus generates a lower entropy value in contrast to the plant lacking certain nutrients or damaged by pets.

Based on the above discussion, it is natural to contemplate that soy leaves under nutritional stress may have more heterogeneous chlorophyll distribution, and thus, the image of the respective wavelength will have a large standard deviation and entropy. Such spatial information indirectly indicates, to some extent, the magnitude of the chlorophyll content. The FISS data contain both spectral information and image information and each sample (one leaf) corresponds to many pixels. Thus, the average reflectivity, image standard deviation and entropy of each wavelength can be obtained (note that the mean reflectivity does not contain spatial information). In contrast, ASD data cannot generate an image but only one spectrum for each sample (i.e., only spectral information but no spatial information) and therefore cannot provide the appropriate standard deviation or entropy data of the leaf. The input features of both types of data that can be used in the chlorophyll content retrieval model are summarized in Table 2.

Table 2 Input features of FISS data and ASD data for the chlorophyll retrieval model

Stepwise multiple linear regression (MLR)

In the present study, stepwise MLR was used to estimate the chlorophyll content. Spectral analysis often involves a large number of usually highly correlated wavelengths, and the high correlation between the variables will usually result in unstable MLR models. Stepwise MLR is currently widely used for regression analysis and is also a common method to reduce the correlation between independent variables. Stepwise MLR can guarantee an “optimal” regression equation for a given significance level. The main parameters are the confidence level of the variable introduced (α in ) and the confidence level of the variable deleted (α out ). Stepwise MLR is implemented in SPSS13.0 (Gao 2005).

The coefficient of determination (R2) and root mean square error (RMSE) are used to measure the estimation accuracy, defined as follows:

$$R^{2} = \sum\limits_{i = 1}^{n} {(\hat{y}_{i} - \bar{y})^{2} } /\sum\limits_{i = 1}^{n} {(y_{i} - \bar{y})^{2} }$$
(4)
$$RMSE = \sqrt {\sum\limits_{i = 1}^{n} {(\hat{y}_{i} - y_{i} )^{2} /n} }$$
(5)

where n is the number of samples, \(\bar{y}\) is the measured mean of the sample and \(\hat{y}\) is the predicted value.

Results and discussion

Correlation analysis of image features, spectral variables and pigment content

To compare the difference between the FISS data and ASD data for quantitative pigment content estimation and to analyze/compare the roles of the image features and spectral reflectivity as a pigment content indicator, we calculated the linear correlation coefficients between these factors and the pigment content, as presented in Figs. 3, 4, 5 and 6. The correlation coefficient curves of chlorophyll a, b, total chlorophyll and the carotenoids were similar in shape but different in ranges; namely, the curves coincided for wavelengths with high correlation coefficients, which will be illustrated in the following section using the total chlorophyll as an example.

Fig. 3
figure 3

Correlogram of entropy of soy leaf image versus pigment contents (calculated using FISS data)

Fig. 4
figure 4

Correlogram of spectral reflectivity of soy leaves versus pigment contents (calculated using ASD data)

Fig. 5
figure 5

Correlogram of spectral mean reflectivity of soy leaves versus pigment contents (calculated using FISS data)

Fig. 6
figure 6

Correlogram of image standard deviation of soy leaves versus pigment contents (calculated using FISS data)

Whether it was the entropy, image standard deviation or spectral reflectivity, the regions that were highly correlated with the chlorophyll content were all located from 530 to 670 nm and 695 to 715 nm. Lower chlorophyll content may indicate a lower level of health and therefore more heterogeneous color or spectrum, resulting in greater variation and smaller entropy, and thus, the entropy was positively correlated with the pigment content with a maximal coefficient of 0.84 (corresponding to a wavelength of 700 nm, as shown in Fig. 3). Figure 5 demonstrates that the maximal correlation coefficient between the FISS mean reflectivity and the chlorophyll content was −0.85 (corresponding to wavelengths of approximately 568 and 703 nm). Figure 6 shows that the correlation coefficient of the image standard deviation was generally greater than the correlation coefficient of the reflectance, with a maximum of −0.87 (corresponding to wavelengths of approximately 568 and 705 nm). This result indicated that the spatial difference of the soy leaf spectra can be used as a good indication of the chlorophyll content and outperformed, to some extent, the mean leaf reflectivity by itself. Therefore, the spatial information can also be used for pigment content retrieval. The curves in Figs. 4 and 5 have similar shapes, and the overall correlation shown in Fig. 5 was greater than that of Fig. 4. The maximal correlation coefficient of the ASD-spectra-derived reflectivity was only −0.54 (corresponding to wavelengths of approximately 699 and 578 nm), indicating that the FISS data can reflect the chlorophyll content more accurately. The features of the FISS image can provide spectral information for each pixel of the leaf, overcoming the drawback of the non-imaging spectrometer that the resulted spectra cannot be precisely mapped to the measured target object. Therefore, imaging spectroscopy has features that can be exploited to more reliably obtain leaf pigment content information.

Significant wavelengths for chlorophyll content retrieval

Wavelengths that were significant in chlorophyll content retrieval were selected using the stepwise selection based on the spectral variables listed in Table 2. The first 3 and 7 wavelengths selected are summarized and listed in Table 3.

Table 3 The statistics of samples for calibration and validation of chlorophyll content retrieval model

Almost all the wavelengths selected using the ASD data were included in those selected using the FISS data; however, the former did not include the FISS-data-selected wavelengths in the green region. By comparing the most important 3 wavelengths that were selected for each feature and listed in Table 2, the wavelength in the green region (531 nm) was selected based on the image feature of “standard deviation”. This result indicated that the image can provide information that cannot be provided by the spectrum and that image features can also be used for pigment content retrieval.

In summary, the wavelengths that were significant in chlorophyll content retrieval generally include wavelengths near 450, 550, 600 and 700 nm. This result can provide guidance for the development of low-cost equipment.

Chlorophyll content retrieval

There were a total of 101 valid samples for chlorophyll retrieval, of which 71 samples were training samples and 30 samples were validation samples. The statistics are presented in Table 4. The statistics of the training samples and validation samples were generally the same; thus, these samples can be used for the construction and validation of the same model.

Table 4 Significant wavelengths for chlorophyll content retrieval based on FISS and ASD data

As described in the previous section, based on each feature, significant wavelengths were selected using stepwise selection, and the chlorophyll content was retrieved with MLR. The retrieval accuracies of various cases are indicated in Table 5 (FISS data) and Table 6 (ASD data). In particular, when only one independent variable was selected, the MLR automatically became a linear regression (LR).

Table 5 Different spectral variables and chlorophyll content retrieval accuracies of different models (FISS Data)
Table 6 Different spectral variables and chlorophyll content retrieval accuracies of different models (ASD data)

Chlorophyll content estimation using spectral and spatial information combined

The comparison of the estimation based on the three features obtained from the image (entropy, mean reflectivity and image standard deviation) was presented in Table 5 (FISS data). The estimation error of entropy and standard deviation, which can reflect the spatial information, was comparable to that of the spectral mean. This indicated that spatial information can also be used for chlorophyll content retrieval and the retrieval accuracy was equivalent to that of the spectral information. In particular, when using 7 wavelengths as independent variables, the entropy had the best prediction, with an estimation error of 0.284 mg/g, and the model’s COD was 0.88.

When both the mean reflectivity and image standard deviation were used as the input features in the model, the retrieval accuracy was substantially improved compared with using a single feature alone (image standard deviation or spectral mean). For example, the MLR retrieval errors of the validation samples at three wavelengths were reduced from 0.297 mg/g (mean reflectivity) and 0.318 mg/g (image standard deviation) to 0.254 mg/g, relative reductions of 14.5 and 20.1 %, respectively. The error of the training samples exhibited a similar pattern. This result suggested that the image features were a very good indication of the chlorophyll content and could provide information that spectral variables could not, and the integration of spatial information and spectral information can greatly enhance the chlorophyll content retrieval accuracy.

Comparative analysis of FISS-data-based and ASD-data-based chlorophyll content estimations

A comparison of Tables 5 and 6 clearly reveals that the FISS-data-generated results were better than the ASD-data-generated results, with a mean error reduction of 30–45 %.

When using the reflectivity at a single wavelength as the estimation variable, the error of the ASD-data-based chlorophyll content estimation was 0.527 mg/g, while the accuracy of the FISS-data-based single-wavelength model was 0.310 mg/g, reducing the error by 41.2 % and increasing the COD from 0.57 to 0.85; the highest validation accuracy of the ASD-based three-wavelength model was 0.380 mg/g, while the accuracy of the FISS-data-based three-wavelength model was 0.254 mg/g, reducing the error by 33.2 % and increasing the COD from 0.77 to 0.91. The scatter plot of the measured chlorophyll content versus the predicted chlorophyll content (Fig. 7) also reveals that the FISS-data-generated results scatter closely around y = x, while the ASD-data-generated results were relatively further apart.

Fig. 7
figure 7

Comparison of the accuracies of chlorophyll content retrieval using FISS data and ASD data

From the perspective of information sources, the imaging features of FISS enables it to not only obtain the conventional spectral information but also the spatial information (image features), as an image-spectrum-in-one can provide richer information and is more conducive to chlorophyll content retrieval. From the perspective of measurement, the FISS can accurately obtain the spectral information of any valid pixel in the entire soy leaf and the spectrum and can accurately locate the target. A non-imaging spectroscopy system can only obtain information in the detector’s field of view and not the complete spectral information of the entire leaf. The chlorophyll content is related to the condition of the entire leaf, and thus, the ASD spectral signals that cannot be precisely mapped to the target objects are not conducive to chlorophyll content retrieval. Therefore, compared with the single sensor system, these advantages of the imaging spectroscopy system enable it to adapt to a wider range of agricultural applications, especially those requiring not only the material properties but also the detailed distribution of the spatial structure, such as soy pest diagnosis, nutritional status monitoring, weed/crop discrimination, agricultural product quality control and other agricultural areas.

Conclusions

FISS-data-based pigment retrieval focused on chlorophyll was investigated, and the results were compared with ASD-based results. The image-spectrum-in-one feature of the imaging spectroscopy system was fully exploited to improve the accuracy of crop chlorophyll content estimation by integrating the image features and spectral information. The main conclusions are the following:

(1) The wavelengths that were sensitive indicators of pigments were the same whether selected based on FISS data or ASD data, and the reflectivity of 530–670 and 695–715 nm were highly correlated with the pigment content, for which the correlation coefficient between the FISS data and the pigment content was higher than that between the ASD data and the pigment content. The significant wavelengths in chlorophyll content retrieval should generally include wavelengths near 450, 550, 600 and 700 nm. This finding provides guidance for the development of low-cost equipment.

(2) The image features were also highly correlated with the pigment content, indicating their indicative significance for pigment; the spatial information can also be used for chlorophyll content retrieval, with an accuracy equivalent to the retrieval accuracy using spectral information. Image features were also good indicators of chlorophyll content and can provide information that spectral variables cannot. Integrating spatial information and spectral information can greatly improve the accuracy of chlorophyll content retrieval.

(3) Because of the unique measurement method and image-spectrum-in-one feature, FISS data can be used to achieve more accurate chlorophyll content retrieval, and the retrieval error was reduced by approximately 30–45 % compared with the use of ASD data, confirming the excellent performance and potential of FISS in quantitative spectral analysis. Therefore, FISS can be widely used in the agricultural sector.