Introduction

Mature cereal grains that are rained upon while in the head, either as a standing crop or in the swath start to germinate as a result of the hydration. The resulting so called ‘pre-harvest sprouting’ causes increasing damage with time to the cereal kernel’s biochemical and physical characteristics which consequently reduces the value of the crop. To ensure that the intrinsic value of exported grain is maintained, the damaging effects of pre-harvest sprouting are limited by the use of defined wheat grading tolerances for sprouted kernels in Canadian wheat [1]. Canadian wheat grading has tolerances for both sprouted and severely sprouted kernels [2] based upon visual assessment by trained inspectors. The percentage of each level of sprouted kernels is expressed on a weight percent basis. For both Canada Western Red Spring (CWRS), and Canada Western Amber Durum (CWAD) a maximum tolerance of 0.1% w/w of severely sprouted kernels with a maximum of 0.5% sprouted grain is allowed in the highest grades [1]. These low tolerances reflect the severe damaging effects of highly sprouted kernels.

The germination process involves the synthesis and release of enzymes that degrade starch, protein and lipid reserves within the grain, releasing sugars and amino acids to nourish the growing embryo. Sprouted grains therefore contain degraded starch and protein components which reduces milling yield and lowers flour or semolina quality which results in significantly poorer quality end products. The most destructive of these enzymes in respect to grain quality is α-amylase. Flours with high levels of α-amylase activity require higher water levels, have poor mechanical dough handling properties which all lead to an inferior ‘sticky bread’ exhibiting low loaf volume, dark crusts and a sticky texture which clogs automated slicing systems. Asian style noodles produced with such flour will also become sticky and disintegrate in the cooking process [3]. For pasta, a product with a less firm texture susceptible to disintegration during cooking and containing an increased number of visually degrading dark specks results. Due to the significant impact of α-amylase activity within sprouted wheat kernels, the measurement of this enzyme, directly by analytical tests or indirectly through its effects on starch viscosity, is used as a quantitative indicator of sprout damage in commercial wheat samples. The measurement of the starch pasting properties of a representative sub-sample of ground grain by either the Falling Number (FN) or Rapid Visco Analysis (RVA) tests, provides a quantitative means to measure the degree of its sprout damage and hence the intrinsic value of the wheat sample [46].

All sprout assessment methods, visual inspection, FN and RVA are highly dependent upon the effectiveness of the initial representative sub-sampling process used to generate the sample for testing.

Grain in the earliest stages of sprouting exhibits no visible symptoms, yet the level of α-amylase may be sufficient to cause product quality degradation. Only quantitative enzyme assays or the measurement of the ground grain’s viscosity characteristics can detect and quantify this level of damage. Occasionally, sound kernels are also found with high inherent levels of α-amylase as a result of variety and environment interactions. This is most noted for the phenomena known as late maturing α-amylase (LMA) in which cool temperatures at approximately 10–20 days post anthesis occur [7]. Four routes for accumulation of α-amylase have been identified in wheat, possibly leading to sprout damage [8]. High levels of α-amylase can be produced during the later stages of ripening in the absence of sprouting [9]. When germination begins, however, the embryo and layers surrounding the starchy endosperm produce enzymes at an accelerating rate. A severely sprouted kernel contains thousands of times the amount of enzyme present in kernels that are considered ‘sound’ or not in the early stages of germination. A bulk wheat sample containing only a few severely sprouted kernels, may exhibit significant α-amylase activity on a ground sample basis. Because of this non-linearity in enzyme activity, starch pasting tests on bulk samples do not give any indication of the variation in α-amylase activity among the individual kernels, given that a few kernels with high activity can have significant impact on overall sample quality. To improve the grading system, which provides only a count of sprouted kernels and reduce the dependence on time consuming starch pasting methods, a system that both identifies individual sprouted kernels and determines their individual level of sprout damage is desirable. With these two parameters, we speculate that estimating the impact of sprouting on processing is feasible.

Near infrared (NIR) spectroscopy provides a single spectrum for an entire sample and is an approved method [10] for measurement of wheat protein and moisture content [11]. It has been commonly used to measure protein content of wheat bulk samples, but more recently has been applied to single-kernel protein measurement [12]. The use of NIR for predicting sprouting damage or FN in wheat samples has not been successful and no commercial system exists for this analysis. Hyperspectral (chemical) imaging in both the visible/NIR (400–1000 nm) and shortwave infrared (1000–1700 nm) has been used for quantitatively or qualitatively determining properties of food and agricultural products [1316]). Hyperspectral imaging has advantages over conventional NIR spectroscopy by providing both spectral and spatial information simultaneously. This advantage makes it possible (1) to study the spectral information of specifically defined spatial regions, and (2) to simulate the multispectral images prior to defining and purchasing hardware. The salient wavelengths can be determined based on hypercube analysis for setting up a multispectral imaging system, which is more practical in terms of speed and cost for a real world sorting system. The nondestructive chemical imaging (1100–1700 nm) approach to assist in testing new lines of wheat for resistance to premature germination was recently introduced [17].

Given the challenge identifying highly sprouted kernels in a bulk sample of grain and then subsequently confirming that sub-samples remain representative, it is highly desirable to have a system that can detect and identify single sprouted kernels in a bulk grain sample. At present, hyperspectral imaging has not been applied to study the α-amylase activity of individual wheat kernels, nor for detecting early stage sprout damage. The work reported here describes the first stages of development of an approach for detecting single kernels that show evidence of sprouting and the prediction of single-kernel α-amylase activity using a 1000–2500 nm wavelength hyperspectral imaging system. In addition, detecting sprouted kernels at early stages using the predicted α-amylase activity were attempted.

Materials and methods

Wheat samples

Two classes of Canadian wheat, Canada Western Amber Durum (CWAD) and Canada Western Red Spring (CWRS), collected from across western Canada throughout the 2007 crop year were used for this study. In total, 288 CWAD and 264 CWRS kernels with different degree of sprout damage were imaged using the hyperspectral system and α-amylase activity individually determined for each kernel.

Hyperspectral imaging system and image acquisition

A schematic of the hyperspectral imaging system used in this study is shown in Fig. 1. This integrated imaging system (Optikon, Kitchener, ON, Canada) included a Short Wavelength InfraRed camera (SpecCamSWIR, Specim, Finland) with a resolution of 320 × 240 pixels, coupled with a spectrograph (Imspector N24E, SpecIm, Finland), which measures the reflected infrared wavelengths from 1000 to 2500 nm with a resolution of about 7 nm and HyperVisual software (Themis Vision Systems, Stennis Space Center, MS 39529-0001, USA) as a turn-key system. This system was mounted in a custom enclosure and the sample illuminated by two 450 W lamps (NewPort model # 71582, Oriel Production Line) with the light beams reflected off of two gold front surface mirrors set in front of the light paths (60° incline angle to the normal) to illuminate the samples evenly. The 30 mm focal length 900–2500 nm fore-lens (OLES30, Specim, Finland) of the imaging system was mounted on an integrated stepper-motor (Newmark NSC1, Newmark Systems Inc, Mission Viejo, USA) moving from front to back while scanning the samples line by line using a patented system (US Patent 6,166,373) integrated by Themis Vision Systems. After finishing scanning, a datacube including two spatial dimensions and one spectral dimension was recorded.

Fig. 1
figure 1

Sketch of the hyperspectral imaging system

Dark current and white reference spectra were collected daily. During dark current acquisition, all lights were shut off and the lens of the camera was covered with a lens cap. A 99% reflectance Spectrolon surface (Labsphere, USA) was used as the white reference.

For wheat sample imaging, the kernels were placed crease down, germ to the left in the field of view of the optics. A black foam plate was used to support the kernel to ensure an easy separation of the kernel from the background in an image. The kernel was approximately 60 mm from the lens. The reflectance image was calculated as:

$$ {\text{R}} = \left( {{\text{I}}_{\text{raw}} - {\text{ I}}_{\text{dark}} } \right)/\left( {{\text{I}}_{\text{white}} - {\text{I}}_{\text{dark}} } \right) $$

where, R is the reflectance image; Iraw is the original image of a sample obtained from the imaging system; Iwhite is the original image of the white reference; and Idark is the original image of the dark current.

Alpha-amylase activity measurement (assay)

Individual wheat kernels were crushed into meal, weighed and placed in a 2.5 mL micro-centrifuge tube. The samples were extracted with 1.5 mL of sodium acetate buffer (pH 5.5) for 45 min at 35 °C with sonication, followed by 15 min of agitation on a revolving centrifuge tumbler at room temperature (22 °C). After centrifugation for 10 min at 10,000×g, 1 mL of supernatant was transferred to a test tube and stored at 4 °C until analysis.

A 1.0 mL aliquot of the extract was equilibrated in a water bath at 40 °C for 5 min prior to an Amylazyme Red (Megazyme Inc, IR) tablet added. The contents were incubated at 40 °C for 15 min, removed from the water bath and the reaction terminated by adding 10 mL of Trizma base (2%, w/v). The samples were vortexed to ensure homogeneity and allowed to stand at room temperature for 5 min. The solution was filtered through a 1 μm syringe filter (Acrodisc 25 mm syringe filter) prior to the absorbance at 515 nm being determined using a microplate reader (Model SpectraMaxM5, Molecular Device, CA). The α-amylase activity of wheat single kernel was calculated based upon SKB fungal standards (CalBiochem) and reported as SKB units/g.

Image processing and data analysis

Image processing and analysis used Matlab 2008a (The Mathworks, MA, USA) with PLS_toolbox v4.21 (Eigenvector Research Inc.).

Image preprocessing

Images were pre-processed to exclude random noise and unrelated information in both spatial and spectral planes. Kernels were separated from the background using a simple thresholding method, i.e., in an image, if the intensity of a pixel was larger than a threshold value, it remained; otherwise, it would be set as 0. This procedure assured that only signals from the kernel were included in analyses.

Spectral analysis

Due to the consideration for the signal to noise ratio, only the spectral data in the region 1255–2300 nm was included in the spectral analysis. Principal Components Analysis (PCA) and Partial Least Square (PLS) regression analysis were utilized. Both methods focus on reducing the spectral information from a wide wavelength range into a few uncorrelated vectors or factors, which represent the most variation/correlation existing in the raw data. The PCA components or PLS factors are linear combinations of the original variables. Raw spectra were pre-processed, to remove the noise which may come from the variations in the ambient environment during scanning, variations in functionality of instruments and irrelevant variations among individual kernels. Spectra were normalized using an area normalization approach by dividing each raw spectrum by the area under the curve.

In the PLS analysis procedure, the spectral data was used as regressor (X-variable) and the α-amylase activity value was used as regressand (Y-variable). It is well known that the α-amylase activity is mainly distributed in the germ region of a kernel. Therefore, the PLS analysis was applied on the spectra from only the germ in order to minimize the influence of other major components of wheat kernel, i.e. protein, on analysis. The germ portion was segmented based on images at wavelength 1886 nm. At this wavelength the optical sensor has high sensitivity. The spectra from the pixels within the segmented germ region were averaged. The mean of the germ spectra was used to represent the spectrum of a kernel.

Results and discussion

The α-amylase activities of each class of wheat kernels are summarized in Table 1. Both CWAD and CWRS samples displayed a range of activities from sound (0 units of α-amylase activity) to greater than 30 units/g of activity with a bias towards lower levels for CWAD and higher levels for CWRS.

Table 1 Summary of the distribution of wheat kernels at different α-amylase activity (SKB units/g) levels

Overview of hyperspectral images of wheat kernels having different α-amylase activities

Principal Components Analysis (PCA) was used to summarize the hyperspectral information in images of wheat kernels. PCA was performed for each kernel individually and the image of principal component scores were examined to discover relationships between image features and α-amylase activities. The first two principal components accounted for the majority of the variance (99.98%) existing in the data. Higher order principle components did not contribute to the model and contained the random noise components from the images.

The first two principal component scores images of wheat kernels are shown in Fig. 2 (CWAD) and Fig. 3 (CWRS). The number above each image is the measured α-amylase activity for the kernel. Based on visual observations, the scores images of PC1 and PC2 provided similar information, except for the opposite display of the germ regions. For both classes of wheat kernels, the germ portion always showed higher PC1 and lower PC2 scores than the rest of the kernel body in the images. As known, the PC1 scores always attempt to explain the maximum variations in the data and the PC2 accounts for the maximum of the rest variations, and so on. The higher values in the PC1 score images suggested that most of the variations in the optical signals were from the germ parts of the kernels. The α-amylase content as well as a brighter and more wrinkled surface at germ region could be the reasons causing this. The PC2 scores images focused on explaining the variations stemming from the kernel body other than the germ portion, which resulted in lower value in the germ regions.

Fig. 2
figure 2

PC1 (a) and PC2 (b) scores images of CWAD kernels with different α-amylase activities; the alpha amylase activity is labeled on top of each image of a kernel

Fig. 3
figure 3

PC1 (a) and PC2 (b) scores images of CWRS kernels with different α-amylase activities; the alpha amylase activity is labeled on top of each image of a kernel

Kernels with differing levels of α-amylase activity could not be classified according to the PC1 and PC2 scores images, unless the difference was very large. In general, for kernels with an α-amylase activity value of 0, the shape of the germ portion appeared as an ellipse with a clear edge. The germ portion of kernels with high α-amylase activities (i.e., >30) often showed irregular shapes and sometimes spikes were observed. This was most likely caused by the swollen germ, ruptured bran above the germ or the extruded shoot. The area of germ portion in the image (high value region in the PC1 scores image and lower value region in the PC2 scores image) has a trend of increasing with the increase of α-amylase activities, although with insufficient robustness to predict the α-amylase activity levels. The large variations existing among sound kernels and little difference between sound kernels and the kernels with less α-amylase activities (i.e., <30) make it difficult to predict/classify the α-amylase activities solely based on morphological features extracted from the PCA score images.

Overview of spectra of wheat kernels with different α-amylase activities

In addition to spatial information, hyperspectral imaging provided spectral information for each pixel on a kernel. Figure 4a and b show the mean plots of the normalized spectra of CWAD (288 kernels) and CWRS (264 kernels), respectively. The kernels were grouped according to the classes for α-amylase activity shown in Table 1. In the wavelength region between 1400 and 1900 nm, the spectra of CWAD kernels with different α-amylase activities show little variability. The shape of the spectra for CWRS kernels was similar to that of CWAD, however, in this case the spectra varied with α-amylase activity class. For both classes of wheat, α-amylase activity could not be linked to any single wavelength.

Fig. 4
figure 4

Average spectra of wheat kernels at different α-amylase activity range for CWAD (a) and CWRS (b)

PLS analysis

Alpha-amylase activity prediction

PLS models were built respectively for each of the two classes of wheat kernels. All kernels of each class were included in the calibration dataset and the leave-one-out-cross-validation method was used to evaluate the model. PLS was performed on the wavelength region between 1255 and 2300 nm with 7 nm increments and the number of latent factors was determined as 6 according to the cumulative explained variance for both of X and Y variables. R 2 values of 0.54 and 0.73 were obtained for CWAD and CWRS, respectively.

Higher R 2 value for the CWRS might be due to larger enzyme activity variations among CWRS kernels than those of CWAD. The number of kernels were distributed more evenly across the different α-amylase activity ranges in the CWRS samples and this could account for the better performance of prediction model for CWRS as well. In addition, these two classes of wheat have different translucency, which would result in different light-path lengths in kernels and different surface reflectivity. Consequently, the spectra of CWAD and CWRS do not contain the same information for the purpose of α-amylase activity prediction and could not be combined into a single predictive model.

Alpha-amylase activity was overestimated when it was low and underestimated when it was high for both classes of wheat (Fig. 5). The inclusion of more kernels with non zero α-amylase activity in the dataset would be expected to improve the prediction accuracy. Unfortunately, since the assay for α-amylase is destructive and time consuming, these values are known only after a selected kernel is imaged, limiting the selection of kernels to provide a well distributed sample set.

Fig. 5
figure 5

Predicted versus measured plot of α-amylase activity for CWAD (a) and CWRS (b) kernels

The Variable Importance in Projection (VIP) scores plot (Fig. 6), which was obtained from the PLS regression procedure identifies key wavelengths in the analysis. The variables with scores larger than 1 are normally considered as important predictors [18]. For CWRS, five peaks were therefore picked corresponding to five wavelengths centered at 1397, 1634, 1874, 2076 and 2228 nm, respectively. For CWAD, six wavelengths centered at 1347, 1410, 1627, 1721, 2064 and 2159 nm were considered as important variables for predicting α-amylase activity. These selected wavelengths are located at the peaks of the reflectance spectra. The chemical meaning of these wavelengths are still in investigation. PLS regression using the selected wavelengths gave even lower R 2 (0.37 for CWAD and 0.50 for CWRS) than using the full wavelength region suggesting that predicting α-amylase activity requires more than just a few selected wavelengths.

Fig. 6
figure 6

VIP scores plot of PLS model for CWAD (a) and CWRS (b)

Given that the amount of α-amylase was miniscule compared to other chemical constituents like protein and starch present in the wheat kernel, a strong relationship to spectral analysis was found using a PLS modeling approach. It is suspected that the analysis is primarily detecting the secondary effects due to α-amylase activity within the kernel, although there is no direct evidence of this. The potential of the spectral method for damage detection is significant from both a time-saving aspect and chemical consumption. Furthermore, the non-destructive characteristic of imaging technique makes it suitable for individual kernel inspection. In summary, the prediction accuracy obtained from this study is limited, however, it indeed suggests a potential to develop a high-speed optical sorter for predicting the α-amylase activity of individual wheat kernels, especially for CWRS.

Classification of CWRS kernels based on α-amylase activity

The CWRS α-amylase activities were represented approximately evenly throughout the analysis range and it seems possible to separate the kernels with high level α-amylase activities from those with low levels (Fig. 5b).

This data set was chosen to build a classification model to separate kernels according to the predicted α-amylase activity.

To verify the efficiency of this classification, the entire dataset of each class of wheat kernels were randomly split: 70% of the kernels were used as the calibration and 30% as the validation dataset. The data split and classification procedure was repeated 1000 times, and the mean and standard deviation of these 1000 classification accuracies were recorded (Table 2). Due to the absence a priori knowledge of the best α-amylase activity value for separation, a range of α-amylase activity values were tested. The highest classification accuracy reached 92 ± 4% for ‘high level’ and 88 ± 8% for ‘low level’ samples when the α-amylase activity threshold value was set at 5 (Table 2a). A high separation activity value doesn’t necessarily lead to a better classification model.

Table 2 Classification results for CWRS kernels using (a) full wavelength region and (b) five wavelengths (1397, 1634, 1874, 2076, 2228 nm)

The same classification procedure was then repeated for CWRS kernels using only the five wavelengths selected by the VIP procedure (Fig. 6b). A classification accuracy of around 80% was achieved (Table 2b). The highest classification accuracy was achieved when the α-amylase activity threshold value was set at 20. These results show considerable promise that a quick, efficient and effective method for the identification of kernels with high level α-amylase activity could be established using a limited number of wavelengths.

Conclusions

A SWIR hyperspectral imaging system (1000–2500 nm) was used to detect sprout damage and predict the α-amylase activity at early germination stages in two classes of wheat kernels: CWAD and CWRS. The spectral information from the germ (1255–2300 nm) was used in PLS regression models to predict α-amylase activity. The PLS models for CWAD and CWRS achieved R2 values of 0.54 and 0.73, respectively. For the CWRS data set, discriminant analysis was used to classify the α-amylase activity levels into high and low levels with a classification accuracy of 92 ± 4% and 88 ± 8%, respectively. It was impossible to distinguish the kernels with no α-amylase activity from those with little α-amylase activity at this time. Five key wavelengths were selected for CWRS according to the VIP scores plot and the data set classified into high and low α-amylase activity with an accuracy of about 80%. This shows the potential for multispectral imaging as a technique for α-amylase activity classification in wheat kernels.