Introduction

The potato is the world’s fourth most important carbohydrate crop and is cultivated in most climate zones inhabited by humans. However, as opposed to the other important carbohydrate crops wheat, rice and corn, potatoes are harvested and stored with fully hydrated and metabolically active cells. This means that potatoes are much more susceptible to harvest damages, disease attacks, storage loss and other detrimental factors. Potatoes furthermore undergo a natural cycle of dormancy and sprout initiation, which dramatically changes the carbohydrate composition and thereby marketability and potential use of stored potatoes.

The large heterogeneity in the raw material creates the need to sort and assess the quality of potatoes. Traditionally, potatoes have been sorted and quality has been assessed by labour-intensive manual processes, brine separation and relatively simple spot sample methods. However, increasingly larger storehouses, distribution centres and processing facilities and increasingly higher demands posed by the consumers and regulations call for improved methods of estimating various quality parameters of individual potatoes. This has created an interest and a market for implementing increasingly complex technological detectors of interior quality.

Methodologies for estimation of interior qualities in potatoes and other food items have been tested under laboratory and semi-industrial conditions for many years. Among these, X-ray (Birth 1960; Finney and Norris 1978), acoustic resonance (Elbatawi 2008) and hyperspectral imaging (Dacal-nieto et al. 2011) have been utilized to detect hollow hearts and other internal abnormalities. Non-destructive magnetic resonance imaging has been applied to test different quality parameters and to detect internal bruises (Thybo et al. 2000, 2003, 2004). In addition, several destructive methods (e.g. photo-spectrometry, HPLC and LC-MS) have been applied to develop instruments applicable on an industrial scale.

Hyperspectral imaging belongs to a group of methodologies, which all utilize the fact that a given sample will reflect, absorb or transmit applied light of defined wavelength differently according to the molecular and physical composition of the sample. Quantitative methods based on the transmission of light through a sample often require a destructive preparation of samples of very accurate thickness, whereas methods based on reflection may be performed non-destructively on intact potatoes. Particularly, near-infrared (NIR) spectrometry has been tested extensively on potatoes (Zhang et al. 2014; Rady et al. 2014, 2015; and reviewed by López et al. (2013)), and the reported results are particularly promising for the starch-related predictions and, to some extent, for prediction of reducing sugars, proteins and amino acids.

Dielectric response is based on the phenomenon that different tissues will exhibit different electrical polarization and conductivity as the frequency of an alternating electric field is changed. The application of this effect is traditionally partitioned into low frequency and high frequency regimes. The low frequency regime is measured with electric impedance spectroscopy (EIS). EIS primarily measures ionic beta-relaxation effects related to polarization on isolating surfaces in the tissue (Schwan and Foster 1980). These beta-relaxations are highly sensitive to cell damage and the technique has previously been applied for detection of freeze/thaw injury in potatoes and carrots (Zhang and Willison 1992), determination of dry matter concentration in tea leaves (Mizukami et al. 2007) and viability of snap beans (Repo et al. 2002). The high-frequency part is measured with dielectric spectroscopy (DS). DS primarily measures the dipolar relaxations in the tissue. It has been possible to establish a good correlation between the dielectric response and the dry matter concentration for different types of food tissue. This includes materials which have relatively high dry matter concentration such as peanuts (Trabelsi and Nelson 2006), cereal grains (Trabelsi and Nelson 2003), dried carrots (Dunlap and Makower 1945), palm fruit (Yee et al. 2011) and dried tobacco leaves (Meyer and Schilz 1980).

Low-field nuclear magnetic resonance (LF-NMR) relaxometry is based on the phenomenon that the magnetic resonance of a sample proportionally reveals the molecular context of the atoms in the sample. LF-NMR is a destructive method, as it requires a very uniform sample size cut from the potato. The analysis method is, in its current form, laborious and time consuming and may only be applicable as a spot sample method. However, the instrument is comparatively inexpensive, and if the method proves to provide stable and reliable predictions of key measurements, it would be technically possible to construct automated spot sample machinery. LF-NMR has previously been successfully applied to dry matter determination in potatoes and prediction of sensory quality of potatoes for boiling (Hansen et al. 2010; Thybo et al. 2000).

The scope of the present study was to screen and develop the industrial potential of hyperspectral imaging, dielectric resonance and low-field nuclear magnetic resonance methodologies. Different setups were explored for their prediction capabilities of key quality aspects in potatoes and for their possible implementation in marketable sorting machines, spot sample instruments and hand-held scanners.

Materials and Methods

Plant Material

Tubers of 10 potato cultivars representing the variation in starch concentration marketed in Denmark were sampled. The tubers were harvested at different Danish farms during the autumn of 2013 and were stored at 5–6 °C in commercial cold storage rooms until sampling at the end of March 2014. From each cultivar, six typical tubers of 100–200 g per tuber were selected from a set of least 100 tubers. The density and size of each tuber (length, width and height) were recorded and the 60 tubers were randomly assigned an ID number. These tubers were first used for hyperspectral imaging of intact tubers (HD-full) and then prepared for further spectral and biometric analyses.

Sample Preparation

The growth direction of each tuber was marked and the ends were cut off, leaving a central piece of 25 mm of each tuber (Fig. 1a). This piece was trimmed to remove the cortex tissue and a cylinder (Ø 15 mm) from the central region was cut out and discarded to decrease the amount of pith-derived tissue in the piece. Subsequently, three cylinders were cut out of the medullary tissue: a cylinder (Ø 15 mm) for hyperspectral imaging (HS-part), EIS and DS; a cylinder (Ø 14 mm) for LF-NMR; and a cylinder (Ø 2 mm) for microscopy. The cylinders for microscopy were oriented according to the observed growth direction of the tuber and cut obliquely about one third from the most proximal point before use. The remaining tuber tissue was cut into cubes of approximately 5 × 5 × 5 mm, thoroughly mixed and divided into two subsamples: one for freezing in liquid nitrogen, freeze drying and mixing in a bead mill (Mixer Mill MM 200, Retsch, Germany) for chemical analyses and the other for electric conductivity measurements.

Fig. 1
figure 1

a Sampling of potatoes. b Hyperspectral imaging of intact potato (HS-full): a intact potato, b halogen lamps, c hyperspectral camera, d rotating rollers. c Hyperspectral imaging of potato cylinder (HS-part): a 15 × 24 mm potato cylinder, b hyperspectral camera, c halogen lamp, d cylinder manually turned four times and each quarter scanned. d Dielectric spectroscopy (DS): a 15 × 24 mm potato cylinder, b probe, c vector network analyser, d weight scale. e Electrical impedance spectroscopy (EIS): a 15 × 24 mm potato cylinder, b probe, c lock-in amplifier, d weight scale

Hyperspectral Imaging of Intact Tubers (HS-Full)

A dynamic optical setup was fabricated and built into a prototype sorting machine (Fig. 1b) to capture the entire surface of tubers. The setup consisted of a hyperspectral camera with an ImSpector V10 spectrograph (Specim, Finland), a TechSpec 16 mm VIS-NIR objective (Edmund Optics, USA), two rolling cylinders and several halogen lamps to avoid shading during the measurements. While rotating the potatoes, 100 subsequent images were obtained within 1 s with exposure times of 2 ms to avoid motion blur. The tangential velocity of the cylinders was approximately 300 mm/s. Data was recorded from 216 channels ranging from 380 to 925 nm at a spatial resolution of approximately 1.0 mm/pixel. Barium sulphate was used as a reference for white in the analyses and each potato tuber was measured twice.

Hyperspectral Imaging of Cylinders (HS-Part)

Potato tuber cylinders were analysed with a 4-megapixel 16-bit CMOS camera (Qtec, Denmark) equipped with an ImSpector V10E spectrograph (Specim, Finland) and a TechSpec 35 mm VIS-NIR objective (Edmund Optics, USA). This setup provided hyperspectral images of the cylinders in the region from 400 to 1050 nm with a high spatial (38 μm/pixel) and spectral resolution (1200 pixels). The samples were illuminated with a powerful halogen lamp positioned directly above the camera (Fig. 1c). The potato cylinders were placed approximately 20 cm from the camera and four hyperspectral images were taken of each quarter of a cylinder giving four replicates. The exposure time was set to 29 ms and the aperture opening to F/4, which gave a depth of focus of 4 mm. Barium sulphate was used as a reference for white in the analyses.

Dielectric Spectroscopy

Dielectric spectroscopy of the potato cylinders was performed with a ZVRE network analyser (Rhode Swartz, Germany). The one-port scattering parameter was measured in 100 logarithmically spaced points in a frequency interval of 0.01 to 4 GHz. The applied signal had an effect of 1 mW. A custom-made coaxial sensor produced from a ‘semi-rigid’ coaxial cable (EZ_250_AL_TP, Huber Suhner, Switzerland) was used. The sensor was connected to the network analyser by a sucoflex 101 PE high-frequency test cable (Huber Suhner, Switzerland), and the procedure of Bao et al. (1994) was used both to calibrate the setup and to convert the reflection coefficients to dielectric functions. Short-circuited reference measurements were performed with the probe pressed against aluminium foil, in free air, and immersed in pure water. At measuring, the probe was aligned with the surface of the cylinder and a force between 1.5 and 2.0 N was applied (Fig. 1e). Each cylinder was measured three times.

Electrical Impedance Spectroscopy

Electrical impedance spectroscopy was performed with a HF2LI Lock-in Amplifier and a HF2TA Current Amplifier (Zurich Instruments, Switzerland). A voltage of 100 mV was applied and measurements were made in 40 logarithmically spaced points in a frequency interval of 10 Hz and 1 MHz. The lateral position of the electrodes was fixed in a sample holder made of non-conducting material. Both electrodes required a surface area markedly larger than the sample area, so the cell constant (C = L/(πR 2)) was purely determined by the radius (R) and the length (L) of the sample. The electrode material was a chemical-resistant 316L stainless steel. The sample holder was placed on a weight scale to standardize the applied force among samples. A potato cylinder was placed in the sample holder and a force between 1.3 and 1.6 N was applied to ensure good contact between the sample and the electrodes (Fig. 1d). The sample was shortly removed from the sample holder and placed again for a second measurement. Three measurements were taken on each cylinder.

Low-Field Magnetic Resonance

A potato cylinder was placed inside a glass tube (Ø 14 mm), which was placed inside a NMR glass tube probe after the sample had been equilibrated at 25 °C for 15 min in a water bath. Proton T2 relaxation measurements were performed on a Maran Benchtop Pulsed NMR analyser (Resonance Instruments, UK) operating at a resonance frequency of 23.2 MHz. Transverse relaxation, T2, was measured using the Carr–Purcell–Meiboom–Gill (CPMG) sequence. The T2 measurements were performed with a t value (time between 901 and 1801 pulses) of 500 ms. Data from 4096 echoes was acquired as 16 scan repetitions with a repetition time of 5 s between the scans.

Density, Dry Matter, Soluble Sugars and Starch

The density relative to water (g/ml) of potato tubers was determined as W o / (W o − W u), where W o = weight over water and W u = weight under water. The dry matter percentage was determined as (W d / W f) * 100 from the weights of fresh (W f) and freeze-dried (W d) materials. The concentration of soluble sugars (glucose, fructose and sucrose) and the concentration of starch, expressed as glucose equivalents after enzyme digestion, was determined by ion chromatography after a method slightly modified from Kjaer et al. (2014).

Amino Acids

Amino acids were extracted from 50 mg freeze-dried tuber material in 1 ml of a 1:1 (vol/vol) solution of water and methanol (99.8%, Sigma-Aldrich) in a sonic bath for 15 min. Samples were centrifuged (14,000g, 15 min, 4 °C) and 500 μl of the supernatant was vaporized to dryness under vacuum. NMR was performed on a Bruker Avance III 600 equipped with a TXI probe (Bruker Biospin, Germany) after a method slightly modified from Kjaer et al. (2014). Amino acids were quantified by automatic fitting and manual curation of experimental data to pure standards in Chenomx NMR Suite (Chenomx Inc., Canada).

Electric Conductivity

Electric conductivity was measured in potato sap pressed from tuber tissue and immediately chilled to 0 °C using an automated CDM80 Conductivity Meter equipped with a CDC114 electrode (Radiometer, Denmark).

Microscopy

The cylinders were immediately placed in formic acid anhydride (FAA) for 5 days and dehydrogenated for 24 h in each of the concentrations of ethanol, i.e. 30, 70, 80, 90, 95 and 99%. The samples were cast in Technovit® (Kulzer, Germany) and sliced in a microtome by slicing 12 µm sections of the cast until an intact cylinder was observed, slicing 8 sections for analyses, slicing 10 discarded sections and slicing 8 sections for analyses. The two series with the 8 slices were placed on glass slides, submerged in Calcoflour White Stain (18909-F, Sigma-Aldrich) under aluminium foil for 1 min, removed and allowed to air-dry for 10 min and mounted under cover glass. Microscope analysis was performed on a Nikon Z100 (Nikon Instruments) equipped with a 200 W fluorescent light source and FITC filters (465–495 nm/505 nm/515–555 nm) at ×200 magnification. From the two series, the three most intact slices from each were selected for image analyses using Aphelion™ Lab Image Analysis Software (Adcis, France). Only cells with areas between 100 and 40,000 μm2 and cells with a compactness index larger than 0.35 were included in the further data analyses.

Statistical Analysis

All calculations and graphics were performed in R (R Development Core Team 2015). The spectral variables were normalized to means of zero and unity variance over all the 60 samples. To get a measure of the error of the prediction models closest to real-life application, no outlier selection was performed in the biochemical and spectral data. To eliminate the surrounding background in the hyperspectral data, only pixels with intensities above a fixed threshold were included, and these were corrected for non-uniform illumination by normalization over the spectral dimension at each pixel point. Finally, the average spectra were calculated for each sample and used in the further calculations.

Pearson’s correlation coefficients (r) were calculated between the different variables, and a one-way random effect model (Gamer et al. 2012) was applied to calculate the intra-class correlations with cultivars. A conservative nested cross-validated partial least squares regression (PLSR) procedure was used for model parameter selection and model assessment (Stone 1974). The specific algorithm was modified from the ‘method 2’ without the stratification step as reported by Krstajic et al. (2014). Ten repetitions of a 10-fold cross validation with between 1 and 35 PLSR components were used in the ‘inner loop’. The optimal number of components (Tables S1S5) was selected from the RMSE by the use of the one standard error selection criterion shown by Breiman et al. (1984). Fifty repetitions of a 10-fold cross validation were used in the ‘outer loop’. The final nested cross-validated error estimates are given by means, and standard deviations of the RMSE and R 2 were calculated from the outer loop (Tables S1S5). The inner cross-validation procedure was performed with the caret package (Kuhn et al. 2015) and the orthogonal score algorithm was used in the PLSR implementation (Mevik et al. 2013). Variable importance in the projections (VIPs) was calculated as described by Chong and Jun (2005). The number of PLSR components used in the calculations of VIPs and the inter-repetition error were determined as the mode of the distribution from the inner loop. The inter-repetition error was calculated by a statistical simulation to estimate the effect of the spectral variance on the biometric variables. One spectral replicate of each sample was randomly selected to build a PLSR model and the model was evaluated using the remaining replicates. This procedure was repeated 100 times. The inter-repetition error was estimated by RMSE of the prediction of each sample averaged over all samples.

Results and Discussion

The selected potato cultivars varied in tuber weight, tuber density, dry matter, starch, soluble sugars, amino acid concentrations, cell area, cell perimeter and conductivity of the cell sap (Table 1). Several of these biometric parameters were inter-correlated as shown in Fig. 2. The raw spectra of the tubers and cylinders with selected colour markings are given in Fig. S1. To be able to evaluate the different spectral methods under realistic conditions, no outliers were excluded in the statistical analyses although some obvious candidates were present in the spectra (Fig. S1). This may have impaired the prediction performance in some cases, but it provided an opportunity to evaluate the methods under realistic conditions. The prediction performance of the hyperspectral imaging and the dielectric spectroscopy methods for density, and glucose and asparagine concentrations are shown in Fig. 3. The overall results of the modelled predictions are given in Fig. 4 and in Tables S1S5. Figures S2S6 show the VIPs of the biometric parameters for the different spectral methods.

Table 1 Measured variables from 60 potatoes from 10 different varieties. All biochemical concentrations were recalculated to fresh weight values. Range indicates the upper and lower values in the data sets. SD is standard deviation and SE is standard error
Fig. 2
figure 2

Pearson’s correlation coefficients (r) between measured variables. Blue colours represent positive correlations and red colours represent negative correlations according to colour scale. Numbers indicate the r values of the correlations

Fig. 3
figure 3

Selected examples of predicted vs. observed values in hyperspectral imaging of intact potatoes (HS-full; ac) and dielectric spectroscopy (DS; df)). Box plots illustrate the variation from the repeated cross-validations steps

Fig. 4
figure 4

Prediction R 2 values for the individual variables (Tables S1S5) of the spectral methods hyperspectral imaging of intact potato (HS-full), hyperspectral imaging of potato cylinder (HS-part), dielectric spectroscopy (DS), electrical impedance spectroscopy (EIS) and low-field nuclear magnetic resonance (LF-NMR)

Prediction of Density, Dry Matter and Starch

Starch was the most abundant molecule in potatoes and the concentrations varied between 9.26 and 21.9 g 100 g−1 fresh weight in the 10 selected potato cultivars (Table 1). As starch makes up the major part of the dry matter in potatoes and the density of potatoes increases with increasing starch concentration, density, dry matter and starch were correlated (Fig. 2). These three biometric parameters were included in the present study as they are used in the industry as measures for starch concentrations. Density was determined using the intact potato tuber and the dry matter and starch concentrations using the medullary tissue excised from the tuber. This distinction enabled us to evaluate the accuracy of the different spectral methods in relation to the sampling method.

The results from the two hyperspectral methodologies HS-full and HS-part showed R 2 values in the ranges of 0.66–0.71 and 0.31–0.42, respectively (Fig. 4, Tables S1 and S2). The HS-full method generally performed better than the HS-part method, possibly because the differences in the spatial distribution of dry matter and starch are averaged out in the HS-full method, whereas the HS-part is more sensitive to differences among the scanned tissue and the tissue sampled for biochemical analyses.

Prediction of starch and dry matter-related parameters in potato tubers by NIR-related methods has been researched by several groups, and the results are summarized by López et al. (2013). For example, Hartmann and Büning-Pfaue (1998) used NIR on homogenized potato samples and achieved R 2 values of 0.97 and 0.93 for dry matter and starch, respectively. Scanlon et al. (1999) used NIR on cut out potato cylinders from two seasons and obtained R 2 values of 0.64–0.75 for density and 0.58–0.77 for dry matter. Using NIR on intact tubers, Chen et al. (2004) predicted the cumulative carbohydrate concentration with R 2 values of 0.86. Although comparisons among results are difficult to make, it is observed that the prediction results of the present HS-full study may be categorized as relatively good.

The results of the two dielectric methodologies DS and EIS showed R 2 values in the ranges of 0.60–0.71 and 0.25–0.31, respectively (Fig. 4 and Tables S3 and S4). Good correlations between the real part of the DS spectrum and density, dry matter and starch were expected as starch has a lower dipole density than water in the applied frequency interval. The literature shows that the precision in dry matter prediction based on DS is highly dependent on the specific application. Mean absolute errors below 0.6% have been reported for homogenized fish meal (Kent and Meyer 1982) and a standard error of 1.8% for cod blended with added starch and 62.5–80.1% water (Kent et al. 2000). Application of the DS method to less homogenized samples and products with relative low dry matter concentrations result in low prediction values. Guo et al. (2007) observed a R 2 of around 0.7 for apples containing 15–23% dry matter. This is in good agreement with the present results.

Results from the LF-NMR method showed R 2 values in the range of 0.50–0.58 for density, dry matter and starch (Fig. 4, Table S5). The LF-NMR method primarily reflects the water distribution and properties of the sampled material (Li et al. 2014), and also other studies have predicted the dry matter and starch concentration of potatoes (Thybo et al. 2000; Thygesen et al. 2001; Povlsen et al. 2003; Hansen et al. 2010). The prediction abilities of the present study were marginally poorer than these (Fig. 4, Table S5), but comparable to the findings of the aforementioned studies.

Prediction of Soluble Sugars

The soluble sugars usually constitute the largest proportion of the soluble solids in potatoes (Table 1). The concentrations of reducing sugars such as glucose and fructose are in particular important for the frying properties of potatoes as these sugars are highly correlated with non-enzymatic browning and are related to formation of carcinogenic acrylamide during processing (Tareke et al. 2002; Ayvaz and Rodriguez-Saona 2015). In the present study, the reducing sugars glucose and fructose were measured along with the non-reducing sugar sucrose.

Results from the two hyperspectral methodologies HS-full and HS-part showed R 2 values in the ranges of 0.19–0.20 and 0.33–0.40, respectively, for the reducing sugars glucose and fructose and 0.41 and 0.31, respectively, for sucrose (Fig. 4, Tables S1 and S2). Prediction of sugars by NIR-related methods has been performed less frequently than starch-related parameters (López et al. 2013). However, investigating the NIR spectra of homogenized potato samples, Hartmann and Büning-Pfaue (1998) found an R 2 of 0.70 for glucose, 0.89 for fructose and 0.62 for sucrose, and using hyperspectral imaging of sliced potatoes of two cultivars, Rady et al. (2015) reached R values for glucose of 0.70–0.94 (∼R 2 = 0.49–0.88) and for sucrose of 0.38–0.52 (∼R 2 = 0.14–0.27). In the present study, the HS-part method performed better than the HS-full method, particularly with regard to glucose and fructose, possibly because the concentrations of these sugars were easier detected with the close-up HS-part method.

Interestingly, the results from the two dielectric methods DS and EIS showed relatively similar results, with R 2 values for reducing sugars in the ranges of 0.31–0.51 and 0.31–0.47, respectively, and for sucrose of 0.01 and 0.02, respectively (Fig. 4, Tables S3 and S4). The DS method has previously been used to predict the soluble solids and sugar concentrations in honeydew melons (Nigmatullin et al. 2006), with an R 2 of 0.72 for soluble solids ranging from 5 to 13% (w/w). Mashimo et al. (1992) showed how the dielectric function changes in a mixture of pure water and glucose in the microwave region.

The LF-NMR method showed R 2 values in the range of 0.19–0.25 for the reducing sugars and 0.06 for sucrose (Fig. 4, Table S5). No other studies have been found that investigated the direct connection between LF-NMR output and sugar concentrations in potato tubers.

Prediction of Amino Acids

Especially, the amino acid asparagine is interesting in relation to potatoes for frying, as asparagine participates in the Maillard reaction and in formation of acrylamide (Tareke et al. 2002; Ayvaz and Rodriguez-Saona 2015). The concentrations of aspartate, glutamate and glutamine are also of interest, as asparagine is enzymatically formed from aspartate during de-amination of glutamine to glutamate (Brierley et al. 1997). Tryptophan, tyrosine and valine were also included in the study to test the overall prediction ability of the methods with regard to free amino acids. Similar elaborate analyses of free amino acids in potatoes related to spectral analyses have not previously been presented.

Results from the two HS methods HS-full and HS-part showed R 2 values of 0.70 and 0.54, respectively, for asparagine and 0.46 and 0.01, respectively, for aspartate. For the remaining amino acids, the R 2 values were in the ranges of 0.10–0.60 and 0.25–0.58 for the respective methods (Fig. 4, Tables S1 and S2). Most of the analysed amino acids have not previously, to our knowledge, been predicted in potatoes by HS, but Ayvaz and Rodriguez-Saona (2015) applied a portable infrared instrument to predict asparagine and glutamine in potato suspensions of powdered potato with R values of 0.95–0.97 (∼R 2 = 0.90–0.94). The HS-full method generally performed better than the HS-part method for the prediction of amino acids.

Results from the two dielectric methods DS and EIS showed R 2 values of 0.45 and 0.18, respectively, for asparagine and 0.11 and 0.06, respectively, for aspartate. For the remaining amino acids, the R 2 values were in the ranges of 0.38–0.59 and 0.14–0.32 (Fig. 4, Tables S3 and S4). No literature has been found on DS and EIS in relation to determination of amino acids in food.

Results from the LF-NMR showed R 2 values of 0.48 for asparagine and 0.01 for aspartate. For the remaining amino acids, the R 2 values were in the range of 0.06–0.59 (Fig. 4, Table S5). No other studies have been found to make the direct connection between amino acid concentrations in potato tubers and the spectra obtained by the use of LF-NMR relaxation.

Prediction of Cell Size

Cell size measurements were included in the present study to investigate if inner structural differences among the potato tubers would be reflected in the predictions by the different spectral methods. For most spectral methods, the R 2 values of cell area and perimeter were in the range of 0.10–0.24 (Fig. 4, Tables S1, S2, S4 and S5); however, the EIS method predicted the cell area and perimeter with R 2 values of 0.41 and 0.45, respectively (Fig. 4, Table S3). The parameters cell perimeter and area provide indications of the inner structure of the potato tissue, particularly the relative amount of cell wall and cell membrane tissues, and it is interesting to note that the HS methodology did not pick up differences in these parameters. Similarly, only poor correlations were observed between the cell parameters and DS spectra, probably because this method primarily measures molecular dipole relaxations. A markedly better prediction was observed with the EIS spectra, and according to the results of Hayden et al. (1969), a good explanation for these good predictions could be that smaller cell areas result in a higher effective cell wall area encountered by the ionic current, and this would affect the relaxation in the EIS spectrum, but this requires further investigations. No previous studies have been found that investigated the direct connection between the cell dimensions and LF-NMR spectra.

Prediction of Conductivity

The conductivity of sap pressed from the potato tubers provides an indirect measure of the concentration of electrically charged molecules or ions in the sap and was primarily included to test how these affected the underlying variation in the electrical spectra of EIS and DS. Results from all the methods HS-full, HS-part, DS, EIS and LF-NMR showed R 2 values of 0.45, 0.23, 0.75, 0.54 and 0.20, respectively. The parameter conductivity is influenced by a diversity of electrically charged molecules in the sap, and the relatively high R 2 values observed from the high-frequency DS spectra were expected, as the complex part of the dielectric spectrum is just another way to measure the average conductivity of the sample. A high correlation could be expected in the low-frequency EIS as it is a technique which primarily is affected by the ionic conductivity of the tissue. The observed relatively low prediction was probably because the average conductivity was measured in the cell sap, whereas the conductive information in the EIS spectra primarily is affected by the polarization of the cell membranes. The moderate to good predictions of conductivity based on the HS and the LF-NMR spectra were unexpected because of the relatively low abundance of electrically charged molecules (not measured). The decent predictions are unexplained, but primarily believed to be due to the distinct inter-correlations between the starch-related parameters with high prediction values and the conductivity. No other studies have been found to investigate the direct connection between the conductivity of the cell sap and HS and the LF-NMR spectra.

Method Considerations

The hyperspectral analyses were performed on two different structural levels of the potatoes. The HS-full scans were performed on the outer part of intact potato, thereby primarily evaluating the skin, cortex and perimedullary parts of the potatoes, whereas HS-part was performed on cylinders excised from the perimedullary tissue. Most of the biochemical analyses were performed on tissue of comparative perimedullary origin in the potato tubers, and thus, the HS-part scans should have a high level of direct comparison with these measurements. Nonetheless, the HS-full method performed better in predicting the starch-related parameters, sucrose and most of the amino acids. The HS-part spectra were acquired with a very high spatial and spectral resolution but with fewer spectra and thus a smaller data cube compared to the HS-full method, where a complete surface scan was acquired but with lower resolution. The dynamics of this are not fully understood, but, in general, fewer measurements are more prone to higher noise levels and irregularities. Future analyses will only include the HS-full method, and optimization will focus on establishing a better and more controlled light reflection regime with LED lights, on creating more noise robust algorithms and on establishing more stable prediction models.

The EIS and DS spectra were both obtained from a tissue volume 5–10 times smaller than the total volume of an intact potato tuber. Studies have shown a substantial spatial variation within potato tubers (Reeve et al. 1971; Subramanian et al. 2011), with dry matter concentration varying between values of 10 and 24% over the spatial extent of the potato tuber, with the highest values near the cortex. Spatial variation could thus be a significant source of the errors observed from these methods. Despite this, especially the predictions from the dry matter-related parameters showed promising results from the DS measurements, and future analyses will only include this method. Optimization will focus on establishing a better understanding of the sources of error, and the size of the coaxial probe for the DS measurement will be increased to increase the electrical sensing volume.

LF-NMR has been investigated as a method of assessing quality parameters in several different types of food tissue, e.g. Guo et al. (2014), Shao and Li (2012, 2013) with relatively promising results, particularly on the dry matter estimations. The LF-NMR method was primarily included in the present experiment to have a guideline to benchmark the hyperspectral and dielectric analyses against. For most of the important parameters (starch, sugars and asparagine), the hyperspectral and dielectric methodologies produced better prediction models than the LF-NMR method. The preparation of samples for LF-NMR is relatively cumbersome and time consuming, as reliable results are highly dependent on the samples having a specific temperature. It is difficult to imagine the method in use under industrial conditions, and LF-NMR will not be included in the future studies.

Effect of Inter-correlation and Sources of Error

Inter-correlations among the measured variables (Fig. 2) were a major concern in the present study, as these may have influenced the predictions of individual variables. A pronounced example of the inter-correlation issue was the prediction of asparagine and starch from HS-full (Figs. 2, 3a, c, and Table S1) with a relatively high inter-correlation (r = 0.67) between asparagine and starch. The question is, however, if the relatively good HS-full prediction (R 2 = 0.70) of the modestly present asparagine is an actual independent prediction or a result of covariance to the more abundant and equally well-predicted starch concentration (R 2 = 0.69). The literature did not provide any clear statistical methods designed specifically for separating the contribution of signals of inter-correlating variables to the PLS models. However, Trygg and Wold (2002) described the method orthogonal signal reduction (OSC) as suitable for eliminating systematic variation in, e.g., NIR and NMR spectra. Applying the method to, e.g., minimize the starch-derived contribution to, e.g., the signal used to predict asparagine proved to be difficult. We were not satisfied with the observation that progressive inclusion of more components carrying information orthogonal to starch and using these for prediction of asparagine resulted in unpredictable changes instead of progressively better predictions, and the method was abandoned.

‘Variable importance in the projection’ (VIP) as defined by Andersen and Bro (2010) was instead applied as an illustrative measure of the relative importance of the spectral features in the final prediction model. The VIPs of all the spectra included in the present study are presented (Figs. S2S5). However, it was beyond the scope of the study to investigate all the individual possible constellations. As an example, the HS-full-derived VIP values for starch and asparagine according to the wavelength of the spectra were compared (Fig. S6). A relatively high agreement between the VIP of the two models was observed below 600 nm, whereas the agreement above 600 nm was markedly less pronounced. This suggests that the prediction of asparagine may be relatively independent of the inter-correlating starch. A pronounced example of good spectral separation of measured variables in most spectral methods is seen between reducing sugars and sucrose (Fig. S3). The matter of the influence and possible elimination of the effects of inter-correlations between parameters on the PLSR predictions is unresolved, and future work will concentrate on solving this.

Another important goal in this screening study was to investigate the origin of errors in the predictions. One way to investigate this was by propagating the uncertainty in the predictor variables to the response variables. Some work has been done on the analytical formulation of this problem in PLSR models (Faber and Kowalski 1997), but this work assumes independence between the prediction parameters which is unlikely in our application. Therefore, we based our calculation of the inter-repetition error (Tables S1S5) on a statistical simulation. The inter-repetition error indicates the variance originating from inherent uncertainty in the spectral measurements. Even though relatively small numbers of repetitions were used to estimate this error, these numbers provided unique information about the origin of the observed uncertainties. It is interesting to compare the inter-repetition error to the mean values of the RMSE (Tables S1S5), as these ratios provide indirect measures of the proportion of the error in the prediction models explained from variation among repetitions and thereby inherent measurement error. For the DS scans, the inter-repetition errors were typically below a quarter of RMSE whereas the errors were markedly higher for most other combinations of measured and spectral variables. This indicates that future work should focus on improving the reproducibility of scans, as this might eliminate a substantial part of the RMSE, and thus improve the predictions.

Conclusions

The overall purpose of the present experiment was to perform a preliminary screening of different methodologies for predicting various measured variables in potato tubers. The aim was to explore the potential of different methodologies for the direct implementation in industrially produced potato sorting machines and other quality assessment equipment for potatoes. The three different types of methodologies were at very different developmental stages in relation to implementation in industrially applicable equipment. The two hyperspectral setups were based on well-tested commercially available cameras, and only the applied light and the subsequent data treatment was in a developing phase, and thus, the method is almost ready to implement in the industry. Using hyperspectral spectrometry on, particularly, intact potatoes (HS-full) showed promising potential for predictions of starch-related variables and key amino acids, and to some extent reducing sugars. The two dielectric setups were altogether on a low developmental level, particularly the custom-made probes forming the contact point to the potatoes, and thus, the methodology is quite far from commercially applicable setups. The results of, particularly, the high-frequency part of the dielectric scannings (DS) showed promising potentials for prediction of starch-related variables, reducing sugars, and key amino acids, and future development of the method will focus on developing stable predictions in the higher frequency spectrum. The LF-NMR measurements were based on a well-established scientific instrument, but quite far from implementation in the industry. The application of the LF-NMR method in the present study showed disappointing results for most parameters and the method will not be further developed. Future work will focus on establishing analytical methods for robust prediction models and, in particular, on understanding and possibly eliminating the influence of correlations among measured variables.