Introduction

Longan (Dimocarpus longan Lour.) is an important subtropical fruit in the family of Sapindaceae [1], which has been used in traditional oriental medicine and possesses several physiological activities besides being a widely welcomed preserved fruits as snacks and used in many traditional dishes [2]. The biggest Dimocarpus longan production industry over the world is in China, with more than 2000 years history of cultivation [3]. Owing to its enrichment of nutrients, sweet flavor and healthy benefit, longan fruit has acquired increasing popularity and great commercial value both domestically and globally [1, 2]. However, longan fruit has a short shelf life at room temperature due to its fast pericarp browning, fungal infection, and pest attack [4, 5]. Additionally, the post-harvest preservation of longan fruits is facing big challenge, as the high temperature and humidity during the harvest season accelerates its decay and favors the growth of pathogenic fungi, especially in Southwest China. Dried longan fruit, comparing to fresh one, has wider market share due to its longer shelf life and special chewy texture. The detection of defective dried longan fruit that is included in the Agricultural Trade Regulation in China is rather difficult, because the moldy and wormy flesh is hidden under the perfect and intact peel that cannot be easily visualized from outside. Thus, it has become a thorny problem for industries to control the quality of this fruit. Besides, the invisible inferior longan fruits also make it difficult to be graded, causing a great loss of credit and profits of dealers. Therefore, a non-destructive quality evaluation of dried longan fruit is in urgent need and of great commercial prospects.

Non-destructive technology can potentially revolutionize fruit industrial practices. For example, the early in-field assessment of ripeness and prediction of the harvest date and yield, screening out inferiors before selling not only enables the consumers to get the tastiest and freshest fruit, but also maximizes suppliers’ profits by grading the fruits [6, 7]. The principle of relaxation nuclear magnetic resonance signals is to analyze in the time domain experimental signal-decay curves by fitting model functions in order to extract the relaxation times. Differences in relaxation times and proton density between tissues/phases/samples make it possible to distinguish defective fruits from normal ones. Low field was defined as the range of magnetic field strengths corresponding to B0 = 10 mT to 1 T for H [8]. Low-field nuclear magnetic resonance (LF-NMR) technologies have been widely applied in food quality control. For instance, LF-NMR measurements have been applied to evaluate the water holding capacity of meat [9, 10]. Moreover, LF-NMR is effective in studying water mobility change during cooking or drying processes of potatoes and Tofu [11,12,13]. Previously, LF-NMR analysis has been applied in evaluation of water distribution of fruits including blueberries [14], sweet cherry [15], and grape [16]. LF-NMR Magnetic Resonance Imaging (MRI) can detect internal bruise and sprain disease symptoms in potatoes [17]. Besides, low-field proton magnetic resonance sensor could also be used in sensing internal discoloration in whole apples [18]. Theoretically, the water mobility may be changed in wormy and moldy dried longan fruits due to the change of water mobility during the growth of fungal and pests. Therefore, we proposed a discriminative method for defective dried longan fruits by checking the water status inside the fruits.

Principal component analysis (PCA), the linear combinations of the original variables, has usually been applied in reducing the dimensionality of large datasets [19]. PCA model is able to reveal the aggregation and separation trend among groups from the distribution of samples in PCA diagram. However, the properties of PCA have some undesirable features when these variables have different units of measurement [19]. Machine learning has the potential to provide an accurate and more efficient solution in detecting contaminations in food products [20] and several reports have highlighted its application in the prediction of food contamination. For example, Support Vector Machine (SVM) model achieved up to 85 % accuracy in identification of food contaminating beetle species by imaging their elytra under a microscope [20]. Moreover, hyperspectral remote sensing combined with kernel-based extreme learning machine (KELM) were used to trace changes in the chlorophyll content of shaded tea leaves, potentially developing a green tea quality detection method [21]. The deep learning neural network (DLNN), as one of the most powerful approaches for machine learning, builds multi-layered neural networks containing many neurons to model complex relationships in big data. DLNN has proven improved prediction performance over traditional models for speech recognition, image identification and natural language processing [22]. Relationship between features and phenotypes could be learned and a mapping from features to their corresponding phenotypes could be constructed by tuning selected hyperparameters, such as the number of neurons and the type of layers. The successful application of DLNN in the fields of systems biology and computational biology to solve prediction problems including gene annotation, recognition of protein folds and prediction of genome accessibility has demonstrated its powerful capability of learning complex relationship from biology data [23]. Give the limitation of PCA analysis on variables with different units of measurement, the possibility of using DLNN model based on LF-NMR relaxation features for food discrimination remains to be evaluated. Therefore, the present study aimed to apply LF-NMR combined with DLNN on non-destructive quality evaluation of dried longan fruit, providing a new method for inferior longan fruit screening and differentiation, thus increasing food safety for customers and profits to the fruit industry.

Materials and methods

Samples grouping

The dried longan fruits were provided by Dengshi Specialty Company in Luzhou in Southwest China. Longan fruits were collected in September to October in 2014, 2015, and 2018, and dried in stoves. Precisely, all fresh fruits were subjected to a standard production process as following. The harvested fresh longan fruits were sorted, washed, initiating baked, refrigerated, re-baked and packed. From the perspective of the quality, edible dried longan fruits were labeled as normal, and moldy or wormy fruits were labeled as moldy/wormy judged by naked eyes after removing the pericarp (Fig. 1). Samples from 2014 (14_batch) and 2015 (15_batch) were only subjected to transverse relaxation measurement, whereas those from 2018 (18_batch) were processed with both transverse relaxation measurement and proton density imaging analysis (PDIA). All experimental samples were subjected to NMR relaxation measurement or MRI analysis approximately a month after being dried. The comparison amongst subjects from different years was conducted after finishing the experiments. Subjects were weighed and conducted with quantity normalization.

Fig. 1
figure 1

Dried longan fruits with pericarp covered (PC) or pericarp removed (PR). Normal, moldy, and wormy longans without pericarp were displayed, fungal film and insect eggs were indicated in red circle and arrow

NMR relaxation measurement

LF-1 H NMR measurements were performed on 23 MHz NMR analyzer PQ001-20-025 V (Niumag Electric Corporation, Shanghai, China). The NMR instrument was equipped with a 60 mm diameter radio frequency coil. Dried longan fruits were placed on the NMR bed and inserted in the NMR probe. The strength of the magnetic field is 0.5 ± 0.08 T. Carr-Purcell-Meiboom-Gill (CPMG) pulse sequence was employed to measure spin-spin relaxation time (T2) to collect decay signals. SFO1(Spectrometer Frequency offset of the first (observe) channel) was 21.242 MHz. The pulse durations used were 5.4 and 10.64 µs for the 90° (P1) and 180° (P2) pulses, respectively. Data were acquired from 2500 echoes over 8 scans at 32 ℃. The repetition time between scans was 800 ms. Spectral width (SW) was 125 kHz, time echo (TE) was 0.2 ms. After acquiring the CPMG sequence of the pericarp-covered dried longan fruits, the T-invfit software was used to inverse the obtained CPMG sequence into a spin–spin relaxation time (T2) distribution. Afterwards, all dried longan fruits were cracked and pericarp-removed for verification.

MRI measurement

18_batch were picked out for MRI measurement. In this part, samples with pericarp covered (PC) and/or pericarp removed (PR) were arranged for MRI measurement. SPIN ECHO sequence was used to obtain proton density weighted images using NMR imaging system MesoMR23-060 H-I (Niumag Electric Corporation, Shanghai, China). MRI parameters were listed as following. SFO1(MHz) = 23.406, RFA90˚ = 7.0, RFA180˚ = 10.6, Time of repetition (TR) = 300 ms, Time of echo (TE) = 6 ms, Slice width = 5 mm, Slices = 1, Average = 2, Read Size = 256, Phase Size = 128. Pseudo-color convention is a method of image processing that changes gray images into color images [24]. The processing of mapping gray image into a color image and the decision of transform functions refers to pseudo-color-coding method [24].

Discrimination based on PCA analysis

LF-NMR T2 relaxation curves of dried longan fruits were pictured using Microsoft Excel 2016. PCA plot were conducted by PAST v2.17. A score plot of the first two PCs for samples grouped in ellipses, representing group membership assuming the 95 % confidence limit as the cut-off distance. Prediction of positive (moldy/wormy) and negative (normal) subjects were conducted based on PCA cluster as subjects circled in the respective ellipses were predicted to be normal, whilst those outside were deemed to be moldy/wormy. Recall, accuracy, precision and F-score according to Eqs. (1), (2), (3) and (4) were calculated as follows:

$$\text{Recall =TP/(TP+FN)}$$
(1)
$$\text{Precision =TP/(TP+FP)}\text{ }$$
(2)
$$\text{Accuracy =(TP+TN)/(TP+TN+FP+FN)}$$
(3)
$$\text{F-score}=(2\text{Precision}\times {\text{Recall}})/(\text{Precision}+{\text{Recall}})$$
(4)

where TP: true positive; FP: false positive; TN: true negative; FN: false negative. Moldy/wormy subjects were designated as positive whereas normal ones as negative.

Machine learning modeling construction and prediction

A DLNN model was constructed by the keras in tf-nightly-gpu (2.4.0.dev20200802) library with Python 3.8.5. Hyperparameters were specified before training the model. Manual hyperparameter tuning was performed to determine each parameter. Each hyperparameter was finely tuned for low loss and high accuracy as follows. (1) the number of hidden layers: 2, 3; Deep neural network was avoided for reducing computing complexity and overfitting. (2) learning rate: 0.1, 0.01, 0.001, 0.0001; A suitable learning rate could contribute for smooth learning process and ensure the convergence of the model. (3) Batch size: 100, 274; A proper batch size could guide the learning process better. (4) Number of units in hidden layers: 100, 200; the number of units in hidden layers determines the capacity of the model. Too many or too few hidden layers could lead to overfit or underfit. (5) Number of epochs: 300, 400; ensures that the model could reach better performance (high accuracy and low loss). (6) Regularization: None, L2 (lambda: 0.01, 0.001); constraint the overfitting of the model by avoiding overlarge weights. (7) Activation function in hidden layers: ReLU. This is a nonlinear function performing affine transformation to enhance the capacity of the model.

Besides, the different combinations of the hyperparameters above were tried. When the outcomes of different model were similar, hyperparameters that retained the model less complexity and capacity were preferred. The combination of hyperparameters that yielded the most satisfactory outcome (low loss and high accuracy) amongst all the attempts was provided in Table 1. Specifically, the model consists of two hidden dense layers with 100 rectified linear units (ReLU) in each layer. The network was trained with Adam optimizer (batch size: 100, learning_rate: 0.01) and binary_crossentropy loss function for 300 epochs. To assess the performance of this model, a ten-fold cross-validation method implemented in MultilabelStratifiedShuffleSplit provided by the Python package iterstrat.ml_stratifiers was used. Negative predictive value (NPV), specificity, recall, precision, accuracy, Average Precision (AP) and Area Under the Receiver Operating Characteristic Curve (AUC) implemented in the Python package sklearn was used to evaluate the performance of our model. Formulas for recall, accuracy, precision, and F-score calculation were as shown above. NPV and Specificity is calculated as shown in Eqs. (5) and (6).

Table 1 Learning and architecture parameters of the deep learning neural network (DLNN) model with the best performance
$$\text{NPV =TN/(TN+FN)}$$
(5)
$$\text{Specificity=TN /(TN+FP)}$$
(6)

where FP: false positive; TN: true negative; FN: false negative. Moldy/wormy subjects were designated as positive whereas normal ones as negative.

Results and discussion

Verification defects and normal fruits

The defects and normal fruits in 274 subjects collected from 2014, 2015 and 2018 were verified by removing pericarp with naked eyes after LF-NMR testing. The numbers of defects versus normal subjects were relatively balanced with ratios of 50:55 and 47:50 in 14_batch and 15_batch, respectively. However, only 10 out of 72 of 18_batch were proven to be moldy or wormy.

Water status in the moldy/wormy dried longan fruits

The transverse T2 relaxation curve of dried longan fruits samples is presented in Fig. 2. According to the spectra, a major peak identified as T21 (0-20 ms) was observed in all samples, which suggested that water content in dried longan fruits is relatively immobile. Previous study reported that 20-30 % water content were preserved in dried longan fruits, which is essential to keep its soft taste [25, 26]. In plant tissue, water components with 0.01-10 ms, 10-150 ms and 150-1000 ms T2 relaxation time were ascribed to cell wall protons, cytoplasmic water, and vacuolar water, respectively [15]. After around 60 % of its weight were lost by drying process [25], the present study confirmed that certain amount of bound water (T21) associated with cell wall were left in dried flesh. However, the present study showed that the water status among samples changed by batches (14_batch, 15_batch, 18_batch). For example, T21 of 18_batch (Fig. 2e and f) had the shortest T2 relaxation time (0-1 ms), representing strongly bound water. By comparison, T21 ranged from 0 to 10 ms were observed in 14_batch (Fig. 2a and b) and 1-20 ms in 15_batch (Fig. 2c and d), standing for the bound water with more mobility than that of 18_batch. 15_bacth had relatively latent T2 relaxation time compared to others (Fig. 2c and d). The present study, as the first to investigate water distribution inside of the dried longan fruits based on T2 relaxation time by LF-NMR, suggested that this kind of fruit characterized with bound water, yet the mobility of the water content is varied amongst samples. The drying process of longan can cause the variation of water mobility among samples. To maintain the soft texture of the dried longan fruits, the drying process were conducted by multiple times, and the interval time of around 10-16 h enables the moisture in the inner fruit core to move outside to be dried by next drying process. The temperature and moisture in interval time vary from batch to batch, and despite the moisture content determined for the fruit part were similar among batches, the moisture content for the whole dried fruit may not be exactly the same, thus to affect the migration of water components in the different part of the dried fruits. Notably, the corresponding relative amplitude of the peaks, marked as A21 were higher in 15_batch than 14_ and 18_batchs, reflecting the stronger hydrogen protons density in the sample. Besides, A21 of some moldy/wormy samples characterized decreased signal amplitude, while others had a tendency of shift to the right (Fig. 2b, d, and f). Both moisture content and water status act as essential factors that affect the shelf life of longan fruits; the study suggested that rotten dried longan fruits were not characterized with higher moisture content than normal fruit but probably the opposite, and the water component in them might be less bonded. Two peaks, 10-100 ms in 18_batch, 20-200 ms in 14_batch and 15_batch, representing immobilized and water free water, respectively, were negligibly small (Fig. 2), but how these water components affect quality of dried longan fruits and if their LF-NMR T2 measurements can be used for discrimination remains unknown.

Fig. 2
figure 2

The transverse relaxation curves (T2) of dried longan fruits. a, c and e normal samples harvest in 2014, 2015 and 2018, respectively. b, d and f moldy/wormy longan fruits of harvest in 2014, 2015 and 2018, respectively

Water distribution in the moldy/wormy dried longan fruits

MRI analysis was performed to compare the proton density relaxation which indicating the water distribution between samples before and after pericarp-removed, between PC and PR samples, and between normal and moldy/wormy samples (Fig. 3a). The proton density weighted images with MRI were processed with pseudo color transformation. The color bar, representing different level of grey scales, provides a relative scale for the moisture content [27, 28]. Heterogeneity of moisture content was observed in the flesh tissue. For instance, outer parenchyma that close to pericarp had brighter colors compared with inter part (Fig. 3). Previously, heterogeneity of water distribution was found between the inner and outer parenchyma of apples by MRI [29], and between the florets and stalks in broccoli tissue [28]. Moreover, pericarp-removed normal, moldy, and wormy fruits, two of each were set on order for imaging analysis. As shown in Fig. 3b, moldy/wormy fruits were distinguished from normal ones by less moisture content in the parenchyma. This result is consistent to the decreased tendency of T21 amplitude signal. Notably, a moldy longan fruit was distinguished with the pericarp remained intact using the LF-NMR imaging (Fig. 3c), showing a decreased signal in the flesh part.

Fig. 3
figure 3

The pseudo-color gray transformed proton density images of normal and moldy/wormy longan fruits with or without pericarp. The gray scale (0-255) images were changed to color images. a pericarp-covered (PC) and pericarp-removed (PR) normal longan fruits; b PR normal, moldy, and wormy longan fruits; c PC moldy longan fruit was spotted out from normal ones

PCA Cluster

Previously, PCA was conducted to determine the sensitive wavelength of the Hyperspectral date, based on which the SVM modeling was conducted to classify the different qualities of litchis [30] or grape seeds [31]. In this study a simple and rapid PCA clustering method was applied to the transverse T2 relaxation time of longan fruit to distinguish normal and moldy/wormy longan fruits from different years. PC1 and PC2 explained 79.10 % and 13.52 %, respectively, of the variation in the data and discriminated according to the different harvest year of the samples. The normal fruit was relatively clustered but separated by years. Therefore, accuracy, recall and precision, embodying the classification performance of PCA, were separately calculated based on datasets from different years. Few negative (normal) subjects were mixed in positive (moldy/wormy) ones; therefore, high precision values were obtained in samples from different years (100 %, 97 % and 100 %) by prediction (Fig. 4). By contrast, moldy/wormy subjects were relatively scattering, resulting the low recall values of discrimination by PCA (56 %, 62 %, and 60 %). The accuracy for detecting positive subjects from 2014 to 2015 datasets were 79 % and 80 %, and the F-scores were 72 % and 76 %. Due to the low percentage of positive subjects (10 out of 72) in 2018 dataset, F-score (75 %) was more credible than accuracy (94 %) in this case.

Fig. 4
figure 4

Principal Component Analysis (PCA) of T2 relaxation times of normal and moldy/wormy longan fruits. 2014_normal (normal samples from 2014, circle, purple); 2015_normal (normal samples from 2015, rectangle, reddish brown), 2018_normal (normal samples from 2018, square, pink), 2014_moldy/wormy (moldy/wormy samples from 2014, cross, green); 2015_moldy/wormy (moldy/wormy samples from 2015, square, blue); 2018_moldy/wormy (moldy/wormy samples from 2018, oval, turquoise)

Performance of predictive DLNN modeling

Hyperparameters including learning rates, batch size, number of units in hidden layers, number of hidden layers, number of epochs and regularization were adjusted. The final hyperparameters and architecture of the DLNN model adapted in this study were provided (Table 1; Fig. 5). An objective and comprehensive evaluation of the performance of the model was obtained through ten-fold cross-validation. Metrics used to evaluate each prediction were presented in Fig. 6. The values of AUC, AP, precision, recall, NPV, specificity, accuracy and F-score reached up to 95 %, 96 %, 100 %, 82 %, 89 %, 100 %, 89 % and 86 %, respectively. These results highlighted the satisfactory performance of the DLNN model. The accuracy on discriminating the moldy/wormy from normal dried longan fruits reached 89 %, compared to the accuracy of 93 % of a predictive model on in-shell shriveled walnuts [32], 88.7 % of the discriminant model of grape seeds built by SVM using the spectra based on the effective wavelengths (EWs) [31], 93 % on the identification of rice seed varieties using NIR spectroscopy [33], and 98 % on the early identification method for cucumber diseases based on the techniques of hyperspectral imaging and machine learning [34]. Interestingly, a previous study proposed a novel method using supervised SVM based on LF-NMR and relaxation features, which showed that when the relative position of each edible oil has been determined by PCA before the designing of binary tree structure of SVM model, the classification accuracy of 99.04 % can be achieved [35]. By contrast, deep learning modelling approaches obtained better performances than PCA classification in this study, showing that deep learning had significant potential to be used as modelling and feature extraction methods in LF-NMR T2 data analysis. Importantly, the DLNN model yielded satisfactory performance of high recall and precision marks up to 82 % and 100 % respectively. The remarkable predicting performance of the DLNN model featured both high recall and precision to discriminate defects at utmost while minimizing false rejection of normal fruits. For several outliers, the present model failed to predict them correctly (Fig. 6), suggesting more samples containing comprehensive characteristics should be collected to improve this DLNN model. Despite of outliers which might result from the heterogeneity amongst samples from three different years, the high accuracy, recall, and precision suggested this model can be improved to a better level of performance if more specimens containing comprehensive characteristics were included. Under the consideration of reasonable cost and simple procedure, the present study indicated that DLNN model can effectively predict dried longan fruits quality through training and testing on LF-NMR T2 relaxometry data with a high feasibility.

Fig. 5
figure 5

Mappings in deep learning neural network (DLNN). Starting from the LF-NMR T2 relaxometry dataset input, modeled using DLNN and validated by several 10-fold cross validation replicates

Fig. 6
figure 6

Performance of Machine Learning Models evaluated by negative predictive value (NPV), accuracy, specificity, recall, precision, Average Precision (AP) and Area Under the Receiver Operating Characteristic Curve (AUC)

A limitation of this study is that we failed to distinguish wormy and moly longan fruits one by one hence labeled them together as moldy/wormy. Ideally, the distinguish between moldy and wormy fruits helps in the better discrimination of inferior longan fruits, assumed that wormy and moldy samples may have different characteristics of moisture content and water mobility and feature different T2 relaxometry and proton density by LF-NMR. Unfortunately, some dried longan fruits were difficult to differentiate between wormy and moldy as they were mingled with fungal infection and pest invading. Therefore, the fruits deteriorated by fungal infection and pest invasion were put together for LF-NMR analysis in this study. The reason that makes longan fruits unqualified should be specified and separately detected in future studies to better discriminate between inferior longan fruits and normal ones.

Conclusions

The present work has proposed a non-invasive and effective method that could be applied to discriminate moldy and/or wormy dried longan fruits, in which LF-NMR and MRI techniques were used to examine the water bounding status and distribution inside the fruits and a DLNN model was constructed to predict defective longan fruits based on the transverse relaxation. This study highlighted the remarkable performance of DLNN model in discriminating defective dried longan fruits of high efficiency featured by both high recall and precision. Future work will focus on improving the performance of the DLNN model in experiential learning and its ability to combine and correlate diverse data (rotten longan fruits featured by different characteristics) by enlarging sample size. Ultimately, an NMR-DLNN based software is expected to be developed and applied in longan and other fruit processing industry for routine and online quality control.