Introduction

Citrus, with a high nutritive and economic value, is one of the most widely grown fruit crops in the world. The consumption and global marketing of fresh citrus has been growing for 10 years and will go on [1]. Freshness is a quality criterion of great importance for consumers to choose fruits [2]. The fresher the fruit, the higher the price. For decades, determination of the freshness of citrus fruits was a challenge for producers, researchers, and food safety agencies [3, 4]. Traditional freshness test is based on fruit firmness, total soluble solids (TSS) content, titratable acidity (TA) and other methods, which are unsatisfactory in accuracy [3, 5]. Recently, Ahlawat et al. proposed a new method to detect freshness of vegetables and fruits. They identified a gene associated with aging in broccoli and proposed it as a quick detection for freshness [6]. But this freshness is not the same as consumers think of freshness. Traditional freshness is judged by ripeness and storage time, but sometimes citrus, which are judged to be stale by their storage time, are fresh when you eat it. Therefore, it is important to figure out what kind of citrus consumers think is fresh and tasty.

The freshness that consumers prefer needs to be evaluated by consumers themselves. However, different people have different criteria for freshness, and human’s sensation is vague and dubious, which is difficult to be described, not to mention quantification. Consumers are not professional assessors and consumer preference evaluation needs a large number of consumers to participate. Therefore, a simple description method is appropriate for consumer preference evaluation, and the results need to be dealt with good analytical methods.

Fuzzy mathematics (FM) is a mathematical method to deal with fuzzy concepts and is able to scientifically and accurately evaluate concepts that is hard to be quantified exactly [7, 8]. FM has been widely applied in the sensory evaluation of food [9], such as liquor [8], tea [10], fruits [11], vegetables [12], oil [13], and meat products [14]. For example, color, aroma, taste, and style were set as four factors affecting the sensory quality of Luzhou flavor liquor [8]. Each factor had four evaluation grades (excellent grade, grade 1, grade 2, grade 3) and was assigned a corresponding weight. The sensory evaluators assigned a corresponding grade to each factor of the liquor. The quality scores of five different kinds of Luzhou flavor liquor were then calculated using the algorithm of FM. The higher the score, the better the quality. The results indicated that the comprehensive evaluation system was scientifically sound and reasonably constructed. Up to now, no study is reported for freshness evaluation with FM.

It is impossible to organize frequently large-scale consumer experiments to judge the freshness of citrus. It is more practical to use a simple and fast instrument to discriminate citrus samples graded by sensory evaluation and then build a prediction model for unknown samples, which is especially necessary for producers and enterprises. Traditional instrumental detection technology includes chromatographic techniques and spectrum technology, but they are either time-consuming, complicated to operate, or expensive. Electronic nose (E-nose) and electronic tongue (E-tongue) have the advantages of fast analysis speed, simple operation, no reagent consumption, and high sensitivity, which are more suitable for application in industry. Most importantly, the software of E-nose and E-tongue could establish a discriminant factor analysis (DFA) model by a particular algorithm for the sensor signals [15, 16] so as to classify the citrus with different freshness. By mapping the sensor signal values of unknown samples to the established model, the freshness of unknown samples can be predicted quickly. As rapid analytical technology, a wide range of applications could be found by applying E-nose and E-tongue for classification (or identification) and prediction of food, such as tea with different quality [17], meats of different ingredients [18], citrus in different storage conditions [19], wines from different geographical sources [20], and tilapia fillets with different freshness [21]. To the best of our knowledge, no research has been done on rapid detection of citrus freshness to meet consumer preferences.

A rapid prediction model can help us judge the freshness of a citrus sample, however, the compounds that distinguish different freshness are not clear. The freshness of citrus fruits is often related to physical changes (color, size, shape, and specific gravity) and chemical changes (pH, acidity, TSS, the ratio of TSS to acidity, and naringin content) [22]. However, most of these indices produce unsatisfactory results because they are determined by a combination of chemicals, in which individual components fluctuate wildly during fruit development and ripening [23]. The metabolomics approach is a very powerful tool to identify and associate chemical changes with the quality characteristics of products [24,25,26]. If those compounds can be found and the thresholds between fresh and stale citrus can be specified, freshness testing will become easier and more interesting.

In this research, the freshness of peeled citrus fruits was studied to avoid the interference from the peel. The freshness of citrus fruits of two cultivars was firstly evaluated by consumers using FM, and then the algorithm of FM was used to convert the fuzzy description of freshness into accurate value. In this case, each citrus had a freshness score, rather than simply being divided into groups based on freshness. The citrus samples with different freshness obtained by consumer evaluation were used to establish prediction models based on E-nose and E-tongue. Simultaneously, metabolomics method based on gas chromatography-mass spectrometry (GC–MS) was used to find the differential metabolites of citrus fruits between fresh and stale groups, and the discriminant critical values were obtained.

Materials and methods

Main chemical and reagents

All chemicals were of analytical reagent grade unless otherwise stated, and water was obtained from a Milli-Q purification system (Millipore Sigma, Burlington, MA, USA). Phenolphthalein, sodium hydroxide, potassium acid phthalate, L (+) ascorbic acid, ethanedioic acid dihydrate, and pyridine were purchased from Sinopharm Chemical Reagent Co., Ltd. (Beijing, China). 2, 6-dichloroindophenol sodium salt (97%) was purchased from Shanghai YuanYe Biotechnology Co., Ltd. (Shanghai, China). Sucrose, glucose and fructose standards, HPLC-grade methanol, adonitol (99%), O-methylhydroxylamine (98%+), BSTFA + TMCS (99:1, v/v) were purchased from Shanghai Macklin Biological Technology Co., Ltd. (Shanghai, China).

A mixture of stock solution containing sucrose (20 mg/mL), glucose (10 mg/mL), and fructose (10 mg/mL) was prepared by dissolving the above-mentioned analytical standards in a mixture of methanol and pure water (7:3). Mixed standard working solutions were prepared by serial dilution of the stock solution with methanol. The stock solution was stored at  −18 °C, while all of the working solutions were stored at 4 °C.

Preparation of samples

Citrus fruits of two cultivars were analyzed, namely satsuma mandarin (C. unshiu Marc.) and Newhall navel orange (C. sinensis Osbeck ‘Newhall’). They were hand harvested from a local orchard located in Yichang, Hubei, China (N 30°07′, E 111°28′) on September 23, 2021 and December 16, 2021, respectively. After harvest, 50 samples of each cultivar were stored in a cold room (4–8 ℃) and the rest were spread out in a room at room temperature (~ 20 ℃). And then, each batch of 50 oranges was transferred from the room to the cold storage every 3 days for satsuma mandarin, and every 10 days for navel orange. The storage time was 2 months for satsuma mandarin and 3 months for navel orange. After storage, all citrus were numbered and sent to a sensory laboratory in Huazhong Agricultural University. Each orange was peeled and divided into two halves. One half was used for sensory evaluation immediately, and the other half was used for instrumental analysis. The samples for E-nose and E-tongue were stored at 4 ℃, and experiments were done in 2 days. The samples for GC–MS were frozen with liquid nitrogen and stored at − 80 ℃.

Establishment of sensory evaluation model based on FM

The details were shown in Supplementary Information.

Determination of TSS and TA

TSS and TA were determined according to standard methods [27, 28].

E-nose setup and signals acquiring

A FOX 4000 Odor Fingerprint Analyzer (Alpha MOS, Toulouse, French) used in this research was equipped with a metal oxide semiconductor (MOS) sensor array which was composed of 18 different MOS sensors. The names and main performances of the sensor array to some specific volatile compounds are presented in Table S4 (Supplementary Information). The flesh of 3 g was placed in a 20-mL head-space bottle as a sample. All samples were placed into E-nose tray for automatic sampling. E-nose parameters: incubation time 600 s, incubation temperature 40 ℃, flushing time 120 s, agitation speed 500 rpm, agitation on 5 s, agitation off 2 s.

E-tongue setup and signals acquiring

An ASTREE II Gustatory Fingerprint Analyzer (Alpha MOS, Toulouse, French) was employed to characterize the taste of citrus samples. The instrument was mainly composed of seven liquid cross-sensitive electrodes, an Ag/AgCl reference electrode and a data acquisition system. These seven sensitive electrodes were coated with specific materials that could respond to five different tastes, i.e., bitterness, savory, saltiness, sourness and sweetness. The detailed information for E-tongue sensors is shown in Table S4 (Supplementary Information). Five citrus slices of the same freshness were mixed and homogenized. The juice (20 mL) was diluted to 100 mL with pure water and transferred to a 120-mL beaker for analysis. E-tongue parameters ar as follows: delay 0 s, acquisition duration 120 s, acquisition period 1.0 s, stirring rate 1 r/s. Additionally, all the samples were detected at 20 ± 1 °C.

Statistical analysis

Multivariate statistics was employed to discriminate and predict the freshness of citrus samples by performing principal component analysis (PCA) and discriminant factorial analysis (DFA) model using Alpha Soft software. Both PCA and DFA models were constructed using the most discriminating peaks (those that showed discrimination power ≥ 95%) selected as important compounds that contributed to the discrimination of citrus samples.

Discrimination was achieved by comparing the distances and pattern discrimination indices between sample groups on a PCA score plot, whereas freshness was predicted by projecting test samples onto the DFA model and recognized by the training samples.

Significance analysis was performed using IBM SPSS 26.0, at a 5% significance level. Graphs were made using Origin 2021.

Metabolomics analysis

Derivatization-GC–MS

Derivatization was carried out according to published guidelines [29]. All of the samples belonging to one cultivar were divided into fresh and stale groups according to previous sensory evaluation in order to find out differential metabolites between the two groups. Six samples were randomly selected from each group for analysis. QC samples were inserted during the experiment.

GC–MS analysis was carried out using an Agilent 7000 D GC–MS instrument (Agilent Technologies, Santa Clara, CA, USA) operating in electron impact ionization (EI) mode at 70 eV. The GC separation was performed using a fused silica HP-5MS (30 m × 250 µm × 0.25 µm) column. The GC oven temperature was starting at 100 ℃ for 1 min, programmed to 184 ℃ at 3 ℃/min, then programmed to 190 ℃ at 0.5 ℃/min and held for 1 min and finally programmed to 280 ℃ at 15 ℃/min.

Data pre-processing

The original data of GC–MS were processed by MZmine 2.53 software for baseline correction, mass detection, chromatograms detection, smoothen, deconvolution, alignment, normalization, et al. [30, 31]. Multivariate statistical analysis was conducted on the result matrix obtained after MZmine processing.

Multivariate statistical analysis

Statistical analysis was conducted online by Metaboanalyst [32]. For three column data (mass, retention time, and intensities), the program further grouped peaks based on their retention time. Mass tolerance was 0.25 (m/z), and retention time was 5 s. The default method replaced all the missing values with a small value (the half of the minimum positive values in the original data) assuming to be the detection limit. All data were normalized according to normalization by sum, cube root transformation, and Pareto scaling. PCA was used to verify sample differentiation and instrument stability. Differential metabolites between fresh and stale citrus were found by the orthogonal partial least square discriminant analysis (OPLS-DA). The quality of the constructed PCA was assessed by R2X and Q2, and the quality of the constructed OPLS-DA model was assessed by R2X, R2Y and Q2. The metabolites with VIP value greater than 1 were selected as candidate differential metabolites by OPLS-DA model. The relative contents of candidate differential metabolites were subjected to t test, and only metabolites with a P < 0.05 were defined as differential metabolites.

Determination of the threshold for differential metabolites

Quantitative standard curves were established by internal standard curve method for quantitative analysis of differential metabolites. A total of 30 citrus samples with different fuzzy comprehensive evaluation scores were detected. Three parallel assays were performed for each sample. Detailed methods can be seen in Supplementary Information. The content can be calculated by drawing fitting curves with the comprehensive evaluation score of freshness as the X-axis and the content of differential metabolites as the Y-axis. The ordinate value of the point with a comprehensive evaluation score of 2.5 was the threshold value of differential metabolites.

Results and discussion

Determination of the weight of factors

According to the analytic hierarchy process, a total of 18 matrices passed pairwise comparison consistency test (CR < 0.1), and the weights of appearance, touch, smell **and taste were calculated to be 0.103, 0.105, 0.212 and 0.580, respectively. The concrete results were shown in Table S5 (Supplementary Information).

Calculation of fuzzy comprehensive evaluation scores

There were 422 and 150 evaluation matrixes for satsuma mandarin and navel orange, respectively. The comprehensive evaluation of each citrus was successfully converted into a freshness score ranging from 1 to 4. The score of citrus of 1–1.75 was stale, 1.75–2.5 was relatively stale, 2.5–3.25 was relatively fresh and 3.25–4 was fresh. In the four grades across low-to-high, the number of satsuma mandarin was 111, 212, 70, **and 29, and the number of navel orange was 37, 40, 48, and 25, respectively.

Analysis of TSS and TA

TSS and TA reflect the sweetness and acidity of citrus, respectively. Usually, they are the most important indicators for consumers to judge the freshness of citrus. As shown in Table S6 (Supplementary Information), there were significant differences (P < 0.05) in TA and TSS between fresh and stale groups for both of satsuma mandarin and navel oranges, but no significant differences between the middle two groups. The results indicated that TSS and TA could not accurately distinguish the freshness evaluated by consumers, but different fresh citrus differed in acidity and sweetness.

Taste discrimination

The radar diagrams based on the responses of E-tongue sensors are shown in Fig. S1 (Supplementary Information). Except the responses of SCS (amargoso) sensors, there were sensible differences in the responses obtained by other sensors. For satsuma mandarin, the responses of each sensor increased with the decrease of freshness, but there was no obvious tendency for navel orange. Anyhow, E-tongue could distinguish citrus with different freshness.

Discrimination of taste profiles of citrus samples was presented on the PCA score plot constructed using seven selected peaks (discrimination power ≥ 95%) as inputs. Discrimination power explains the contribution of each peak in differentiating the samples. All of the samples were divided into training part and testing part. For E-tongue, 41 of satsuma mandarin samples and 80 of navel orange samples were applied for training. In Fig. 1, all responses of seven sensors (outputs) were used for PCA. When the taste characteristics of citrus samples were discriminated using PCA, separation of samples along two principal components (PC 1 and PC 2) based on freshness was evident. The first two principal components of both models could explain more than 96% of the total information. This result is consistent with the sensory evaluation, where the weight of taste given by the consumers was 0.58. It showed that consumers judged citrus freshness quality largely by the sour and sweet sensation brought to the mouth rather than the smell brought to the nose [2]. Therefore, there was enough discrepancy in taste for E-tongue to make an effective distinction with the unsupervised learning method.

Fig. 1
figure 1

PCA score plot constructed by E-tongue using the most discriminating peaks for discrimination of citrus samples from different freshness groups

After discrimination, DFA models (Fig. 2) consisting of samples from each freshness groups were constructed to predict the freshness of the test samples. In the DFA models based on E-tongue, the distance between groups of fresh and stale was large, while the distance between the remaining two groups was relatively small, which was also in line with the result of radar map. The first two discriminant factors of the two models explain 99.28% and 98.94% of the total variation, respectively.

Fig. 2
figure 2

DFA models for prediction of the freshness of citrus samples by E-tongue

28 of satsuma mandarin samples and 60 of navel orange samples were used for testing. Figure 3 shows the recognition of test samples by their corresponding freshness groups when they were projected onto the two models. As expected, the two DFA models showed high accuracy (100% and 98.3%). There was only one error that occurred in two adjacent groups with blurred boundaries. It was difficult for sensory evaluation to distinguish citrus at the junction of freshness and staleness, which was an important reason to apply FM to sensory evaluation. Zhang et al. [33] found that sweetness was the quality that consumers cared about the most. However, the results of the traditional test (such as TSS and TA) were not highly differentiated between different groups. More sensitive methods are required, and the results of the prediction proved the feasibility of E-tongue.

Fig. 3
figure 3

Projection of test samples onto the DFA models and recognition using the training samples by E-tongue

Aroma discrimination

Fig. S2 shows the radar diagrams of E-nose sensors. There were no responses on all of the LY sensors (sensitive to toxic). For satsuma mandarin [Fig. S2 (a)] (Supplementary Information), the other responses showed a rising trend with the decrease in freshness, but only the stale group had clear differences with other groups. That is probably due to consumers’ insensitivity and neglect of smell when judging the freshness of citrus pulp. The response values of navel orange [Fig. S2 (b)] (Supplementary Information) went down first and then increased with the decline in freshness, and the differences between groups were more noticeable than satsuma mandarin. This may be attributed to the more prominent aroma of navel orange compared with satsuma mandarin [2].

There were 28 satsuma mandarin samples and 78 navel orange samples used for training, and 20 satsuma mandarin samples and 60 navel orange samples used for testing. As can be seen from Fig. S3 (Supplementary Information), samples were not well separated into groups representing their freshness. As previously analyzed, the low weight given by consumers to smell led to the result that citrus with different freshness differed very little in odorant characteristics. Regarding the prediction of the freshness using E-nose, the recognition values were very small due to similarity in the intensity of aroma profiles between samples used for the model. Thus, the models were not valid and results were not included in this article.

Metabonomics analysis

Principal component analysis

Sugars and organic acids may be important metabolites for distinguishing freshness based on the previous analysis, and the principal task of this study was to discriminate sugars and organic acids between two groups. After data pre-processing, 959 features (m/z_RT pairs) were detected. PCA was used to uncover the internal structure of multiple variables through several principal components. The relatively tight clustering of QC samples in Fig. S4 (Supplementary Information) indicated that they had similar metabolic profiles and the analysis was stable and repeatable. All of the tested samples were divided into two distinct groups, suggesting that each group had a relatively distinct freshness profile. For satsuma mandarin, the first principal component (PC 1) could explain 50.6% of the features of the original dataset, and the second principal component (PC 2) could explain 18.8% of the features. As the freshness decreased, the gap in PC 1 obviously increased. For navel orange, PC 1 and PC 2 could explain 17.4% and 10% of the features of the original dataset, respectively. On the contrary, the gap in PC 2 increased significantly as the freshness decreased.

Orthogonal partial least square discriminant analysis

OPLS-DA is a multivariate statistical analysis method with supervised pattern recognition, which can effectively eliminate the influence irrelevant to the study and screen the differential metabolites. OPLS-DA was used to pairwise analyze the fresh and stale groups and draw the score plot. In the models, R2X and R2Y represent the interpretation rate of the built model on X and Y matrix, respectively, and Q2 represents the prediction ability of the model. The OPLS-DA score plots in Fig. 4a–b showed that significant segregation occurred in the different comparison groups. The models showed one orthogonal component, with R2Y = 0.999 (total variation in Y explained by the model) and Q2 = 0.515 (goodness of prediction) from the satsuma mandarin dataset, R2Y = 1 and Q2 = 0.518 from the navel orange dataset, indicating that the built models were appropriate.

Fig. 4
figure 4

OPLS-DA of metabolite features detected in citrus. ab Score plots of all metabolite features. cd Model overview of the OPLS-DA models. ef Loadings V-plot showing the variable importance in a model, combining the VIP > 1 and the P < 0.05 loading profiles. gh Volcano plots showing the differential saccharides and acids expression levels between fresh and stale samples

Analysis of differential metabolites

Pairwise comparisons were conducted among the fresh and stale samples to determine the metabolites that caused the observed differences. In OPLS-DA models (Fig. 4a–b), fresh samples clearly separated from stale samples, suggesting major distinctions in the metabolic profiles between the groups.

Differential metabolites screening among all saccharides and organic acids was further performed based on the fold-change (FC ≥ 2 or ≤ 0.5), variables identified as important in the projection scores (VIP > 1), and the P value of t test (P < 0.05). The screening results are presented as volcano plots (Fig. 4g–h) and scatter plots (Fig. 4e–f). Eight identified markers with high VIP scores, high FC, and significance levels (P < 0.05) by OPLS-DA are summarized in the boxplots in Fig. 5. The mass peak intensities of sucrose, fructose, glucose and mannose (Fig. 5a–d) were significantly different between the fresh and stale groups of satsuma mandarin, while the mass peak intensities of sucrose, fructose, glucose and galactose (Fig. 5e–h) were significantly different for navel orange. The results showed that the differential metabolites between fresh and stale citrus were sugars, indicating that consumers tended to distinguish citrus with different freshness according to their sweetness, which also explained that E-tongue could better distinguish different levels of freshness of citrus. It was also consistent with the result of TSS.

Fig. 5
figure 5

Box-plots of significant differential metabolites of satsuma mandarin (ad) and navel orange (eh)

Quantitative analysis of main differential metabolites

Sucrose, glucose and fructose, which were the common differential metabolites of satsuma mandarin and navel orange, were selected for quantitative analysis. According to the comprehensive evaluation scores of FM, citrus with different scores were selected and analyzed.

For satsuma mandarin, an obvious correlation was only found between the freshness and the content of sucrose. As seen in Table S7 (Supplementary Information), the content of sucrose increased with the decrease of freshness. For navel orange, a clear trend was seen between the freshness and the contents of fructose and glucose. Their contents increased with the decrease of freshness. The optimal models were selected for data fitting, and the fitting curves were shown in Fig. 6. According to the fitting curve of sucrose, 36.57 mg/g was the critical value between fresh and stale satsuma mandarin. If sucrose content is higher than this value, it is stale; on the contrary, if the content is lower than this value, it is fresh. Similarly, 10.35 mg/g and 10.72 mg/g were the thresholds between fresh and stale navel oranges for glucose and fructose, respectively. This is a preliminary study, and the above results need to be verified by more samples.

Fig. 6
figure 6

Fitting curves for differential metabolites of satsuma mandarin and navel orange

Conclusions

In this study, consumer sensory evaluation combined with FM was applied for the evaluation of citrus freshness. In the sensory evaluation, the weight that consumers assigned to taste was 0.58, which indicated that taste was the most important factor for consumers to judge the freshness of citrus. It was consistent with the results of TSS and TA. E-tongue could distinguish more clearly and accurately citrus between fresh, relatively fresh, relatively stale, and stale groups. Sweetness was the most important factor for consumers to judge citrus freshness. Sucrose, glucose,** and fructose were identified as the common differential metabolites of satsuma mandarin and navel orange between fresh and stale groups. But in the quantitative analysis, obvious trends were only found in the contents of sucrose in satsuma mandarin, and fructose and glucose in navel orange. The critical value of sucrose for satsuma mandarin was calculated to be 36.57 mg/g, and the thresholds of glucose and fructose were 10.35 mg/g and 10.72 mg/g for navel oranges, respectively. However, more samples are needed for verification our conclusions.This is the first study to evaluate and differentiate the freshness of citrus favored by consumers. The method has the advantages of low cost, easy operation, fast analysis speed, and strong applicability.