Introduction

Adulteration of food and beverages is a significant problem that involves many different edible products. In the fruit juice industry, one of the most common frauds is to add a co-fruit, a fruit that is less expensive or easier to obtain, to the authentic juice. The high cost of the fruit, and the possibility of poor harvest conflicting with high consumer demand, makes this industry susceptible of adulteration. Nowadays, sweet orange (C. sinensis) juices, whose consumption have significantly increased in the last years, is adulterated with tangerine (C. reticulate), lemon (C. limon) and/or grapefruit (C. paradisi) [1]. Detection and prevention of fruit juice adulteration is a very complex task due to the natural variation in the cultivars [2], stages of maturity [3], environmental conditions during growth [4], storage conditions [5], postharvest treatments [6, 7], the presence of the peel in fruit-based products [8] and the extraction system [9]. Therefore, the development of analytical methods to detect the adulteration of Citrus juices is of great interest in order to guarantee the food authenticity demanded by food producers, consumers and regulatory bodies. Approaches based on chemical analysis, and statistical and multivariate data analysis procedures have proved to be useful to develop models for the determination of geographical origin or quality brand of foodstuffs and fraud detection [10]. Several authors have already shown the potential of 1H nuclear magnetic resonance (1H NMR), infrared spectroscopy and liquid chromatography-mass spectrometry for authenticity assessment of fruit juices in combination with different chemometrics tools [1114].

For certain fruits, characteristic phenolic compounds have been successfully used to detect the adulteration of fruit juices [15, 16], nectars [17, 18] and jams [17, 19] with cheaper fruits [20]. Thus, polyphenols are very promising for the determination of food authenticity due to their taxonomic specificity [21, 22]. Phenolic composition of Citrus juice comprises flavanones (major group), flavones and flavonols [2225], which usually occur as glycosides. Several publications reported improvements in Citrus phenolic compound determination, especially using reversed-phase high-performance liquid chromatography (HPLC) with diode array detection (DAD) for their identification and characterization [15, 26]. However, liquid chromatography (LC) coupled to electrospray ionization (ESI) and tandem mass spectrometry (MS/MS) is nowadays one of the most successful techniques applied to qualitative and quantitative determination of phenolic compounds in fruits [27]. Its superior sensitivity, high selectivity and resolution power allow direct screening of natural products avoiding the previous need for laborious isolation of phenolics [28, 29]. This technique has been applied for the identification of phenolic compounds in Citrus fruits [23, 3034], apples [35, 36], coffee [37] and tomato [38, 39].

In the present study, the phenolic profiles of Citrus juices, previously characterized by HPLC–DAD-ESI–MS/MS [32], were quantified by HPLC–DAD, using an optimized and validated method for the simultaneous determination of several polyphenolic families in fruit juices [40]. These polyphenolic profiles, being representative of the Spanish Citrus fruit juice production, were studied with the aim of differentiating Citrus juices according to the specie/s used for their elaboration: sweet orange, tangerine, lemon or grapefruit. The data set was analysed by statistical and chemometric techniques in order to identify possible markers and develop classification and regression models for the authentication of Citrus juices and the detection of adulterations.

Materials and methods

Chemicals

Methanol and dimethyl sulfoxide (Romil, Chemical Ltd, Heidelberg, Germany) were of HPLC grade. Water was purified on a Milli-Q system (Millipore, Bedford, MA, USA). Glacial acetic acid, ascorbic acid and sodium fluoride provided by Merck (Darmstadt, Germany) were of analytical quality. All solvents used were filtered through 0.45 μm nylon membranes (Lida, Kenosha, WI, USA).

Phenolics standards were supplied as follows: eriodictyol-7-O-rutinoside, eriodictyol-7-O-neohesperidoside, naringenin-7-O-rutinoside, hesperetin-7-O-rutinoside, hesperetin-7-O-neohesperidoside, isosakuranetin-7-O-rutinoside, hesperetin, homoeriodictyol, ferulic acid, sinapic acid, quercetin-3-O-galactoside, quercetin-3-O-glucofuranoside, quercetin-3-O-glucopyranoside, quercetin-3-O-rhamnoside, kaempferol-3-O-glucoside, kaempferol-3-O-rutinoside, kaempferol-7-O-neohesperidoside, kaempferol-3-O-robinoside-7-O-rhamnoside, isorhamnetin-3-O-glucoside, isorhamnetin-3-O-rutinoside, isorhamnetin, tamarixetin, myricetin, scopoletin, luteolin-7-O-glucoside, luteolin-6-C-glucoside, luteolin-8-C-glucoside, luteolin-3′,7-di-O-glucoside, luteolin-4′-O-glucoside, diosmetin-7-O-rutinoside, apigenin-7-O-glucoside, apigenin-6-C-glucoside, apigenin-8-C-glucoside, apigenin-7-O-neohesperidoside, apigenin-7-O-rutenoside, diosmetin, chrysoeriol and sinensetin from Extrasynthèse (Genay, France); while naringenin, 5′-caffeoylquinic acid, caffeic acid, p-coumaric acid and quercetin-3-O-rutinoside were provided by Sigma-Aldrich Chemie (Steinheim, Germany); apigenin-8-C-glucoside-4′-O-rhamnoside, kaempferol-3-O-(p-coumaroyl)glucoside, tangeretin and nobiletin by Chromadex (Santa Ana, CA, USA); and naringenin-7-O-neohesperidoside, quercetin dehydrated and apigenin by Fluka Chemie (Steinheim, Germany).

All stock standard solutions (in concentrations ranging from 250 to 2,500 μg/mL, depending on each phenolic compound) were prepared in methanol, except for hesperetin-7-O-rutinoside, hesperetin, homoeriodictyol, chrysoeriol and isorhamnetin that was dissolved with water-dimethyl sulfoxide (80:20, v/v), and all were stored at 4 °C in darkness.

Fruit samples

Citrus fruits of 18 cultivars grown and used in Spain for making juices were purchased from a local market at commercial maturity. These samples had been previously analysed by HPLC–DAD-ESI–MS/MS to identify the phenolic compounds contained [32]. The Citrus species and cultivars studied were (Online Resource 1) as follows: sweet orange Citrus sinensis (27 juice samples): Cv. Navel-Late (NVLA), Cv. Navelina (NVL), Cv. Navel (NV), Cv. Salustiana (SA), Cv. Valencia Late (VL) and Cv. Valencia (VA); tangerine Citrus reticulate and Citrus unshiu (30 juice samples): Cv. Hernandina (CLH), Cv. Marisol (CLM), Cv. Clemenule (CLN), Cv. Clementina (CL), Cv. Satsuma (SAT), Cv. Fortuna (FOR) and Cv. Clemenvilla (CLV); lemon Citrus limon (12 juice samples): Cv. Verna (V) and Cv. Primafiori (VP); grapefruit Citrus paradise (15 juice samples): Cv. Star Ruby (SR), Cv. Red Ruby (RR) and Cv. Blanco (BL). Fruits of NVLA, NVL, SA, CLH, CL, SAT, V, VP, SR and RR cultivars were from the 2003–2004 and 2004–2005 harvests; fruits of NV, VL and FOR cultivars, from the 2003–2004 harvest; and fruits of VA, CLN, CLM, CLV and BL cultivars, from the 2004–2005 harvest.

Citrus juice preparation

For each Citrus cultivar and harvest, three independent juice samples were prepared. For this, three batches of fruit (1 kg each) were separated. Each batch was peeled, separating the flavedo and the albedo from the pulp, and squeezed using a home juicer. Despite the fact that this extraction procedure is not used in the industrial scale by fruit juice manufacturer, it is widely used by small manufacturers and allows a suitable control of the elaboration conditions and the fruits used to prepare the juice. The collected juice, after measuring its volume, was mixed with 50 mL of an aqueous solution containing ascorbic acid 0.2 g/mL and sodium fluoride 0.2 g/mL, in order to inactive polyphenoloxidases and prevent phenolic degradation [41]; and then was centrifuged at 6,000 rpm for 15 min at 4 °C. Aliquots of 1 mL were sampled, stored at −20 °C and lyophilized later. The freeze-dried material was stored at room temperature in a desiccator in darkness until analysis. The juice of each fruit batch was analysed in triplicate.

Determination of phenolic compounds in Citrus juices by HPLC–DAD

The analytical method used for the determination of phenolic compounds in fruit juices was optimized and validated in a previous work [40]. The method is based on the solvent extraction of freeze-dried aliquots of fruit juices followed by the analysis of the extract by HPLC–DAD.

Solvent extraction

Freeze-dried juice aliquots of 1 mL were extracted at the time of analysis with 2 mL of a mixture of methanol–water–acetic acid (30:69:1, v/v/v) using ascorbic acid (2 g/L) as preservative. Mixing was carried out by vortex, and the extraction was performed in an ultrasonic bath for 15 min at room temperature. The extract was centrifuged at 4,000 rpm for 4 min and passed through a 0.45 μm PTFE filter (Waters, Milford, USA) prior to injection into the chromatographic system.

Reversed-phase HPLC analysis

Chromatographic analysis was performed on a Shimadzu (Kyoto, Japan) liquid chromatograph, equipped with a vacuum degasser DGU-14A, a quaternary pump LC-10DVP, a thermostatted autosampler SIL-10ADVP, a thermostatted column compartment and a DAD detector SPD-M10AVP and controlled by CLASS-VP software. A reversed-phase Phenomenex Luna C18(2) column (150 × 4.6 mm i.d. and particle size 3 μm) (Torrance, USA) with a Waters Nova-Pack C18 guard column (10 × 3.9 mm i.d, 4 μm) (Milford, USA) was used. A gradient program for general phenolic compound analysis was employed: the eluents were acetic acid–water (0.5:99.5, v/v) (solvent A) and methanol (solvent B); initially 0 % B for 2 min, a linear gradient to 15 % B at 6 min, held isocratically until 12 min, linear gradient to 20 % B at 15 min, 20 % B constant until 35 min, linear up to 35 % B at 90 min, 35 % B constant until 136 min, and finally, washing and reconditioning of the column was done. The flow rate was 0.8 mL/min, and injection volume was 50 μL. The column was operated at 30 ºC, and sample vials on the injector were preserved at 4 ºC. Flavanones were monitored and quantified at 280 nm, hydroxycinnamic acids at 320 nm, and flavonols, flavones and coumarins at 370 nm.

The identification of the phenolic compounds analysed has been already reported [28, 32, 4244]. Quantitation was performed using integration areas in the calibration curve of the standards most similar to each phenolic compound quantified. Thus, flavanones were quantified as naringenin-7-O-rutinoside (limit of detection (LOD) = 0.02 mg/L; limit of quatitation (LOQ) = 0.04 mg/L); apigenin glycosides as apigenin-7-O-glucoside (LOD = 0.06 mg/L; LOQ = 0.2 mg/L); luteolin, diosmetin and chrysoeriol glycosides as luteolin-7-O-glucoside (LOD = 0.04 mg/L; LOQ = 0.14 mg/L); quercetin and kaempferol glycosides as quercetin-3-O-rutinoside (LOD = 0.04 mg/L; LOQ = 0.14 mg/L) and kaempferol-3-O-rutinoside (LOD = 0.04 mg/L; LOQ = 0.2 mg/L), respectively; isorhamnetin and tamarixetin glycosides as isorhamnetin-3-O-rutinoside (LOD = 0.06 mg/L; LOQ = 0.2 mg/L); ferulic and sinapic acid derivates as 5′-caffeoylquinic (LOD = 0.016 mg/L; LOQ = 0.06 mg/L) and sinapic acid (LOD = 0.02 mg/L; LOQ = 0.06 mg/L), respectively; and scopoletin glycosides as scopoletin (LOD = 0.04 mg/L; LOQ = 0.1 mg/L). These concentrations were corrected with the recovery factors previously published [40].

Data analysis

The data set, made up of the individual polyphenol concentrations (variables in columns) measured on the Citrus fruit juices (samples in rows) determined by HPLC–DAD, was firstly analysed by univariate procedures (ANOVA, Fisher index and Box-Whisker plots) in the search of phenolic markers to distinguish the four Citrus species studied. When required, further multivariate data analysis was performed by pattern recognition techniques, already described in bibliography [45]: unsupervised ones as principal component analysis (PCA); and supervised ones as linear discriminant analysis (LDA) and partial least squares discriminant analysis (PLS-DA). Moreover, partial least squares regression (PLS) was used to create calibration models to determine the percentage of adulteration in Citrus fruit juices. For this purpose, a data set was constituted with the phenolic concentrations of adulterated juices theoretically calculated from the pure juice concentrations. Statistical and chemometric data analysis were performed by means of the statistical software packages Statistica 6.1 (StatSoft Inc., Tulsa, OK, USA, 1984–2004), The Unscrambler 9.7 (Camo Process AS, Oslo, Norway, 2007) and SPSS for Windows (SPSS Inc., 1989–1999).

The linear parametric techniques LDA, PLS-DA and PLS use statistical parameters of the distribution of the objects in the derivation of the linear function used to discriminate between classes (LDA and PLS-DA) or for calibration (PLS) [45]. In LDA, the variable selection was performed using forward stepwise selection [45]. In PLS-DA and PLS, PRESS or RMSEP is plotted against the number of the PLS components to find the optimal number of them. Sometimes there are several almost equivalent local minima on the curve; the first one should be chosen to avoid overfitting (according to the principle of parsimony). The model with the smallest number of features should be accepted from among equivalent models in the training set. Once PLS components are estimated by cross-validation, the classifications in the training-test set are represented in a box and whisker plot to define half of the distance between the quartiles as the boundary.

The supervised multivariate techniques were applied to the autoscaled (or standardized) data matrix as follows: (1) the data set was divided into a training-test set and an external set; (2) the training-test set was subsequently divided into a training set and a test set several times in order to perform cross-validation; (3) the training-test set was used for the optimization of parameters characteristic of each multivariate technique by cross-validation, for instance, the number of PLS components in PLS-DA and PLS, or variable selection in LDA; (4) a final mathematical model was built using all the samples of the training-test set and the optimized parameters; (5) this model was validated using an independent test set of samples (external validation). During the parameter optimization step, the models were validated by threefold cross-validation (threefold CV) or leave-one-out cross-validation (LOO-CV). The reliability of the classification models achieved in the cross-validation was studied in terms of recognition ability and prediction ability (percentage of the samples in the training set and the test set correctly classified, respectively). The reliability of the final classification model was evaluated in terms of the prediction ability in the external validation (percentage of the samples of the external set correctly classified using the optimized model). The reliability of the regression model was evaluated in terms of the root mean square error of prediction (RMSEP); and the square correlation coefficient (R 2), which indicates the fraction of the total variance explained by the model.

Results and discussion

Polyphenolic profiles of Citrus fruit juices

Forty-nine polyphenolic compounds were quantified in Citrus fruit juices by HPLC–DAD, previously characterized by HPLC–DAD-ESI–MS/MS [32]. Other nine polyphenols were detected and identified in these Citrus juices [32], but they were present under the limit of quantitation of the DAD. These nine polyphenols were the following: dihydroquercetin-7-O-rutinoside and dihydrokaempferol-7-O-rutinoside found in tangerine juices; dihydroisorhamnetin-7-O-rutinoside in sweet orange, tangerine, lemon and grapefruit juices; hesperetin-7-O-rutinoside-3′-O-glucoside in sweet orange and tangerine juices; homoeriodictyol-7-O-rutinoside in lemon juices; naringenin-O-rhamnosylmalonylhexoside-2 in grapefruit juices; luteolin-6-C-glucoside in lemon juices; apigenin-8-C-glucoside in sweet orange and tangerine juices; and kaempferol-7-O-rutinoside in tangerine juices. Table 1 lists the retention times of each polyphenol quantified in sweet orange, tangerine, lemon and grapefruit juices. The total polyphenol contents found in the juices of the four Citrus species studied were as follows: 548–1,407 mg/L in sweet orange juices, 215–1,335 mg/L in tangerine juices, 658–1,538 mg/L in lemon juices and 1,173–2,216 mg/L in grapefruit juices. Flavanones were the major polyphenol class in Citrus juices in comparison with the other classes of flavonoids (flavones, flavonols and coumarins) and phenolic acids, which were present in considerably smaller amounts.

Table 1 Polyphenols in Citrus fruit juices from Spanish cultivars

Flavanones

Flavanones occur as glycosides in Citrus juices. The percentages of flavanones in the total polyphenol content were as follows: 85–95 % in sweet orange juices, 87–97 % in tangerine juices, 81–87 % in lemon juices and 92–94 % in grapefruit juices.

In sweet orange (Citrus sinensis) juices, the flavanone glycoside profile consisted predominantly in hesperetin-7-O-rutinoside (339–1,039 mg/L) and naringenin-7-O-rutinoside (70–223 mg/L) (Table 2). Three other flavanones detected in all studied cultivars in low concentration were eriodictyol-7-O-rutinoside (2.4–12 mg/L), isosakuranetin-7-O-rutinoside (20–74 mg/L) and naringenin-7-O-rutinoside-4′-O-glucoside (22–64 mg/L); which were also reported in the literature [30, 4648].

Table 2 Polyphenolic contents (mg/L ± SD (n = 3)) in sweet orange (Citrus sinensis) juices

In tangerine (Citrus reticulate and Citrus unshiou) juices, flavanone composition was similar to sweet orange juice (Table 3). The main flavanones were hesperetin-7-O-rutinoside (166–879 mg/L) and naringenin-7-O-rutinoside (15–184 mg/L) followed by eriodictyol-7-O-rutinoside (3.3–8.6 mg/L), isosakuranetin-7-O-rutinoside (1.5–156 mg/L) and naringenin-7-O-rutinoside-4′-O-glucoside (1.2–29.7 mg/L); which is in agreement with published data [47, 49].

Table 3 Polyphenolic contents (mg/L ± SD (n = 3)) in tangerine (Citrus reticulate and Citrus unshiou) juices

In lemon (Citrus limon) juices, hesperetin-7-O-rutinoside was the most abundant flavanone, present in concentrations between 289 and 806 mg/L, followed by eriodictyol-7-O-rutinoside, ranging from 168 to 480 mg/L (Table 4) [9, 23, 48, 49]. Both flavanones represented between 95 and 97 % of the total flavanone content of lemon juices. The high content of eriodictyol-7-O-rutinoside, in comparison with other Citrus species, is a distinctive characteristic of lemon juices. Other three flavanones were determined in lemon juices: eriodictyol-7-O-rutinoside-4′-O-glucoside, naringenin-7-O-rutinoside and isosakuranetin-7-O-rutinoside, which were found in relatively low concentrations: 9–27, 3–20 and 8 mg/L, respectively. Isosakuranetin-7-O-rutinoside was detected only in some cultivars.

Table 4 Polyphenolic contents (mg/L ± SD (n = 3)) in lemon (Citrus limon) juices

The grapefruit (Citrus paradise) juices were those with the highest flavanone content, presenting concentrations of 1,080–2,076 mg/L (Table 5). The high flavanone content grapefruit juices is due to the presence of two isomeric structures of flavanones: rutinoside and neohesperidoside of naringenin, hesperetin and isosakuranetin [4951]. The naringenin-7-O-neohesperidoside was the most abundant flavanone glycoside (705–1,410 mg/L), comprising at least 60 % of the total polyphenol content; followed by naringenin-7-O-rutinoside (185–365 mg/L). Rutinosides and neohesperidosides of hesperetin and isosakuranetin were found in lower amounts than the 7-O-rutinoside and 7-O-neohesperidoside of naringenin [48]. Other four flavanone glycosides were detected in juices of all grapefruit varieties in much lower quantities than the rest of flavanones: naringenin-7-O-rutinoside-4′-O-glucose (11–17 mg/L), naringenin-7-O-neohesperidoside-4′-O-glucose (7.6–13.5 mg/L) and naringenin-O-rhamnosylmalonylhexoside (4.8–25 mg/L). These concentrations were similar to those reported in literature [24, 52, 53]. Naringenin-O-hexosylhexoside was quantified here for the first time (4.7–6.0 mg/L).

Table 5 Polyphenolic contents (mg/L ± SD (n = 3)) in grapefruit (Citrus paradise) juices

Flavones

Flavones are present in considerably smaller amounts than flavanones. The percentages of flavones of total polyphenols were as follows: 3.0–8.8 % in sweet orange juices, 0.3–9.8 % in tangerine juices, 8.0–13.6 % in lemon juices and 3.2–3.9 % in grapefruit juices.

In all cultivars of sweet orange (Citrus sinensis) juices, apigenin-6,8-di-C-glucoside, found also in all Citrus juices [54], was the most abundant flavone glycoside (Table 2); being present in concentrations that ranged from 25.8 to 69 mg/L. This flavone glycoside was also detected in tangerine and lemon juices but in lower amounts. One luteolin glycoside and three apigenin glycosides were also detected in all sweet orange cultivars: luteolin-6,8-di-C-glucoside, apigenin-8C-glucoside-O-pentoside, apigenin-6C-glucoside-O-pentoside and apigenin-8C-hexoside-O-acylglycoside. They were here determined quantitatively for the first time in sweet orange juices. Apigenin-8C-glucoside-O-pentoside had been found in leaves of sweet orange and sour orange [55, 56] and sweet orange peel [57, 58]. These four flavone glycosides were contained in considerably lower concentrations than apigenin-6,8-di-C-glucoside (Table 2).

Apigenin-6,8-di-C-glucoside and diosmetin-6,8-di-C-glucoside were always present in tangerine (Citrus reticulate and Citrus unshiou) juices [59], in concentrations of 1.6–61 and 1.2–2.9 mg/L, respectively (Table 3). The hybrid varieties, Fortuna and Clemenvilla, stood out due to their particularly high contents of apigenin-6,8-di-C-glucoside (56 and 61 mg/L, respectively). Other four flavones detected were 7-O-rutinosides of apigenin, diosmetin, luteolin and chrysoeriol in juices of Clementina-Hernandina (harvest 2003–2004), Clementina (harvest 2003–2004) and the hybrid varieties (Fortune and Clemenvilla). Apigenin and diosmetin 7-O-rutinosides had been already reported in tangerine juices in the literature [25, 60]. However, as far as we know, luteolin-7-O-rutinoside and chrysoeriol-7-O-rutinoside were quantified for the first time in the present work. These four flavones were present in lower amounts than 7.0 mg/L, except for apigenin-7-O-rutinoside (3.1–43 mg/L). In general, the hybrid varieties contained higher flavone amounts than the other tangerine cultivars.

The lemon (Citrus limon) juices differed from the rest of Citrus juice in their great diversity of flavones. Diosmetin-6,8-di-C-glucoside [23, 61] and luteolin-7-O-rutinoside [9, 62] were the major flavones in lemon juices, presenting concentrations of 14–45 and 10–27 mg/L, respectively (Table 4). The other flavones detected in lemon juices are present in lower concentrations. These flavones, some of them also described by other authors, were identified as apigenin-7-O-rutinoside (3.2–10 mg/L), diosmetin-7-O-rutinoside (2.2–29 mg/L), luteolin-6,8-di-C-glucoside (0.62–5.3 mg/L) [23], apigenin-6,8-di-C-glucoside (3.2–17 mg/L) [59, 63], chrysoeriol-6,8-di-C-glucoside (1.5–3.2 mg/L) [61], diosmetin-6-C-glucoside (2.7–25 mg/L) [25], chrysoeriol-7-O-rutinoside (2.8–7.2 mg/L) [64], apigenin-7-O-rutinoside-4′-O-glucoside (0.7–2.3 mg/L), chrysoeriol-6,8-di-C-hexosideacylhexoside (0.6–7.6 mg/L), diosmetin-6,8-di-C-hexosideacylhexoside (3.3–7.0 mg/L), apigenin-6-C-glucoside-O-pentoside (1.0–4.8 mg/L) and diosmetin-8-C-glucoside (3.9–10 mg/L). The last five compounds were determined in lemon juices for the first time in the present work.

Regarding grapefruit (Citrus paradise) juices, apigenin-6,8-di-C-glucoside was the most abundant flavone in all the grapefruit varieties studied [25], present in concentrations of 29–44 mg/L (Table 5). Four minor flavones were also determined: apigenin-7-O-neohesperidoside (5.1–15 mg/L), which had been reported by other authors [53, 65]; and apigenin-6-C-hexoside-O-hexoside (4.8–6.1 mg/L), apigenin-6-C-glucoside-O-pentoside (6.2–11 mg/L) and luteolin-7-O-neohesperidoside-4′-O-glucoside (0.5–0.7 mg/L), which were here quantified for the first time.

Flavonols

Flavonols, together with hydroxycinnamic acids and coumarins, were the polyphenols present in the smallest amounts in Citrus juices. The percentages of flavonols in the total polyphenol content were the following: 1.0–3.5 % in sweet orange juices, 0.3–9.5 % in tangerine juices, 3.4–6.0 % in lemon and 0.1–0.2 % in grapefruit juices.

In sweet orange (Citrus sinensis) juices, the flavonol glycoside pattern consisted in quercetin, isorhamnetin and kaempferol glycosides. Some flavonol glycosides were here determined quantitatively for the first time in sweet orange juices: quercetin-3-O-rutinoside-7-O-glucoside, quercetin-3-O-rhamnoside-7-O-rhamnosylhexoside, quercetin-7-O-rutinoside, quercetin-3-O-rutinoside (previously reported only in sweet orange leave and peel [66]), kaempferol-3-O-rutinoside-7-O-glucoside, kaempferol-3-O-rhamnosylhexoside-7-O-rhamnoside, kaempferol-3-O-rutinoside, isorhamnetin-3-O-rutinoside-7-O-glucoside, isorhamnetin-3-O-hexoside-7-O-rhamnosylhexoside, isorhamnetin-3-O-rhamnoside-7-O-rhamnosylhexoside and isorhamnetin-7-O-rutinoside. These flavonol compounds were detected in all sweet orange cultivars at lower concentrations than 6 mg/L (Table 2).

In tangerine (Citrus reticulate and Citrus unshiou) juices, flavonol composition depended on the tangerine sub-species. The hybrid cultivars, Fortuna and Clemenvilla, contained quercetin-7-O-rutinoside, quercetin-3-O-rutinoside and isorhamnetin-3-O-rutinoside in low quantities (<5.5 mg/L); quantified here in tangerine juices for the first time. Clementina juices presented the lowest concentrations of these three flavonols, as well as of quercetin-3-O-rhamnoside-7-O-rhamnosylhexoside, kaempferol-3-O-rhamnosylhexoside-7-O-rhamnoside and tamarixetin-7-O-rutinoside (Table 3). In contrast, Satsuma juices presented the highest content of flavonols (76 mg/L in harvest 2003–2004 and 92 mg/L in harvest 2004–2005), which were four quercetin glycosides, identified as quercetin-3-O-rutinoside, quercetin-7-O-rutinoside, quercetin-3-O-rutinoside-7-O-glucoside and quercetin-3-O-rhamnoside-7-O-rhamnosylhexoside; three kaempferol glycosides, identified as kaempferol-3-O-rhamnosylhexoside-7-O-rhamnoside, kaempferol-3-O-rutinoside-7-O-glucoside and kaempferol-3-O-rutinoside; and six isorhamnetin and tamarixetin glycosides, identified as isorhamnetin-3-O-hexoside-7-O-rhamnosylhexoside, isorhamnetin-3-O-rutinoside-7-O-glucoside, tamarixetin-3-O-rutinoside-7-O-glucoside, isorhamnetin-3-O-rhamnoside-7-O-rhamnosylhexoside, tamarixetin-7-O-rutinoside and isorhamnetin-3-O-rutinoside. All these flavonols were quantified in tangerine juices for the first time in the present work.

In lemon (Citrus limon) juices, quercetin-3-O-rutinoside is the most abundant flavonol [25, 67], present in all lemon cultivars, and in much higher contents (7.4–34 mg/L) than in sweet orange and tangerine juices (Table 4). Other two quercetin glycosides and three isorhamnetin glycosides were determined in lemon juices: quercetin-3-O-rutinoside-7-O-glucoside (detected before in lemon juice and lemon tree [61], and leaves of other Citrus [6]); and quercetin-7-O-rutinoside, isorhamnetin-3-O-rutinoside-7-O-glucoside, isorhamnetin-7-O-rutinoside and isorhamnetin-3-O-rutinoside, which are here quantified in lemon juices for the first time. These flavonols were found in low quantities (≤9 mg/L).

The grapefruit (Citrus paradise) juices presented the lowest content of flavonols of all the studied Citrus juices. Only quercetin-7-O-rutinoside was detected, in concentrations levels lower than 3 mg/L (Table 5). The content of this flavonol was determined in grapefruit juices for the first time in the present work.

Hydroxycinnamic acids

Hydroxycinnamic acid percentages in the total polyphenol content were as follows: 1.2–3.4 % in sweet orange juices, 0.6–9.6 % in tangerine juices, 1.1–1.3 % in lemon juices and 2.0–3.4 % in grapefruit juices. The hydrocycinnamic acids found in Citrus juices were O-hexoside of ferulic acid and O-hexoside of sinapic acid. The latter was present in smaller amounts than the former, and it was found in sweet orange, tangerine and lemon juices (Tables 2, 3, 4). In grapefruit juices, only the O-hexoside of ferulic acid was detected (Table 5).

Coumarins

Coumarins appeared only in grapefruit juices, being 0.2–0.4 % of the total polyhenolic content. The coumarins quantified were scopoletin-O-hexoside (2.3–4.2 mg/L) and scopoletin-O-rhamnosylhexoside (2.4–2.8 mg/L) (Table 5). Scopoletin-O-rhamnosylhexoside was not detected in grapefruit juices from harvest 2004–2005.

Polyphenolic markers of Citrus fruit juices

The study of the detailed polyphenolic profiles of Citrus fruit juice disclosed several markers that allow to distinguish with practical certainty lemon and grapefruit juices between them and from the other Citrus species. Grapefruit (Citrus paradise) juices contained several flavanones markers: naringenin-7-O-neohesperidoside, naringenin-7-O-neohesperidoside-4′-O-glucose, naringenin-O-hexosylhexoside, hesperetin-7-O-neohesperidoside, naringenin-O-rhamnosylmalonylhexoside, isosakuranetin-7-O-neohesperidoside, which were present at higher concentrations than 2.5 mg/L. Hesperetin-7-O-rutinoside was always detected in Citrus juices over 144 mg/L, except for grapefruit juices, which contain less than 38 mg/L. Among flavones, apigenin-6-C-hexoside-O-hexoside (4.0–6.8 mg/L) and apigenin-7-O-neohesperidoside (4.9–16.5 mg/L) were also grapefruit makers; as well as the coumarin scopoletin-O-hexoside (2.3–4.2 mg/L). However, this coumarin may be present at too low concentrations in some grapefruit juices to be used to detect adulterations. For this purpose, hydroxycinnamic acids should neither be used, because even though grapefruit juices contains higher amounts of the O-hexoside of ferulic acid (>36 mg/L) than the other Citrus juices (<33.5 mg/L), the difference is too narrow. Another characteristic feature of grapefruit juices was the absence of isorhamnetin-3-O-rutinoside, which was present in all the other Citrus species (>0.3 mg/L).

Eriodictyol-7-O-rutinoside-4′-O-glucoside and eriodictyol-7-O-rutinoside were flavanone markers of lemon (Citrus limon) juices. The latter was also present in sweet orange and tangerine juices in low concentrations (<14 mg/L), whereas lemon juices contained more than 163 mg/L. Some flavones were present either in lemon and tangerine juices but at different concentration ranges: diosmetin-6,8-di-C-glucoside (tangerine juices (<3.5 mg/L), lemon juices (>13 mg/L)), diosmetin-8-C-glucoside (tangerine juices (<2.4 mg/L), lemon juices (>4 mg/L)), luteolin-7-O-rutinoside (tangerine juices (<8 mg/L), lemon juices (>9 mg/L)) and diosmetin-6-C-glucoside (tangerine juices (<1.6 mg/L), lemon juices (>3 mg/L)). Instead diosmetin-6,8-di-C-hexosideacylhexoside was only observed in lemon juice (3.2–7.5 mg/L). Chrysoeriol-6,8-di-C-glucoside (1.4–3.3 mg/L), apigenin-7-O-rutinoside-4′-O-glucoside (0.6–2.6 mg/L) and chrysoeriol-6,8-di-C-hexosideacylhexoside (0.4–8.0 mg/L) were other flavones that were only detected in lemon juices, but they may be present in too low concentrations to be considered for detecting juice adulteration.

Apigenin-8C-glucoside-O-pentoside (1.9–6 mg/L) may be regarded as a flavone marker for sweet orange (Citrus sinensis) juices, however, it has been detected at trace levels (under the LOQ = 0.1 mg/L) in some lemon juices. The flavone apigenin-6C-glucoside-O-pentoside has not been detected in any of the tangerine (Citrus reticulate and Citrus unshiou) juice samples analysed, whereas it is always present in the other Citrus juices in concentrations over 0.9 mg/L. Thus, the absence of this flavone could be considered as a characteristic feature of tangerine juices.

Regarding the adulteration of Citrus juices, the most common practices are the adulteration of sweet orange juice with grapefruit juice, lemon juice or tangerine juice, which are cheaper fruits than sweet orange. The presence of grapefruit juice and/or lemon juice in sweet orange juice is easily detected by the analysis of at least one of the characteristic grapefruit and lemon markers described above. Instead, the detection of the adulteration of sweet orange juice with tangerine juice requires an exhaustive data analysis of the polyphenolic profiles of the juices.

Pattern recognition of sweet orange and tangerine juices

The phenolic composition of sweet orange and tangerine juices was studied by statistical and chemometric techniques. The data set was made up of 57 samples (27 sweet orange juices and 30 tangerine juices) and 49 variables, which were the concentrations of the phenolic compounds determined by HPLC–DAD. The analysis of variance (ANOVA) and the box and whiskers plot performed on this matrix disclosed that the contents of luteolin-6,8-di-C-glucoside, apigenin-8C-glucoside-O-pentoside, apigenin-6C-glucoside-O-pentoside and isorhamnetin-7-O-rutinoside were totally discriminant between both Citrus species. Indeed, these polyphenols were always present in sweet orange juices but not detected in tangerine juices. Therefore, they are not useful as markers to detect sweet orange juice adulterated with tangerine juice. Moreover, these compounds were present in low concentrations. Taking into account the LOQ for each polyphenol analysed, the minimum amount of a compound to be considered as a marker to detect juice adulteration was established. Thus, a marker should be present at least in a concentration of 2.5 mg/L in the pure juice. Regarding this, for further data analysis, the data set consisted of a 57 × 7 matrix, in which rows represented the 57 sweet orange and tangerine juices; and columns, the 7 polyphenols present in the pure juices at concentrations higher than 2.5 mg/L: naringenin-7-O-rutinoside-4′-O-glucoside, eriodictyol-7-O-rutinoside, naringenin-7-O-rutinoside, hesperetin-7-O-rutinoside, isosakuranetin-7-O-rutinoside, apigenin-6,8-di-C-glucoside and O-hexoside of ferulic acid. ANOVA performed on the data set revealed that there were significant differences for several variables between sweet orange and tangerine juices. The Fisher test allowed us to detect the most discriminant variables (p < 0.01) between these Citrus juices, which were naringenin-7-O-rutinoside-4′-O-glucoside, eriodictyol-7-O-rutinoside, naringenin-7-O-rutinoside and apigenin-6,8-di-C-glucoside. However, the box-whisker plots of these variables showed an overlap in the concentration ranges of these compounds, thus none of the variables measured was able, by itself, to discriminate between the sweet orange juice samples from the tangerine ones. For this reason, it was necessary to apply multivariate data analysis in order to distinguish them.

Principal component analysis was performed on the autoscaled 57 × 7 data set. The two first principal components accounted for 66 % of total system variability. The bidimensional plots of the sample scores in the space defined by the first principal component (PC1, 48 % of total variability) versus the second principal component (PC2, 19 % of total variability) indicated a natural separation of sweet orange and tangerine juices, even though Salustiana (SA) sweet orange samples were in the tangerine cluster, and Satsuma (SAT) tangerine juices, in the sweet orange cluster. SA concentrations of naringenin-7-O-rutinoside and isosakuranetin-7-O-rutinoside were close to those of tangerine juices, whereas SAT concentrations of naringenin-7-O-rutinoside-4′-O-glucoside, naringenin-7-O-rutinoside and isosakuranetin-7-O-rutinoside were in the range of sweet orange juices. These variables together with apigenin-6,8-di-C-glucoside were the most influent features on PC1. Apigenin-6,8-di-C-glucoside was present in particularly high concentrations in tangerine hybrid varieties (Fortuna and Clemenvilla).

LDA and PLS-DA were applied to the autoscaled 57 × 7 data set to produce classification models to distinguish sweet orange and tangerine juices. From the 57 juices (prior probabilities were 0.47 for sweet orange class (Or) and 0.53 for tangerine class (Ta)), 45 juices (21 Or and 24 Ta) were included in the training-test set to perform threefold cross-validation, and 12 juices (6 Or and 6 Ta) in the external set to carry out the external validation of the classification models. The LDA model correctly classified all sweet orange and tangerine juices using four selected variables: naringenin-7-O-rutinoside-4′-O-glucoside, naringenin-7-O-rutinoside, hesperetin-7-O-rutinoside and apigenin-6,8-di-C-glucoside, in both the cross-validation and the external validation. The PLS-DA model, using 3 PLS components and the boundary at 0.595, obtained the same satisfactory results: only one sample was misclassified in the cross-validation. From the weighted regression coefficients, the most influent variables in the PLS-DA model were naringenin-7-O-rutinoside-4′-O-glucoside, apigenin-6,8-di-C-glucoside, isosakuranetin-7-O-rutinoside and naringenin-7-O-rutinoside. The fact that models achieved by different techniques were based on the same variables implies that the results were feasible and not random. The results obtained by pattern recognition techniques showed that phenolic composition of sweet orange and tangerine juices contain adequate information to achieve their differentiation.

Adulteration of sweet orange juices with tangerine juices

Regarding the adulteration issue, PLS was used to develop predictive models to estimate the percentage of tangerine juices in sweet orange juice. For this purpose, a new data set was composed using the data of phenolic concentrations in pure sweet orange and tangerine juices to estimate the phenolic composition of sweet orange juices adulterated with tangerine juice in the following percentages: 10, 20, 30, 50 and 70 %. The data set consisted of 280 adulterated juice samples and 7 phenolic compounds present in the pure juices at concentrations higher than 2.5 mg/L. The composition of the adulterated juices was theoretically calculated from the pure juice values, using the average of three analysed replicates for each cultivar and harvest. The tangerine cultivars used for adulteration were Clemenule, Clementina-Hernandina and Clementina-Marisol. An extreme sample of a Navel-Late sweet orange juice in a preliminary study by PCA was not included in the data set. A PLS regression model was developed using the autoscaled data matrix and leave-one-out cross-validation. Three PLS components were selected. The prediction ability of the regression model was evaluated by RMSEP in the cross-validation and external validation (Table 6). RMSEP is an indicator of the average error in the analysis for each component and how well the model fits to the data. The overall RMSEP in the cross-validation and the external validation were 7.5 and 7.4, respectively. The square correlation coefficient (R 2) of the regression model was 0.942 in the training set and 0.937 in the test set in cross-validation. These close RMSEP in CV and external validation, and R2 values in the modelling step indicated that the regression model was robust. The regression coefficients of the PLS model are shown in Table 7. The most influent variable in the model is apigenin-6,8-di-C-glucoside, which was present in higher amounts in sweet orange juices than in tangerine juices, except in the tangerine hybrid varieties.

Table 6 RMSEP in the cross-validation and external validation of the PLS regression model to determine the percentage of tangerine juice in sweet orange juice
Table 7 Regression coefficients of PLS regression model to determine the percentage of adulteration of sweet orange juice with tangerine juice

Conclusions

The polyphenolic profiles of Citrus fruit juices from 18 cultivars, grown in Spain, of sweet orange, tangerine, lemon and grapefruit, identified by HPLC–DAD-ESI–MS/MS [32], were quantitatively determined by HPLC–DAD. The exhaustive study of the polyphenolic composition of Citrus juices disclosed the presence of several markers in grapefruit and lemon juices, some of them reported here for the first time. Each one of these characteristic markers of grapefruit or lemon allow to detect the adulteration of other Citrus fruit juices with grapefruit and/or lemon juices. Sweet orange and tangerine presented some characteristic compositions regarding certain polyphenols; however, they were not feasible to be used to detect juice adulteration. Therefore, the polyphenolic profiles of sweet orange and tangerine juices were submitted to multivariate data analysis considering only polyphenols contained in these Citrus juices in higher amounts than 2.5 mg/L. LDA and PLS-DA provided classification models that correctly identified all sweet orange and tangerine juices. Moreover, PLS afforded a regression model to determine the percentage of tangerine juice used to adulterate sweet orange juice. PLS regression model allowed the successful detection of adulteration at 10–70 % level with a suitable confidence interval (RMSEP = 7 %) for screening purposes. Although more studies and a comprehensive external validation with real adulterated samples are required, the regression model here presented seems to be promising for detecting sweet orange juice adulterations with tangerine juice.