Introduction

Honey has been considered a natural therapeutic agent since antiquity and used as a drug in traditional medicine for its antibacterial, antioxidant, anti-inflammatory and antitumoral properties. The beneficial role of honey is attributed to its antioxidant activity exerted by several compounds such as flavonoids (chrysin, pinocembrin, pinobanksin, quercetin, kaempferol, luteolin, galangin, apigenin, hesperetin, myricetin), phenolic acids, (caffeic, p-coumaric, ferulic, ellagic, chlorogenic) and derivatives of phenolic.

The amount and type of antioxidants in honey depends largely upon the floral source/variety of the honey and a correlation between its antioxidant activity and total phenolic content has been demonstrated. Therefore, these metabolites may be used as potential markers for the identification of honey botanical origin and to assess honey quality as their concentration is strongly correlated to antioxidant capacity but also to its anti-microbial effects [1, 2].

Several phenolic compounds have been identified as markers of honey botanical origin: hesperetin for citrus honey, caffeic and coumaric acids for chestnut honey, abscisic acid for strawberry tree honey; ellagic acid, kaempferol for acacia honey [3, 4].

Di Marco et al. [3] demonstrated that each Italian monofloral honey analysed was characterized by a specific metabolic profile linked to certain flowerings. The total antioxidant capacity was attributed to the overall synergistic effect of the different phenols and dark honeys proved to have more antiradical properties than light ones, as consequence of their higher concentration of phenolic compounds.

Perna et al. [5] described the phenolic composition in four Italian monofloral honeys, sulla, chestnut, eucalyptus and citrus and samples of multifloral honey. Multifloral demonstrated to have high concentration of antioxidant compounds, while chestnut honey was the richer among unifloral varieties.

Petretto et al. [6] evaluated several phenolic acid and flavonoids in unifloral honeys from Sardinia, among them, strawberry tree honey resulted in an exceptional high antioxidant properties and phenolic compounds content.

Chemometric tools can be a reliable aid to explore large amount of complex data and to extract and maximize information for classification and authentication purposes [7].

Paliuc et al. [8] confirmed the possibility of use phenolic compounds and phenolic acids for Romanian honeys authentication and in the work by Ciucure and Gaena [9], phenolic acids and flavonoids allowed to distinguish Romanian honeydew and polyfloral honeys from acacia and rape honey by PCA. Chemometric techniques such as PCA, LDA and HCA demonstrated to effectively classify four unifloral honey types from Kashmir valley by macro minerals (K, Ca, Na and P) and polyphenolic compounds data [10].

Oroian and Ropciuc [11] used physico-chemical parameters and phenolic compounds content for the classification of Romanian acacia, sunflower, linden, polyfloral and honeydew honey samples by PCA, LDA and ANN, but in this study no phenolic compounds were identified as chemical markers of floral origin. Ciulu et al. demonstrated used four physico-chemical parameters to classify Sardinian unifloral honey by a LDA model with a level of predictive accuracy higher than 95% [7].

LDA was successfully applied also to the classification of honey samples collected from various geographical origin and floral source by the phenolic compounds in commercial honeys [12].

The aim of our study was to quantify the total phenolic content, individual polyphenols and to evaluate the antioxidant activity in Italian honeys from different floral origins. The correlations between the analysed parameters were also studied and chemometric tools were applied on data to identify potential markers of honey floral origin that might be useful for authentication. Physicochemical parameters were also examined to assess the overall quality of honey.

Materials and methods

Samples

A total of 48 commercial honey samples with different commercial brands and from diverse floral sources were collected from the supermarket in Rome in 2020. The floral origin was attributed on the basis of what declared on the label and were: acacia, orange, honeydew, chestnut, strawberry tree, sulla, eucalyptus, dandelion, linden, polyfloral.

The honey samples originated from different geographical regions Lazio, Sicily, Sardinia, Friuli Venezia Gulia, Trentino Alto Adige. Samples were stored at room temperature in a dark place until analysis. The storage time was less than a month.

Chemicals and materials

Acetonitrile, methanol (HPLC grade), sodium carbonate, hydrochloric acid, and Folin–Ciocalteu reagent were purchased from Merck (Darmstadt, Germany).

2,2-azino-bis (3-ethylbenzothiazoline-6-sulphonic acid) diammonium salt (ABTS), potassium persulphate, 2,2-diphenyl-1-picrylhydrazyl (DPPH), cis, trans-abscisic acid standard, flavonoid standards rutin (quercetin 3-O-rutinoside), luteolin, quercetin, apigenin, kaempferol, naringenin, hesperetin and chrysin, phenolic acids standards: gallic acid, ellagic acid, syringic acid, caffeic acid, chlorogenic acid and coumaric acid and were all purchased from Sigma–Aldrich (Steinheim, Germany).

Physicochemical parameters

The pH of samples was determined according to the IHC [13], total acidities were determined with the volumetric method [14]. HMF in honey was determined by HPLC/UV method according to IHC [13].

The colour intensity, defined as the net absorbance Abs450 and Abs720, was determined as described by Beretta et al. [15].

Determination of antioxidant capacity and total phenolic content

Samples were prepared according to the slightly modified method proposed by Lamien-Meda et al. [16]. The TPC was spectrophotometrically determined with a Folin–Ciocalteu method reported by Singleton et al. [17], with some modification, results were expressed as Gallic Acid Equivalents (GAE) per 100/g of honey.

The AC of the extracts of honey samples was evaluated by two essays that evaluate the radical scavenging activity (RSA) of two different radical compounds: 2,2-diphenyl-1-picrylhydrazyl (DPPH) and 2-azino-bis (3-ethylbenzothiazoline-6-sulphonic acid) diammonium salt (ABTS) as described in the works by Gasic et al. [18] and Jasicka-Misiak et al. [19].

RSA was calculated for both the radicals as a percentage of radical discolouration.

Identification and quantification of individual phenolic compounds

Extraction of polyphenols for chromatographic analysis

Honey samples were treated by the SPE extraction method described in Michalkiewicz et al. [20].

HPLC/DAD analysis

HPLC analyses were performed on a Shimadzu HPLC system, a LC-10AT liquid chromatograph equipped with four pumps FCV-10AL, a degasser DGU-14A, a Rheodyne 7725i injector with a 20 μL sample loop (Rheodyne, Berkeley, CA, USA) and a photodiode array detector SPD-M20A.

The column used was a C18 Kinetex (150 mm × 4.6 mm, 5 μm particle size), and a guard column, all supplied by Phenomenex, Torrance, CA, USA).

The mobile phase was a mixture: 2% acetic acid in water (solvent A) and acetonitrile (solvent B). The final optimized conditions were as follows: a linear gradient was from 3% B to 45% B in 55 min and then increased to 100% B in 60 min, and then kept for 2 min at a constant flow rate of 0.7 mL/min at room temperature (25 °C). The eluted compounds were monitored at 280, 320 and 350 nm and the adsorption spectra between 200 and 600 nm.

Identification of phenolic compounds was carried out by comparing their retention time and spectral characteristics of unknown analytes with those from reference standards. The HPLC method of detection was linear for all analytes in a range of three orders of magnitude. The limits of detection and quantification were in the range of 0.03–0.42 mg/L and 0.10–1.39 mg/L respectively. The calculated recovery of each phenolic compound ranged from 92 to 99%. A typical HPLC chromatogram of the phenolic compounds studied is represented in Fig. 1.

Fig. 1
figure 1

HPLC chromatograms of an acacia honey sample: 1 Gallic acid, 2 Chlorogenic acid, 3 Caffeic acid, 4 Coumaric acid, 5 Ellagic acid, 6 Rutin, 7 Abscisic acid, 8 Quercetin, 9 Luteolin, 10 Naringenin, 11 Hesperetin, 12 Kaempferol, 13 Apigenin, 14 Chrysin. Wavelength used: 280, 320 and 350 nm

Statistical analysis

Each chemical parameter was measured in triplicate. The obtained data were expressed as mean values ± standard deviations. The obtained analytical data were processed statistically by analysis of variance (ANOVA) and used to evaluate significant differences among honey type with regard to phenolic compounds composition and bioactive properties. The Tukey–Kramer test was used to discriminate the honey category (p ≤ 0.05). Pearson's correlation coefficients between the different parameters were also calculated to highlight between variables’ association. Statistical tools such as PCA and HCA using the Ward's clustering were carried out as unsupervised data analysis techniques to evaluate the ability of phenolic compounds profile in combination with bioactive properties to classify honey samples according to their floral origin. Starting from these results, a linear discriminant analysis (LDA) mathematical model was constructed. All the computations were performed using V-PARVUS [21].

Results and discussion

Physicochemical parameters

Physicochemical parameters investigated were pH, free acidity, lactones and total acidity, HMF and colour since are important parameters for the quality control of different honey samples. Results are displayed in Table 1. In the samples object of the present study, the pH values were in the range of 3.51 (sulla honey)—4.64 (chestnut honey), in accordance with those previously reported in Italian honeys [14]. No sample had a value of free acidity greater than the limit set by EU Directive 63/2014 (50 meq/kg), with the maximum value in Honeydew honey (33.91 meq/kg). These results confirm the good quality and preservation of the samples with no deterioration from fermentative processes. The total acidity in honey must be evaluated as the sum of free and lactone acidities, with glucone‐δ‐lactone that hydrolyze in gluconic acid, the major contributor to honey acidity [14]. No significant differences resulted in the samples, with the maximum value of 44.4 mg/kg in honeydew honeys.

Table 1 Physicochemical parameters of the honey samples analyzed

HMF values ranged from 8.74 mg/kg in sulla honeys to 26.00 mg/kg in eucalyptus honeys, with no sample that exceeded the UE legal limit of 40 mg/kg. This parameter depends largely on poor storage and overheating, as it is formed by the degradation of fructose [8], therefore its large variability in the samples analyzed can be attributed to the differences in geographical origin and different producers, besides the floral origin.

Colour intensity of honey is linked to the formation of Maillard reaction products during storage but also to the presence of pigments, such polyphenols and carotenoids and therefore to the floral origin of honey as to be used in the classification of unifloral honeys [22]. Colour has been reported to be correlated to honey’s total phenolic content and antioxidant capacity, with darker coloured honeys often have higher antioxidant properties [23, 24]. Can et al. [25] explained that honey colour is mostly reliant on nectar sources and pollen contents which contain various colour pigments i.e. anthocyanins, phenolic acids, proanthocyanidins and flavonoids, and mineral constituents.

In this paper, honeys were classified as dark honeys if their colour intensity was above 200 mAU. Therefore, eucalyptus, chestnut, strawberry tree, dandelion and honeydew honey were “Dark” while polyfloral, sulla, acacia, orange and linden were “Light”.

Total phenolic and antioxidant capacity assays

Total phenolic content has been estimated in the samples studied and the results are displayed in Table 2. The highest total phenol content has been found in Strawberry tree (102.81 mg GAE/100 g) honey samples followed by Honeydew, Chestnut and Eucalyptus (47.89 mg GAE/100 g). The other honey types studied were more similar among them with no statistical differences and ranged from acacia (27.23 mg GAE/100 g) and polyfloral (33.34 mg GAE/100 g).

Table 2 Antioxidant capacity and total phenolic content of the honey samples analyzed

These results confirm data previously reported in literature, that revealed a high presence of phenolic compounds in strawberry tree and honeydew honey [6, 7, 26]. Previously reported findings on the correlation between total phenols content and colour intensity have been confirmed (r2 = 0.80, p < 0.05), with darker honeys having a higher phenolic content and antioxidant capacity [23].

Antioxidant capacity revealed a strong correlation with colour intensity (0.79 for DPPH test and 0.84 for ABTS TEST, p < 0.05), and total phenols content (0.83 for DPPH test and 0.97 for ABTS TEST, p < 0.05), with strawberry tree, chestnut, honeydew and eucalyptus honeys showing the highest antioxidant power. Therefore, honey colour properties can be considered a suitable indicator of the antioxidant characteristics of honey.

Regarding the profile of the 15 individual polyphenols determined by HPLC/DAD, the results are presented in Table 3.

Table 3 Polyphenols content in different types of Italian honey

The six phenolic acids were always present in all the honey varieties studied except for syringic acid in acacia chestnut and linden honeys, ellagic acid in honeydew and sulla honeys and caffeic acid in sulla honey. The most abundant acids were chlorogenic and caffeic, in particular caffeic in chestnut (14.74 mg/kg), linden (17.73 mg/kg) and polyfloral (14.32 mg/kg).

The presence of good quantity of caffeic and coumaric acid in chestnut honey has been reported by Tomas Barberan et al. [27] with values superior to those found in this study (30–150 mg/kg, 10–100 mg/kg, respectively), while those reported by Perna et al. [5] in Italian honeys were quite similar.

The high presence of these two organic acids emerged also in honeydew samples, with highest values respect other papers [3, 27], probably linked to their different botanical and origin.

Linden honeys had a high concentration of caffeic acid, not reported elsewhere, probably because this honey variety is not largely studied. The gallic acid dimer, ellagic acid, was present in strawberry tree and eucalyptus honeys in quantities comparable to those reported in heather honey (3–11 mg/kg) in the paper by Tomas Barberan et al. [27]. The authors conferred to this compound a marker status for heather honey because this compound was seldom found in individual samples of other floral origin.

The concentration of the 7 flavonoids ranged from 4.49 mg of luteolin per kg of strawberry tree honey to 0.24 mg of hesperetin per kg of acacia honey. Luteolin was not detected in chestnut, sulla and linden, quercetin in chestnut and strawberry tree, rutin in linden and hesperetin and apigenin in chestnut honey. From literature data is evident a strong variability of these compounds in different honey varieties from different geographical origins, that can be attributable to different floral diversity and maturity [5, 6]. This aspect is particularly evident in polyfloral honeys that show the largest variability.

In citrus honey, hesperetin is considered a marker of the botanical origin [27, 29], and results in this paper confirms this evidence with orange honeys showing a medium content of this compound more than double the others honey types.

Hesperetin content detected in multifloral honeys is attributable to the simultaneous presence of pollen and nectar from different botanical species, in particular officinal herbs, in which the concentration of this flavonoid is high [30].

Strawberry tree honey differed from the other samples also for a relevant presence of rutin and luteolin. For rutin, up to our knowledge this is the first paper reporting this compound in this honey type, while luteolin was determined also by Petretto et al. [6], in a similar level.

Abscisic acid is a plant hormone which is related to the protection of plants in drought and to environmental stress. Its presence and amount could be used as a complementary method to determine the botanical origin of honey [27]. Abscisic acid was present in all honeys analysed with a range from of 0.99 mg/kg in sulla honey to 8.62 mg/kg in linden honey. It is present also in quite large amounts in strawberry tree, with inferior values to those reported by Tuberoso et al. [31].

Honeydew samples showed significant superior levels of kaempferol, apigenin and chrysin, with values much higher to those previously reported [9, 28] that can be explained with the different geographical and botanical origin of the samples analysed. Furthermore, the presence and quantity of apigenin and chrysin in honey depend on the degree of contamination of honey with propolis [27, 32].

Significant Positive correlation for syringic, ellagic and coumaric acids both in DPPH (r = 0.69, 0.67, 0.73, respectively; p < 0.05) and ABTS (0.88, 0.83, 0.79, respectively; p < 0.05) tests.

Among flavonoids only luteolin and rutin showed a strong positive correlation to antioxidant capacity in both tests (luteolin vs DPPH test r = 0.79, vs ABTS test r = 0.87; p < 0.05). Same results for the correlation test with total phenols content, except for rutin which had a positive correlation but not significative.

The correlations observed can be related also to the chemical interactions that usually occur in complex food matrices such as honeys, as synergism or antagonism among phenolic compounds.

Multivariate analysis

Unsupervised pattern recognition techniques such as cluster analysis (CA) and principal components analysis (PCA) were used to evaluate the data matrices in order to highlight a natural grouping among samples. Prior to chemometrics application, data of phenolic compounds, antioxidant capacity total phenolic content and colour intensity, were autoscaled to standardize the statistical importance of all responses. Then, a matrix of samples (n = 48) and response variables (n = 20) was built, in which samples were adopted as lines and variables as rows.

Then the dataset was object of LDA, a supervised statistical technique used to find a linear combination of features which characterizes or separates two or more classes of objects, with the aim of classifying the honey samples according to botanical origin.

Cluster analysis

The first multivariate approach used was cluster analysis to verify whether the collected data would be able to identify groups among the honey samples. For CA, sample similarities were calculated on the basis of squared Euclidean distance, and the Ward hierarchical agglomerative method was used to group samples into clusters.

The result obtained by cluster analysis, presented as a dendrogram (Fig. 2), shows the honeys samples grouped into 9 clusters at a similarity level of 0.5, ordered from the sample richest in antioxidants to the lowest. Cluster number one, three, four seven and nine contain only honey samples from a unique floral honey type (strawberry, honeydew, chestnut, eucalyptus and orange). The second cluster includes both eucalyptus and dandelion samples, the fifth includes both lime honeys and polyfloral honeys, the sixth contains all the acacia samples but also two dandelion honey samples, the eight contains all the sulla samples, one polyfloral and one dandelion honey samples.

Fig. 2
figure 2

Dendrogram obtained by cluster analysis performed using the single Ward algorithm (Euclidean distance) on the data obtained on honey samples grouped by botanical origin. STRAW strawberry tree honey, EUC eucalyptus honey, DAN dandelion honey, HDEW honeydew honey, ORA orange honey, CHES chestnut honey, POLY polyfloral honey, LIN linden honey, SULLA sulla honey, ACA acacia honey

This preliminary analysis allowed to visualize the differences in antioxidant properties and compounds between different honey types, with the immediate evidence that dandelion, polyfloral and eucalyptus were not unequivocally classified.

Principal component analysis (PCA)

PCA was applied to separate the samples according to the phenolic composition, colour, and antioxidant activity.

Since the PCA performed on the total dataset did not result in a two-dimensional graph able to visually distinguish the natural grouping of the different honey types, the dataset was divided in two, a dataset included honeys classified as dark honeys (colour intensity > 200 mAU) and the other included light honeys.

Therefore, in Fig. 3A are considered Dark honeys (eucalyptus, chestnut, strawberry tree, dandelion and honeydew honeys) while Fig. 3B is the PCA plot for light honeys (polyfloral, sulla, acacia, orange and limen).

Fig. 3
figure 3

Biplot of the first two components obtained by PCA performed with data obtained for honey samples grouped by botanical origin. Graph A: Dark honey, samples with colour intensity above 200 mAU. Graph B: samples with colour intensity under 200 mAU. STRAW strawberry tree honey, EUC eucalyptus honey, DAN dandelion honey, HDEW honeydew honey, ORA orange honey, CHES chestnut honey, POLY polyfloral honey, LIN linden honey, SULLA sulla honey, ACA acacia honey

In PCA of Fig. 2A, Principal component 1 (PC1) explained up to 30.38% of total variance and PC2 explained 17.73%. Thus, the two-dimensional graph presented was able to explain 48.11% of the variability in the experimental data.

The variables that contributed more to the PC1 were the sum of individual polyphenols, syringic, caffeic, chlorogenic, ellagic and abscisic acids, apigenin, luteolin, rutin. The PC2 was associated with Antioxidant capacity essays, TPC, colour intensity, kaempferol, chrysin, and coumaric acid.

Along PC1 resulted clearly separated Strawberry tree honeys, characterized by the high presence of abscisic and syringic acid, luteolin and rutin, and on the opposite side chestnut honeys, with their high content in caffeic acid naringenin and quercetin. Along PC2 were grouped honeydew honey, characterized by kaempferol, coumaric acid and chrysin and eucalyptus and dandelion that resulted characterized for gallic acid and by the lower values in the antioxidant capacity tests and in TPC respect the other dark honeys.

The PCA carried out on light honeys revealed only two groups clearly separated along PC1 that accounted for the 33.79% of the variability, on a total explained variance of 58.38%: linden honeys, with high content of chlorogenic, caffeic, coumaric and abscisic acids and TPC; and on the opposite side of the plot, sulla honeys with high DPPH and gallic acid and low luteolin, naringenin and apigenin. As evidenced in HCA, polyfloral honeys were quite scattered over all the graph, revealing the high variability among different samples derived by the diverse floral sources. Acacia and orange honeys samples were gathered in the lower-left area of the graph, characterized by high values of hesperetin, kaempferol and rutin, and low values of colour intensity, TPC and antioxidant capacity. The samples were not definitively separated in two groups, as to be clearly classified.

Polyphenols mostly originate from the nectar collected by bees and are greatly dependent on the floral sources, and different polyphenols profiles reasonably are linked to the floral origin of honey [12]. After a preliminary discrimination based on the colorimetric results, the results of PCA concerning dark honeys show that caffeic acid and naringenin can be suggested as marker of chestnut honey, while kaempferol, chrysin and coumaric acid represent marker of honeydew honey. Rutin, luteolin, syringic and abscisic acid characterize strawberry honeys. For light honeys, gallic acid and high antioxidant properties are representative for sulla honey, hesperetin has been confirmed as a marker for orange honey and the organic acids coumaric, chlorogenic and caffeic can be addressed as marker for the poorly studied linden honey.

Linear discriminant analysis (LDA)

Starting from these encouraging results of the explorative multivariate tests such as CA and PCA, an LDA analysis was performed on the entire dataset in order to have a mathematical model to correctly classify the honey types object of the study. Validation of the LDA model was carried out following leave-one-out cross-validation (LOOCV) procedure. The content of polyphenols, Antioxidant capacity tests, TPC and colour were used as independent variables, and types of honeys were used as responses.

The classification capacity of the model proposed by LDA demonstrated to be more accurate and efficient than PCA and CA in distinguishing among different honey varieties, having classification ability of 100% and the prediction capacity of 95.8%, as shown in Table 4 and Fig. 4.

Table 4 Confusion matrices of calibration (top) and prediction (bottom) for the LDA model based on phenolic compounds, antioxidant properties and colour, in calibration and prediction, according to a leave one-out cross-validation procedure and % of correct classifications or prediction
Fig. 4
figure 4

2D scatter plot (root 1 vs. root 2) using linear discriminant analysis to classify honey of different botanical origin based on phenolics contents, antioxidant properties and color intensity

Only the category of polyfloral honey was not satisfactorily classified with a prediction score of 71.43%, that affected the total prediction rate of the model. The misclassification occurred for two polyfloral samples that were classified as linden honey. These results were expected for the large variability in the floral origin of polyfloral honey samples [12].

Reduction of variables resulted in a poorer prediction rate of 85.4%, when only polyphenols were considered. Prediction rate was 75% for sulla, 80% for acacia and polyfloral decreased to 28.6%.

Better results were achieved when polyphenols and colour were considered, with an overall prediction rate of 87.5% with polyfloral honey increased at 42.9%.

Although a limited number of samples were used, LDA method seemed to be a suitable approach to discriminate Italian unifloral honey samples according to the phenolic composition, antioxidant properties and instrumental colour. Concerning the prediction abilities, the method has led to a good classification of the samples according to their botanical origin, with 95.8% of the samples that have been correctly classified. Discrimination of the unifloral honey studied: acacia, orange, strawberry tree, honeydew, chestnut, eucalyptus and linden honey was more accurate than polyfloral honey samples, due to their large variability in floral composition origin, and consequently in antioxidants composition.

Thus, more samples should be used to validate the proposed model and other statistical approaches should be applied to find the key markers for each honey type.

Conclusions

Characterization of the antioxidant capacity and polyphenols profile and content in honey is important to determine its health benefits.

In the present study samples of 9 Italian unifloral honeys and polyfloral honey were examined in order to identify possible markers of their botanical origin for authentication purpose. Polyphenols and antioxidant capacity were chosen as promising parameters for this goal.

The results confirmed that content and type of phenolic compounds in honeys and its antioxidant effectiveness are linked on honey floral origin as strictly correlated to nectar and pollen chemical composition that are highly variable among plant species.

The application of a LDA model succeed in classifying the 9 unifloral honeys according to their botanical origin, as they greatly differed in the individual polyphenols content and colour attributes. Polyfloral honeys, due to their natural variability in pollen origin, showed a high level of misclassification, as to be not correctly classified by the model proposed.

The current study has evidenced the role of colour attributes in honey quality assessment as dark-coloured honeys confirmed their higher antioxidant activities and phenolic content, as good correlation between antioxidant parameters with the colour intensity of honey was found. Among them, strawberry tree honey and honeydew honey had the highest values.

Furthermore, colour data also demonstrated to be a key parameter to discriminate among different unifloral honeys and might represent a useful starting point for the development of a preliminary spectrophotometric screening method when a large number of varieties is considered.