Introduction

Over recent decades, the vogue of minimally processed natural foods has increased expeditiously among consumers to sustain a healthy and conscious lifestyle. For that reason, the food and pharmaceutical industries are intrigued toward superfoods as drug alternatives and functional or dietary ingredients [1]. Propolis is one of the significant api-products known for its indispensable nutritional and therapeutic values. It is a highly complex material recorded to contain hundreds of compounds contributing to its ineffable bio-functionalities [2]. Propolis generally comprises 40–70% resin and/ or vegetable balsam attributes higher concentrations of phytochemicals such as phenolic acids, prenylated benzophenones, flavonoid glycosides, flavonoid aglycones and their esters, volatile organic compounds and their esters, phenolic, sesquiterpenes, quinones, coumarins, steroids, aldehydes, alcohols, ketones, and amino acids which are primarily accountable for its bioactivity and diverse functionalities [3].

Propolis is commercially available in various forms and formulations, such as capsules, powder, aqueous or ethanolic tinctures, extracts, etc. [4]. But, according to various studies, 70–80% of ethanolic solutions possess the highest biological and chemical activities [5]. Recently, various molecular simulation investigations demonstrated the effect of propolis extract (rich in flavonoids) on the reduction of viral replication. During COVID-19, many studies and clinical trials confirmed its anti-viral activity against the potent SARS-CoV-2 virus. As per the observation, the COVID-19 patients who consumed propolis showed fast recovery, an earlier sign of viral alleviation, and a decline in mortality rate than other patients [6]. Several studies have also demonstrated high levels of cytotoxicity driven by propolis over the tumor cell line [7]. Furthermore, numerous investigations also provided substantial evidence related to the other bio-functionalities of propolis, such as anti-bacterial, anti-fungal, anti-inflammatory, immunomodulatory, hepatoprotective, anti-septic, anti-carcinogenic, anti-tumor, and antioxidant properties [8].

However, there is still a scarcity of information related to the chemical composition of propolis from diverse origins, which resists its usage in commercial medicinal and food formulations [9, 10]. Propolis is generally categorized into seven groups: poplar, birch, green, red, clusia, pacific, and Mediterranean, based on geographical, botanical origins, and specific marker compounds primarily accountable for diverse bioactivities [11]. Several countries such as Brazil, Russia, Argentina, Turkey, Ukraine, Mexico, China, Taiwan, Japan, Korea, Serbia, Poland, Finland etc., have conducted enormous studies on the chemical profiling of their native propolis and also established standards for its quality parameters discussed in the previous publication [3, 10].

India is a hub of diverse flora due to tropical and subtropical regions, and has a great potential to produce high-quality apicultural products. Regardless, there is an inadequacy of scientific studies and information on the worth of Indian propolis [2]. The information regarding the chemical profiling of superfoods is crucial to be utilized as a functional ingredient in various food and nutraceutical formulations. Henceforth, it is necessitated to conduct a systemic investigation on the chemical profiling of propolis from India.

Considering the earlier discussion, the approach of the present study is to explore in-depth chemical characterization such as qualitative and quantitative analysis, via advanced chromatographic techniques. Thus, the present study analyzed the polyphenolic composition, antioxidant capacity, and outlining of biomarker compounds of Indian propolis collected from different locations in northern India. In this study, an artificial neural network (ANN) was also employed to predict the antioxidant capacity of propolis from multiple botanic sources according to their polyphenolic content. Further, this investigation differentiated the propolis samples using principal component analysis (PCA) and hierarchical cluster analysis (HCA) to understand the relationships between the geographical location of propolis and their polyphenolic composition. This study would be beneficial in the rapid screening of the geographical origins of propolis from their antioxidant potential or polyphenolic composition. Thus, this detailed data information may aid in further food formulation and processing and assist in establishing quality parameters for Indian propolis.

Materials and methods

Chemicals, reagents, and standards

Gallic, rutin, quercetin, syringic acid, Caffeic, galangin, pinocembrin, catechin, hesperidin, naringenin, kaempferol, chrysin, myricetin, benzoic acid, luteolin, cinnamic, apigenin, beta-carotene, tannic acid, ellagic acid, chlorogenic acid, CAPE, p-coumaric acid, ferulic acid, ascorbic acid, and 2,2-diphenyl-1-picrylhydrazyl (DPPH), 2,4,6-tripyridyl-S-triazine (TPTZ) reagent, ammonium molybdate were purchased from Sigma-Aldrich (St. Louis, Missouri, USA). The HPLC-grade methanol, acetonitrile, water, and formic acid for chromatographic analysis were obtained from Merck (Darmstadt, Germany). The analytical-grade reagents and chemical such as Folin–Ciocalteu reagent, ferrous sulphate, sodium carbonate, aluminium chloride, sodium nitrite, and sodium hydroxide were bought from Loba Chemie (Mumbai, India).

Propolis samples collection

The bee propolis samples (n = 30) were collected from different geographical locations of Indian states, such as Himachal Pradesh, Punjab, Haryana, and Rajasthan, during April to October 2019 with the help of local beekeepers of the particular areas (Table S1). It was observed that the vegetation from Genus such as Acacia, Ocimum, Ziziphus and Azadirachta, Eucalyptus, Tagetes, Rosaceae, Morus, Mangifera, Brassicaceae, Dahlia, Trifolium, Albizia, Murraya, Plumeria, and Ocimum were commonly found nearby the apiaries. Whereas, the type of vegetations like Genus Cedar, Quercus, Pinus, Dalbergia, Anogeissus, and Rhododendron were found particularly in abundance nearby the apiary situated in Himachal Pradesh region and Genus such as Cucumis, Gossypium, Accasia, and Butea were particularly observed around the periphery (1–2 km) of apiaries located in Rajasthan region. The beehives under study were covered with propolis traps, and propolis samples were collected twice a month. The propolis trap was covered with polyethylene sheets placed in the freezer at − 18 °C for 2 h and after that, propolis was scraped out of the propolis traps. The collected propolis chunks were sorted to remove impurities and further processing steps have been mentioned in previous publication [2].

Preparation of propolis extract

Ethanolic extracts of Indian propolis (EEIP) were prepared by suspending 10 g of propolis sample in 70% ethanol (1:10, w/v) placed on continuous stirring for 24 h followed by probe sonication (NexTgen Lab 500, SinapTech) at 35 ± 2 °C, 30 s pulse on and off, 30% amplitude for 30 min. Afterward, solutions were subjected to rotary vacuum evaporator (Rotavapor R-300, BUCHI, Switzerland) at 45 °C for 1 h to separate ethanol and water to obtained concentrated dried propolis extracts and stored at − 20 °C until further analysis.

Total phenolic content (TPC)

Total polyphenol contents of EEIP were examined with sight modification in the method of [12]. Amount of 0.5 mL of each extract (2 mg/mL) was taken, mixed with 5 mL of 0.2N Folin–Ciocalteu reagent, and left for 5 min. Afterward, 4 mL of 7.5% sodium carbonate was added in the previous mixture. The final mixtures were held for 45 min at room temperature. The absorbance of the reaction mixture was measured at 765 nm using a UV–vis spectrophotometer. The standard curve was prepared using gallic acid (0 to 2 mg/mL) and the total polyphenol content was expressed in terms of milligram of gallic acid equivalent per gram ethanolic extract of EEIP (mg GAE/g).

Total flavonoid content by aluminum chloride method

Total flavonoid contents in EEIP was evaluated by the procedure given by [13]. About 1 ml of EEIP (2 mg/mL) solution was taken and mixed with 1 ml of 2% AlCl3 in 50% methanol solution. Later the mixture was left for 45 min at room temperature. The absorbance of the solution was measured at 420 nm. The standard curve was prepared using solutions of quercetin in methanol solution with concentrations ranging from 0 to 1 mg/mL, and the total flavonoid contents were expressed in terms of milligram of quercetin equivalent per gram ethanolic extract of EEIP from a calibration curve.

DPPH radical scavenging assay

DPPH assay of EEIP was employed by the methodology established by [14], wherein 0.2 mL of each EEIP was taken and combined with 4 mL of 0.1 mM DPPH solution in 80% methanol. The mixtures were stored in the dark for the next 30 min. The absorbance was measured at 517 nm against a blank solution containing the same quantity of methanol and DPPH. The ascorbic acid (10–1000 μg/mL) was employed as standard, and the results were demonstrated based on free-radical scavenging activity (%, FRSA):

$${\text{FRSA}}\left( \% \right) = \left[ {1{-}{\text{As}}/{\text{Ac}}} \right] \times 100,$$

where Ac and As refer to the absorbance of the control and sample, respectively; this data was also expressed in the form of SC50 (μg/mL).

Ferric reducing ability of plasma (FRAP) assay

The ferric reducing antioxidant assay was determined by the procedure defined by Ku’s et al. [15] with minor modifications. The reagents were formulated by combining 20 mmol/L ferric chlorides with 10 mmol/L TPTZ reagent in acetate buffer having pH 3.6. The calibration curve was prepared using ferrous sulfate as an external standard with a concentration range of 0.1–2.5 mmol/mL. About 0.3 mL of each extract of EEIP having a concentration of 0.4 mg/mL was taken and mixed with 3 mL ferric complex. The solutions were stood for 10 min. Later, the absorbance was taken at 593 nm. The quantitative results were assessed and expressed in micromoles of Fe2+ per gram of EEIP. All the measurements were performed in triplicate.

Total antioxidant capacity (TAC) assay

The total antioxidant capacity of EEIP was determined by the formation of phosphomolybdenum complex through applying the process described by Prieto et al. [16]. A 0.5 mL of each EEIP solution is mixed with 5 mL of phosphomolybdenum reagent. The test tubes were covered and incubated in a hot water bath at 95 °C for 1.5 h. Next the sample solutions were cooled to the room temperature and the absorbance of the green colored solution were measured at 695 nm. Ascorbic acid solution (0–1 mg/mL) was prepared for calibration curve. The TAC was expressed in terms of milligram of ascorbic acid equivalent per gram ethanolic extract of Indian propolis (EEIP).

LC-ESI-QTOF-MS analysis

The concentrated dried EEIPs were analyzed on Waters Micromass Q-Tof microsystem equipped with a C18 column described by Martini et al. [17], with slight modifications. The system was functioned at 1 mL/min of flow rate consisting of solvents A and B [A: de ionized water and formic acid (99:1 v/v) and B: acetonitrile and formic acid (99:1 v/v)]. The gradient initiated at 4% B for 0.5 min trailed by linear gradient up to 30% B in 60 min. In purpose to washing out the column of mobile phase, the concentration elevated up to 100% B in 1 min and upheld for 5 min beforehand reaching to an initial condition. The eluate was split and subsequently transient through column wherein 0.3 mL/min was navigated to the mass spectrometer. Further, Dual Agilent Jet Stream Electrospray Ionization (ESI) was equipped to conduct mass spectroscopy (MS) at negative-mode polarity where ionization operating conditions were as follows: 100–800 m/z (mass-to-charge) range, 4 kV potential source, 1.5 spectra/s scan rate, 400 °C capillary temperature, 12 L/min gas flow, 30 psig nebulizer, – 500 v nozzle voltage, and 190–650 nm detection scanning range. Further, the chromatographic peaks in the propolis samples were determined by matching the retention times and MS data of the observed peaks with the information documented in the literature.

HPLC analysis

Polyphenol analysis of EEIP samples was performed on HPLC Water’s 2489 equipped with C18 column (Phenomix (100 mm × 4.6 mm, 5 µm)) described by Martini et al. [17] with some modifications. The mobile phase of system was operated at 1 mL/min of flow rate containing solvents A [acetonitrile and formic acid (99.8:0.2 v/v)] and B [de ionized water, acetonitrile, and formic acid (96:3.8:0.2 v/v)]. The elution gradient comprised the following: 5% A at 5 min, 15% A at 25 min, 20% A at 30 min, 25% A at 39 min, 45% A at 43 min, 95% A at 48 min, 95% A at 50 min, 20% A at 55 min, 100% A at 60 min. Afterward, 10 µL of 500 ppm sample extract was injected, and phytochemicals were detected at 280 nm at 60 min operating time. Furthermore, each peak of phenolic compound attained was identified with the help of an authentic standard by equating the retention time, and their quantification was done using standard curves. Further, the validation of the HPLC included several parameters, such as linearity, limit of detection (LOD), limit of quantification (LOQ), stability, system suitability, and robustness. Linearity was assessed by constructing calibration curves using 6 different concentrations for all 24 reference compounds. These curves were generated by plotting the peak area against the concentration of the reference compounds. Regression analysis was employed to derive the regression equation and correlation coefficients. The LOD and LOQ were determined by injecting the diluted standard solutions until the signal-to-noise ratio was about 3 and 10 for LOD and LOQ, respectively. To determine the stability of the analytes, stock solution reference compounds and sample extracts were stored at 4 °C in a refrigerator for 24 and 48 h. In addition, the stability of reference compounds and samples at room temperature was evaluated over 24 and 48 h. Further, the system suitability was ensured by injecting 2 mL of a standard solution mixture at least five times during the analysis. Robustness testing involved making minor adjustments to method conditions, including variations in the mobile phase, flow rate, and column temperature, and the results were subsequently evaluated.

Statistical analysis

All assays were conducted in triplicates and outcomes were indicated as mean values ± standard deviation. The variation in propolis samples was determined using the one-way analysis of variance (ANOVA) and Duncan’s multiple range test (DMRT) (p < 0.05) in Statistica.v.12 (Stat Soft, Tulsa, Oklahoma, USA). The multivariate analysis initiated with the building of data matrix for obtaining a representative and huge data set to obtain the consistent outcomes of statistical analysis. Then the PCA and HCA were performed using Statistica v.12. For artificial neural networking, a three-layered feed-forward neural network was applied using MATLAB R2016a (MathWorks, Massachusetts, USA), in which the neurons were systemized into three different layers as: input, hidden, and output. The following conditions were used to train the network: 5 inputs, 10 hidden layers, 1 output layer, 1000 epoch, and Levenberg–Marquardt back propagation method. The input layers included TPC, TFC, CAPE, galangin, and beta-carotene content while the output value was DPPH FRSA (%) which are strongly associated with the antioxidant properties of the test samples [18].

Result and discussion

Total phenolic content (TPC) and flavonoid content (TFC)

In the current investigation, the values of TPC of EEIP increased according to geographical locations in the following order (Table 1): Rajasthan propolis (RP), < Haryana propolis (HP), < Punjab propolis (PP), < Himachal Pradesh propolis (HPP), while the TFC value observed PP and lowest in RP samples (Table 1). Significant differences (p < 0.05) were found between TPC and TFC of bee propolis extracts from different geographical origins which might be due to variation in the botanical sources as well as climatic conditions of different regions from northern India. Our results showed slightly higher concentration of phenolics from Brazilian propolis (120 mg/g of GAE) [19], Turkey (27.48–103.88 mg/g of GAE) [20], and Thailand propolis (31.2 mg/g of GAE) [13]. Whereas showed lower content of TPC from Chinese propolis from Hebei (302 ± 4.3 mg/g of GAE) [21], Korean propolis Yeosu (212.7 mg/g of GAE) [19], Portugal propolis from Bornes (151 mg/g of GAE), Anatolia propolis (Aegean region 198 mg/g of GAE) as well as propolis from Argentina, Australia, Brazil, Bulgaria, and Chile with 298 mg/g of GAE [22, 23]. However, northern Indian propolis possessed higher concentration of flavonoid from China, Macedonia, Iran USA, Brazil, Thailand, and New Zealand. In contrast, propolis from different regions of Mexico showed higher TFC amount (243, 269, and 379 mg/g, respectively) from northern Indian propolis [24]. The TPC and TFC content of propolis collected from various locations are significantly influenced by the concentration and distribution of different phytocompounds depending upon the botanical and geographical origins.

Table 1 Total phenolic content (TPC), total flavonoid content (TFC), and antioxidant activity of Indian propolis from different geographical locations

Antioxidant capacity: DPPH, FRAP, and TAC assay

The EEIPs from different locations were revealed as potent free-radical inhibitors corresponding to minimum SC50 value (Table 1). This revealed that the highest antioxidant activity of HPP samples among other samples corresponded to its free-radical scavenging activity 0.71 ± 0.11 mg/mL SC50 value. The higher antioxidant potential corresponds to the lower SC50 value [25]. This phenomenon is also partially correct in this examination but cannot be followed in each case because of huge variations in the bioactivity of numerous polyphenolic compounds, particularly flavonoids.

Similarly, the HPP extracts showed maximum FRAP and TCA values (Table 1), thus confirming the DPPH test findings. These results revealed the lowest FRAP and TAC values for PP and RP extracts individually. Contrarily, only the TAC value of RP extracts corresponded with DPPH and TPC content. In contrast, the lowest FRAP value of PP extract could be due to a comparatively higher amount of non-phenolic compounds like amino acids or proteins etc., in the sample, which can bind with polyphenols and eventually reduce their antioxidant potential corresponding to FRAP [2, 18]. Furthermore, present TAC results of EEIP were similar to Turkey propolis (1370.6–6332.9 TE/ 100 g EEP) [20], but comparatively higher than Mexican propolis (39–54 TE/g of propolis) [24]. Likewise, current FRAP values were higher than Croatian propolis (0.04–1.3 mmol Fe2+/g) [26]. However, comparing present findings with other published reports is irrelevant due to different extraction methods, methodologies, and ways of data demonstration [27].

Qualitative analysis by LC-ESI-QTOF-MS

LC-ESI-QTOF-MS experiments were conducted to investigate the phytochemical composition in northern Indian propolis extracts. The LC-ESI-QTOF-MS facilitates a highly selective and specific approach to detect multiple compounds in a solitary run to bring more confidence in results, hence accommodating in authentication and determination of marker compounds in the material. In this study, overall, 67 chromatographic peaks appeared wherein 24 compounds were quantified using standards via HPLC technique. In total, 67 phytochemicals were identified that belonged to various polyphenolic classes including flavonoids, phenolic acid, carotenoids, phenol amides, phytosterol and their derivatives (Table 2, Fig. 1a).

Table 2 Polyphenolic compounds identified in Indian propolis using LC-ESI-QTOF-MS in negative mode at 280 nm
Fig. 1
figure 1

A representative chromatograms of the Punjab propolis: a LC-ESI-QTOF-MS and b HPLC. LCMS-QTOF: gallic acid (1), hesperidin (2), catechin (3), apigenin (4), ferulic acid (5), benzoic acid (6), p-coumaric acid derivative (7), luteolin (8), syringic acid (9), naringenin (10), gallic acid o-hexoside (11), kaempferol (12), spinacetin (13), chrysin (14), isoliquiritigenin (15), pinocembrin (16), quercetin (17), galangin (18), 3-o-p-coumaroyl quinic acid (19), myricetin (20), chlorogenic acid (21), digalloylglucose (22), valoneic acid dilactone or its isomer (23), Caffeic acid phenylethyl ester (24), p-coumaroyl quinic acid o-glucuronic acid (25), lutein (26), amentoflavone (27), beta-carotene (28), c-hexosyl-luteolin-o-p-coumaroyl hexoside (29), isorhamnetin-O-glucuronide (30), N1, N5, N10-tri-caffeoyl-N14- hydroxy feruloyl spermine (31), quercetin-3-o-rutinoside (32). HPLC: gallic acid (1), chlorogenic acid (2), caffeic acid (3), syringic acid (4), catechin (5), ellagic acid (6), p-coumaric acid (7), ferulic acid (8), benzoic acid (9), rutin (10), hesperadin (11), tannic acid (12), kaempferol (13), quercetin (14), pinocembrin (15), myricetin (16), luteolin (17), caffeic acid phenylethyl ester (18), galangin (19), chrysin (20), naringenin (21), apigenin (22), beta-carotene (23)

Identification of phenolic acids

The hydroxycinnamic acid, specifically chlorogenic acid was detected in all the samples at RT = 25.92 min with the distinct deprotonated [M–H] molecule observed at ion peaks (m/z 191) [28]. In addition, different derivatives of chlorogenic acid were also observed in the samples which formed by shikimate and phenylpropanoid pathways in all plants. Esterification of cinnamate derivatives and quinic acid molecules led to the formation of various compounds such as dicaffeoylquinic acid, 3,5-dicaffeoylquinic, 4,5-dicaffeoylquinic acid in which the precursor ion was detected at m/z 515, 518, and 518, respectively (possessing common fragmentation of ion at m/z 353), primarily found in all HP samples, whereas 3-O-p-coumaroyl quinic acid O-hexoside at m/z 501 obtained by condensation of the carboxy group of 4-coumaric acid with the 3-hydroxy group of quinic acid at m/z 353, with loss of a hexoside molecule at ion peak m/z 180, most probably glucose [28]. Ferulic acid and p-coumaric acids were also detected in the samples at ion peaks m/z 134 and m/z 119, respectively. These compounds were also detected in green Brazilian and Siberian propolis [29, 30]. The hydroxycinnamoyl derivative such as coniferin and feruloyl syringic acid (derivative of ferulic acid) identified only in HPP samples at m/z 179 and 197 (Table 2). Coniferin is a phenolic glycoside, primarily belongs to Coniferae family, wherein phenolic structure attached to a glycosyl moiety (m/z 179) which further hydrolyzed and formed 4-O-glucosides of coniferyl alcohol led to the synthesis of lignin in plants [31]. However, detection of phenolic glycoside is exceptional due to: (i) hydrophobic nature of plant resins and (ii) existence of b-glucosidase enzymes during propolis collection and processing [32]. Moreover, chromatographic spectra also confirm presence of simpler hydroxybenzoic acid, i.e., gallic acid with the distinct deprotonated [M–H] molecule observed at ion peaks at m/z 125 on the removal of CO2, followed by benzoic acid, protocatechuic aldehyde, and syringic acid with the fragmentation ions at m/z 77, 109, and 182, respectively, also detected in Portuguese and Siberian propolis [30, 32].

Tannins

Few gallotannin and ellagitannins derivatives were also identified in all the propolis samples. Two compounds detected at m/z 469 (Rt = 28.92 and 28.96 min) in both PP and HP samples revealed spectra of prime fragment ions of ellagic acid at m/z 425 with removal of CO2. These compounds named as valoneic acid dilactone and its isomers are hydrolysable tannin that can be isolated from the Eucalyptus species, etc. [33]. According to Hirano et al. [34], it demonstrated an inhibitory effect on 5α-reductase which intricate steroids metabolism and prostate cancer. A compound detected at m/z 783 yielding fragment ions at m/z 301 (indicated loss of hexahydroxy diphenyl (HHDP) glucose) and at m/z 481 (ellagic acid; deprotonated HHDP glucose showed loss of HHDP) corresponds to loss of di-HHDP glucose unit, presumably confirming pedunculagin or pedunculagin isomers [33]. A compound, i.e., methyl ellagic acid glucose (galloyl ester) identified at m/z 629, fragments at m/z 477 (methyl ellagic acid hexoside, loss of a galloyl group) [33]. A phenylpropanoid glycerides was also confirmed by the spectra at m/z 431 which exhibited a complex mixture of fragmentation ions from both feruloyl and dihydrocaffeoyl residues at m/z 275 and the compound was recognized as 1,3-O-feruloyl-dihydrocaffeoylglycerol. This compound was also found in wholegrain sorghum extract [35]. However, minor amount of phenylpropanoid glycerides are also found in European propolis indicating belongingness from Liliacea, Juncaceae, and Gramineae plant species [36].

Identification of flavonoids:

Glycosylated flavonoids

The analytical data revealed the presence of flavonoid glycosides, derivatives of apigenin, luteolin, quercetin, hesperidin, isorhamnetin, and kaempferol identified in propolis extracts along with most common and consistent sugar moieties such as glucosides, rutinosides, and glucuronides in nature that were identified in propolis. At Rt = 37.57 min, the presence of quercetin-3-O-rutinoside (at m/z 609) (rutin) was confirmed that corresponded to precursor ion at m/z 301, detected in all the samples, except HPP. Similarly, compounds [M–H] peak ion at m/z 491, m/z 637, and m/z 785 were identified as methylated derivatives of quercetin glycosides, recognized as isorhamnetin-o-glucuronide, quercetin-dimethyl-ether-o-rutinoside, and isorhamnetin-3-o-rutinoside-7-o-glucoside only detected in the propolis extracts. In case of isorhamnetin-o-glucuronide (m/z 491), the base peak in spectra (MS2) was at m/z 315, occurred due to the loss of a glucuronide residue, and also further fragmentation at m/z 70 corresponded to isorhamnetin (methyl group in C-3′ position) unit. Whereas in isorhamnetin-3-O-rutinoside-7-o-glucoside, spectra showed loss of rutinoside unit (m/z 308) and glycosidic residue at 7 position corresponding to a hexosyl residue [M–H] m/z 162, which was only detected in HP samples [37]. These compounds were also confirmed in European, Asian South American, and Portuguese propolis [32]. Three luteolin derivatives were also identified in propolis matrixes, at Rt = 30.82, 34.78, and 32.71 at peak ion at [M–H] m/z at 609, m/z 756, and m/z 709, respectively. The fragmentation aglycone ion discovered at m/z 285 corresponded to luteolin including another fragment ion at m/z 447 that showed loss of glycosidic units which tentatively confirmed luteolin-di-glucoside. Later, with loss of a hexoside molecule at ion peak m/z 180, most probably glucose corresponded to c-hexosyl-luteolin-o-p-coumaroyl hexoside [38]. The third derivative at m/z 709 showed precursor ion at m/z 299 that corresponded to chrysoeriol, i.e., methyl derivative of luteolin, further fragmentation at m/z 285 confirmed luteolin after loss of CH3 group and at m/z 447, it showed loss of glycosidic units that confirmed chrysoeriol o-hexosyl-o-malonylhexoside in only RP and HP samples. Glycosylated flavonoids, predominantly quercetin and luteolin derivatives, signified a large number of hubs in the leaf type network, representing a vital role for flavonoids in plant resistance [38]. On other side, derivative of kaempferol, at peak ion at m/z 771, presented a base peak product ion at m/z 285 with a fragmentation pattern similar to that of kaempferol, and a mass loss at m/z 447 corresponding to two beta-D-glucosyl units. This indicates the presence of kaempferol-3,7-diglucoside in the propolis matrixes [39]. A compound at m/z 563 was assigned to apigenin-O-pentosyl hexoside, which had a deprotonated [M–H] ion at m/z 563, demonstrated a precursor ion fragment (m/z 269) pertinent to apigenin moiety including fragmentation pathway at (m/z 149) reflecting the presence of pentosyl (1 → 2) hexoside [24]. A compound detected at m/z 609 was confirmed as hesperidin due to mass spectroscopy fragmentation pattern wherein most stable ions at m/z 286 formed after the loss of CH3 from m/z 301 [40].

Iso-flavonoids

These phytocompounds are primarily recognized for anti-aging and anti-carcinogenic characteristics [41]. In current investigation, three iso-flavonoids were identified in only RP samples which could be assigned as marker compounds such as: daidzein (Rt = 15.84 min), formononetin (Rt = 28.42 min), and medicarpin (Rt = 31.68 min). In case of Daidzein (phytoestrogen), the ion peak observed [M–H] at m/z 253 which yielded a fragment ion at m/z 199 and m/z 145 due to the removal of 2CO (carbon monoxide) and C6H6O2 (benzenediol), respectively [41]. For formononetin, m/z 266 was a precursor ion [M–H] to the m/z 252 product ion, whereas in medicarpin, showed [M–H] at m/z 269 in ESI spectra. The characteristic fragment ions of medicarpin included m/z 254 loss of CH3 [42, 43]. As per literature, these isoflavones were commonly found in leguminous plants (Fabaceae or Leguminosae) such as soybean, pea etc., as well as also detected in Brazilian red propolis [42].

Flavanone, flavone, and flavanol

In northern Indian propolis, six flavanones and flavones, along with nine flavanols were also detected (Table 2). About flavanones, pinocembrin (Rt = 20.00 min) was detected in all the samples, and its existence was confirmed by the precursor ion at m/z 255, followed by major fragmentation at m/z 213 due to the elimination of C2H2O groups [44]. Further, two compounds commonly detected in HP samples referred as naringenin and naringenin derivatives at Rt = 18.86 and 26.94 min, respectively. MS fragmentation at m/z 271 confirmed the presence of naringenin, whereas the second compound showed further fragment at m/z 317, corresponding to loss of sugar unit and tentatively speculated as naringenin-O-malonylhexoside [45]. Naringenin possesses antioxidant properties in grapefruit juice, and in the current study, all bee propolis samples demonstrated the presence of naringenin. Moreover, it confirmed the presences of apigenin in PP and RP samples, whereas precursor ion at m/z 272 was tentatively identified as butein only in RP samples [46]. A compound detected at Rt = 15.78 min showed a major precursor ion peak at m/z 285 corresponding to pinobanksin, further fragmentations at m/z 267 and 235 due to loss of CH3 and C2H5 groups, tentatively confirmed as pinobanksin-5-methylether-3-O-acetate. This compound was only detected in HP and RP samples as well as also found in Portuguese propolis [32]. In case of flavone and flavanol, at Rt = 19.22 min, a deprotonated [M − H] precursor ion at m/z 255 observed in all propolis samples confirmed chrysin (flavone). It is known for its potent biological properties against inflammation, diabetes, atherosclerosis, cancer, and other chronic diseases [47]. Moreover, compounds such as catechin (Rt = 13.08 and 13.09 min), luteolin (Rt = 16.37 min), kaempferol (Rt = 17.83 min), quercetin ((Rt = 20.34 min), myricetin (Rt = 25.55 min), and caffeic acid phenylethyl ester (CAPE, Rt = 29.89 min) were also detected in northern Indian propolis at deprotonated [M − H] at major ion peaks at m/z 289, 285, 285, 301, 269, 317, and 283, respectively, which later quantified by a standard curve using HPLC techniques. These compounds were also detected in Algerian, Siberian, Portuguese, Italian, Chilean, and Turkish propolis [20, 32, 44, 48,49,50]. In addition, two quercetin derivatives taxifolin (MS2 fragment at m/z 303) and quercetin-3,4′-O-di-β-glucopyranoside (precursor ion fragment at m/z 627) were confirmed in HPP and HP samples. Furthermore, two o-methylated flavones were detected in individual samples, [M − H] corresponded to major ion peaks at m/z 347 and 283, confirming the existence of spincetin and acacetin in PP and RP samples, respectively. Acacetin was also found in Portuguese and Algerian propolis [32, 44]. Amentoflavone (bioflavonoid of apigenin) was only found in PP samples, likewise also confirmed in propolis samples collected from Agri (Turkey) [51]. Furthermore, isoliquiritigenin (chalcone) was detected in PP and HP samples. Isoliquiritigenin is also known as a natural aldose reductase inhibitor and its presence might be due to the leguminous crops around the periphery of apiaries, since this compound is commonly found in Leguminosae family such as G. glabra, Amaryllidaceae etc., [52].

Other compounds

Phenolamides

Two phenolamides were detected in HP and PP samples mentioned in Table 2. A compound at Rt = 11.84 min exhibited the precursor molecular ion at m/z 436 [M − H] and yielded three MS/MS fragments at m/z 292 (corresponded to the loss of C9H6O2, m/z 204 (characteristic fragment ions of p-coumaroyl spermidine) and m/z 147 (indicating p-coumaroyl residue), confirmed the presence of N1(Z), N10(Z)-di-p-coumaroyl spermidine in HP samples. The spermidine-based compounds are associated with antimicrobial activities and delay the aging in humans [53]. Another compound, N1, N5, N10-tri-caffeoyl-N14- hydroxy feruloyl spermine, confirmed by a deprotonated ion [M − H] at quasi-ion peak at m/z 879 to yield fragment ion at m/z 719 [M − H– C9H6O3] including loss of caffeoyl and hydroxy feruloyl residue) to generate fragment ion at m/z 557 and 527 corresponded to the loss of C9H6O3 and C10H8O4, respectively. Further fragment ions at m/z 250 and m/z 483 indicated hydroxy feruloyl residue substituted at N1 or N14 of spermine. Likewise, these compounds were also detected in rapeseed and buckwheat bee pollens [54].

Carotenoids, anthocyanidins, and phytosterols

Carotenoids compounds detected at Rt = 32.21 and 33.89 min exhibited the precursor molecular ion at m/z 551 and m/z 537, referred to as lutein and beta-carotene, respectively [55]. Beta-carotene was widely distributed in all the samples except the HP propolis sample; this might be due to pollen particles in the propolis matrix, which results in leaching out of carotenoids from pollen to propolis during the extraction process or the abundance of mustard plants nearby apiaries. Later, this compound has also been quantified by HPLC analysis. Further, two phytosterols, such as stigmasterol and cholesteryl linoleate, were found in RP and HP samples, respectively (Table 2). Stigmasterol commonly occurs in the plant fats or oils of many plants, such as soybean, calabar bean, and rape seed, and various herbs used in herbalism practices and can inhibit the development of various cancerous cells by inhibiting the promotion and growth of apoptosis of cancer cells in human [56]. Similarly, cholesteryl linoleate was also found in wheat [57]; thus, this indicates presences of same crops or plantation around apiaries. Furthermore, some other flavonoids, including anthocyanins (anthocyanidins with sugar moieties), anthocyanidins and flavan-3-ols (contributing to proanthocyanidins) were only detected in HPP samples. Proanthocyanidins such as procyanidin dimer A2 and B2 were confirmed by [M − H] at m/z 579 and 577, respectively. Procyanidins B2 (epicatechin–epicatechin dimer) are abundant in fruits such as peach, nectarine, plum, and apple, and in vegetables such as broad beans. Procyanidin A2 is generally found in the mountain cranberry (V. vitis-idaea), horse chestnut (A. hippocastanum), and other fruits with anti-oxidant, anti-inflammatory, anti-bacterial, and anti-diabetic properties [58], hence indicating the plantation source nearby the apiaries in Himachal Pradesh. Anthocyanins compounds, such as cyanidin 3-o-glucosyl-malonylglucoside and cyanidin O-diacetyl hexoside-O-glyceric acid were confirmed by quasi-molecular mass fragmentation at deprotonated ion m/z 697 and 620, respectively (Table 2). These compounds are also found in blackberry, raspberry, common grapes, highbush blueberries, and chicories, mango, sweet oranges, lupines, passion fruits, garden onion, etc.[59].

Quantitative analysis by HPLC

The HPLC analysis quantified phenolic acid content from 2.71 to 8.42%, flavonoids constitute 47.69–73.39% and carotenoid content ranged from 1.73 to 21.74% in 1 mg of dry extracts of propolis (Table 3). HPLC chromatograms of the representative PP sample are shown in Fig. 1b. Among 24 standards, the galangin, beta-carotene, and p-coumaric acid were recorded as most common and dominating phytocompounds (from, flavonoid, carotenoid, and phenolic acid categories, respectively), presented in all propolis samples (n = 30). Nonetheless, the concentration of compounds altered significantly depending on the geographical and botanical origin.

Table 3 Quantification of phenolic compounds (mg/g) in Indian propolis from different geographical locations

In current investigations, most abundant flavonoids quantified in the all propolis samples were galangin CAPE, pinocembrin followed by apigenin and quercetin. The concentration of flavonoid such as galangin, naringenin, and rutin was found highest in RP samples (Table 3). In contrary, it contained lowest amount of CAPE, pinocembrin, quercetin, and chrysin than other propolis samples as well as contained lowest flavonoid content, i.e., 42.01%. In addition, luteolin was also recorded as major flavonoid in HP samples followed by PP and HPP samples, but not detected in RP samples. Similarly, the catechin and hesperidin content were also found high in HP samples in comparison to other samples, but not detected in RP samples. The kaempferol and myricetin concentration was observed highest in PP samples (Table 3).

According to several studies, CAPE is found as one significant phytocompounds associated with high antioxidant activity of poplar propolis [60]. For instance, [20] reported highest concentration of CAPE (2857.33 ppm) among other compounds in Turkish propolis. Similarly, Pellati et al. [48] and Escriche and Juan-Borrás [61] have reported high content of CAPE in poplar-type propolis from Italy and temperate region of Spain. So, this can be concluded that temperate region propolis are rich in CAPE and it can be considered as one of the marker compounds. The propolis from Europe, Ukraine, and Bulgaria predominantly contained chrysin as major flavonoid (120.4 of mg/g of ethanolic extract) [13]. Likewise, in our findings, HPP samples (temperate region of India) also contained maximum CAPE and chrysin concentration from other samples, i.e., 174.65 mg/g and 41.72 mg/g, respectively.

On the other side, the most abundant phenolic acid quantified in the all propolis samples were p-coumaric acid, tannic acid followed by gallic acid (Table 3). As per present investigation, HPP samples contained highest phenolics (8.42%) followed by PP (6.67%) and RP that possessed lowest concentration of phenolic acid (2.31%). In HPP samples, tannic acid contributed the most, followed by p-coumaric acid, ferulic acid, gallic acid, and chlorogenic acid (2.56%). Similarly, propolis obtained from northern region of Europe contained high concentration of phenolic acids, i.e., p-coumaric acid, ferulic acid, benzoic acid, benzyl p-coumarate, glycerol ester, and phenolic glycerides [62].

However, cinnamic acid and caffeic acid were not detected in the HPP samples. Contrarily, temperate region propolis from Beijing contained caffeic acid (3.74 mg/g) as a prime phenolic acid [63]. Moreover, cinnamic acid was also recorded as major phenolic acid specifically in HP samples, followed by syringic acid among other samples, whereas benzoic acid concentration was found higher in PP samples (Table 3). Escriche & Juan-Borrás [61] have reported higher concentration of phenolic acids such as p-coumaric (278–284 mg/g balsam), ferulic (243–260 mg/g balsam), caffeic acid (79–88 mg/g balsam), and cinnamic acid (48–59 mg/g balsam) in propolis extract from Romania. Furthermore, first-ever quantified beta-carotene (carotenoid) in Indian propolis, wherein highest concentration found in RP and lowest amount detected in HPP. Mouhoubi-Tafinine et al. [64] reported higher carotenoid content in Algerian propolis (45 mg/100 g) than honey (1 mg/100 g). Also, the carotenoid content in propolis could be varying due to some factors such as foraged by bees, culture conditions, and fruit maturity [65].

Apart from natural factors such as geographical location, climatic conditions, and botanical origin, the method of extraction (like solvent type, extraction conditions, extraction efficiency, etc.) also imparts huge impact on the balsam yield, type, and concentration of phytochemicals in extract and antioxidant potential of extract [66]. In present finding, the flavonoids’ concentration attained from the hydro-alcoholic extract was considerably much higher than the amounts extracted by methanol and aqueous medium [67]. Whereas, comparatively lower phenolic content could be due to leaching during removal of waxy compounds or at other possessing steps.

Multivariate statistical analysis by PCA

The PCA analysis was conducted to distinguish the Indian propolis samples as per geographical locations based on antioxidant activities and polyphenol profile. Wherein, three principal components (PCs) were derived as per the Kaiser criterion where eigenvalues were obtained 11.06, 4.94, and 2.63, which demonstrate 98.1% variation in the examined northern Indian propolis samples. The PC1, PC2, and PC3 displayed 47.68, 28.32, and 23.18% of the variance, respectively (Table S2). According to the factor loading analysis (Table S2), the PC1 was positively related with the TPC, gallic acid p-coumaric, DPPH (%), CAPE, tannic, and ferulic acid. In contrast, it was negatively associated with galangin, apigenin, and beta-carotene (Fig. 2a). These factors significantly influence PP and HPP samples in first and fourth quadrant, signifying dissimilarities among the samples, while galangin revealing similarities between RP and HP samples in third quadrant. The PC2 was positively correlated with TFC, DPPH SC50 but negatively associated with FRAP (Fig. 2b). Similarly, the PC3 was positively linked with ellagic acid while negatively associated with pinocembrin, directly influencing PP and HPP samples, respectively (Fig. 2a). Hence, the propolis samples from different geographical locations and botanical sources can be efficiently classified using antioxidant properties and phenolic compounds. Furthermore, hierarchical cluster analysis (HCA) revealed clusters (Fig. 2c) and sample similarities were evaluated using Euclidean distances (single linkage). The dendrogram displayed four clusters corresponding to each geographical location of the propolis samples. The results confirmed that 100% of the propolis samples were well categorized with respect to each propolis geographical region.

Fig. 2
figure 2

Principal component analysis score plots of northern Indian bee propolis samples from four different locations a projection of variables on factor-plane (1 × 2),  bprojection of botanical origins on factor-plane (1 × 2), and  cclassification of bee propolis samples from four geographical origins using hierarchical cluster analysis. HPP Himachal Pradesh propolis, PP Punjab propolis, HP Haryana propolis, RP Rajasthan propolis

Artificial neural networks (ANN)

An artificial neural network has nodes which are interacting in a group. It can generate a memory from processed data and predict the desirable outcome. In this, a neural networking tool (Fig. 3a) was developed for the discrete networks by ten hidden layers, which predicted the antioxidant potential of test samples based on TPC, TFC, CAPE, galangin, and beta-carotene content. While training and optimizing the model, the acquired regression curve displayed a satisfactory validation (R = 0.9998) following the 1000 epoch (Fig. 3b) wherein the best validation performance of the model is 0.3983, observed at epoch 16 (Fig. S3). R-value signifying a very strong positive linear relationship between variables analyzed depicted high strength of model. Hence, the validation run effectively accomplished the network, demonstrating the forecast about antioxidant potential of propolis samples (Fig. 3b). Furthermore, the predicted values were then tested to confirm the authenticity and accuracy of results exposed. Absolute errors of predicted data and the experimental data of 30 propolis samples were approximately ± 0.04.

Fig. 3
figure 3

a Neural network showing five inputs and ten hidden layers and b regression plot for the predicted network of test data

Conclusion

The LC-ESI-QTOF-MS qualitative analysis revealed extensive distribution of flavonoids, phenolics, and their derivatives in all propolis samples and indicate existence of multiple plant sources nearby the respective zone apiaries. Following compounds could be considered as biomarker compounds in northern Indian propolis of particular regions (i) proanthocyanidins and hydroxycinnamoyl derivative, (ii) quinate and flavonolignan, (iii) iso-flavonoids, and (iv) carotenoids and phytosterols. However, extensive study is still required on quantification of these biomarker to develop the outline of range and variation in the concentration present in the particular sample. The quantitative data information is paramount to validate the biomarker compounds range in specific region propolis to set regulations for authentication and chemical standardization. Therefore, Indian propolis must be explored vastly to be utilized as a nutraceutical or functional ingredient in the food processing segment.