Introduction

Pulses, intended as leguminous seeds harvested for dry grain, are the second most important food source for humans after cereals [1]. Pulses are high in proteins, complex carbohydrates (resistant starch and dietary fibre), minerals, vitamins, and other phytochemicals while low in fat, calories, and sodium. Lentils belong to the genus Lens, which includes many species and hybrids. This genetic diversity is likely to result in variability of their phytochemical composition and associated bioactive properties [2]. In lentil seeds is possible to distinguish three different parts namely cotyledon, seed coat, and embryo which represent 89, 10, and 1%, respectively, of the total seed weight. Among them, cotyledon contains the main reserve substances, basically, proteins and carbohydrates. On the contrary, the external tegument represents a protective barrier for the cotyledon and has the highest concentration of phenolic compounds [3]. Dehulling, which is the remotion of the pulse seed coat, is a primary process to produce peeled lentils and grounded flour. The lentil seed coat, generated from the dehulling process, is generally used in animal rations as a cheap feed source. However, this by-product is rich in bioactive molecules and contains health-promoting compounds. Specifically, several phenolic compounds have been detected in lentil seed coats, such as phenolic acids, stilbenes, flavonoids (flavan-3-ols, flavanones, flavones, flavonols, anthocyanidins, isoflavones) and tannins, including condensed tannins or proanthocyanidins [4] and several studies correlated the dark external tegument of legumes with high amounts of anthocyanins and condensed tannin [5, 6].

Globally, lentils are grown in three major distinct zones: Mediterranean, subtropical, savannah and northern temperate and, although lentils’ nutritional and chemical characteristics are well-known, cultivar, location, and growing conditions are the major factors in the production of bioactive compounds in lentils [7]. Different studies investigating the phenolic compounds in lentils have already been published [8, 9]. However, most of them have been focused on genetic diversity7. Minor information is available about the effect of the growing conditions in different locations on the health-promoting compounds of lentil cultivars. Currently, the application of pulses seed coat in human nutrition is still narrow despite they may offer exciting solutions towards the valorization and recycling of agri-food by-products. Accordingly, more effective uses of seed coats, including their addition to processed foods, are being proposed by food scientists. For example, lentils seed coat has been added to whole wheat bread [10, 11] and beef burger [12] improving their nutritional traits. Besides food fortification, the water extract of lentil seed coats was used in chicken bologna to control lipid peroxidation without modifying its texture and sensory profile [13]. In this framework, understanding how the cultivation area (i.e., the pedo-climatic conditions) and the plant genotype may affect lentils' phytochemical characteristics may serve as a valuable strategy to efficiently market food by-products. Notwithstanding, the use of lentil seed coats as animal feed should not be underestimated, considering the possibility to biofortify feed in fatty acids and bioactive compounds [14].

In the light of the above information, the research aimed to systematically investigate the phenolic profiles of lentils seed coats belonging to 11 different genotypes grown in two different Italian’s regions (Basilicata and Marche) and to determine if there is a relationship between the phenolic content and the color of the legumes seed coats.

Materials and methods

Sample collection and preparation

In the present work eleven lentil (Lens culinaris Medik) genotypes, including both landraces and cultivars, were evaluated. Seeds for each genotype were made available by Università degli Studi della Basilicata, Università Politecnica delle Marche and by two seed companies: the Italian ISEA and the French Agri Obtention (Fig. 1) (Table 1).

Fig. 1
figure 1

The eleven lentils genotype grown in Basilicata (from 1 to 11) and Marche (from 12 to 22) regions of Italy

Table 1 Lens culinaris Medik samples examined

Genotypes were grown in small plots (45 seeds/plot) during the growing season 2018/2019 in two different environments: at A.A.S.D. Pantanello of ALSIA in Metaponto (MT, Basilicata region, Southern Italy; 40° 23′ 27.7″ N, 16° 47′ 15.1″ E, 15 m a.s.l.) and at the Research Centre for Cereal and Industrial Crops (CREA-CI) experimental station of Osimo (AN, Marche region, Central Italy; 43° 27′ 00″ N, 13° 29′ 27″ E; 41 m a.s.l.).

The lentil seed coat was separated from the cotyledon using a semi-industrial husker (OTAKE, model FS20-SM) at a rotor speed of 2300 rpm.

Extraction procedure

Phenolic compounds were extracted from lentil seed coats using a slightly modified version of the method described by Zhang et al. [8]. Two hundred mg of lentil seed coats were extracted overnight with 20 mL of 70% MeOH containing 0.1% HCl (v/v) at 170 rpm using an orbital shaker (Thermolyne AROS 160, Barnstead International, Boston, MA) at room temperature. After 15 h extraction, the mixture was centrifuged at 3000 g for 10 min. The residue was re-extracted twice more, each with 10 mL of the same solvent and under the same conditions. The collected supernatants were filtered through a 0.45 μm PTFE membrane filter in preparation for spectrophotometric and UHPLC-ESI/QTOF Mass Spectrometry analysis.

Analysis of phenolic compounds

The content of total phenolic compounds (TPC) of lentils seed coats was measured using the method described by Galgano et al. [4]. The total flavonoid content (TFC) determination was determined as reported by Caruso et al. [15], while the condensed tannin content (CTC) of the extract was measured by the vanillin assay, as reported by Caruso et al. [16].

Colour of lentils seed coat

Lentils seed coat colour was performed using a Minolta CR- 400 model spectrophotometer (Minolta Camera Co, Osaka, Japan) according to the International Commission on Illumination (CIELab 1978). After dehulling, samples were transferred into a Petri dish, and lightness (L*) and color (+ a: red − a: green; + b: yellow; − b: blue) were assessed. The a and b values are used to define the psychometric parameters chroma (C) and the hue angle (h) according to Eq. (1) and (2):

$${\text{C a}},{\text{b }} = \left[ {\left( {\text{a}} \right)^{{2}} + \, \left( {\text{b}} \right)^{{2}} } \right]^{{{1}/{2}}}$$
(1)
$${\text{H a}},{\text{b }} = {\text{ arctan}}\frac{b}{a}$$
(2)

UHPLC-QTOF-MS untargeted profiling of phenolic compounds

Ultra-high-performance liquid chromatography (UHPLC) coupled with quadrupole-time-of-flight mass spectrometry (UHPLC-QTOF-MS) was used to investigate the untargeted phytochemical profile of different lentil extracts. A 1290 series UHPLC and a 6550iFunnel QTOF, both from Agilent Technologies (Santa Clara, CA, USA) were used, following extraction by Ultra-Turrax in 80% methanol, according to a previously reported method [17]. The chromatographic separation was based on a binary mixture of water and acetonitrile (LC–MS grade, VWR, Milan, Italy) acidified with 0.1% formic acid (v/v), and a C18-column Zorbax Eclipse Plus (50 mm × 2.1 mm, 1.8 μm, Agilent Technologies). The acquisition of accurate masses was done in positive scan mode (100–1200 m/z range), at a rate of 0.8 spectra/s (absolute peak height threshold 3000 counts, relative height threshold 0.0001%), with 6 μL injection for each triplicate sample. The following source conditions were used: nitrogen as both sheath gas (10 L/min and 350 °C) and drying gas (8 L/min and 330 °C), nebulizer pressure = 60 psi, nozzle voltage = 300 V, and capillary voltage = 3.5 kV.

The software Agilent Profinder B.06 (from Agilent Technologies, Santa Clara, CA, USA) was used for compound identification from raw mass features, according to the ‘find-by-formula’ algorithm. The compounds were putatively annotated according to level 2 of confidence, as set out in the cosmos standard in metabolomics [18]. Annotation was achieved using the whole isotope pattern (monoisotopic mass, isotopic spacing, and isotopic ratio), following mass and retention time alignment, and a 5-ppm tolerance for mass accuracy. The reference database for annotations was Phenol-Explorer 3.6 [19]. Only the compounds identified within 100% of replications in at least one treatment were retained in the dataset. Thereafter, the compounds were firstly classified into phenolic classes and subclasses, and cumulative abundances were used for the semi-quantitative content of each class, according to a representative standard compound per class [17]. In detail, cyanidin (for anthocyanins), catechin (for flavan-3-ols), luteolin (for flavones and other flavonoids), quercetin (for flavonols), ferulic acid (for hydroxycinnamic acids and other phenolic acids), sesamin (for lignans), resveratrol (for stilbenes), and tyrosol (for tyrosols and other remaining phenolics) were used. The results were expressed as mg equivalents/g dry weight (DW).

Statistical analysis

All the experiments were performed at least in triplicate with constant results. Statistical analyses were carried out with Xlstat (Version 2019.4.2, Addinsoft SARL, Paris, France) and the data expressed as the mean ± SD. Results obtained were analysed using Two-way ANOVA, and comparison between groups was performed using the LSD test. Pearson's correlation coefficients were calculated to highlight the significant correlations (p < 0.01 and p < 0.05; two-tailed) between phenolic contents and biological activities, using IBM SPSS 21 software. Regarding the phenolic dataset, data were filtered and normalized using the software Agilent Mass Profiler Professional B.06 [17]. Therein, compounds’ abundance was Log2 transformed, normalized at 75th percentile, and then baselined versus the median of each compound in all samples. Afterwards, unsupervised hierarchical cluster analysis (HCA, Euclidean distance, Ward’s linkage rule) was used to investigate similarities/dissimilarities across samples naively. The metabolomics dataset was then loaded into SIMCA 13 (Umetrics, Malmo, Sweden) for supervised OPLS-DA modeling. Confidence limits of 95% and 99% were used to check for the presence of outliers (suspect and strong outliers, respectively), according to Hotelling’s T2. The model was cross-validated using a CV-ANOVA (p < 0.01) and the permutation test (N = 200) was done to exclude overfitting. The model parameters related to the prediction were finally recorded (R2Y correlation and Q2Y prediction ability), and the importance of the variables in projection (VIP) approach was used to identify the most discriminant metabolites (VIP score > 1).

Results and discussion

Total phenolic compounds, flavonoids, and condensed tannins of lentils seed coat

Studies on lentils as functional food are increasing due to their high nutritive value, bioactive compounds composition, and polyphenols content, which play an important role in preventing degenerative diseases like diabetes, obesity, and cancers [20]. It is well-known that the functional profile of plant-based foods depends on a series of factors that include the interaction between the genetic potential (the cultivar) with edaphic conditions, climate, agronomic management, and post-harvest [21]. Eleven lentil genotypes, collected from two different regions (Basilicata and Marche, southern and central Italy, respectively), were characterized in terms of TPC, TFC and CTC. The results provided in Table 2 indicate that TPC varied significantly among the tested lentils: the genotype Palazzo San Gervasio grown in Basilicata had the highest TPC at 70.37 mg GAE/g instead Crimson grown in Marche with 30.60 mg GAE/g had the lowest. The results here reported are consistent with those described by Boudjou et al. [22], who evaluated the phenolics content of different legume fractions reporting TPC mean values of 57.19 mg GAE/ g in the seed coats of a blond lentils variety (Lens esculenta var. Petite blonde de Dahra). A similar trend was observed for the TFC and CTC. The genotype Anicia grown in Basilicata showed the highest value of TFC and CTC, with 2.14 and 77.49 mg CAE/g, whereas the genotype Elsa grown in Marche region was characterized by the lowest value of TFC (0.86 mg CAE/g) and CTC (22.34 mg CAE/g). According to Pearson’s correlation, CTC was strongly correlated with the TPC (R2 = 0.933), and moderate correlations were found between TPC and TFC (R2 = 0.843) and between TFC and CTC (R2 = 0.706).

Table 2 Total phenolic compounds (TPC, mg of GAE/g), total flavonoid compounds (TFC, mg of CAE /g), and total condensed tannins content (CTC, mg CAE /g) of lentil seed coat extracted from the 11 different genotypes grown up in two different areas (Basilicata and Marche, Italy)

Lower values in terms of TPC and CTC were detected by Sun et al. [23] on ADM Red, ADM Laird green and ADM Eston green cultivars. These differences could be attributed to the source of legume seeds and the cultivation region [24]. To deepen explore the effect of the growing location (Basilicata – Marche) and the genotype type on TPC, TFC and CTC, a two-way analysis of variance (ANOVA) was conducted. The results, reported in Table 2, demonstrated a significant effect of both factors and their interaction on all the determined parameters. All the samples grown in the Basilicata resulted in higher concentrations of phenolic compounds, except for the Santo Stefano genotype (for which there was no significant difference between the two areas, p = 0.05), and Onano (showing an opposite trend).

Comprehensive phenolic profiling in lentils seed coat

As above reported, the geographical origins and different genotypes were considered as the factors able to affect the untargeted phenolic composition of this food matrix.

The UHPLC/QTOF analysis revealed 402 compounds, with flavonoids (209 compounds), phenolic acids (86 compounds), and low-molecular-weight phenolic compounds (77 compounds) being the most frequent. In particular, the anthocyanins delphinidin 3-O-sambubioside, peonidin 3-O-rutinoside, delphinidin 3-O-(6''-p-coumaroyl-glucoside), cyanidin 3-O-rutinoside/Pelargonidin 3-O-sophoroside and cyanidin 3-O-(6''-p-coumaroyl-glucoside), the flavones luteolin 7-O-(2-apiosyl-glucoside) and hesperidin, together with the flavan-3-ols prodelphinidin dimer B3 and (-)-epicatechin/( +)-catechin were found to be the most abundant among flavonoids. Also, the 5-nonadecylresorcinol (belonging to alkylphenols) and hydroxybenzoic and gallic acids (both belonging hydroxybenzoic acids) were the most representative compounds among small molecular weight phenolics. These findings agree with scientific literature, with monomeric forms of flavonols and anthocyanins and proanthocyanidins (polymeric flavonoids) being reported as the most abundant classes described in pulse crops [25]. Unsupervised cluster analysis (HCA) allowed hierarchically comparing the effect of origin and cultivar on the different phenolic profiles detected (Fig. 2).

Fig. 2
figure 2

Hierarchical cluster analysis (Euclidean distance) carried out from the untargeted phenolic profiles in lentils seed coat, as determined through UHPLC/QTOF-MS analysis on samples of different cultivars in two different geographical origins

Notably, except for Anicia and S. Stefano, the heat map suggested that the origin was more important than the genotype considered. Also, a similar phenolic profile was found for Crimson and Elsa varieties, whereas Gaia, from Metaponto (Basilicata), was included in a separated sub-cluster, thus indicating a more distinct phenolic fingerprint.

The phenolic compounds detected were quantified as mg/kg equivalents according to a representative standard per class/sub-class. The results of the semi-quantitative analysis are listed in Table 3. Notably, Crimson, Elsa, Flora, Palazzo, and Itaca from Basilicata showed a higher content of polyphenols (mainly flavonoids and phenolic acids) compared to the genotypes from Marche, with a semi-quantitative content ranging from 1800 to 2200 mg/kg. Conversely, in S. Stefano, Onano, Gaia, IGP Altamura, Lenticchia Rossa, and Anicia, the TPC was found to be higher in those genotypes cultivated in Basilicata.

Table 3 Semi-quantitative data on the different classes of phenolic compounds

Interestingly, the Anicia genotypes from Basilicata and Marche origins demonstrated the highest content of anthocyanins (136.66 mg/kg and 141.00 mg/kg), flavones (305.66 mg/kg and 257.33 mg/kg) and flavonols (262.00 mg/kg and 335.33 mg/kg). Lenticchia Rossa and Palazzo from Basilicata were the genotypes with higher flavanols content being 605.00 mg/kg and 682.66 mg/kg, respectively, whereas Gaia was the genotype recording the lower flavanols content from both Basilicata (142.33 mg/kg) and Marche (208.66 mg/kg).

Also, Flora extracts were characterized by the highest lignans abundance, with values of 153.33 mg/kg (Basilicata) and 197.66 mg/kg (Marche), while S. Stefano demonstrated the lowest, with a content of 97.03 mg/kg and 110.00 mg/kg. Considering the class of phenolic acids higher content was found in Lenticchia Rossa, Anicia and Itaca. Concerning Basilicata, the highest phenolic acids content was found in the Itaca variety (230.36 mg/kg), whereas among cultivars from Marche their higher abundance could be recorded in Anicia (242.06 mg/kg).

As the next step, the supervised OPLS-DA was built considering each variety and confirmed the HCA outcome since a clear separation between Anicia and Gaia was also achieved by this supervised modelling (Fig. 3). The model parameters were good, being R2Y (the goodness-of-fit) = 0.95 and Q2Y (goodness-of-prediction) = 0.73. The robustness of the OPLS model (CV-ANOVA < 0.01) was confirmed by cross-validation, and the permutation test (number of random permutations = 100) allowed to exclude overfitting (Table S1 and Fig. S1). Therefore, the VIP selection method was carried out to extrapolate those phenolic compounds most influenced by the genotype, considering a VIP score > 1. The compounds possessing the highest discrimination potential (VIP score > 1.4) were flavonoids, mainly flavones and flavanones, as well as phenolic acids (hydroxybenzoic and hydroxycinnamic acids). The flavonoids 6''-O-malonyldaidzin (isoflavonoids), 7-Methylnaringenin (flavanones), and malvidin 3-O-arabinoside (anthocyanins) were those possessing the highest discrimination potential. Regarding phenolic acids, the compound 1,2′-disinapoyl-2-feruloylgentiobiose showed a VIP score = 1.34.

Fig. 3
figure 3

OPLS-DA modelling carried out from the untargeted phenolic profile in lentils seed coat, as determined through UHPLC/QTOF-MS analysis on samples of different cultivars in two different geographical origins. The “cultivar” factor was used for the supervised analysis

Another OPLS-DA was built to discriminate the geographical origin (Fig. 4). The model parameters were good, being R2Y (the goodness-of-fit) = 0.96, Q2Y (goodness-of-prediction) = 0.69, and cross-validation CV-ANOVA < 0.01. Like the previous OPLS-DA model, the permutation test (number of random permutations = 100) excluded overfitting (Table S2 and Fig. S2). Considering a VIP score > 1, VIP analysis identified 139 discriminant phenolics. The compounds possessing the highest discrimination potential (VIP score > 1.4) were mainly flavonoids (anthocyanins, flavones, and flavonols), followed by phenolic acids and stilbenes. The Venn analysis carried out from VIP phenolic compounds discriminating cultivars and origin, respectively (Fig. 5), highlighted that only a limited portion of markers (80 compounds, 34.9% of the whole VIP markers) were shared between the two factors, whereas the 34.9% discriminated the cultivars and the remaining 25.8% the origin.

Fig. 4
figure 4

OPLS-DA modelling carried out from the untargeted phenolic profile in lentils seed coat, as determined through UHPLC/QTOF-MS analysis on samples of different cultivars in two different geographical origins. The “geographical origin” factor was used for the supervised analysis

Fig. 5
figure 5

Venn analysis concerning the phenolic markers identified as discriminants of origin and cultivar in lentils seed coats

Despite being recognized as a valuable source of phenolics, the composition of legume plants (including lentils) is known to be affected by a series of abiotic (drought, soil salinity, agricultural practices, extreme temperatures) and biotic (pathogens, soil biodiversity and nematodes) factors [26]. Nonetheless, it is well-known that crop ecotypes result from the interaction between genetic background and environment, finally shaping the profile of bioactive compounds [26]. Therefore, it is not surprising that the combined effect of both genetic bases (the cultivar) and origin (pedoclimatic conditions) ultimately resulted in distinctive phenolic profiles in lentils, as our untargeted metabolomics approach highlighted.

Finally, Pearson’s coefficients were calculated to investigate the potential correlations between the different phenolic classes and the in vitro spectrophotometric results on antioxidant compounds (i.e., total phenolic, total flavonoids, and total condensed tannins content). A significant (p < 0.01) correlation coefficient was found between cyanidin equivalents and TPC (0.369), TFC (0.503), and CTC (0.663); similar results were found for catechin equivalents with a correlation coefficient of 0.484 (TPC), 0.529 (TFC), and 0.382 (CTC). Regarding quercetin and luteolin equivalents, a significant and positive correlation coefficient was recorded with CTC, being 0.450 and 0.333, respectively. Also, ferulic acid correlated with CTC, with a coefficient of 0.266 (p < 0.05).

Lentil seed coat color

Lentils seed coat could be of different color, such as yellow, orange, red, green, brown, or black, depending on the cultivar, the composition of the seed coats, and cotyledons5. The L* value of 22 legumes ranged from 36.71 for Anicia genotype grown in the Marche region to 59.8 in the case of IGP Altamura (Marche) (Table 4). The a* values of legumes varied from -0.84, for the Anicia genotype (Marche), to 7.92 for Crismpn (Marche). The b* value of legumes was observed to be in a range from 10.43, found in Anicia genotype (Marche), to 26.57 in the case of the Onano genotype (Marche). Chroma resulted from 10.48 of Anicia genotype (Basilicata) to 26.86 of Gaia genotype grown in the Marche region. The lowest h° values were reported for Anicia genotype (Basilicata-Marche) instead the higher for Lenticchia palazzo San Gervasio (Basilicata). The statistical analysis showed, as expected, a significant difference between the samples and generally, samples grown in the Marche region showed higher lightness and chroma values with more reddish and yellow color than the samples grown in Basilicata as evidenced by the parameters a* and b*, respectively.

Table 4 Color parameters of lentil seed coat extracted from the 11 different genotypes grown up in two different areas (Basilicata and Marche, Italy)

Generally, there are conflicting opinions on the correlation of phenolic compounds and the color of the seed coats [27]. In this field, the correlation between TPC, TFC and CTC, as well as semi-quantitative data on phenolic classes and color parameters, was investigated through Pearson’s coefficients. There were no correlations between TPC and color values, unlike the TFC and the CTC. Detailed, TFC resulted negatively correlated with a* value (r = − 0.650; p < 0.01) instead CTC resulted correlated with both a* (r = − 0.613; p < 0.01) and h° (r = − 0.451; p < 0.05) values. In the same manner, Mojica et al. [28] reported no significant correlation was noted between TPC and a* whereas Giusti et al. [29] found a weak negative correlation between TPC and h° and between TPC and C. Considering the comprehensive lentils seed coat phenolic profiling, all the color parameters resulted well negatively correlated with the cyanidin and luteolin eq. Catechin, sesamin and tyrosol eq. were non correlated to any color parameter instead ferulic eq. were correlated with b* (r = − 0.469; p < 0.05), C (r = − 0.520; p < 0.05) and h° (r = − 0.487; p < 0.05). Several authors reported how the differences in phenolic composition between different varieties grown in specific geographical places could be related to the color of the seed [5, 30]. However, the results here only partially confirm how Xu et al. [5] reported that mentioned a good correlation among the color parameters, TPC, TFC and CTC. Based on the obtained results, it is worth pointing out that the variability found in the polyphenolic profile seems to be only partially due to the seed coat color. Particularly, the effect of the geographic location and the genotype seems to be more relevant. Rodríguez Madrera et al. [31] also came to a similar conclusion evaluating the phenolic content in seeds of common bean (Phaseolus vulgaris L.).

Conclusion

The phenolic profile of lentils seed coat was investigated through a combination of spectrophotometric quantification of phenolics, flavonoids and tannins, together with a comprehensive profiling through untargeted metabolomics. The approach allowed highlighting that these lentil by-products can represent a valuable source of phenolics, thus representing a promising food ingredient. At the same time, the geographical origin, followed by the genetic background, have been pointed out as significant factors to be considered, when lentil seed coat is to be used as food ingredient to improve the functional traits in processed foods.