Introduction

Interest in the characterization of condensed tannins has increased in the last few decades. These phenolic compounds are oligomers and polymers of flavan-3-ols that belong to the flavonoid class of polyphenols widely distributed in the plant kingdom. Tannin composition may differ by the nature and the number of constitutive units [1], by the type (B or A) and location of interflavanic linkages connecting the monomeric units (mostly C4–C8 and rarely C4–C6) and lastly by the conformation (linear versus branched) of polymers [2]. All theses parameters are believed to be genetically controlled because they vary with plant species, and within a given species, with plant tissue, organ and physiological stage. The study of this class of compounds in the context of their biological, nutritional and sensory properties [37] is particularly interesting with respect to their degree of polymerisation and how far it can influence their properties.

Thiolysis and phloroglucinolysis [8, 9] of a tannin fraction are currently the methods most used to provide an average composition in terms of the nature of the constitutive units of tannins, their relative proportions and an estimate of the average degree of polymerisation (aDP). These depolymerisation methods are difficult to implement and do not give information about the polymer distribution of a tannin fraction because all the polymers contained in the fraction are cleaved into monomer units in the course of the reaction. Moreover, of the few efficient fractionation and analytical techniques that exist for characterizing tannin composition, none of them give access to a reliable polymer distribution of the corresponding tannin fractions. Condensed tannins can be separated up to decamers on a liquid chromatography diol phase, whereas beyond decamers, they are not retained on the stationary phase with methanol elution and are therefore not fractionated. Most of the size exclusion chromatography [1013] (SEC) techniques do not work properly with tannins because of their adsorption on the support materials. Nevertheless, tannins can be fractionated according to their degree of polymerisation by SEC by using polar aprotic solvents such as dimethylformamide (DMF) or by saturating hydrogen sites with urea [8, 14].

Mass spectrometry (MS) is an interesting alternative technique and allows condensed tannins to be detected without sample pre-treatment. Electrospray ionization (ESI) [15] and matrix-assisted laser desorption ionization (MALDI) [16] are soft ionization methods adapted for ionising non-volatile analysts such as biopolymers. Since the beginning of 1990s, these methods have been applied to the analysis of phenolic compounds in plant. Although MALDI-MS allows the detection of slightly higher polymerisation degrees (DPs) than those observed by ESI-MS (up to DP10 [1719] or DP15 [20] in ESI-MS, and up to DP10 [21, 22], DP15 [23] or DP20 [16] in MALDI-MS) an issue with the detection of higher DPs [24] was evidenced using both techniques. A discrimination is observed in favour of the lower molecular ions in ESI-MS so that DPs greater than 15 are not detected even if the average degree of polymerisation value determined through depolymerisation methods is higher than this value.

These limitations lead one to conclude that this technique could be relevant only in a restricted number of situations and that its actual range of application remains to be determined. Determining the conditions under which mass spectrometry can properly describe the mass distribution of a tannin fraction was therefore the objective of this work. For this purpose, four fractions with different aDP were analysed under varying conditions defined by a full factorial experimental design. In this way, considering all the mass spectra acquired and the reliability of the fraction’s fingerprints, we sought a better understanding of the behaviour of the tannins by using ESI-MS thus allowing guidelines for their analysis to be established.

Experimental

Preparation of tannin fractions

Fractions of tannins were obtained from two cider apple cultivars (Avrolles (A) and Kermerrien (K)) in order to have fractions with various aDP values. Freeze-dried powder of apple cortex tissues were kindly prepared as already described [25] and provided by S. Guyot. Tannins were extracted from apple powder of each variety with methanol (K1 and A1) then with water/acetone (K2 and A2). The methanol and acetone layers were concentrated separately under vacuum and each extract was then fractionated by Fractogel chromatography [26, 27] (methyl acrylate copolymer in solution in aqueous ethanol 20%, HW-50F). The fractionation was achieved with CH3OH/CH3COOH solution (99:1 v/v), then higher molecular weight tannins were eluted with a H2O/CH3COCH3/CH3COOH solution (39.5:59.5:1 v/v/v). The fractions studied in this work are the last fractions eluted with H2O/CH3COCH3/CH3COOH solution for each of the four extracts (K1, K2, A1 and A2).

Depolymerization conditions (thiolysis) and analytical reversed-phase high-performance liquid chromatography (HPLC)

A 1 mg L−1 solution of each dry extract was prepared in methanol. In a 250-μL glass insert, 100 μL of the solution was introduced and 100 μL of toluene-α-thiol (5% v/v in acidified methanol with 0.2 N HCl) was added. The glass insert was sealed with an inert cap and the reactions were then carried out at 90 °C for 2 min. Thiolysis reaction media (10 μL) were directly injected into the HPLC system, a Waters Alliance (Milford, MA) (2690 separation module, 2966 photodiode array detector, Empower Pro 2 software). The column (125 × 4 mm, 3 μm, 120 Å) was a Nucleosil 120-3 C18 endcapped (Macherey-Nagel, Seden). The flow rate was 0.8 mL min−1 and the gradient conditions were solvent A (H2O/CHOOH, 98:2 v/v); solvent B (CH3CN/H2O/CHOOH, 80:18:2 v/v/v); initial 15% B; 0–15 min, 75% B linear; 15–20 min, 100% B linear; 20–22 min, 100% B isocratic; 22–25 min, 15% B linear.

Mass spectrometry

Analyses were performed on an AccuTof JMS-T100LC ESI time-of-flight (TOF) mass spectrometer (Jeol, Japan). Tannin fraction samples were analysed by flow injection analysis (FIA) in H2O/CH3CN (50:50 v/v) with a flow of 0.250 mL min−1. Source temperature was set at 250 °C and that of curtain gas flow at 80 °C; the spray voltage was set at 2 kV in the positive ionization mode or −2 kV in the negative ionization mode. The mass spectra were acquired over a mass range 200–4,000 Th in positive and negative ionization modes.

Experimental design

The ionization mode, the cone voltage and the solvent acidity are known to greatly influence tannin response in ESI-MS. The influence of these instrumental factors was investigated for the four tannin fractions (K1, K2, A1, A2) according to a full factorial design crossing:

  • Three levels of solvent acidity: 0.1% or 1% or 2% of formic acid corresponding respectively to pH 6, pH 4 and pH 2.

  • Two ionization modes: ESI+ (positive) and ESI (negative).

  • Three levels of cone voltages: 45 V, 65 V and 85 V.

Each experiment was carried out in duplicate which leads to 72 mass spectra for each ionization mode.

Throughout this paper, experiments are labelled as follows:

  • First character refers to the apple cultivar from which was prepared the tannin fraction: K (Kermerrien) and A (Avrolles).

  • Second character refers to the solvent used for tannin extraction that is 1 for methanol, 2 for water/acetone mixture.

  • Third and fourth characters correspond to cone voltages (45, 65 and 85 V) used in the experiment.

  • Fifth character corresponds to the solvent acidity used in the experiment: a = 0.1%; b = 1%; c = 2% of formic acid added.

  • Sixth character (1 or 2) corresponds to level of duplication.

Estimation of aDP from the mass spectrum

For each experiment, the closeness between the aDP value which can be calculated from the mass spectrum and the aDP value measured by thiolysis (considered as the standard value) was chosen as a quality criterion. This comparison can provide information about the accuracy of tannin detection. If the calculated aDP value is close to the aDP value obtained chemically, we can assume that tannin ions present in the sample were appropriately detected. In this way, the mass spectrum generates accurate information about the tannin composition of the sample. Furthermore, the comparison between the estimated values obtained under different analytical conditions allows the influence of the factors studied to be established and then to use optimised experimental conditions to get closer to the standard aDP values.

The estimation of the aDP value for each experiment was calculated automatically from the mass spectrum obtained using a Matlab (version 7.3, the Mathworks Inc., MA) routine developed in house. The calculation of aDP takes place after the automatic reading of ion intensities at the known tannin oligomer’s masses: as the tannins analysed in this study are solely composed of a variable number of epicatechin subunits [28], two tannin species with consecutive DP have singly charged ions separated by 288 Th on the mass spectrum (144 Th for doubly charged ions). For isotopic clusters, informative mass peaks were selected as a function of the ion charge state (singly charged: first peak which corresponds to the monoisotopic ion and second one; doubly charged: second and third peaks; and triply charged: third and fourth peaks). Finally, as even DP doubly charged ions are superimposed with singly charged ions (Fig. S1 in the “Electronic supplementary material”) and some triply charged ions are located under some singly and doubly charged ions, a correction was applied for each singly charged oligomer by subtracting the contribution of the third ion of doubly charged species that contains exactly double the number of constitutive units.

Chemometrics

Data pre-treatment

Fingerprints were derived from the total ion current (TIC) using the Mass Center Software (Jeol, Tokyo, Japan) and exported in an ASCII format to the Matlab software. Mass spectra were first smoothed using the Stavitzky–Golay method [29] and aligned on a reference (vector of known tannin oligomer’s masses) in order to obtain a homogeneous data matrix for each ionization mode.

Data treatment

In order to visualize the influence of tannin fraction, cone voltage, solvent acidity on the whole set of spectra recorded, a principal component analysis (PCA) was performed for each mode. When applied to spectral data, PCA allows similarity maps of the samples (or scores plots) and spectral patterns (or loadings) to be drawn. The similarity maps allow the comparison of the spectra in such a way that two similar spectra are represented by two neighbouring points, whereas the spectral pattern exhibits the m/z values that explain the similarities observed on the maps. Before processing PCA, the number of data points for each spectrum was reduced by selecting the maximum intensity value in each 0.1-m/z interval. Then each mass spectrum was normalised by dividing each intensity recorded by the sum of all the intensities.

Results and discussion

Annotation of spectra

All mass spectra were annotated for the masses corresponding to the different DPs. The aDP values determined by thiolysis followed by reversed-phase HPLC are presented for the four tannin fractions in Table 1. As an example, the annotated spectra of fraction K2 (aDP15.5) obtained in positive and negative ionization modes for a cone voltage of 45 V and a solvent acidity of 1% are shown in Fig. 1. The ESI+ mass spectrum in Fig. 1 consists of the superposition of two distributions corresponding to singly and doubly charged species obtained for the different tannin oligomers and polymers present in the fraction (mass difference between singly charged species = 288; mass differences between doubly charged species = 144). The ESI mass spectrum consists of the superposition of three distributions corresponding to singly, doubly and triply charged species (mass difference between triply charged species = 96), the last of these species not being detected in the positive mode.

Table 1 Composition of tannin fraction estimated by chemical depolymerization (thiolysis)
Fig. 1
figure 1

K2 (aDP15.5) mass spectra in ESI-MS a positive mode and b negative mode. Singly charged ions (*), doubly charged ions (○) and triply charged ions (◊)

Theoretically, tannins show a better ability for deprotonation than for protonation. This observation can be related to the fact that among the five hydroxyl functional groups present in a tannin monomer unit, four are phenolic hydroxyls. Thanks to phenolate ion stability, deprotonation is favoured. Protonation on each monomer unit can take place preferentially on two sites of the heterocycle referred to as the C ring (first on the oxygen in position 1 of the heterocycle and on the hydroxyl in position 3). Nevertheless, these first results led us to establish that in the positive ionization mode, the sensitivity is increased and ions corresponding to higher DPs than in negative ionization mode can be detected. Depending on the fraction composition, the highest detectable DPs can be either similar or very different in each ionization mode. The highest DPs observed on the mass spectra obtained in the positive and negative ionization modes are close for K2 (DP25 (ESI+), DP22 (ESI)) and A2 (DP26 (ESI+), DP22 (ESI)) tannin fractions but are pretty far away for K1 (DP22 (ESI+), DP11 (ESI)) and A1 (DP20 (ESI+), DP9 (ESI)) tannin fractions. The highest DP detected by ESI-MS (DP26) was observed with the A2 tannin fraction. It should be noted that this value is a clear improvement on the maximum values encountered in the prior literature [20]. A summary of the multicharged species encountered for both ionization modes in relation to the DP value of a given tannin is shown in Fig. 2.

Fig. 2
figure 2

Ionic species detected by ESI-MS in positive and negative ionization modes

Principal component analysis

PCA was applied to the spectral libraries recorded in ESI and ESI+ ionization modes. The similarity map (or scores plot) defined by principal component 1 and 2 (synthesizing 86.1 and 87.8% of the spectral variance for ESI and ESI+, respectively) and the corresponding spectral patterns (or loadings) are presented in Fig. 3.

Fig. 3
figure 3

PCA of 72 spectra in positive and negative ionisation modes ESI-MS (a). Layout of spectra in the first principal plane (each spectrum represented using the label AAXXz1: AA = tannin fraction, XX = cone voltage (volt), z = formic acid percentage (a = 0.1%, b = 1%, c = 2%), 1 or 2: duplicate); loadings for the first principal component (b) and the second principal component for each mode (c). The highest masses are shown in the inserts

Figure 3a shows that similar results were obtained for both ionization modes even if some minor differences were observed. Indeed, the component 1 discriminates between the fractions regardless of the acquisition conditions: on this axis, K1 (aDP6.7), A1 (aDP20.9) then K2 (aDP15.5) and A2 (aDP49.5) which are superimposed are lined up from the left to the right. On the basis of data in Table 1, it appears that component 1 does not sort the samples in order of increasing aDP. As shown in Fig. 3b, the negative part of the loading related to the first principal component is composed of low DP tannins mostly detected as singly charged ions (for ESI: m/z = 289, 577, 865,…) contrary to the positive part where higher DPs were observed as multicharged ions (for ESI: m/z = 720, 864, 1,008, 1,152, 1,296, 1,440, 1,584,…). Multicharged species were therefore mainly produced for K2 and A2 fractions. Nevertheless, the presence of multicharged ions can not be directly linked to aDP: their relative proportion on the mass spectrum of fraction K2 is roughly greater than or equal to those of the fractions presenting higher aDP (A1 and A2). If a common behavior can be highlighted, some differences between both ionization modes can be identified from Fig. 3b: on the one hand, as previously stated, more multicharged ions are produced in ESI (m/z = 1,343, 1,439, 1,538,…corresponding to triply charged species). On the other hand, difference can be noted in the polymer distribution reflected by the loadings relative to each ionization mode: in the positive ionization mode the negative part of the CP1 loading (low DPs) has a Gaussian appearance with a large envelope composed of tannin ions ranging from DP2 to DP7 (m/z = 579 to m/z = 2,021), centred on DP4, whereas in the negative ionization mode this envelope is narrower (between DP2 and DP5) with two major tannin ions (DP3 and DP4). This apparent difference in polymer distribution profile may be due to the fact that polymers appears like three charge states, whereas in the positive ionization mode the same polymers are distributed into two charge states. In the negative ionization mode, deprotonation is favoured thanks to the phenolate ion stability giving rise to more multicharged ions that appear at much lower intensities. These results are further elements in favour of the positive ionization mode that seems more suitable for tannin analysis

For both ionization modes, component 2 separates the samples based on the cone voltage values and, to a lesser extent, on the solvent acidity. Indeed, spectra acquired with high cone voltage (85 V) and the lowest solvent acidity value (0.1%) have positive coordinates on the axis 2, whereas negative coordinates are obtained for spectra acquired with low cone voltage value (45 V) and solvent acidity values greater than 0.1%.

The ions contributing to the loadings of principal component 2 (Fig. 3c) are mainly located at oligomer’s masses decreased by 2 units (ESI: m/z = 287, 575, 863 ...). These ions can be considered as markers of the interflavan linkages fragmentation (for further details see Fig. S2 in the “Electronic supplementary material”) and they appear in the positive part of the loading whereas fragment ions and/or multicharged ions are observed in its negative part. Increasing the cone voltage and decreasing the percentage of acidity of the solvent are therefore reflected on the mass spectrum by an increase in fragmentation and a decrease in the proportion of multicharged species.

Again, if the both ionization modes have a very similar overall behaviour, differences between them may be highlighted. In the negative ionization mode, lowering the solvent acidity favoured the deprotonation of tannins and concomitantly increasing the cone voltage led to more fragmentations. This phenomenon is observed with the higher aDP fractions: they are likely composed of high molecular weight tannins, converted to multicharged ions in the ESI source which are more sensitive to the fragmentation. Indeed, more fragment ions are produced from A2 (and K2) fractions than from K1 at a high cone voltage and a low acidity. The comparison of mass spectra between A2 and K1 tannin fractions shows the presence of more abundant fragment ions detected in the A2 (aDP49.5) mass spectrum than in the K1 (aDP6.7) spectrum (see Fig. S3 in the “Electronic supplementary material”). Hence, the maximum of the ion envelop at 85 V is shifted toward lower m/z values corresponding mainly to monomer, dimer and trimer species, whereas the maximum of the K1 ion envelop is unchanged at 85 V. However in the positive ionization mode, this finding is not observed (see Fig. S3b in the “Electronic supplementary material”). Actually the K1 ion envelop maximum intensity moved to m/z 579 which corresponds to the dimer species.

The positive part of the loading for the principal component 2 (Fig. 3c) is actually composed of m/z − 2 values coming mostly from the DP2 fragment along with DP1 and DP3 fragments contrary to the negative ionization mode where the m/z − 2 values correspond to a larger range of fragments (DP1 to DP5). This difference could be explained by the difference in the yield of multicharged ions between both ionization modes (less multicharged ions in the positive ionization mode). Like in the negative ionization mode, lowering acidity affects tannin fragmentation but to a lesser extent. Thus, the distribution on the vertical axis, of samples corresponding to analyses carried out at a cone voltage of 85 V, is homogeneous in the positive ionization mode. The heterogeneity observed in the negative ionization mode for the samples analysed under the same conditions seems to result from the interaction between three factors, namely nature of the fraction, cone voltage and solvent acidity.

Calculated aDP and characteristics of ESI mass spectra under optimum conditions

The estimated aDP values obtained from each experiment of the factorial design are presented in Table 2. The best results were obtained in positive and negative ionization modes with a cone voltage of 45 V and more than 1% of formic acid — henceforth referred to as optimised conditions. These results can be interpreted in the light of spectral changes described in the preceding paragraph: aDP estimates are even higher according to a high level of doubly and triply charged ions (influence of fraction composition and cone voltage) and a low level of fragmentation (influence of cone voltage and acidity).

Table 2 Comparison between aDP value established by thiolysis and calculated from the mass spectrum

The aDP values calculated from mass spectra generated with the optimised conditions are near the aDP value estimated by thiolysis for the half of the studied fractions (aDP5.2 (MS) vs. aDP6.7 (thiolysis) and aDP11.1 (MS) vs. aDP15.5 (thiolysis) corresponding to K1 and K2 fractions, respectively).

Conversely, the aDP values determined from mass spectra for A1 and A2 fractions are much lower than those obtained by thiolysis (aDP5.7 (MS) vs. aDP20.9 (thiolysis) and aDP11.8 (MS) vs. aDP49.5 (thiolysis), respectively). None of the 36 experimental conditions tested allowed one to get aDP values close to the standard ones obtained with Avrolles (A1 and A2) tannin fractions. Table 2 shows that a decrease in cone voltage greatly improves the relevance of the mass spectrum obtained. However preliminary tests conducted to determine the range of variation for the factors of the experimental design showed that below 40 V this quality is largely deteriorated. This leads us to believe that the best results in Table 2 are probably the best that can be obtained.

Both couples K2/A2 and K1/A1 show similar mass spectra under better acquisition conditions (Fig. 4) as previously evidenced by PCA. The upper DP tannin detected in this work by ESI+-MS is DP26, due to the limit of detection of direct mass spectrometry analysis. Thus, direct ESI-MS analysis of tannin fractions gives accurate estimation of tannin composition provided that the tannin fraction is mainly composed of tannins below DP26. In this study, direct ESI-MS analysis is suitable for K1 and K2 tannin fractions but does not provide accurate information about tannin composition for A1 and A2 fractions. These discrepancies between aDP values obtained from thiolysis and those calculated from mass spectra demonstrated the difficulty in reaching an actual composition of tannin distribution from direct analysis by mass spectrometry for some tannin fractions.

Fig. 4
figure 4

Mass spectra of the tannin fractions generated under optimised conditions by ESI+-MS: a K1 (aDP6.7 from thiolysis); b K2 (aDP15.5 from thiolysis); c A1 (aDP20.9 from thiolysis) and d A2 (aDP49.5 from thiolysis)

Moreover, it should be noticed that K2 and A1 fractions which have a close standard aDP value generate mass spectra with a different profile. Indeed, thiolysis of A1 fraction revealed an aDP20.9 but only low DP tannins appear prominent in the mass spectrum which results in an aDP value calculated from the mass spectrum of only 5.7. This suggests that the composition of the tannin fraction has a great influence on the mass spectrum profile, the composition being related to the polymer distribution or polydispersity of the fraction. This difference in polymer distribution according to the tannin fractions can be explained by the two cider apple cultivars used in these experiments. In fact, Kermerrien cultivars are composed of tannins with aDP12 [30], whereas Avrolles cultivars showed significant concentrations of tannins with aDP50 [31]. Thus tannin fractions (A1 and A2) extracted from Avrolles cultivars should have a larger range of tannin DPs than those extracted from Kermerrien (K1 and K2). The A1 fraction should be composed of, to a larger extent than K2, tannin DPs which can not be detected from direct MS analysis because of the limit of detection of this analytical technique.

To cope with the difficulty in detecting the higher DP tannins, a complementary experiment was carried out to understand whether the presence of tannins of lower molecular weights (DP2 to DP7 in the A1 fraction and DPs around 15 in the A2 fraction) were responsible for a spectral discrimination towards high DP tannins. In this experiment, K1 and K2 tannin fractions were chosen because the aDP values estimated by mass spectrometry are the closest to the aDP values obtained by thiolysis. Both tannin fraction solutions (K1 (aDP6.7) and K2 (aDP15.5)) were analysed under the same conditions (130 μM) (Fig. S4a, b in the “Electronic supplementary material”). Intensity differences between the fractions are noted. The K1 fraction composed of low DP tannins gives a mass spectrum with a maximum ion intensity ten times higher than that observed for the K2 fraction. The mass spectrum obtained from the equimolar combination of both these tannin solutions (Fig. S4c in the “Electronic supplementary material”) demonstrates that there are no spectral discriminations of highest DPs in favour of the lowest because the ratio of 10 is conserved. Therefore, an issue with the ionization of the highest molecular weight tannins is most likely the reason for the absence of higher DP ions on the mass spectra. Secondly, these results showing the absence of competition between small DP tannins confirm the ability of mass spectrometry to perform their analysis.

The poorer ionization (or the absence of ionization) of high DP tannins led us to significantly underestimate the aDP values calculated from mass spectra, especially in two cases. This underestimation can be problematic when the fraction mainly contains higher tannin DPs (such as the A1 fraction) than the limit of detection (DP26) and may be even more important when the fraction analysed presents a large polydispersity. Our results illustrate quite faithfully the various scenarios that may be encountered during the analysis of a fraction of tannin by direct ESI-MS. They show that this approach can only be meaningful if the magnitude of the aDP value for the analysed fraction is known. This information may be available before analysis (for example, it is well known that aDP values for grape seed tannins are usually less than 10). Otherwise, the determination of the aDP value must be achieved by chemical depolymerisation. In the case of poorly polydisperse fractions mainly composed of tannin polymers lower than DP26 the mass spectrum obtained has a real chance to reflect the distribution of tannins. Otherwise, the mass spectrum must be used to appreciate the differences in aDP values (chemical depolymerisation and mass spectrum processing) and thereby infer the polydispersity of the sample.

Conclusions

This study has led to better define the difficulties encountered in the MS analysis of tannins. The comparisons of aDP values obtained after chemical depolymerisation and mass spectra data processing have been established. The underestimation of aDP values calculated from mass spectra underlines the difficulties in detecting high DP tannins. Indeed, the highest tannin DPs detected vary between DP11 and DP26 depending on the fraction composition. The underestimation of aDP values calculated from mass spectra is particularly significant for Avrolles tannin fractions. This finding is consistent with the tannin composition of the Avrolles cultivar which is known to contain large polymer distributions (large polydispersities), a desired property of Avrolles for cider production.

Chemometric analysis of the whole MS library collected showed that several analytical parameters (solvent acidity, ionization mode, cone voltage) strongly affect the mass spectrometric response of tannins nevertheless without exceeding the limit detection (DP26).

ESI-MS is suitable for aDP estimation of fractions displaying a narrow polymer distribution with standard aDP values below 20. Thus, direct analysis by mass spectrometry is appropriate for studying samples composed of condensed tannins with low molecular weights, frequently encountered in numerous species in the vegetable kingdom [32, 33].