Introduction

In grape, ripening corresponds to the second growth period of the non-climacteric berries and entails profound changes in fruit composition: grapes soften, probably because of modifications in specific components of cell walls; become sweet by reason of the upturn of the hexoses glucose and fructose; lose acidity for the drop of malic acid and change the color of the skin because of the surge of many secondary metabolites together with chlorophyll degradation (Kanellis and Roubelakis-Angelakis 1993; Nunan et al. 1998; Coombe and McCarthy 2000; Boss and Davies 2001).

The attention paid to the improvement of wine quality urges on a deeper understanding of the biochemical events characterizing the process of maturation, in order to dispose of tools useful in evaluating grape potentials, the ripening degree of the berries and to pinpoint other aspects that are crucial for winemaking such as the aromatic and phenolic potentials (Mattivi 2006).

Recently, also thanks to the improvement of genetic information that burst out with the sequence of the Vitis genome (Jaillon et al. 2007), some proteomic analyses resulting from the combination of protein separation by two-dimensional gel electrophoresis and protein characterization through mass spectrometry have been performed on grape ripening (Sarry et al. 2004; Deytieux et al. 2007; Negri et al. 2008a, 2008b; Giribaldi et al. 2007; Zhang et al. 2008). These studies provided a first outline of grape proteomes, overcoming the difficulties linked to the extraction procedure from such a “recalcitrant” material (Carpentier et al. 2005) and opening new perspectives on the role of some proteins/processes during maturation.

Unfortunately, in this case, an organic comprehension of these biochemical events and the concomitant search of biomarkers for grape ripening is generally hampered by the large number of spots present on each gel and the low reproducibility affecting 2-DE. Multivariate techniques have been applied in the recent past to proteomic datasets issued from different fields to overcome such hurdles (Marengo et al. 2007 and 2008) since they can take into account the complex relationships and correlations existing between variables (Marengo et al. 2004; Karp et al. 2005) as well as some of them achieving the dimensionality reduction and eliminating the experimental error.

In order to isolate a panel of proteins that could be tested as ripening markers, we carried out a time–course comparative 2-DE analysis of skins of cv. Pinot Noir grapes collected in three different moments from véraison, intended as the moment in which 50% of berries start to change the color, to full maturation. A first principal component analysis was carried out to the spot volume dataset obtained after software-assisted gel alignment; then, a forward stepwise-linear discriminant analysis (FS-LDA) was performed on the principal components (PCs) calculated, allowing the discrimination of the classes of samples and sorting the variables according to their relevance. Thirty-six spots among the most significant in the models were characterized through liquid chromatography-electrospray ionization-tandem mass spectrometry (LC-ESI-MS/MS) and the biochemical role of the corresponding proteins is discussed. In addition, the same statistical approach was repeated restricting the analyses to the last two ripening stages in order to confirm its robustness in discriminating the last stages of berry ripening, known to be quite difficult to distinguish as far as it concerns the phenological point of view.

Materials and methods

Plant material and experimental design

Thirty clusters were harvested at three different moments of the 2005 season (28/07–11/08–25/08) from Vitis vinifera L. cv. Pinot Noir plants grown at the experimental station of the “Ente Regionale per i Servizi all’Agricoltura e alle Foreste” of Regione Lombardia (Montebello della Battaglia, PV, Italy). The sampling stages corresponded to 35 (termed “RS35”, véraison berries), 37 (“RS37”, not quite ripe berries), and 38 (“RS38”, harvest-ripe berries) of the modified E–L system (Coombe 1995). The definition of the developmental stages of this material though the phenological modified E–L system was corroborated by some physiological parameters (total solids, pH, titratable acidity and anthocyanins) determined according to Negri et al. (2008a). These data are reported in the supporting information (Online Resources 1).

The samples were made up of skins that were immediately detached by squishing the berries in order to remove seeds and the bulk of the mesocarp, then pressing and smearing the inner part of the skin on two layers of cheesecloth to completely take away the residual pulp. Afterwards, the samples were frozen in liquid nitrogen and stored at −80°C until use.

In order to obtain three groups of 100 skins for each sampling stage, ten berries/cluster were collected. The pool of skins from each group was intended as a biological replication and underwent an independent protein extraction. For each extracted sample, two 2-DE gels were obtained. Consequently, six gels were analyzed for each time-point of the analysis.

Protein extraction and 2-DE

Protein extraction and 2-DE analysis were carried out as described in the work by Negri et al. (2008a). Briefly, the acetone powders, obtained starting from the frozen skin samples, were resuspended in an aqueous extraction buffer and then centrifuged to clarify the extract. The final protein suspension was obtained through the phenol-phase partition method. Proteins were resuspended in IEF buffer [7 M urea, 2 M thiourea, 3% (w/v) CHAPS, 1% (v/v) NP-40, 50 mg mL−1 DTT, and 2% (v/v) IPG Buffer pH 3–10 (GE Healthcare)] and their concentration was determined by 2-D Quant Kit (GE Healthcare).

For the IEF, 200 μg of the protein sample were loaded on pH 3–10, 24 cm IPG strips (GE Healthcare) after passive overnight rehydration. IEF was carried out at 20°C with current limit of 50 μA/strip for about 90 kVh in an Ettan IPGphor (GE Healthcare) using the following settings: 5 min gradient 200 V, 1 h at 200 V, 5 min gradient 500 V, 1 h at 500 V, 5 min gradient 1000 V, 6 h at 1000 V, 3 h gradient 8,000 V, and 9 h at 8,000 V. Prior to the second dimension, strips were equilibrated in order to reduce and alkylate proteins. Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE; Laemmli 1970) was performed using 12.5% acrylamide gels placed in the ETTAN DALT six apparatus (GE Healthcare). Proteins ran at 5 W/gel for 30 min, then at 15 W/gel until the bromophenol blue line came out the gels.

Protein visualization and image analysis

The colloidal Coomassie Brilliant Blue G-250 (cCBB) procedure, as described by Neuhoff et al. (1988), was employed in order to stain the proteins. The gels were then scanned in an Epson Expression 1680 Pro Scanner and gel images were saved as gray-scale TIFF files with a resolution of 360 dpi. Gel alignment was performed though the use of the ImageMaster 2-D Platinum Software v. 6.0 (GE Healthcare) matching the gels using the landmark-assisted procedure followed by manual check. The molecular weight of the spots were estimated on the basis of the migration of SigmaMarkers™ wide range (MW 6.500–205.000), while pI was deduced according to the strip manufacturer’s instructions (GE Healthcare).

LC-ESI-MS/MS analysis and protein characterization

Spots excised from cCBB-stained 2-DE gels were digested as previously described by Prinsi et al. (2009) and the resulted tryptic peptides were resuspended in 0.1% (v/v) formic acid (FA).

The LC-ESI-MS/MS experiments were conducted through a Surveyor (MS pump Plus) HPLC system connected to the ESI source of a Finnigan LCQ DECA XP MAX spectrometer (ThermoFisher Scientific Inc., Waltham, USA). Chromatographic separations were carried out on a INERSTIL WP300 C18 column (200 μm I.D × 150 mm length, 5 μm particle size) with a flow of 2 μl/min, using a nonlinear gradient of solvent B [5% for 5 min, up to 16% in 10 min, up to 40% in 60 min, up to 80% in 10 min then fixed for 10 min, followed by a column equilibration step in 5% of solvent B, were solvent B: 0.1% (v/v) FA in acetonitrile and solvent A: 0.1% (v/v) FA]. ESI was performed in positive ionization mode with spray voltage and capillary temperature set at 2.5 kV and at 220°C, respectively. Data were collected in the full-scan and data dependent MS/MS mode with collision energy of 35% and a dynamic exclusion window of 3 min.

Spectra were searched using TurboSEQUEST® incorporated in BioworksBrowser 3.2 software (ThermoFisher Scientific Inc., Waltham, USA) against the Vitis protein subset extracted from the National Center for Biotechnology Information (NCBI). These analyses were performed assuming parent ion and fragment ion mass tolerance of ±2 and ±1 Da, respectively, two possible missed cleavages per peptide, fixed carboxyamidomethylation of cysteine and variable methionine oxidation. Positive hits were filtered on the basis of peptides scores [Xcorr ≥ 1.5 (+1 charge state), ≥2.0 (+2 charge state), ≥2.5 (≥3 charge state), ΔCn ≥ 0.1, peptide probability <1 × 10−3, and Sf ≥ 0.70]. If needed, identified peptides were used in protein similarity search performed by alignment analyses against the NCBI-nr database using the FASTS algorithm (Mackey et al. 2002).

Theoretical molecular weight and pI of characterized proteins were calculated by processing sequence entries at http://www.expasy.org/tools/pi_tool.html.

In order to assign putative functions to proteins derived from the two grapevine sequencing projects (Jaillon et al. 2007; Velasco et al. 2007) that are defined as “hypothetical protein” or “unnamed protein product”, we used the Blast2GO software (Conesa et al. 2005) aligning the sequences against the NCBI NR non-redundant database.

Software

Principal component analysis and linear discriminant analysis were carried out by Statistica v. 7.1 (StatSoft Inc., Tulsa, OK, USA) and Matlab R2007b (The Mathworks, Natick, MA, USA). All graphical representations were performed by Statistica, Matlab and Excel 2003 (Microsoft Corporation, Redmount, WA, USA).

Theory

PCA and LDA

PCA (Massart et al. 1988; Vandeginste et al. 1988) is a multivariate pattern recognition method representing the objects, described by the original variables, into a new reference system given by new variables called PCs, linear combinations of the original variables. PCs are orthogonal one to each other and are calculated hierarchically in decreasing order of explained variance so that experimental noise and random variations are collected in the last PCs. PCA offers therefore a powerful tool for dimensionality reduction. PCA provides two tools for data analysis: the scores (the co-ordinates of the objects on the PCs), allowing the identification of groups of samples showing similar or opposite behavior and the loadings (the weights of the original variables on each PC), giving insight on the reasons of the differences pointed out in the objects.

LDA (Massart et al. 1988; Vandeginste et al. 1988; Eisenbeis 1972; Klecka 1980) is a Bayesian classification method providing the classification of the objects considering the multivariate structure of the data. In Bayesian methods each class is usually described by a Gaussian multivariate probability distribution and each object is classified in a particular class g if the so-called discriminant score D g is minimum:

$$ {D_g}\left( {{x_i}} \right) = {\left( {{x_i} - {{\bar{x}}_g}} \right)^T}S_g^{{ - 1}}\left( {{x_i} - {{\bar{x}}_g}} \right) + \ln \left| {{S_g}} \right| - 2\ln {P_g} $$
(1)

where, S g is the covariance matrix of class g; \( {\bar{x}_g} \) is the centroid of class g, x i is the vector representing the i-th object and P g is the prior probability of class g.

In LDA, S g is approximated with the pooled (between the classes) covariance matrix; this corresponds to consider all the classes as having a common shape (i.e., a weighted average of the shape of the present classes).

The variables contained in the LDA model discriminating the classes can be chosen by a stepwise algorithm, selecting iteratively the most discriminating variables. Here, a forward selection (FS) procedure is applied: the method starts with a model where no variables are included and gradually adds a variable at a time until a determined criterion of arrest of the procedure is satisfied. The variable being included in the model in each step is the one providing the greatest value of an F-Fisher ratio, so that the j-th variable is included in the model, with p variables already included, if:

$$ F_j^{ + } = {\max_j}\left[ {\frac{{{\text{RS}}{{\text{S}}_p} - {\text{RS}}{{\text{S}}_{{p + j}}}}}{{s_{{p + j}}^2}}} \right] > {F_{{{\rm{to}} - {\rm{enter}}}}} $$
(2)

Where, \( s_{{p + j}}^2 \)is the variance calculated for the model with p variables plus j-th variable; RSS p is the residual sum of squares of the model with p variables; RSSp+j is the residual sum of squares of the model with p variables plus j-th variable.

The F value thus calculated is compared to a reference value (F to-enter) usually set at values ranging from 1 (more permissive selection, including a larger number of variables in the final model) to 4 (more severe selection).

Here, FS-LDA was applied to principal components rather than to the original variables. The loadings of the PCs allow the calculation of the final weight of each original variable on the LDA model built on PCs. The classification performance of the LDA models can be evaluated by the calculation of three parameters:

  • Non-error rate (NER%): represents the percentage of overall correct assignments:

$$ {\hbox{NER}}\% = \frac{{\sum\limits_{{g = 1}}^G {{c_{{gg}}}} }}{n} \times 100 $$
(3)

where, c gg is the number of objects from class g correctly classified in class g; n is the overall number of objects; G is the overall number of classes.

  • Selectivity S g : represents the percentage of non-overlapping between the classes:

$$ {S_g} = \left( {1 - \frac{{\sum\limits_{{i = 1}}^G {{c_{{i{,}j\left( {i \ne j} \right)}}}} }}{{n - {n_g}}}} \right) \times 100 $$
(4)

where c i,j are the objects from other classes from class g wrongly assigned to class g; n is the number of overall objects; n g is the number of objects in class g; G is the overall number of classes.

  • Specificity Spg: represents the NER% of each class:

$$ {\hbox{S}}{{\hbox{p}}_g} = \frac{{{c_{{gg}}}}}{{{n_g}}} \times 100 $$
(5)

where, c gg is the number of correct classifications of class g; n g is the number of objects of class g.

These parameters can be calculated both in calibration and cross-validation for obtaining an evaluation of the predictive ability of the LDA model calculated. Due to the small number of samples in each class, leave-one-out procedure is applied here as the validation method.

Results and discussion

2-DE and image analysis

The samples under investigation are skins of the cv. Pinot Noir at three different ripening stages. The stages were coded according to the Modified E-L models (Coombe 1995) as RS35 (véraison berries), RS37 (not quite ripe berries) and RS38 (harvest-ripe berries). This phenological classification is also consistent with the results of the chemical analysis conducted on the berries (total solids, pH, titratable acidity, and anthocyanins) shown in the supporting information (Online Resources 1). Proteins were extracted washing the homogenized samples with acetone and using phenol as described by Negri et al. (2008a). In the derived gels, an average of about 750 spots gel−1 was found. This value did not vary significantly among the three stages as it could be seen in the representative gels shown in supporting information (Online Resources 2). In order to evaluate the quantitative changes occurring among the three stages, the relative volume (%Vol) of the 652 matched spots was considered. A first visual inspection of the maps highlighted the absence of maps characterized by a bad protein separation.

Statistical analyses and biomarkers isolation

Principal component analysis

Data were arranged in an 18 × 652 matrix: 18 being the number of samples (six replicate gels for each ripening stage) and 652 being the spots matched among gels. PCA was carried out after autoscaling. The amount of explained variance is quite distributed along several PCs showing a low correlated data structure (PC1 = 14.7%, PC2 = 11.8%, PC3 = 8.1%, PC4 = 7.7%, PC5 = 7.1%). The low correlation could be explained by a significant sample heterogeneity that can be also noticed by looking at the score plot of the first two PCs (Fig. 1) and confirms the evidence about the absence of deep changes in protein patterns among the stages that emerged from gel analysis. The score plot shows that the replications within each class are characterized by a quite large dispersion. Such heterogeneity could be partially explained by the large variability that is expected in the data since the replications are not only instrumental but also biological, accounting for samples that underwent environmental pressures typical of the field conditions. PCA allows in part to solve this problem by a first dimensionality reduction: the last PCs, mainly accounting for noise and experimental error, are not considered further in this study. Moreover, biological variability seems larger than instrumental variability since the two instrumental replications are usually close to each other, while biological replications are more disperse. The largest variability regards the two pn3.2 replications that appear quite distant one from the other but also distant from the other RS38 samples. Sample pn2.2a is instead quite different from its corresponding instrumental replication pn2.2b.

Fig. 1
figure 1

PCA on the overall dataset: score plot of the first two PCs (objects from different classes, RS35, RS37, RS38, are represented by different markers). The samples under investigation are coded by a six-character label: the first two are letters (pn) corresponding to the genotype (Pinot Noir); the third is a number related to the ripening stage (1 = RS35, 2 = RS37, 3 = RS38), followed by a point; the fifth is a number corresponding to the biological replication (1, 2, or 3), while the last is a letter corresponding to the instrumental replication (a or b)

For what regards the separation between the classes, the score plot shows the samples quite well separated, in spite of the large variability of the results; this is particularly true for the third ripening stage, RS38, located at negative values on PC2. The other two groups, RS35 and RS37, appear instead quite overlapped showing that the main differences in 2-DE patterns regarded the last ripening period. This result was quite surprising because some oligo/microarray analyses (Waters et al. 2005; Terrier et al. 2005) indicated that most changes in mRNA profiles characterized the first ripening phases in cv. Shiraz berries immediately after véraison while no significant variations were observed towards maturity.

Also considering that genotype peculiarity and vintage effects may heavily weight on this phenomenon, it is possible that mRNA and protein levels give rise to different outcomes. In support to this hypothesis, according to the hierarchical cluster analysis in a 2-DE work on Barbera skins (Negri et al. 2008a), the great variations in proteome profile did not take place immediately after the onset of ripening but later, between 14 and 28 days after véraison. This evidence is quite interesting since the common phenotypical and chemical parameters used to define the degree of ripening are not exhaustive in distinguishing the last phases of berry development and indicating the right moment for harvesting. On the contrary, it seems that, through 2-DE analysis assisted by multivariate statistics, it could be possible to clearly separate these similar samples on the basis of the altered levels of a panel of proteins.

Linear discriminant analysis on the overall dataset

The identification of possible markers was then achieved by linear discriminant analysis, applied to the first ten PCs calculated: the use of the PCs instead of the original variables allows the dimensionality reduction and the elimination of noise by excluding the last PCs calculated. It appears particularly important in this case where a large variability within each class was identified. Furthermore, a variable selection procedure was applied exploiting a stepwise algorithm in forward search (F to-enter = 4) to select only PCs effective for classification purposes. The final model contains PC1, PC2, PC5, PC6, and PC8 (Table 1) at a significance level of 95% (p level < 0.05); Table 1 reports the coefficients of each PC included in the model for each model built (one for each ripening stage) together to the value of the intercept. LDA provided very good results in calibration, since all samples are correctly classified (NER and specificity of 100% for all classes) and no overlap is detected (selectivity of 100% for all classes). The performance of the model in prediction was evaluated in cross-validation by a leave-one-out procedure: the first two ripening stages show a partial overlapping with two RS35 samples (pn1.1a and pn1.2b) classified as RS37 and two RS37 samples (pn2.2a and pn2.2b) classified as RS35 (selectivity = 83.33% and specificity = 67.77% for both classes). The model for RS38 shows instead both selectivity and specificity of 100%. The overall NER in cross-validation is 77.78%.

Table 1 Coefficients of the significant PCs included in the LDA model for all the ripening stages with the corresponding p-level and F-Fisher values

Since PCs and original variables (spot volumes) are related by a linear combination, it is possible to calculate each discrimination model in terms of the original variables, i.e., spots present on the maps. The results are represented in Fig. 2; the representation is separated in six panels to allow the clear identification of the original 652 variables, indicated at the bottom of each panel, by their identification number. The variables are represented on a color scale; black corresponds to spots showing a large negative effect on the model of the corresponding ripening time (the corresponding spot is less abundant), while white corresponds to spots showing a large positive effect (the corresponding spot is present in higher amount). In this case the spots are sorted according to their rank on the RS38 model for convenience: spots from left to right in each panel are sorted from those with the largest negative weight on the RS38 model to those with the largest positive one on the same model. Figure 2 shows that the models of the first two ripening stages are the most similar and show similar contributions of the same spots: spots more abundant in one model are up-accumulated also in the other; differences are to be searched for in the entity of the up- or down-accumulation. The third ripening stage instead shows a completely different behavior, with most of the variables that increase their levels as ripening goes on.

Fig. 2
figure 2

Effect of the original variables on the LDA models calculated for the three ripening stages: the variables are represented on a color scale according to the increasing negative (from light gray to black) or positive (from light gray to white) coefficient. The representation is separated in six panels; original variables are represented at the bottom of each panel sorted according to their significance on the RS38 model: spots are sorted from left to right from those with the largest negative weight on the RS38 model to those with the largest positive weight. The spots indicated in bold characters were identified by mass spectrometry

Biological significance of some identified proteins

Some of the variables showing the largest positive and negative coefficients in the LDA models were characterized by LC-ESI-MS/MS as possible interesting markers and discussed in the following sections considering their biological role during ripening.

For each of the three models, the 30 most significant variables characterized by the largest positive and negative coefficients were considered. Considering that many of the selected spots overlapped among the three models and that for some spots, due to the really low abundance, there was no possibility of picking, the selection provided a set of 39 spots.

The variables were ranked according to their decreasing positive or negative coefficient on the LDA models and they were selected for MS characterisation considering this order. We identified 36 of the 39 analyzed spots. The recent availability of Vitis genome sequence surely contributed to the reported high number of the characterizations. It is interesting to note that both the two distinct projects of genome sequencing (Jaillon et al. 2007; Velasco et al. 2007) worked on two different lines of the Pinot Noir cultivar, which was also employed in this work. This circumstance also sheltered from the difficulties linked to the broad intervarietal heterogeneity that is typical of V. vinifera. The impact of these new genetic resources is evident when considering that 24 of 36 identified proteins were characterized as sequences derived from the two projects and that in this work, differently from past studies on grape (Negri et al. 2008b), it was not necessary to have recourse to EST databases.

The characterized proteins are listed in Table 2 and are shown in supporting information (Online Resources 3 and 4) that reports the identified spots on one of the six maps from RS38. Further details of mass spectrometry analyses are reported in the supporting information (Table 1S).

Table 2 List of the spots identified by LC-ESI-MS/MS and bioinformatic analysis. Statistical information about mass spectrometry analysis are reported in Online Resources 4

The results discussed below, obtained through this organic integration of 2-DE and multivariate statistical techniques, mainly appear to depict a typical proteomic path, similar in many traits to previous works conducted on grape berry as well as open new perspectives on other proteins that have not yet found as involved in ripening.

Pathogenesis related proteins

As observed in previous works, among the proteins that change their levels of expression during ripening, there are many pathogenesis-related proteins (PRs). Being highly abundant at ripening and generally low expressed or absent in unripe berries, they are variables with high discriminating power for mature samples in our models (Fig. 2). Among these, we found spots identified as chitinases of classes III and IV, β-1,3-glucanases and pathogenesis-related proteins 10 (PRP-10). While chitinases and β-1,3-glucanases are known to have fungal targets and probably act hydrolyzing the cell walls of fungal hyphae (Tattersall et al. 1997; Pocock et al. 2000), PRP-10 display antimicrobial and ribonuclease activities (Liu and Ekramoddoullah 2006).

Chitinases belong to a large family of enzymes with hydrolytic activity against chitin which is a major component in the cell walls of most pathogenic fungi and in the exoskeleton of insect pests. We identified three spots (spots 2611, 2648, 2769) related to class IV and two (spots 2669 and 2723) as class III chitinases. In agreement with other recent proteomic papers (Deytieux et al. 2007; Negri et al. 2008a; Giribaldi et al. 2007), we observed that during ripening there is a surge in the abundance of enzymes belonging to class IV chitinases. An increase in the levels of class III chitinases during ripening has not been yet reported in literature, and it was argued that such proteins are present in low amount in grape and that their levels may change only in the event of pathogen attacks (Robinson et al. 1997). Inversely, in our work we observed that these enzymes are highly expressed (up to 1% of the %Vol) and that their amount increased after véraison. One of these, spot 2669, was only detectable in RS38, while the other one (spot 2723) showed an upward trend during maturation.

Along with the increase in the amount of chitinases, we observed two spots identified as β-1,3-glucanases (spot 2528 and 2538) that are particularly significant in RS38, so making our data well fitting with recent papers that demonstrated that β-1,3-glucanases levels upsurged after véraison (Deytieux et al. 2007; Negri et al. 2008a; Giribaldi et al. 2007).

The fact that healthy fruit accumulate many PR proteins during their development suggested that such enzymes could be synthesized and stored in an inactive form (Robinson et al. 1997; Robinson and Davies 2000). Alternatively, it is also possible that they may have some role during berry ripening. For example, it was observed that some chitinases do not show any antifungal property in vitro (Kasprzewska 2003) while there is strong evidence that chitinases catalyze the hydrolytic decomposition of arabinogalactan proteins (AGPs) that are present in plant cell walls (van Hengel et al. 2001, 2002).

Abiotic stresses response proteins

The hypothesis of a possible involvement of oxidative stresses during grape ripening has been recently reinforced by the work of Pilati et al. (2007) that observed a burst of hydrogen peroxide around véraison that might modulate the coordinated expression of at least 32 transcripts involved in ROS scavenging. Among them, it was reported the peak in the levels of ascorbate peroxidase (AsPOX), which is considered the main responsible for H2O2 removal in plant cytosol, mitochondria, and chloroplasts (Dalton et al. 1991; Chew et al. 2003). In agreement with these results, we identified a spot corresponding to AsPOX (spot 2677) showing opposite coefficients for RS35 and RS38 models, reflecting the decreasing trend that we observed though the progression of ripening (Fig. 2).

An analogous behavior can be described for other two spots related to proteins known to be also involved in oxidative processes, i.e., polyphenol oxidase (PPO, spot 2936 and 2940) and lipoxygenase (LOX, spot 1756). PPOs, generally highly expressed in young developing tissues, constitute a large class of enzymes that catalyze the formation of o-quinones through the O2-dependent oxidation of monophenols and o-diphenols (Mayer and Harel 1991). It was proposed that in addition to their involvement in browning reactions as a consequence of pathogen infection, wounding and organ senescence, they may participate in other activities such as proanthocyanidin (Dixon et al. 2004) and aurone (Nakayama et al. 2000) biosynthesis. PPOs are synthesized in grape as a 67 kDa precursor that is then processed to remove a 10.6 chloroplast transit peptide from the N-terminus and a 16.2 kDa peptide of unknown function from the C-terminus with the resulting 40.5 kDa that is considered as the chloroplast catalytic unit (Dry and Robinson 1994). The two PPO spots may correspond to this C-term peptide having a deduced mass of about 18 kDa. In addition to the similarity of the molecular weight, this idea is supported by the evidence that the detected tryptic peptides are comprised in the part of the sequence between the hypothesized cleavage site and the C-terminus. The fact that this fragment is detectable in the gels and thus not discarded after cleavage suggests that it may retain a function during ripening. It was observed, for instance, that its tertiary structure is similar to that of hemocyanin, an oxygen-binding protein isolated in the blood of mollusks, the main function of which resides in O2-storage and transport (Marusek et al. 2006). A drop in the levels of spots that correspond to this PPO peptide were also reported after véraison in the proteomic analysis of Negri et al. (2008a) performed on cv. Barbera skins. In the light of these evidences, it would be interesting to further investigate the possibility of employing such proteins as markers for grape ripening.

LOX are non-haem iron-containing dioxygenases widely distributed in the plant kingdom and with diverse roles (Porta and Rocha-Sosa 2002). LOX action passes through the peroxidation of polyunsaturated fatty acids and the formation of a class of end-products known as oxylipins which have specific functions in signaling and plant defense responses (Nooden and Leopold 1988; Rogiers et al. 1998; Zhang et al. 2006). It was observed that in senescing tissues and during the ripening of some fruits there is a clear relationship between lipid peroxidation and increasing oxidative levels (Rogiers et al. 1998; Brennan and Frenkel 1977; Brennan et al. 1979). This suggests that, although LOX are involved in different activities, for some of them the expression during grape ripening could be linked to the oxidative burst occurring in berries around véraison.

General metabolism

As reported in many papers, grape ripening affects the expression of many enzymes involved in metabolic pathways of the primary metabolism such as those related to glycolysis/gluconeogenesis, pentose-phosphate pathway and Krebs cycle (Giribaldi et al. 2007; Negri et al. 2008a; Robinson et al. 1997; Pilati et al. 2007; Ruffner and Hawker 1977; Famiani et al. 2000; da Silva et al. 2005; Terrier et al. 2005). In the past, it was proposed that glycolysis was inhibited after véraison (Famiani et al. 2000). Then, through some microarray analyses, it was reported that some glycolytic enzymes were up regulated during ripening (da Silva et al. 2005; Terrier et al. 2005). In the present work, we observed that among the characterized spots, there are some enzymes relative to glycolysis and Krebs cycle, indicating that also in Pinot skins, primary metabolism undergoes some modulations during ripening. As far as it concerns glycolysis, we found three variables with highly significant coefficients in the models corresponding to enolase (spot 2066, Eno1; spot 2071, Eno2; spot 2158, Eno3), and a glyceraldehyde-3-phosphate dehydrogenase (G3PDH, spot 2314). The levels of Eno1 and Eno2 sharply decreased during ripening while Eno3, which is more abundant than the other two, surged in ripe berries. G3PDH levels peaked at the second ripening stage causing the high significance of this variable for the RS37 model. In addition, we found that the discriminant coefficients of two proteins of Krebs cycle, a cytoplasmic aconitase (Aco, spot 1736) and a fumarate hydratase (FumHyd, spot 2182), were among the most negative for the model of RS38 indicating their clear down-regulation in the last phases of ripening. A detailed study of primary metabolism during this process goes beyond the purpose of this work, but taken together these data could lead to some interesting considerations. Although we characterized just a few discriminant proteins and we have no information about enzyme activities, it seems that in Pinot Noir, spot volume data do not suggest the evident induction of some enzymes of glycolysis, pentose–phosphate pathway, and Krebs cycle that emerged from the other 2-DE works on grape skin (Deytieux et al. 2007; Negri et al. 2008a). This different behavior could depend on a plethora of differences existing among these cultivars. It was hypothesized, for example, that a possible activation of glycolysis and pentose-phosphate pathway during the ripening of Barbera skins could be linked to the high demand of carbon skeletons and reducing power required by the biosynthetic activities of the secondary metabolism that are operating during this period. Hence, according to this hypothesis, the fact that grape cultivars show unlike capabilities of accumulating anthocyanins might imply that the expression of some enzymes of the primary metabolism greatly differs. For instance, the different considerations proposed for glycolysis and Krebs cycle in the present work (Negri et al. 2008a) could be a consequence of the different levels of such molecules in Barbera and in Pinot Noir mature skins that were measured as 3.2 and 1.7 mg malvidin g−1 berry, respectively. Few is known at the moment on the relationship existing between primary and secondary metabolism in developing berries and these aspects will be certainly further investigated in the next years. The high discriminant coefficients of such proteins in our models stressed the relevance of primary metabolism processes during ripening.

We found two spots (spot 2100 and 2126) corresponding to the subunit α and β of the complex of mitochondrial ATP synthase (ATPsyntα and β, respectively). Being localized in the soluble portion of the protein, F1, they are known to represent the catalytic unit of the complex, which is directly involved in ATP production. Both spots showed high discriminant coefficients for the RS37 model because their levels dropped at maturity. This trend of expression has already been described for the α-subunit in Nebbiolo berries while in the same study the β one appeared as upregulated after véraison (Giribaldi et al. 2007). Few are known about this key protein, its regulation and ATP biosynthesis during this period. Anyway, considering that in a proteomic work performed on membranes of Cabernet Sauvignon berries (Zhang et al. 2008), it was indicated that the levels of mitochondrial ATP synthase did not vary during ripening, it is possible that the expression of such subunits could suffer from the effect of cultivar-specificity and could be strictly linked to other traits of berry metabolism. In this view, we can argue that the low amount of this protein in ripe berries could be linked to the possible decrease of the energetic metabolism in accordance with the data of glycolytic enzymes discussed above.

It was reported that during ripening, a loss of photosynthetic capacity occurs and that a marked negative regulation of the genes involved in this process, observed through microarray analyses, takes place (Pilati et al. 2007). In agreement with these results, we observed the shrinkage of spot 2579, identified as the 33 kDa precursor of one of the three extrinsic proteins that take part in the regulation of the Oxygen Evolving Complex (33OEC). 33OEC, in particular, is involved in the stabilization of the tetranuclear manganese cluster under physiological salt concentrations and in accelerating O2 evolution activity (Bricker and Frankel 1998; Popelkova et al. 2002).

GDP-mannose-3′,5′-epimerase (ManEp) catalyzes the reversible conversion of GDP-D-mannose to GDP-l-galactose and to GDP-l-gulose and it was suggested that this reaction is a key regulatory step of ascorbic acid (AsA) biosynthesis (Wolucka et al. 2001; Major et al. 2005). It has been recently indicated that in a comparison among kiwifruit genotypes that differ for the levels of AsA, the concentration of the vitamin is correlated to the high amount of GDP-mannose-3′,5′-epimerase and GDP-l-galactose guanyltransferase also because, overexpressing the genes of the two enzymes in transgenic kiwifruit plants, a significant increase in AsA levels occurred (Bulley et al. 2009). In our work, we selected the relative spot (spot 2233) because it was characterized by an extremely positive coefficient for RS37 model and a very negative one for RS38, reflecting that the levels of ManEp peaked in the second ripening stage and dropped in mature skins. This behavior is similar to what reported for kiwifruit: after reaching the top in the amount of the enzyme 2 weeks after anthesis, the levels of ManEp sharply downturned moving towards maturity, overlapping the trend of the levels of ascorbate in the fruit during development (Bulley et al. 2009).

Other proteins

S-adenosylmethionine synthetase (SAMS) is a protein involved in the biosynthesis of S-adenosyl-l-methionine (SAM). In addition to its activity as the precursor in the biosynthesis of the polyamines spermidine and spermine and of the plant hormone ethylene, SAM is the principal methyl donor of the cell being the substrate of many specific methyltransferases that act on a wide range of molecules, moving from nucleic acid to secondary metabolites (Moffatt and Weretilnyk 2001). Spot 2215, which sharply declined its expression in mature skins resulting in a very negative coefficient for RS38 was identified as SAMS. The role of this enzyme during ripening is not clear, especially for non-climacteric fruit such as grape, where ethylene function(s) has not been yet clarified. However, our results appear to be in agreement with those of Deytieux et al. (Deytieux et al. 2007) that found three isoforms of this enzyme that were more expressed at the end of color change than at maturity.

Moreover, in plants, adenosine kinase (ADK) is essential for transmethylation activities because it is involved in the recycling of S-adenosyl-l-homocysteine (SAH), the by product of SAM-dependent reactions. SAH is a competitive inhibitor of methyltransferases and has to be continuously removed in order to mantain cellular methylation reactions. In our work, we found the spot 2444 that was identified as ADK which showed a largely positive discriminant coefficient for RS38 because it was considerably accumulated in mature berries, as already described for Barbera grapes (Negri et al. 2008a). These results suggested that ADK might be an interesting marker for grape ripening and opened new perspectives on the relevance of the transmethylation events during the process.

We characterized two spots (Hsp70 and Hsp17.6, spot 1849 and 3006, respectively) belonging to the family of heat shock proteins (Hsp) that showed a sharp downturn through the progression of maturation events. These data agree with the report that a peak of Hsp levels occurs around véraison (Giribaldi et al. 2007; Brennan et al. 1979) and might be interpreted as a consequence of the intense redirection of metabolism that is necessary to stabilize old and newly synthesized proteins during this phase.

The fact that in our work we found a spot corresponding to actin (Act, spot 2253) that showed a high discriminant coefficient for RS37 suggested a more aware approach in employing actin genes as references for the normalization in grape expression analyses (Reid et al. 2006). In accordance, Giribaldi et al. (2007) noticed that the presence of four spots, identified as actin, were modulated throughout Nebbiolo berry development. This fact could be explained by the evidence that berry size increases mainly due to the enlargement of the cells during ripening. This event has to be accompanied by changes in cytoskeleton structure and, maybe, as a consequence, also by variations in actin levels.

Cytochrome b5 (Cytb5) is a tail-anchored protein that is known to function in higher plant as intermediate electron donor in the desaturation of fatty acids (Smith et al. 1990), sterol precursor (Rahier et al. 1997) and in the hydroxylation of oleate (Smith et al. 1992). A possible involvement of Cytb5 in the determination of flower color was also described: de Vetten and collegues (de Vetten et al. 1999) reported that the inactivation of the relative gene by transposon mutagenesis resulted in a reduced 3′,5′ substitution of anthocyanins. In this work, we describe for the first time, to our knowledge, the discovery of this protein in grape skins. In fact we found a spot (spot 2635), corresponding to Cytb5 that showed a highly positive discriminant coefficient for the second ripening stage RS37 and a negative one for RS38 in the linear discriminant analysis performed on the whole dataset. In general, little information is available about this protein, thus further investigations are needed in order to elucidate the aspects concerning Cytb5 activity in grape skins.

Among the spots with high discriminant coefficients we identified some proteins, i.e., an aminopeptidase N (spot 1752), Tif3b1 (spot 1772), a 2-nitropropane dioxygenase (spot 2450), a 60 S acid ribosomal protein P0 (spot 2467) a NADH isoflavone reductase-like protein (spot 2483), and a pyridoxine biosynthesis protein (spot 2505): it is difficult to argue about their role during grape ripening because of the lack of knowledge about their activities.

Linear discriminant analysis on the last two ripening times

Since the main differences arose between RS35 and RS37 on one side and RS38 on the other side, the analysis was restricted to the samples belonging to second and third sampling periods in order to test this approach in isolating a panel of candidate markers useful in discriminating the last stages of berry ripening, known to be quite similar according to classical phenological and chemical parameters: PCA was applied to this restricted set of data; again, the amount of information accounted for by each PC reflects low correlated data (PC1 = 18.8%, PC2 = 15.3%, PC3 = 11.9%, PC4 = 10.8%, PC5 = 7.8%).

Again, linear discriminant analysis was applied to the first ten PCs calculated exploiting a stepwise algorithm in forward search (F to-enter = 6) to select only PCs effective for classification purposes. Only the second PC was included in the final model at a significance level of 95% (p level <0.05). The coefficients of the models calculated for both ripening stages show a positive contribution of PC2 on class RS37 and a negative contribution on class RS38. Samples appeared in fact well separated on the second PC, according to the scores (Online Resources 5): samples from class RS37 showed positive scores while samples from class RS38 showed all negative scores along this principal component.

LDA in calibration provided very good results since all samples were correctly classified (NER = 100% and specificity of 100% for both classes) and no overlap was detected (selectivity of 100% for both classes). The performance of the model in prediction was evaluated in cross-validation by a leave-one-out procedure; the classification matrix confirmed the good results obtained in calibration, with only one wrong assignment corresponding to sample pn2.2a. This sample in facts was characterized by the smallest positive score along PC2, making it within class RS37 the closest to samples belonging to the third ripening stage. Also the values calculated for selectivity and specificity of both classes in validation showed good results: selectivity = 100% and 83% for classes RS37 and RS38, respectively, and specificity = 83% and 100% for classes RS37 and RS38, respectively. The overall NER is 91.6%.

Also in this case, the relationship existing between the PCs and the spot volumes allowed to consider the relevance of the original variables in defining each model.

The results are represented in a graphical way, for the third ripening stage, in Fig. 3a–b; the most discriminating spots were represented on a “virtual” map according to their molecular weight and pI. The variables were separated in two representations according to their positive or negative weight on the final model and are represented on a color scale from light gray to black according to an increasing negative (Fig. 3a) or positive (Fig. 3b) coefficient. Negative values correspond to spots showing a down-regulation while positive values corresponded to spots characterized as more abundant at the third stage. Obviously, variables at the second ripening stage, RS37, showed an opposite behavior. This representation should be useful for identifying the possible markers on a “virtual map”, indicated as the most colored spots.

Fig. 3
figure 3

Discriminant variables for the LDA model of the 3rd ripening stage: negative (a) and positive (b) coefficients are represented on “virtual” map; the spots are represented on a color scale according to the increasing negative or positive coefficient, from light gray to black

Among the spots that we identified through LC-ESI-MS/MS, some were also highly discriminant in these models. It is interesting to note, for instance, that all the previously discussed PR proteins, the levels of which sharply rose between the two last stages, showed large positive coefficients for RS38. On the other hand, PPOs, G3PDH, 33OEC, ATPsyntα and, ManEp, SAMS, and Cytb5, being more abundant in RS37 than RS38, were comprised among the variables with the most negative coefficients for RS38.

Concluding remarks

In this study, the combination of 2-DE analysis and FS-LDA performed on principal components revealed to be an interesting and robust tool that could be employed for the isolation of ripening markers in future studies.

The fact that, besides new candidates, many of the identified spots referred to proteins already known to be involved in the ripening of grape or of other fruits (e.g. chitinase, β-1,3-glucanase, aconitase, enolase, heat shock proteins etcetera) confirmed the reliability of the approach.

Moreover, the multivariate statistical analyses unexpectedly told apart the last two collected samples, suggesting that these techniques might give some insights about the right time for grape harvesting, while the chemical and phenological parameters usually employed, in many cases, do not return unequivocal results. This evidence paved the way to study the differences occurring in grape samples more closely collected in the period around vintage.