Introduction

Apple (Malus x domestica Borkh.) is one of the major temperate fruit crops, and it is appreciated by consumers for taste, flavour and nutritional attributes. Apple fruit is an important source of minerals, vitamins, fibres and antioxidants in the human diet. M. x domestica Borkh. belongs to the Rosaceae family, and it is an ancient allopolyploid species with a basic haploid number x = 17. Since the fruit is the edible part of the plant, it is of the highest importance to understand the molecular mechanisms underlying fruit development and ripening and thus try to unravel the complex interactions which lead to fruit quality. The identification of the genes involved in those processes will allow combining alleles with positive effects on the traits of interest, selecting new and improved varieties. Fruit development is determined by the coordinated action of multi-faceted biological processes eventually affecting colour, texture, flavour, sugar content and many other features of the mature fruit. After fruit set, morphological changes culminate when the fruit reaches its maximum size. It is at that stage that metabolic and physiological modifications take place to confer the well-known agreeable characteristics of the mature fruit (Gillaspy et al. 1993; Alba et al. 2005). Fruits can be classified in two groups according to the ripening mechanism: (1) climacteric, when respiration and ethylene biosynthesis increase during ripening, and (2) non-climacteric, when respiration does not change significantly and ethylene production remains low (Alexander and Grierson 2002). Apple is a climacteric fruit such as tomato, which can be considered the model species of the climacteric group. In tomato, the biosynthetic pathway of ethylene is well known, and the effects of the hormone have been extensively studied. Ethylene is formed in two steps: (1) initially, S-adenosyl-l-methionine (AdoMet) is transformed into 1-aminocyclopropane-1-carboxylic acid (ACC) by ACC synthase (ACS), and (2) eventually, ACC is converted to ethylene by ACC oxidase (ACO) (Alexander and Grierson 2002). The plant is able to sense the presence of ethylene by a family of histidine kinase-like receptors, negatively regulating ethylene responses (Chang and Shockey 1999). A molecular cascade transduces the signal to the nucleus, where transcription factors (e.g. in tomato, LeEIL and LeERF) regulate the expression of the ethylene responsive genes (Adams-Phillips et al. 2004). Modifications in the expression level of ethylene responsive genes lead to ethylene autocatalytic production, which causes changes to the cell wall metabolism, alters the synthesis of volatile compounds, increases sugars and acids, stimulates the synthesis of carotenoids in tomato and that of flavonoids and anthocyans in apple (Giovannoni 2001).

All of these metabolic changes contribute to fruit quality determining whether the fruit is appreciated by the consumer or not. Breeders have always selected new cultivars with improved fruit characteristics in order to meet the market demand. Nowadays, the progress of molecular biology techniques provides the breeders with new tools that can make selection strategies more efficient. However, one of the critical steps required to carry out effective molecular breeding programs is the ability to recognise the link between the molecular function of the protein performing a certain cellular reaction and the influence that such reaction has on the actual phenotype of the plant. In the vast majority of the cases, molecular breeding relies solely on the identification of the chromosomal regions where a gene controlling a complex trait is located (QTL). Several QTL mapping studies have been carried out in apple on physiological traits controlling for example: stem diameter, leaf size, flowering time, number of flower bunches, juvenile phase length, number of fruits, fruit weight, fruit flesh firmness, fruit texture and fruit acidity and sugar content (King et al. 2000; King et al. 2001; Liebhard et al. 2003). More recently, other fruit quality-related QTLs, controlling volatile compounds (Zini et al. 2005) and vitamin C content (Davey et al. 2006), have been mapped.

Breeders are mainly interested in fruit quality and disease resistance. QTLs controlling resistance against the two most important apple diseases, apple scab and powdery mildew, have been identified (Calenge et al. 2005). The necessary step to provide breeders with powerful molecular tools is therefore the identification of the genes underlying the QTLs, thus identifying the allele variant responsible for the improvement of the phenotypic trait. One possible strategy to reach such goal is through the exploitation of genetic linkage maps, enriched with gene-derived markers, to identify candidate genes co-localising with the locus responsible for the variation of the agronomic trait of interest (Pflieger et al. 2001). A valid alternative is represented by the analysis of mutants, a genomic tool largely employed in tomato (Moore et al. 2002). However, due to the complexity of the apple system (mainly long generation time and long juvenile period), such approach is not easily applicable.

Reverse genetics can be considered as a valid approach to identify genes controlling traits of interest. Usually, candidate genes are selected and expressed sequence tag (EST) sequences retrieved from public databases; allele variants are then associated to the phenotype change. Such approach has been used in tomato (Causse et al. 2004), in apricot (Grimplet et al. 2005) and in peach, which can be considered as the model species for fruit trees (Horn et al. 2005). Currently, more than 261,000 apple ESTs are available in the Genbank, mainly produced by two apple EST sequencing projects carried out in 2006. One was performed by the Horticultural and Food Research Institute of New Zealand (Newcomb et al. 2006), and the second was carried out at the Michigan State University, USA (Park et al. 2006). When the present work started, only 700 apple ESTs were publicly available; hence, within the framework of the European project High-Quality Resistant Apples for a Sustainable Agriculture (Gianfranceschi and Soglio 2004), we decided to use a cDNA microarray approach to identify genes putatively involved in fruit quality traits. cDNA microarray technology can be used without previous sequence information knowledge, and it allows to analyse gene expression profiles in plant tissues at different developmental stages or to compare transcript level changes between different genotypes (Alba et al. 2004; Clarke and Zhu 2006). Since our main interest was fruit quality, we focused our study on the identification of genes differentially expressed during fruit development. A similar approach has been successfully applied in peach (Trainotti et al. 2006) and in pear (Fonseca et al. 2004), two climacteric fruits, and in citrus, a non-climacteric fruit (Cercós et al. 2006). The ultimate goal of our work was the identification of genes responsible for QTLs controlling fruit quality traits and to provide a general overview of the major metabolic processes occurring during fruit development. Here we present the analysis of the transcription profiles as observed in the apple cultivar Prima during fruit development. Our results will be discussed and compared with those obtained from similar studies.

Materials and methods

Plant material for cDNA library preparation

In the experimental orchard of Plant Research International in Elst (The Netherlands), fruits from six trees of M. x domestica Borkh., cultivar Fiesta, have been harvested at three developmental stages (an early one: 16 May 2003, fruit size 10–15 mm; a middle one: 3 July 2003, fruit size 50–58 mm and a ripening one: 26 August 2003, fruit size 50–58 mm). Fruits were always harvested from the same six trees. Young leaves from the same cultivar and from the same trees were harvested.

cDNA microarray slide preparation

Total RNA from fruit flesh and leaves of Fiesta were isolated according to Zeng and Yang (2002). mRNA were purified from total RNA using the GE Healthcare mRNA purification kit. Two micrograms of mRNA extracted from the leaves and 3 μg of mRNA pooled from the fruits (1 μg of each fruit stage) were used to construct subtractive libraries employing the PCR-Select cDNA Subtraction kit (Clontech). Two libraries were made: fruit versus leaf (named FL) and fruit versus fruit (named FF, produced for normalisation purposes). cDNA fragments were cloned in pGEM-T easy vectors (Promega) and transformed into Escherichia coli JM109E strain. Each library consisted of 768 clones stocked in eight plates of 96 wells. One plate per library was sequenced for quality check. All the 1,536 clone inserts were amplified by colony polymerase chain reaction (PCR) (forward primer: ATACGACTCACTATAGGGCG; reverse primer: ATTTAGGTGACACTATAGAATAC). PCR products were purified using QiaQuick 96 Biorobot plates (Qiagen). Some cDNA samples were prepared to spot on cDNA microarray slides as controls. Amplified inserts of three yeast clones (yeast aspartate kinase, J03526; imidazoleglycerolphosphate dehydratase, Z75110 and phosphoribo-sylaminoimidazole carboxylase, Z75036) were chosen as negative controls. Four amplified fragments of the luciferase gene were used as positive controls. All controls were purified by means of QiaQuick columns (Qiagen). All PCR samples and control cDNAs were transferred into five 384-wells plates and dried at 37°C in a flow cabinet. After addition of 12 μl of 5× Saline - Sodium Chloride (SSC) per well, the samples were shaken for 1 h and printed in duplicate onto 50 amino-silane coated slides from Corning. The first slide was used for checking the printing quality.

Plant material for cDNA microarray hybridisations

In the experimental orchard of Bologna University in Cadriano (Italy), fruits of three trees of M. x domestica Borkh., cultivar Prima, were harvested at five different developmental stages: 16 May 2003: 46 days after full bloom (DAFB; fruit size 20–25 mm); 16 June 2003: 76 DAFB (40–50 mm); 14 July 2003: 104 DAFB (50–60 mm); 13 August 2003: 133 DAFB (70–75 mm); 15 September 2003: 165 DAFB (70–75 mm).

Fruits were always harvested from the same three trees. Immediately after harvest, all fruits were deprived of peel and seeds and immediately frozen in liquid nitrogen and stored at −80°C until used.

cDNA microarray hybridisation and data analysis

The “common reference design” was adopted as experimental design in cDNA microarray hybridisations (Alba et al. 2004). May cDNA sample was chosen as “reference” sample, therefore each cDNA from the other four developmental stages (June, July, August and September) was compared to May cDNA. Two hybridisations were performed for each comparison with a dye-swap.

Total RNA was isolated from fruit flesh of cultivar Prima harvested at five different developmental stages following the protocol reported by Zeng and Yang (2002). An indirect incorporation of the Cy3 and Cy5 dyes was adopted to avoid low incorporation rates and bias. Twenty micrograms of total RNA from each of the two stages was retrotranscribed and labelled using the CyScribe post-labelling kit (GE Healthcare) according to manufacturer's protocol. The probes were dissolved in the hybridisation buffer (ArrayHyb Low Temp Hybridisation buffer, Sigma) and mixed. Twenty micrograms of denatured salmon sperm DNA was added to the probe to reduce unspecific hybridisation. The probe mix was deposited onto the slide, and the slide was placed in a hybridisation chamber (Corning). Hybridisations were performed over night at 50°C in the dark. All washing steps were carried out in the dark, and all washing solutions were pre-warmed at 65°C. A total of five washing steps in 0.1% sodium dodecyl sulfate, and decreasing concentrations of SSC (2×, 1×, 0.1×) were used. Finally, each slide was washed five times in MilliQ water for 2 min.

Slides were scanned using a ScanArray 4000 Packard PE BioChip Technologies, employing the software ScanArray version 3.1. Laser power was set between 75% and 85%, and the photomultiplier gain was determined by auto-balance feature of the software. Scans were conducted at resolution of 5 μm. Two separate TIFF images were produced. Using the software QuantArray version 3.0, the two images were merged, and spot intensity was measured. Images were examined visually, and non-uniform spots were removed from further analysis.

Data collected from cDNA microarray hybridisations were analysed using GeneSpring software (SiliconGenetics). Initially, raw data were normalised employing a method based on Locally Weighted Regression Scatter Plot Smoothing (LOWESS). The software calculated the expression ratio of each gene in the two analysed developmental stages. The ratio values were then filtered on the basis of two criteria: “40% fold change” and “Expression level”. The first criterion verified the good repeatability of the hybridisation replica: for each comparison, two hybridisations (with a dye-swap) were performed, so if the difference of intensity of two spots representing the same gene in the two slides/replicas was bigger than 40%, the spot was excluded from further analysis. Data that passed the “40% fold change” criterion were eventually filtered on “Expression level” with a cut off of 0.5 and 2: the clones with an intensity ratio higher than 2 or lower than 0.5 were considered as differentially expressed (Alba et al. 2004; Clarke and Zhu 2006). The values of the two intra-slide replica were not averaged; they were separately analysed, and in case the expression level of the two intra-slide replica was not in agreement (both above 2 or below 0.5), they were not considered as differentially expressed.

The differentially expressed cDNA clones in at least one comparison were sequenced at Greenomics™ (Wageningen, The Netherlands). All sequences were submitted to National Center for Biotechnology Information dbEST (database of “Expressed Sequence Tags”). Sequences were assembled using the CAP3 program (overlap length cutoff, 30 bp/nt and overlap percent length cutoff, 90%) to identify redundancy (Huang and Madan 1999). Duplicated sequences were assembled into contigs.

Gene ontology (GO) annotation was employed to univocally describe the differentially expressed genes and their products. In order to assign a molecular function, the sequences that resulted differentially expressed from the microarray experiments were compared, using blastx, with M. x domestica Putative Unique Transcripts (PUT sequences). When an apple unique transcript with high similarity was found, the protein with the highest similarity and the assigned GO terms were annotated to the sequence (www.plantgdb.org).

EPCLUST software (http://www.bioinf.ebc.ee/EP/EP/EPCLUST/) allowed clustering genes according to their expression profile. To be able to compare our results with those obtained by Fonseca and coworkers (Fonseca et al. 2004), a complete linkage hierarchical clustering (Euclidean distance) was chosen.

Quantitative reverse transcription PCR

Primer3 software (Rozen and Skaletsky 2000) was used to design primers for quantitative reverse transcription PCR (RT-PCR) experiments. cDNA clone sequence information was used to obtain amplicons with a maximum size of 200 bp. The list of primer sequences is reported in Table S1 (Electronic supplementary material). One microgram of total RNA isolated from fruit flesh at each developmental stage was DNAse (Sigma-Aldrich) treated and retrotranscribed using the iScript™cDNA Synthesis kit (Bio-Rad). The retrotranscription was performed twice, in two independent experiments: the first time employing the same total RNA used in cDNA microarray hybridisations and the second time starting from freshly isolated total RNA. Two biological replicas were carried out. Quantitative RT-PCR reactions were performed in a final volume of 25 μl containing 100 ng of cDNA, 0.75 μl of specific primers 10 μM (final concentration 0.3 μM), 12.5 μl of 2× iQ™SYBR®Green Supermix (Bio-Rad). For each sample, three replicates were set. Amplifications were conducted in a iCycler iQ® thermocycler (Bio-Rad) according to the following protocol: denaturation step at 95°C for 3 min followed by 45 cycles of 95°C for 10 s and 60°C for 45 s. Primer efficiency was evaluated by a standard curve created using a fourfold dilution series of May cDNA. The absence of primer dimers and the uniqueness of the amplified products were assessed by post amplification dissociation curve analysis (denaturation at 95°C for 1 min, cooling to 55°C for 1 min, and gradual heating at 0.5°C cycle−1 to a final temperature of 95°C). In each quantitative RT-PCR experiment, 18S rRNA (DQ341382) was chosen as reference gene for data normalisation. Quantification of gene expression level was performed following the method reported by Pfaffl (2001).

Results

Analysis of cDNA microarray data

The expression profile of 1,536 apple fruit cDNA clones obtained from libraries enriched for fruit specific genes was analysed to identify genes differentially expressed during fruit development and maturation.

Fruit samples of the cultivar Prima were monthly collected from May to September 2003 (46, 76, 104, 133 and 165 days after full bloom, respectively). In 2003, Prima reached commercial ripening (starch index 7) in August; thus, September samples could be considered as being in the over-ripe phase.

In every experiment, repeatability was good. On average, less than 10% of the clones showed differences in the expression level exceeding 40% between replicas. On the retained data, a twofold change criterion was used to establish the differentially expressed genes (Alba et al. 2004; Clarke and Zhu 2006). A total of 285 clones resulted to be differentially expressed. Each clone was sequenced, and the sequences were assembled to account for redundancy. About 49% (141) of the sequences resulted to be unique, while the remaining 51% (144) included sequences present more than once. Redundant sequences were assembled into 36 contigs, using CAP3 software (Huang and Madan 1999). The list of EST accession numbers belonging to each contig is reported in Table S2 (Electronic supplementary material).

Thanks to the adopted experimental design, it was possible to analyse the expression pattern of all spotted genes, allowing to compare RNA samples at different developmental stages, even though the specific hybridisation was not performed. The indirect comparisons relevant to identify the expression pattern of genes during fruit development are: June–July, July–August and August–September. Thus, the 177 unique sequences include 159 genes that are differentially expressed in at least one of the direct comparisons and 18 genes which are differentially expressed in at least one indirect comparison. July–August is the indirect comparison that shows the highest number of genes differentially expressed (58, 26 of which are up-regulated and 32 down-regulated). Forty-eight of those 58 genes (83%) are up- or down-regulated also in the August–September comparison. We observed that, although the changes in expression levels in the last two developmental stages are heterogeneous, in general, the transcriptional variations observed in August maintain the same trend in September.

All sequences were compared to the apple Putative Unique Transcripts (PUT sequences), available at the Plant Genome Database (www.plantgdb.org), to collect further information and to assign a putative function to the gene (Table S3 of Electronic supplementary material). All the 177 identified unique transcripts were grouped into functional categories to have an overview of the most relevant protein functions occurring during fruit development (Fig. 1). A high proportion (24%) of the differentially expressed sequences encode for proteins with unknown function. This result is not surprising. In fact, it is in good agreement with previously published microarray experiments on apple fruit, where a similar percentage of genes with unknown function was identified (Lee et al. 2007; Schaffer et al. 2007). Furthermore, it is interesting to note that three novel sequences (1%), showing no similarity to any sequence in public databases, have been identified.

Fig. 1
figure 1

Functional categories of the 177 differentially expressed genes. The relative abundance of each category is reported as percentage

As expected, August and September are the 2 months where the largest number of transcriptional changes has been detected. In particular, the group including 85 up-regulated genes in September is the most numerous. A closer look at the function of the differentially expressed genes reveals that “primary metabolism” is the most numerous category, including 29 genes. What is also worth noting is that all the genes belonging to “primary metabolism” appear to be differentially expressed in August and/or in September.

Gene ontology supplies a unified and structured classification which was proposed to univocally describe genes and their products. GO allows to compare results from different species. The three main organising categories of GO are: (1) cellular component, (2) molecular function and (3) biological process. Although “biological process” might be the most interesting ontology category for the identification of metabolic pathways involved in fruit development and maturation, the other two categories were also considered to construct a more accurate survey of the functions and components involved in fruit development. A blastx sequence comparison was performed comparing the 177 ESTs to the apple PUT sequences, allowing to assign at least one GO term, considering all the three organising categories, to 122 out of them. Since a single protein can play different roles in the cell, it is possible that more than one GO term is associated to a single EST in each GO organising category. The list of GO terms assigned to each EST is reported as Electronic supplementary material in Table S3.

Although the functional classification of the differentially expressed genes is a useful way to analyse microarray data, it is the combination of this approach with clustering methods that is more enlightening. Cluster analysis allows to identify groups of transcripts showing similar expression patterns, thus possibly involved in correlated biological processes. To this purpose, the identified 177 genes were clustered to find out whether similar expression patterns are associated to genes belonging to the same functional categories. Six different clusters were defined (Fig. 2). Clusters I and II contain genes whose expression increases from July to September, thus during pre-climacteric phase. However, cluster I includes genes showing a dramatic increase in July, while cluster II contains genes showing a more moderate increase. Genes grouped in cluster III show either swinging expression patterns or minimal changes until August, followed by a moderate increase or decrease. Clusters IV contains genes showing a considerable decrease in expression from July. Cluster V comprises genes with a gradual decrease of expression during all the stages, while genes belonging to cluster VI show a more dramatic decrease during all stages.

Fig. 2
figure 2figure 2

Hierarchical cluster analysis (using EPCLUST software) of transcript levels from the 177 genes differentially expressed during apple fruit development (JUN June, JUL July, AUG August and SEP September). Each transcript is identified by the accession numbers of ESTs, while contigs are reported with a progressive number. Each row represents the expression profile of a single EST or contig. Red boxes mean high levels of expression compared to May, and green boxes mean lower expression levels. The colour brightness is directly proportional to the expression ratio. Black boxes are genes not significantly differentially expressed. Dashed blue lines group ESTs according to the six determined clusters (I–VI). The diagrams report the expression patterns of genes belonging to each cluster. In these diagrams, y-axis represents gene expression ratio (log2) and x-axis the stages of fruit development. Line colour in the graphs on the right has the same meaning as in boxes

cDNA microarray data validation

cDNA microarray results were validated by quantitative RT-PCR (qRT-PCR), a technique that allows a more precise quantification of gene expression levels. Following different criteria, 17 genes, found to be differentially expressed from microarray experiments, were chosen for this purpose. Some of the selected genes showed significant changes in the expression level in more than one comparison; therefore, the total number of comparisons selected for validation was 26. Both down- and up-regulated genes were chosen. We decided to validate genes covering a wide range of transcriptional changes in order to verify the reliability of microarray results even when close to the selected thresholds (Table 1). A total of 18 (69%) selected microarray data were validated, whereas eight (31%) could not be confirmed by qRT-PCR experiments. The expression pattern of three genes was evaluated during all stages of fruit development by qRT-PCR and the results compared to those obtained from microarray analysis (Fig. 3). Two of the analysed genes, GD254873 (MdMYB11) and GD254975 (function unknown), show nearly the same expression pattern using the two approaches. GD254873 is characterised by a progressive decrease of expression from May to September. According to microarray results, GD254975 shows a slight decrease of expression between August and September, while according to qRT-PCR data, a very slight increase of expression in the final developmental stages was detected. For the third gene considered, GD254869 (chalcone synthase I), the order of magnitude of the relative expression obtained by the two approaches was greatly different even though both microarray and qRT-PCR show a rapid decrease in the transcript level of chalcone synthase from July to September.

Table 1 List of ESTs selected for quantitative RT-PCR validation
Fig. 3
figure 3

Expression profiles of three genes obtained by cDNA microarray (left panels) and by qRT-PCR (right panels). Expression level is measured as fold change in comparison to the May sample

Discussion

Although the cDNA microarrays used in the present study have some limitations, mainly due to the number of cDNA clones (1,536) present on the slides, we proved that the strategy employed was adequate to identify genes participating to major biological processes related to fruit quality. Further evidence supporting the validity of our results comes from the comparison of our results with those reported by Janssen et al. (2008), who utilised oligo-microarrays containing a much larger number of probes. Combining the functional classification with cluster analysis, we noticed that the majority of the differentially expressed genes are involved in primary and secondary metabolism, and as expected, their transcript level increases from May to September. The elucidation of the cellular and molecular mechanisms that lead to fruit quality is crucial to providing useful tools for molecular breeding. Our microarray analysis revealed that some key genes involved in the primary metabolism are differentially expressed during fruit development. Phosphoglucomutase gene (PGM; GD254886, cluster II) is one example; its expression progressively increases from July reaching the pick in the over-ripening stage (September). The reaction catalysed by PGM contributes to determine the fate of the cellular carbon molecules, that is whether those molecules are routed towards malate or, alternatively, they are used to synthesise complex carbohydrate polymers such as starch (Berüter 2004). During fruit growth, starch is almost entirely transformed into sucrose as respiration progressively increases. In the ripe fruit, the major substrate for respiration is malate which can be decarboxylated to pyruvate via NADP-dependent malic enzyme (GD254910, cluster II or contig_16, cluster III). We observed that the transcript level of the gene encoding for the NADP-dependent malic enzyme is highest in September fruits, where respiration activity reaches its peak. NADP-dependent malic enzyme is also involved in malic acid degradation; thus, it affects fruit acidity (Yao et al. 2007). The reduction of malic acid is also to be associated with the decrease of the enzyme involved in malic acid biosynthesis. Indeed, microarray data show a reduction in the transcript level of the NAD-dependent malate dehydrogenase (GD254856, cluster IV) in August and September.

A third very important feature contributing to fruit quality, besides sweetness and acidity, is flavour. Fruit flavour is a complex trait determined by the presence of many chemical compounds produced by interacting and interconnected biosynthetic pathways. The principal components of flavour, affecting aroma, are volatile esters, known to play a major role in the interaction between the plant and the environment. In some fruit species, such as apple, pear and banana, esters are responsible for the characteristic aroma (Beekwilder et al. 2004). Our results show that the expression of two enzymes, acetolactate synthase (GD255028, cluster III) and pyruvate decarboxylase (contig_4, cluster II), involved in ester biosynthesis (Newcomb et al. 2006), is increased in August and September. The elevated level of the transcripts agrees with the fact that it is in the final stages of ripening (August) and during the over-ripening phase (September) that the fruit acquires its characteristic aroma. Moreover, our results are supported by a previous study on pear (Fonseca et al. 2004), where an analogous expression trend of the pyruvate decarboxylase gene was reported.

Sweetness, acidity and flavour, the fruit qualities previously mentioned, affect another important trait: taste. So far, the complexity of taste continues to be unclear. Thaumatins, intensely sweet-tasting proteins, are reported being strongly affecting fruit taste (Temussi 2006). In our experiments, a thaumatin-like gene (contig_14, cluster I) showed a considerable increase of expression during fruit development. However, besides the sweet taste, some thaumatins have allergenic activity, too (Gao et al. 2005). Therefore, breeders must pay particular attention, when selecting genotypes having a sweeter taste, to the molecules responsible for the sweet taste, taking into account the possible allergenic effect of some of those thaumatins, for example by selecting genotypes naturally producing reduced amounts of allergenic thaumatins. In recent years, consumers became more aware of the importance of the nutritional properties of the food they are eating. In this context, flavonoids, known to have antioxidant and anticarcinogenic properties (Prior 2003), might be relevant in targeting new breeding strategies. The biosynthetic pathways of two classes of flavonoids, flavonols and anthocianins, are well known (Mehrtens et al. 2005; Newcomb et al. 2006). Chalcone synthase I (GD254869, cluster VI), chalcone-flavone isomerase (GD254847, cluster V) and putative flavanone 3-beta-hydroxylase (contig_35 and GD254896, cluster III; GD255067, cluster II) are three enzymes belonging to this pathway. The expression level of those enzymes decreases from May to September with the exception of the flavanone 3-beta-hydroxylase for which an increase in the transcript level is observed. Flavanone 3-beta-hydroxylase is one of the key enzymes located at the bifurcation leading: on the one hand to the production of anthocyanin, responsible for fruit pigmentation, and on the other hand to the production of flavonols, known to have antioxidant properties. cDNA microarray analysis showed a remarkable decrease of the MdMYB11 (GD254873, cluster VI) transcription factor and of the chalcone synthase, MdCHS1, expression occurring during fruit ripening. Since transcription factors are key elements in controlling biosynthetic pathways, understanding their role is very important for breeding purposes, as demonstrated by Espley et al. (2007) reporting about MdMYB10, a transcription factor regulating anthocyanin biosynthesis, which was proven to control pigmentation of leaves and fruit flesh in the ‘Red Field’ cultivar.

Cluster analysis revealed that some genes involved in amino acid and protein metabolism such as an ubiquitin homolog (GD254821), cullin-1 (GD254816), ubiquitin-conjugating enzyme E2 (GD255059, GD254820), which are involved in ATP/ubiquitin-dependent non-lysosomal proteolytic pathway, have similar expression patterns. Those genes, belonging to clusters IV (GD254821, GD254816) or V (GD255059, GD254820), show decreased transcript levels during fruit development and ripening. Such result is in agreement with that reported by McClellan and Chang (2008). The authors proved that ubiquitin-mediated protein degradation is a key mechanism through which plants are able to quickly modulate their response to hormones such as auxin, gibberellins, abscisic acid and, last but not least, ethylene. The protein level of ACS, the rate-limiting enzyme controlling ethylene biosynthesis, was shown to be regulated by protein degradation. Moreover, ethylene receptor turnover is also controlled by a similar post-translational mechanism. In Arabidopsis, when the ethylene level is low, the SCF protein complex ubiquitinates EIN3, a transcription factor responsible for the activation of the ethylene response, which is then degraded. On the contrary, when ethylene increases, EIN3 is not degraded; it is free to bind to its target genes, thus triggering the ethylene response (Guo and Ecker 2003). AtEIN3 homolog has been identified in tomato, LeEIL4 (Yokotani et al. 2003), and in apple, MdEIN3 (Newcomb et al. 2006). It is therefore reasonable to assume an involvement of EIN3 in ethylene synthesis and signal transduction in apple and tomato, too. Our microarray results did not show any significant change in the expression level of MdEIN3, and that could be easily due to the limitations of the microarray technique in revealing transcriptional changes of low abundance transcripts, which are typical of transcription factors. However, we observed significant changes in the expression level of the structural genes controlled by EIN3. Keeping in mind that EIN3 has a post-translational type of regulation, it is likely that EIN3 is actually controlling ethylene biosynthesis during apple development, even if its transcript level does not vary significantly.

Besides changes in cell metabolism, fruit development and ripening involve strong modifications in cell morphology; thus, changes in membrane transport and cell wall metabolism are expected. It has been reported that fruit development starts with an initial phase of cell division followed by cell expansion, mainly due to volume increase of the vacuole(s). In agreement with that, our cDNA microarray data show significant changes in the expression level of genes involved in the transport of solutes, occurring in the first stages of fruit development. A membrane transporter gene, the gamma tonoplast intrinsic protein (GD254866, cluster VI), resulted to be expressed only in the initial phases of fruit development. Gamma tonoplast intrinsic protein is an aquaporin, and an analogous expression pattern has been reported in previous microarray experiments performed in young apple fruits by Lee et al. (2007). Also in agreement with Lee and co-workers is the observation that another transporter, the potassium transporter gene (GD254938, cluster IV), is abundantly expressed in young fruits (May). The potassium channel is also thought to be involved in potassium accumulation in the vacuole, therefore determining cell expansion and cell turgor. In order to achieve cell expansion, a critical step is the modification of cell wall structure and of cell–cell adhesion. Expansins are involved in cell wall loosening allowing cells to expand. Only the alpha-expansin gene (GD254806, cluster V) resulted to be differentially expressed from our microarray experiments, showing a gradual decrease of expression during fruit development and ripening. A similar expression pattern was observed by Janssen et al. (2008) for the same gene, suggesting that alpha-expansin might play a role in cell enlargement. However, due to the low transcription level of the gene at later developmental stages, alpha-expansin does not seem to be involved in fruit softening, occurring at ripening. Another enzyme that affects cell–cell adhesion by removing acetyl groups from pectin is pectinacetylesterase (contig_25, cluster IV). Deacetylation is known to increase the instability of pectin facilitating its degradation. The elevated expression of pectinacetylesterase in the early developmental stages and its decrease in August and September is inversely correlated with pectin content in ripening fruit.

A significant increase of MdACO1 (contig_26, cluster I), a gene encoding for a key enzyme involved in ethylene production, is observed in July followed by a further increase, peaking in August. The final phase of fruit maturation (September) is characterised by the stabilisation of the expression level which anyway remains high. A similar expression pattern was observed for the tomato LeACO1, as reported by Alba et al. (2005). The authors detected the highest transcript levels around the breaker phase, which, in our conditions, corresponds approximately to the August sample (pre-climacteric phase). Since the creation of high-quality apple varieties does definitely involve the creation of disease resistance cultivars, it could be interesting to discuss our results concerning the way plants respond to stress. Our microarray data show that the expression level of many genes belonging to the “response to stress” and to the “response to abiotic or biotic stimuli” categories significantly changes during fruit development. This result differs from what reported by Janssen et al. (2008), who did not find stress-related transcripts as being predominant in their survey. The discrepancy between the two studies could be partially explained by the use of a different cultivar. In fact, while the cultivar Prima, a disease-resistant cultivar, was employed in our experiments, Janssen and co-workers utilised Royal Gala, a variety susceptible to the most common apple diseases. It is therefore possible that the ability of a disease-resistant cultivar to detect the presence of a pathogen leads to the activation or repression of the genes involved in the resistance, while in a susceptible cultivar such activation does not take place.

We are confident that the development of molecular markers from the differentially expressed genes, currently going on in our lab (Perini et al., in preparation), will allow to establish the involvement of those genes in fruit quality and disease resistance.

cDNA microarray data validation

Because of the error-prone nature of any high-throughput technology, microarray data need to be experimentally validated by an alternative method, such as quantitative RT-PCR (Clarke and Zhu 2006), which is more sensitive than microarray. A total of 18 (69%) microarray data selected for validation were confirmed, while eight (31%) showed contrasting results. The incomplete validation of the microarray results is not unexpected as previously reported by Schaffer et al. (2007) and by Janssen et al. (2008), who obtained similar validation results in oligo-microarray experiments on apple fruits. It is known that a-specific background signals or cross-hybridisation between gene-family members may lead to erroneous or at least unreliable data in microarray experiments. A distinctive feature of plant genomes is the presence of a large number of multigenic families. Since the cDNA clones present on our microarray slides were not sequenced beforehand, members of multigenic families are certainly present. Furthermore, it is not possible to predict the reliability of the microarray results based on the fact that a gene belongs to a multigenic family. As an example, we discuss two cases, one where microarray and qRT-PCR data showed contrasting results and the second where microarray expression data were validated by qRT-PCR. Asparagine synthetase mediates the synthesis of asparagines by transferring the amide group of glutamine (or ammonia) to aspartate. Asparagine serves as major nitrogen transport and storage compound in many higher plants. Previous studies on asparagus (Asparagus officinalis L.) demonstrated that the transcript abundance of asparagine synthetase (GD254893 belonging to contig_33, cluster VI) is regulated by sugar level (sucrose, glucose or fructose). Specifically, it is low when the sugar content is high (Davies et al. 1996). In our cDNA microarray experiments, asparagine synthetase transcript displays a significant decrease in August, when compared to May. Since sugar content in August is much higher than in May, we hypothesised that the same kind of regulation described for asparagus could be effective in apple, too. Unfortunately, qRT-PCR did not confirm the expression trend observed in microarray data. Surprisingly, qRT-PCR did not show significant differences between May and August transcript levels. A blast search in the public sequence database revealed the existence in apple of two independent transcripts encoding for asparagine synthetase, which, due to their elevated sequence similarity, hybridise to the sequence present onto the microarray slide. The higher specificity of the primers used in qRT-PCR experiments lead to amplification of only one gene, whose expression did not change significantly from May to August. Therefore, it is likely that cDNA microarray data report the combined expression of the two genes, which “on average” resulted to be down-regulated in August. Evidences about the possible role of asparagine synthetase in apple fruit development are currently not available, however, further investigations on the expression of the second asparagine synthetase gene would be justified only in case the gene would result to be a good candidate, co-localizing with a fruit quality controlling QTL.

The second example concerns the beta-cyanoalanine synthase, catalysing the synthesis of cysteine, a precursor of methionine (required for ethylene biosynthesis), which is involved in cyanidine metabolism. Its relevance in fruit maturation becomes clear once we consider that the main source of cyanide is the oxidation of 1-aminocyclopropane-1-carboxylic acid, a reaction of the ethylene biosynthetic pathway (Maruyama et al. 2001). Ethylene production is therefore tightly connected and dependent on the ability of the plant to efficiently detoxify cyanide (Han et al. 2007). The increase in ethylene biosynthesis during fruit ripening would lead to cyanide accumulation, a highly toxic nitrogen compound, in case it is not immediately degraded by beta-cyanoalanine synthase. Microarray and qRT-PCR data support this hypothesis, revealing an increase in the transcript level of beta-cyanoalanine synthase (GD254916, cluster II) during fruit development. cDNA microarray data show that its transcription is increased from August to September, and it shows a similar expression pattern as the one observed for MdACO1. We decided to compare the expression levels observed in May and August by qRT-PCR. The results confirmed a similar trend of expression as seen in microarray experiments, even though the increase in expression measured by qRT-PCR in the August sample is much higher than the one revealed by microarray. An in silico analysis showed that at least four apple genes for beta-cyanoalanine synthase might hybridise to the probe present onto the microarray slide, while only two of them could be amplified by the primers used in qRT-PCR. In this case, since the data obtained with the two approaches are in agreement, we cannot know whether all the four members of the gene family have a similar expression pattern or whether just one member is responsible for the variation observed in microarray and qRT-PCR experiments.

Although cDNA microarray analysis performed in this work could be considered as inadequate when the final aim is to provide an exhaustive picture of the biological processes occurring during fruit development, we believe that they are suitable to give valuable insights about the metabolic pathways contributing to fruit development, leading to the identification of several key genes. A parallel work, aimed at the development of molecular markers from the differentially expressed genes and at the identification of their position on the apple genetic map, is currently under way (Perini et al., in preparation). The co-localisation of QTLs controlling fruit quality traits with functional markers will allow to identify candidate genes which combine a modulated expression during fruit development, a molecular function fitting the trait under investigation and a correct map location. Although the final evidence proving that a candidate gene is responsible for the QTL could come only from genetic transformation, interesting clues can derive from studies directed at investigating the allelic variation of those genes and the correlation with the phenotype of the trait of interest. In any case, markers developed from expressed sequences will be immediately a valuable resource to be used in apple breeding, allowing early selection for complex traits related to fruit quality.