Introduction

Color development of fleshy fruits in many plant species depends on differentiation and function of pigment-rich chloroplasts and/or chromoplasts derived from plastids (Spiegel-Roy and Goldschmidt 1996). One of the best-studied systems of color variation and fruit pigmentation is tomato (Solanum lycopersicum). Ripe tomato fruits contain significant amounts of carotenoids and are the principal dietary source of lycopene and β-carotene in human diet (Giovannoni 2004), of which the β-carotene is a pro-vitamin A compound, and lycopene is important for reducing risk of chronic disease (Key et al. 2002).

Division plays a central role in plastid differentiation and development because plastids are created by division through binary fission from pre-existing plastids in the cytosol rather than arise de novo. Plastid division in higher plants is a complex process involving organelle constriction with assembly and expansion of membrane networks (Maple and Møller 2007). Earlier studies have shown that plastid division is performed by the constriction of a ring-like division complex at the division site, spanning both inside and outside of the two envelopes (Osteryoung and Vierling 1995). Recent studies have discovered that FtsZ, a prokaryotic tubulin-like GTPase within the plastid, is directly involved in plastid division (Mori et al. 2001; Yoder et al. 2007; Schmitz et al. 2009). Prokaryotic FtsZ can spontaneously self-assemble into an FtsZ-ring (Z-ring) at the mid cell division site (Osawa et al. 2008). In plants and green algae, two nuclear-encoded forms of FtsZ, FtsZ1 and FtsZ2, are components of the plastidic Z-ring and function in plastid division (Vitha et al. 2001; Maple and Møller 2007). In addition to FtsZ, several other proteins are also involved in chloroplast division. For example, the Min (Minicell) system/complex, comprising MinC, MinD, and MinE (Åkerlund et al. 2002; Margolin 2005; Maple and Møller 2007; Loose et al. 2008), and ARC3 (Shimada et al. 2004) play an important role in regulating placement of the plastid division site. ARC6 (an ortholog of cyanobacterial division protein Ftn2) directly modulates Z-ring assembly and dynamics (Vitha et al. 2003). The plastid division proteins PDV1 and PDV2 are localized to medial plastidic rings and recruit the dynamin ARC5 to the division site (Miyagishima et al. 2006; Glynn et al. 2008), the latter is a member of the dynamin superfamily of eukaryotic membrane-remodeling GTPases (also called DRP5B), which form spiral-like structures that twist to pinch membranes in later stages of plastid division (Gao et al. 2003; Miyagishima et al. 2003).

Although numerous components have been demonstrated to be involved in plastid division, little is known about the mechanisms that regulate total plastid volume/number in cells. The strict maintenance of plastid populations in dividing cells and the alteration of plastid number and function in response to internal and external signals suggest that the cellular regulation of the plastid division must be fine-tuned (Jones et al. 2002; Raynaud et al. 2005; Galpaz et al. 2008). In particular, plant hormones such as abscisic acid (ABA) and auxin are important for plastid division. For example, the ABA deficiency in the tomato mutant high-pigment 3 (hp3) leads to an increased plastid number and more fruit lycopene (Galpaz et al. 2008). The tomato auxin-responsive factor DR12, which is developmentally regulated and nucleus-localized, has been demonstrated to regulate cell division and plastid division simultaneously (Jones et al. 2002). Another molecular component linking the cell and plastid division cycles is CDT1, a cyclin-dependent kinase that forms a part of the pre-replication complex (Raynaud et al. 2005). Recently, it has been shown that U (Uniform ripening) gene, encoding a Golden 2-like (GLK) transcription factor SlGLK2, is involved in chloroplast development and fruit pigmentation in tomato (Powell et al. 2012).

In tomato, mutants with enhanced pigmentation such as the hp1 and hp2 have been identified (Kendrick et al. 1997). The HP1 gene encodes a protein homologous to UV-DAMAGED DNA BINDING PROTEIN-1 (DDB1) (Lieberman et al. 2004; Liu et al. 2004). The Arabidopsis counterpart, DDB1a, interacts with DE-ETIOLATED-1 (DET1) both biochemically and genetically (Schroeder et al. 2002). DET1 was first identified in Arabidopsis as a negative regulator in photomorphogenic responses (Pepper et al. 1994). Its tomato homologue HIGH PIGMENT-2 (HP2) has been implicated in playing a negative role in fruit, leaf and seedling pigmentation (Mustilli et al. 1999; Davuluri et al. 2004, 2005). Increased chloroplast number and size have been found in fully expanded leaf and fruit pericarp cells from the hp1 mutant plant, suggesting a role of HP1/DDB1 and HP2/DET1 proteins in plastid biogenesis and development (Yen et al. 1997; Cookson et al. 2003; Kolotilin et al. 2007; Wang et al. 2008). Although the increased plastid compartment per cell is thought to attribute to the remarkably elevated accumulation of chlorophyll and carotenoid in hp mutants (Yen et al. 1997; Mustilli et al. 1999; Levin et al. 2003; Lieberman et al. 2004; Liu et al. 2004; Wang et al. 2008), the molecular aspects of how HP1/DDB1 manipulates plastid biogenesis and development in tomato fruits remains largely unknown.

RNA-Seq provides a powerful tool for transcriptome profiling, relying on the deep-sequencing technology (Clark et al. 2002; Wang et al. 2009). RNA-Seq can produce millions of relatively short reads from the ends of cDNAs derived from fragments of sample RNA, and provide a number of transcriptome analyses, including transcript quantification (Katz et al. 2010; Trapnell et al. 2010; Nicolae et al. 2011), differential expression testing (Anders and Huber 2010; Robinson et al. 2010), reference-based gene annotation (Trapnell et al. 2010), new transcription start site (TSS) and alternative splicing (Trapnell et al. 2010). For example, RNA-Seq was used to gain insight into the wide range of transcriptional responses that are associated with berry development in Vitis vinifera (Zenoni et al. 2010). The results have demonstrated the utility of RNA-Seq for identifying SNPs and splice variants, and for describing how plant transcriptomes changed during development (Zenoni et al. 2010).

In the current study, we carried out a RNA-Seq-based comprehensive transcriptional profiling of pericarp tissue of immature fruits (14 days after pollination, DAP) obtained from the WT or hp1 mutant. Our data suggest that global gene expression profiling is an effective way to dissect the molecular basis of HP1/DDB1 that affects the plastid level and other physiological processes during fruit development in tomato.

Materials and methods

Plant materials

The seeds of tomato high pigment 1 (accession number: LA0279) and Ailsa Craig were provided by Tomato Genetics Resource Center (TGRC: http://tgrc.ucdavis.edu/). Tomato plants, including the wild type (WT) Solanum lycopersicum cv. Ailsa Craig (AC+) and its nearly isogenic hp1 mutant, were grown under natural condition in the farm of Sichuan University, Chengdu, China. Age-matched (14 days after pollination, DAP) fruits were harvested for total RNA isolation.

Construction of cDNA libraries for illumina RNA-Seq

Fresh fruits were ground to fine powder with liquid nitrogen. Total RNA were isolated using Trizol reagent (Tiangen, Beijing, China) and treated with Rnase-free DNase I (Promega), according to the manufacturers’ protocol. The RNA Integrity Number (RIN) value for both samples was > 8, checked using Agilent 2100 Bioanalyzer.

For the Illumina RNA-Seq, two cDNA libraries generated from pericarps of whole fruit sphere of 30 WT and hp1 fruits, respectively, were prepared according to the instructions for preparing mRNA sequencing samples (Illumina, San Diego, CA, USA). Subsequently, the cDNA libraries were sequenced using Illumina HiSeq 2000 platform. The raw sequence data were deposited in SRA database of NCBI with accession number SRA058770 (http://www.ncbi.nlm.nih.gov/geo/).

Mapping of RNA-Seq reads using TopHat

The genome sequences (SL2.40 version) and annotation files (ITAG 2.3 Release) of Solanum lycopersicum were downloaded from SOL Genomics Network database (SGN, http://solgenomics.net/organism/Solanum_lycopersicum/genome). After removing reads containing sequencing adapters and reads of low quality (reads containing Ns > 5), the remaining reads were aligned to the S. lycopersicum genome using TopHat (Trapnell et al. 2009), allowing up to two base mismatches. Default parameters for TopHat were used. The mapping results were then used to identify ‘‘islands’’ of expression, which can be interpreted as potential exons. TopHat builds a database of potential splice junctions and confirms these by comparing the previously unmapped reads against the database of putative junctions.

Reads that failed to be mapped were progressively trimmed off 1 base at a time from the 3′ end and mapped to the genome again until a match was found (unless the read had been trimmed < 27 bases). For paired-end reads, the insert between paired reads was set as 1 bp ~ 5 kb, allowing them to span introns of varied sizes in the genome. The same strategy was performed for aligning paired-end reads to the non-redundant genes except that the insert was changed to 1 base ~ 1 kilobase.

Transcript assembly and abundance estimation using Cufflinks

The aligned read files were processed by Cufflinks v0.8.0 (Trapnell et al. 2010). Cufflinks constructed a minimum set of transcripts, which describes the reads in the sequencing dataset, without utilizing existed gene annotation during transcripts assembly. This approach allows Cufflinks to identify alternative transcription and splicing that are not described by preexisting gene models (Trapnell et al. 2010). Cufflinks uses the normalized RNA-Seq fragment counts to measure the relative abundances of transcripts. The unit of abundance measurement is Fragments Per Kilobase of exon per Million fragments mapped (FPKM). Confidence intervals for FPKM estimates were calculated using a Bayesian inference method (Jiang and Wong 2009).

Gene ontology analysis

We obtained the Gene Ontology (GO) terms of each tomato gene by the software Blast2GO (version 2.5.0) (Conesa et al. 2005) using the default parameters. Blast2GO was also used for GO functional enrichment analysis of certain genes, by performing Fisher’s exact test with robust false discovery rate (FDR) correction to obtain an adjusted P value between certain test gene groups and the whole annotation.

Validation of novel untranslated regions, alternative splicing events, novel transcripts, and altered gene expression

The Cufflinks transfrags that did not have a complete match or “containment” relationship with the SGN annotations (ITAG 2.3 Release) were selected for novel untranslated regions’ validation, according to the previously described methods (Trapnell et al. 2010). Four novel transcripts and four alternative spliced isoforms were chosen as cases for validation. The gene-specific primers designed for the validation of RT-PCR experiment were listed in Supplemental Table S1.

To test the differential expressions between WT and hp1 mutant immature fruits, thirty genes were randomly chosen as cases for analyzing expression changes using semi-quantitative RT-PCR. The internal control gene is UBI3 (GenBank accession X58253) (Wang et al. 2008) and gene-specific primers were listed in Supplemental Table S2.

cDNA synthesis and RT-PCR analysis

The same samples of total RNAs extracted for the RNA-Seq experiment were utilized to synthesize the first-strand cDNA required for the subsequent analyses. cDNA synthesis was performed with 5 min at 65 °C, 45 min at 42 °C, and 5 min at 94 °C using High Efficient Reverse Transcription Kit ReverTra Ace (Toyobo, Osaka, Japan).

The PCR was performed using the EasyTaq DNA polymerase (Transgen Biotech, Beijing, China), 1.2 mM MgCl2, and 20 pmol of each primer. The PCR program was as follows: 5 min at 95 °C, optimal cycles (28 for semi-quantitative RT-PCR and 30 for RT-PCR) of 30 s at 95 °C, 30 s at the optimal annealing temperature (53–60 °C), and optimal time in extension (30 s ~ 1 min) at 72 °C, followed with 10 min at 72 °C. The PCR products were, respectively loaded in a lane of the 1.0 % agarose gel, post-stained with ethidium bromide and visualized on a UV transilluminator.

Real-time RT-PCR

The expression of eight genes that encode proteins putatively related to chlorophyll metabolism and plastid division was validated by real-time RT-PCR analysis. The real-time RT-PCR was performed using SsoFast EvaGreen Supermix (Bio-Rad). The PCR reactions were performed in StepOne Real-Time PCR System (Applied Biosystems). The PCR conditions were as follows: 95 °C for 30 s, followed by 40 cycles of 95 °C for 5 s, 60 °C for 20 s. Melting curve analyses of amplification products were performed at the end of each PCR reaction to ensure that unique products were amplified. The gene-specific primers were listed in Supplemental Table S2. The expression level was normalized to the internal control gene UBI3 (GenBank accession X58253), using the 2−ΔΔCt method (Livak and Schmittgen 2001).

Statistical analysis

The reads were filtered for containing sequencing adapters or unreliable low quality (reads containing Ns > 5). The data about the GO terms of each tomato gene were obtained using the software Blast2Go (version 2.5.0) with the default parameters and subjected to statistical analysis by analysis of P value using the Fisher’s exact test with robust false discovery rate (FDR). To compare the FPKM-derived read count, Student’s t-test was used.

For the realtime RT-PCR analysis, each bar represents three repetitions from each RNA sample (derived from pools of at least three fruits per plant). Statistical analysis was performed using student’s t-test (*P < 0.05, **P < 0.01). Similar results were obtained in at least two independent experiments.

Results

RNA-Seq sequencing and mapping

To illuminate the transcriptional landscape of the WT (S. lycopersicum cv. Ailsa Craig, AC+) and hp1 mutant, RNA samples from pericarps of whole fruit sphere of thirty age-matched immature fruits (14 DAP) from 15 plants grown under the same natural condition were obtained. The immature fruits (14 DAP) were selected for analyses because they are at an active state of plastid division and development (Raynaud et al. 2005). We generated one cDNA library from each genotype for sequencing using HiSeq 2000 at BGI-Shenzhen (Shenzhen, China). Each cDNA library yielded over 20 million sequence reads (Table 1), representing > 1.6 Gb sequence data per sample.

Table 1 Summary of RNA sequencing, mapping, assembly and annotation

After removing reads containing sequencing adapters or with low quality (reads containing Ns > 5), we mapped the sequencing reads to the tomato genome sequences (SL2.40 version) using TopHat with at most two mismatches tolerances (Trapnell et al. 2009). In this study, the latest gene annotation (ITAG 2.3 release) of the tomato genome was used to generate a splice junction library. Of the sequencing reads from WT and hp1, more than 83 % reads could be aligned to the tomato genome, with over 82 % reads aligned to the unique locations. In addition, 55.57 % of WT and 56.76 % of hp1 sequencing reads were mapped to known exons (Table 1). Unmapped or multi-position matched reads (approximately 0.8 %) were excluded from further analyses in each sample (Table 1). Of the unique mapping reads, 16.34 % of WT and 17.05 % of hp1 reads were aligned to splice junctions, respectively.

Transcripts assembly and annotation

All the unique mapping sequence reads were assembled into gene transcripts using Cufflinks (version 6.02). In total, we obtained 27,525 and 27,184 transcripts from WT and hp1, which are encoded by 21,972 and 22,167 gene models, respectively (Table 1). The current official annotation (ITAG 2.3 Release) on S. lycopersicum genome (SL2.40 version) contains 34,727 gene models. It appeared that, in both genotypes, over 60 % genes overlap with or contain the known gene models. In addition, 18,909 tomato genes were expressed in both WT and hp1, of which about 71 % (13,468/18,909) could be detected in both genotypes (Fig. 1).

Fig. 1
figure 1

The statistics of genes expressed in WT and hp1 immature fruits. The Venn diagram shows the commonly and specially expressed genes in WT and hp1, respectively, and the overlap statistics between the expressed genes from WT, hp1, and the predicted official gene model of SGN (http://solgenomics.net)

All transcripts were queried against NCBI nr (non-redundant) databases for annotation. The annotated results were listed in Supplemental Table S3. Approximately 86 % of WT and 87 % of hp1 sequence entries produced significant hits (E ≤ 1 × 10−5) (Table 1). The sequences with no significant hit versus the nr database were queried against the Pfam database to determine their putative function. Only a small fraction of these sequences from WT or hp1 mutant (≈ 2 %) produced significant hits (E values ≤ 1 × 10−5) to known protein domains.

For the WT and hp1 transcripts, the best hit for each transcript queried against the nr database was utilized to assign functional Gene Ontology (GO) annotation in terms of biological process, molecular function and cellular component, respectively. The results of GO distribution were summarized in Table 1 and Fig. 2a. Using the Fisher’s exact test, GO functional enrichment analysis indicated there is no significant difference in GO annotations for molecular process and cellular component between WT and hp1 mutant. Only six GO terms for molecular function, including cycloartenol synthase activity (GO:0016871), L-ascorbate peroxidase activity (GO:0016688), antioxidant activity (GO:0016209), sodium-hydrogen antiporter activity (GO:0015385), 11-beta-hydroxysteroid dehydrogenase (NADP +) activity (GO:0070524) and 11-beta-hydroxysteroid dehydrogenase [NAD(P)] activity (GO:0003845), were found significantly enriched (FDR-adjusted P-value < 0.05) in the transcriptome of hp1 mutant (Fig. 2b).

Fig. 2
figure 2

Gene Ontology (GO) analysis of the genes expressed in WT and hp1 immature fruits. a Pie charts display the distributions of the categories of the total WT and hp1 by biological process, molecular function and cellular component, respectively. b Differential distribution of six significantly enriched GO-terms in the WT and hp1 mutant (P < 0.05). Column height reflects the relative percentages in WT (blue) and hp1 mutant (red)

Characterization of untranslated regions (UTRs)

UTRs are known to play important roles in the post-transcriptional regulation of gene expression, influencing mRNA transport, subcellular localization and turnover (Mignone et al. 2002). Because UTRs contain important regulatory motifs, identification of the untranslated boundaries is crucial for understanding of the post-transcriptional regulation in S. lycopersicum. Therefore, identification of UTRs of S. lycopersicum genes has the potential to improve the genome annotation and can provide important insights into the gene regulatory mechanisms underlying the fruit development and pigmentation.

We mapped the 5′- and 3′- end boundaries (5′-, 3′- UTRs) of S. lycopersicum genes by searching for a sharp reduction in RNA-Seq read signals at both ends of predicted gene models. Those genes whose 5′- or 3′- boundaries overlap with other genes were excluded from this UTR analysis. According to this RNA-based mapping, the 5′- boundary regions were identified in 12,603 WT transcripts and 12‚342 hp1 transcripts, whereas 12,787 WT and 13,330 hp1 transcripts were found possessing 3′- boundary regions (Supplemental Table S4). Among these data, the 5′- boundary regions were extended in 8,833 WT transcripts and 8‚263 hp1 transcripts, whereas 8,510 WT and 8,939 hp1 transcripts were extended in 3′- boundary regions (Supplemental Table S4). In all the tomato transcripts, the median lengths for 5′- and 3′- UTRs were 126 and 277 bp, respectively (Fig. 3a).

Fig. 3
figure 3

UTRs’ length distribution and PCR validation in six randomly chosen genes. a Scatterplot and histograms showing the length distributions of 5′- and 3′- UTRs. b Significantly enriched GO terms distribution with longer UTRs (≥200 bp, right) and shorter UTRs (<200 bp, left) in WT and hp1 mutant (FDR-adjusted P < 0.05). c RT-PCR validation of six randomly chosen genes with extended 5′ UTRs in WT and hp1 mutant. Primers A and B represent the positive primers, designed for the target gene sequence in official gene models of SGN (http://solgenomics.net) and determined by RNA-Seq, respectively. Primer C represents the negative primer of the target gene

Moreover, we compared the UTR length distributions for various GO functional categories. The GO enrichment analysis of transcripts with 5′- UTRs length more than 200 bp indicated that these genes were significantly involved in 35 GO terms in both genotypes, such as responses to external stimulus, regulation of gibberellic acid and ethylene-mediated signaling pathway, secondary growth, plant-type cell wall biogenesis (FDR-adjusted P-value < 0.05) (Fig. 3b). In contrast, for another sixteen GO terms, such as cullin deneddylation, amino acid biosynthetic process, oxidation reduction, ribosome biogenesis and translation, had significantly shorter UTRs (< 200 bp) (Fig. 3b).

To validate the extended 5′- UTRs determined by RNA-Seq, we randomly selected six transcripts’ 5′- UTRs for verification. As shown in Fig. 3c, the extension of the six tested 5′- UTRs was confirmed by RT-PCR amplification using the 5′-UTR-specific primers.

Identification of splice variants

The alternative splicing (AS) plays a major role in generating protein diversity and subsequent functional complexity (Modrek and Lee 2002; McGuire et al. 2008). There are seven conventional AS types defined, including A5SS (alternative 5′ splice sites), A3SS (alternative 3′ splice sites), SE (skipped exons), RI (retained intron), MXE (mutually exclusive exons), AFE (alternative first exons) and ALE (alternative last exons) (Wang et al. 2010). In the present study, five AS types were found in both the WT and hp1 transcripts (Table 2). There were 4,321 or 4,289 genes producing 8,322 or 7,989 AS transcripts in WT or hp1 immature fruits, respectively (Table 2, Supplemental Table S5). Among these AS types, A5SS and A3SS were the predominant ones, accounting for over 60 % of the total AS events (Table 2). Our data indicate that in WT or hp1 immature fruits, 72.82 or 73.36 % annotated transcripts contain two or more exons, wherein 18.14 or 16.78 % transcribed genes were estimated to be subjected to AS (Table 1). Moreover, 2,497 genes were detected to undergo AS concomitantly in both genotypes, with 3,648 and 3,545 AS events in WT and hp1, respectively (Supplemental Table S5).

Table 2 List of genes undergoing alternative splicing

To verify the AS events detected by RNA-Seq, we randomly selected 4 genes with RI alternative splicing for RT-PCR analysis. As a result, the expected multiple PCR products were visualized (Fig. 4).

Fig. 4
figure 4

PCR validation of genes undergoing alternative splicing. a The schematic depicts the AS types of the genes which are located in SL2.40ch01: 87199940-87202306. The black bar represents the constitutive exons, and the gray bar represents the alternative splicing exons. The arrows show the location of specific primers used in RT-PCR validation. b RT-PCR confirmation of the constitutive and alternative splicing events in four candidate genes in WT and hp1 mutant

Discovery of novel transcripts

RNA-Seq has been a useful tool to discover novel transcripts and genes, which can complement automated genome annotation efforts (Wang et al. 2009; Trapnell et al. 2010). By comparing against the official annotation (nr database), we found 1‚136 and 1‚133 novel transcripts with plausible CDSs and a recognizable exon–intron structure in WT and hp1 mutant, respectively (Supplemental Table S6). Furthermore, 942 novel transcripts were concomitantly detected in both WT and hp1. To validate the novel transcripts identified by RNA-Seq, we randomly selected four novel transcripts and confirmed their existence by RT-PCR using gene-specific primers. As a consequence, PCR products of expected size were observed. Sequence analyses demonstrated that these PCR products are indeed the expected products (Fig. 5).

Fig. 5
figure 5

RT-PCR Validation of novel isoforms predicted by RNA-Seq. Four randomly selected novel transcripts were confirmed in both WT and hp1 mutant. The band in each lane represents expected size of PCR product

Analysis of differential gene expressions based on RNA-Seq data

To obtain statistical verification of the differential gene expressions in the WT and hp1 mutant, we compared the FPKM (FPKM, Fragments Per Kilobase of transcript per Million mapped reads)-derived read count using Student’s t-test (Trapnell et al. 2010). Compared to those in WT, the expression levels of 880 genes were significantly altered in immature hp1 fruits, among which 548 and 332 genes were up- and down-regulated, respectively (P < 0.05) (Supplemental Table S7). To verify the differential levels of expression between WT and hp1 mutant detected by RNA-Seq, 30 genes (with or without statistics significance) (Supplemental Table S2, S7) were randomly selected for RT-PCR analysis. As a result, the expression difference of all the tested genes validated by semi-quantitative RT-PCR was largely in accordance with the RNA-Seq data (Fig. 6). More specifically, several genes, encoding ethylene receptor-like protein (ETR6), homeobox-leucine zipper protein, WRKY transcription factors 16 and 51, pectinacetylesterase (PAE), were found to be markedly up-regulated in hp1 mutant (Fig. 6b, c).

Fig. 6
figure 6

Validation of differentially expressed genes by semi-quantitative RT-PCR. a Semi- quantitative RT-PCR validation of down-regulated genes in hp1 mutant compared to the WT. b, c Semi- quantitative RT-PCR validation of up-regulated genes in hp1 mutant compared to the WT

One of the primary goals of our transcriptome sequencing analyses was to identify genes responsible for the differential pigmentation between WT and hp1 mutant. Among the 880 genes, a number of genes involved in chloroplast photosynthesis were significantly up-regulated in hp1 mutant, including photosystem I reaction center subunit, chloroplast photosystem II subunit, ATP synthase F1 delta subunit, 50S ribosomal protein, 14-3-3, and elongation factor TU (Table 3, Supplemental Table S7). To verify the altered gene expression identified by RNA-Seq, we conducted RT-PCR assays on several genes that only showed marginal induction in hp1 mutant by the RNA-Seq assay, focusing on those genes involved in chloroplast division and development (Tables 3, 4). We selected four genes coding for chlorophyll biosynthesis, including aminolevulinate dehydratase (Alad, Solyc08g069030.2.1), magnesium chelatase subunit I (CHL I, Solyc10g008740.2.1), 8-vinyl reductase (DVR, Solyc01g067290.2.1) and chlorophyll synthetase (CHLG, Solyc09g014760.2.1) (Table 3), and two plastid division-related genes, FtsZ1 (Solyc07g065050.2.1) and ARC3 (Solyc04g017620.2.1) (Table 4). As shown in Fig. 7, quantitative RT-PCR analyses indicated that the expression of these genes was significantly enhanced in the hp1 mutant fruits. In addition, quantitative RT-PCR analyses were employed to determine expression alteration of two other genes important for control of photosynthesis and chloroplast compartment size. The results demonstrated that the expression of SlGLK2 (Solyc10g008160.2.1) encoding the Golden 2-like transcription factor (Powell et al. 2012) was increased in hp1 mutant (Fig. 7b; Table 4). In contrast, the expression of DR12 gene that encodes an auxin response factor negatively affecting the plastid compartment size (Jones et al. 2002) was decreased in hp1 mutant fruits (Fig. 7b; Table 4).

Table 3 Fold changes in transcription level of genes related to the chloroplast biogenesis in early development stage fruit of hp1 relative to WT
Table 4 Fold changes in transcription level of genes related to plastid division in early development stage fruit of hp1 relative to WT
Fig. 7
figure 7

Real-time quantitative RT-PCR validation of differentially expressed genes related to chlorophyll metabolism and plastid/chloroplast division in 7, 14, 21 and 28 DAP fruits of WT and hp1 mutant. a Real-time quantitative RT-PCR analyses on four genes involved in chlorophyll metabolism, including Alad (aminolevulinate dehydratase), CHL I (magnesium chelatase subunit I), DVR (8-vinyl reductase), and CHLG (chlorophyll synthetase). b Real-time quantitative RT-PCR validation of four genes important for plastid/chloroplast division, including FtsZ1 (filamenting temperature-sensitive Z), ARC3 (accumulation and replication of chloroplast), DR12 (developmentally regulated), and SlGLK2 (Golden 2-like transcription factor). Each bar represents three repetitions from each RNA sample (derived from pools of at least three fruits per plant). Error bars representing standard errors are shown in each case. Statistical analysis was performed using Student’s t-test (*P < 0.05, **P < 0.01). Similar results were obtained in at least two independent experiments

Discussion

The transcriptome is comprised of a set of transcripts from a particular stage of organism, reflecting the specific transcriptional profiles of the genome. Due to the unprecedented level of sensitivity and the high throughput of deep sequencing technology, RNA-Seq is considered as a powerful approach for detecting large scale transcriptome profiles. Using this technology in the current study, we performed the transcriptome analyses for immature fruits of WT and hp1 mutant tomato plants (samples were collected from thirty fruits of different plants) and our results revealed the effect of dynamics of high pigment mutation on gene expression during early fruit development.

We conducted extensive statistic analyses for the obtained transcriptome datasets, including identification of splice variants, characterization of untranslated regions (UTRs), and identification of novel transcripts. In the immature fruits, 8,322 alternative splicing (AS) events derived from 4,321 genes of the WT and 7,989 AS events from 4,289 genes of the hp1 mutant, respectively, were detected (Table 2). The A5SS (alternative 5′ splice sites) and A3SS (alternative 3′ splice sites) were the two predominant AS types, accounting for over 60 % of the total AS events (Table 2, Supplemental Table S5), which is similar to that found in Arabidopsis, while the percentage of RI (retained intron) in Arabidopsis (approximately 30 %) is considerably greater than that in tomato (18.84 %) (Ner-Gaon et al. 2004; Kim et al. 2007). In addition, over 8‚500 UTRs were extended in the WT immature fruits, whereas over 8‚200 UTRs were found in the hp1 immature fruits. Interestingly, compared to the average length of Arabidopsis 5′- UTR (208 bp) and 3′- UTR (99 bp) (Chung et al. 2006), the average length of tomato 5′- UTR (126 bp) is shorter, but the tomato 3′- UTRs (277 bp) is longer. Significantly, our RNA-Seq has identified a large number of novel transcripts shared by or unique to the WT and hp1 mutant; 942 novel transcripts were found in both WT and hp1 (Supplemental Table S6); 1,136 and 1,133 novel transcripts were identified in WT and hp1 mutant, respectively. Further functional dissection of these putative novel transcripts will provide novel insights into the molecular mechanisms of tomato fruit pigmentation and development.

The photomorphogenic mutant hp1 is characterized by dark-green immature fruit because of substantially increased chloroplast compartment size and overproduction of chlorophyll (Jarret et al. 1984; Kendrick et al. 1997; Yen et al. 1997; Cookson et al. 2003; Bino et al. 2005), suggesting that the expression of genes involved plastid (chloroplast precursor) biogenesis and development as well as chlorophyll metabolism might be altered by the hp1 mutation. This hypothesis was validated by our comprehensive transcriptome analyses. Sequencing datasets, as well as quantitative RT-PCR analyses revealed that the expression of various genes involved in chlorophyll biosynthesis as well as plastid biogenesis and development was up-regulated in the hp1 immature fruits (Tables 3, 4; Fig. 7). Among them, aminolevulinate dehydratase is a cytosolic enzyme, which catalyzes the third step of tetrapyrrole synthesis leading to the formation of heme and chlorophylls in plant tissues (Berthe et al. 2003). Previous transgenic study also showed that the altered expression of CHL I gene (one of the subunits of active magnesium protoporphyrin IX chelatase, Mg-chelatase) resulted in the reduction of chlorophyll synthesis (Papenbrock et al. 2000). 8-vinyl reductase (DVR) is indispensable for monovinyl chlorophyll synthesis (Nagata et al. 2005). Chlorophyll synthase (CHLG) is important for the co-regulation of the entire pathway and the coordination of chlorophyll synthesis and the chlorophyll-binding proteins (Shalygo et al. 2009). Knockdown of CHLG expression led to down-regulation of genes coding for chlorophyll biosynthesis and chlorophyll-binding proteins, whereas overexpression of CHLG resulted in enhanced aminolevulinate (ALA) synthesizing capacity and more chelatase activities (Shalygo et al. 2009). Thus, we postulate that the elevated expression of genes involved in chlorophyll biosynthesis might, at least partially, contribute to the high pigmentation in hp1 mutant.

Consistent with the enlargement of the plastid/chloroplast compartment size in hp1 immature fruit (Cookson et al. 2003; Liu et al. 2004; Wang et al. 2008), altered expression was found in several well-characterized genes involved in plastid biogenesis or regulation of plastid division and development. Two genes highly homologous to the nuclear-encoded FtsZ1 and ARC3 are up-regulated in hp1 immature fruits (Table 4; Fig. 7b). In fact, FtsZ1 is a critical component of the plastidic Z-ring essential for plastid biogenesis (Vitha et al. 2001; Maple and Møller 2007), while ARC3 plays an important role in regulating placement of the plastid division site (Shimada et al. 2004). Intriguingly, increased plastid division has been found in another high pigment mutant (hp3) fruits, in which the FtsZ1 transcript level is 60 % higher than that in the WT fruits (Galpaz et al. 2007). ARC3 also specifically interacts with FtsZ1, acting as a Z-ring accessory protein and defining a unique function for this family of FtsZ proteins (Maple et al. 2007). Overexpression of ARC3 leads to aberrant chloroplast division and greatly enlarged chloroplasts (Maple et al. 2007). In addition, our RNA-Seq dataset, supported by the quantitative RT-PCR analyses, demonstrated the elevated expression of U (uniform ripening) gene in the hp1 immature fruits (Powell et al. 2012). The U encodes a Golden 2-like transcription factor (SlGLK2) and is responsible for increase in the number and size of green fruit chloroplasts and accelerated accumulation and development of grana thylakoids (Powell et al. 2012). Taken together, our data suggest that the enhanced expression of positive components or regulatory proteins involved in plastid division or/and development is, at least in part, responsible for the promotion of plastid/chloroplast compartment size in the hp1 immature fruits. By contrast, DR12, a negative regulator encoding an auxin-responsive factor, was found to be down-regulated in hp1 immature fruits (Fig. 7b). The decreased DR12 transcripts might largely account for the enlargement of plastid compartment size observed in hp1 immature fruits, which is consistent with previous observations that down-regulation of DR12 causes dark-green immature fruit with increased chloroplast number in the fruit pericarp (Jones et al. 2002).

In addition to the identification of these genes involved in plastid/chloroplast division and development, some defense-related genes, including PAE, WRKY16/51, were observed to be highly up-regulated in hp1 mutant (Fig. 6b, c). WRKY transcription factors are global regulators of host responses to pathogen infection, they can either positively or negatively regulate defense gene expression through modulating downstream target genes (Pandey and Somssich 2009; Huang et al. 2012). Significantly, the Arabidopsis AtWRKY16 played a negative role of defense response and hypersensitive cell death (Chi et al. 2013), and AtWRKY51 acts as a negative regulator of jasmonic acid-inducible defense response (Gao et al. 2011). Pectinacetylesterase (PAE) hydrolyzes acetyl esters in homogalacturonan regions of pectin (a major compound of plant primary cell wall), thereby modifying cell walls during the interaction with pathogens. Van den Berg et al. (2007) reported that PAE had significantly higher expression in Fusarium wilt-susceptible banana cultivar Williams compared to the resistant cultivar GCTCV-218, though PAE expression became higher in GCTCV-218 than in Williams after inoculation with Fusarium oxysporum. Based on these previous studies and our data of gene expression, we speculate that hp1 mutant with up-regulated PAE, WRKY16/51 genes might be more susceptible to pathogens. This hypothesis was partially supported by our recent study, in which tomato DDB1 was implicated in pathogenesis-related (PR) gene expression and the hp1 mutant was more susceptible to Agrobacterium infection (Liu et al. 2012). However, further experiments are needed to identify the specific defense pathway deficient in hp1 mutant.

In summary, our results help us to understand the alterations of dynamic processes underlying the differential transcripts levels between WT and hp1 fruits during early stage of development. Although the analysis of gene expression based on a single replication lacks knowledge about biological variation, the RNA-Seq analysis provided a view on the variation tendency of genes involved in plastid division and development of WT and hp1 mutant. In particular, the current analysis indicates that HP1/DDB1 may regulate the transcription of a number of genes directly involved in plastid division and development as well as hormone responses, such as Uniform ripening, FtsZ1, ARC3 and DR12, and subsequently influence plastid compartment size and fruit pigmentation.