Introduction

Tomato (Solanum lycopersicum) is one of the most important vegetable crops in the world due to its great nutritive and commercial value. It is also a model organism for studying fleshy fruit development and ripening (Klee and Giovannoni 2011), compound leaf development, floral system and plant architecture (Kimura and Sinha 2008), as well as defense response against abiotic and biotic stresses (Kennedy 2003; Sun et al. 2011). Tomato belongs to the family Solanaceae, which includes vegetable crops such as pepper (Capsicum annuum), eggplant (Solanum melongena) and potato (Solanum tuberosum). The tomato genome is considered a reference for solanaceous species because it is one of the smallest diploid genomes within the family and, in particular, for species within the Solanum genus, it shows high conservation of gene order among each other (Tomato-Genome-Consortium 2012). Therefore, the study of tomato genes is important because the knowledge obtained may be easily applied to other Solanaceae species.

Gene families are groups of similar genes that arise from a common ancestor through duplication and divergence. Many genes belong to gene families. In Arabidopsis thaliana, 41 % of the predicted proteins belong to gene families containing at least five members (The-Arabidopsis-Genome-Initiative 2000). In rice (Oryza sativa), 77 % of the predicted genes are found to have at least one paralog (Goff et al. 2002). The IQD/SUN, OFP (OVATE family protein) and YABBY gene families are characterized by the IQ67, OVATE and YABBY domain, respectively (Golz and Hudson 1999; Bowman 2000; Abel et al. 2005; Hackbusch et al. 2005). In tomato, three quantitative trait loci (QTLs) controlling fruit shape have been cloned: SUN, OVATE and FASCIATED (FAS) belonging to the IQD/SUN, OFP and YABBY gene families, respectively (Liu et al. 2002; Hackbusch et al. 2005; Cong et al. 2008; Xiao et al. 2008).

The cloning of SUN revealed that the elongated fruit phenotype is caused by a 24.7-kb gene duplication that caused SUN to be controlled by the promoter of a defensin (DEFL1) gene leading to high expression in the fruit (Xiao et al. 2008; Jiang et al. 2009; Wu et al. 2011). Phenotypic analysis of SUN near isogenic lines shows that high SUN expression leads to fruit elongation by increased cell number in the longitudinal direction and reduced cell number in the transverse direction of the fruit. Overexpression of SUN results in slender cotyledons and leaflets as well as extremely elongated, seedless fruits (Wu et al. 2011). SUN encodes a protein containing the IQ67 domain (Abel et al. 2005). There are 33 and 29 genes encoding proteins with the IQ67 domain in Arabidopsis and rice, respectively (Abel et al. 2005). Over-expression of AtIQD1 (At3g09710) leads to glucosinolate accumulation in Arabidopsis (Levy et al. 2005). It was recently found that AtIQD1 interacts with both kinesin light chain-related protein-1 (KLCR1) and also CaM/CMLs and recruits those proteins to the microtubules (Buerstenbinder et al. 2012). However, the function of other members of this family is unknown.

OVATE also controls tomato fruit elongation (Liu et al. 2002). A single mutation leading to a premature stop codon in the OVATE gene results in the transition of tomato fruit from round to pear-shaped. Over-expression of OVATE reduces the size of floral organs and leaflets; therefore, OVATE is considered to be a negative regulator of plant growth (Liu et al. 2002). CaOvate, an OVATE-like gene of Capsicum annuum, may play a similar role in fruit shape determination because it expresses higher in cv. “Mytilini Round” than cv. “Piperaki Long”. Down-regulation of CaOvate through virus-induced gene silencing in cv. “Mytilini Round” changes its fruit to a more oblong shape (Tsaballa et al. 2011). OVATE encodes a protein with a 60–70 amino acid C-terminal domain termed the OVATE domain (Liu et al. 2002; Wang et al. 2007). In Arabidopsis, 18 genes encode OVATE domain-containing proteins, and are named Arabidopsis thaliana OVATE family proteins (AtOFPs) (Hackbusch et al. 2005; Wang et al. 2007). Most AtOFPs appear to function as transcriptional repressors in the transient Arabidopsis protoplast expression system (Wang et al. 2011). In a yeast two-hybrid screen, nine AtOFPs are found to interact with three-amino acid loop extension (TALE) homeodomain proteins (Hackbusch et al. 2005). AtOFP1 and AtOFP5 control the subcellular localization of one of the TALE homeodomain proteins, BLH1. When coexpressed with AtOFP1 and AtOFP5 in Nicotiana benthamiana leaves, BLH1 is relocated from the nucleus to the cytoplasmic space (Hackbusch et al. 2005). These results imply that the effect on growth is controlled by interactions of OFP with TALE homeodomain transcription factors and also by direct transcriptional repression of target genes. One such target gene is AtGA20ox1 (gibberellin 20-oxidase1, a gibberellin biosynthetic gene) whose expression is reduced by AtOFP1 overexpression. The reduced length of above ground organs is partially restored by application of gibberellin (Hackbusch et al. 2005; Wang et al. 2007). Besides interaction with TALE homeodomain proteins, AtOFP1 also interacts with AtKu, which is involved in DNA double-strand break repair (Wang et al. 2010). In another study, AtOFP5 acts as a negative regulator of BLH1-KNAT3 activity during early embryo sac development (Pagnussat et al. 2007) and AtOFP4 plays a role in secondary cell wall formation through its interaction with KNAT7 (Li et al. 2011a). Contrary to tomato OVATE, the analysis of loss-of-function alleles of OFPs in Arabidopsis suggests that these genes have redundant functions because single knock out mutants of AtOFP1, AtOFP4, AtOFP8, AtOFP10, AtOFP15 and AtOFP16 do not show morphological defects (Wang et al. 2011). In all, the OFP proteins might regulate plant growth and development by affecting transcriptional regulation of target genes either directly or indirectly.

In contrast to SUN and OVATE, which control elongated fruit shape, a mutation in FAS results in a flat tomato due to an increase in locule number (Lippman and Tanksley 2001; Barrero et al. 2006). The mutation is the result of an inversion that knocks out the likely ortholog of Arabidopsis YABBY2, and this mutation is found in several tomato accessions with a high locule number and flat fruit shape (Cong et al. 2008; Huang and van der Knaap 2011; Rodriguez et al. 2011). YABBY proteins have conserved roles in specifying abaxial cell fate in lateral organs such as leaves, floral organs and ovules, and establishing the proper boundaries in meristems (Golz and Hudson 1999; Bowman 2000). Arabidopsis has six YABBY gene family members (Golz and Hudson 1999; Bowman 2000). Four of them, FILAMENTOUS FLOWER (FIL, also called YAB1), YABBY2 (YAB2), YABBY3 (YAB3) and YABBY5 (YAB5), have overlapping functions in Arabidopsis leaf development based on the phenotype of their loss-of-function mutants (Stahle et al. 2009; Sarojam et al. 2010). The other two Arabidopsis YABBY genes, CRC and INO, are only expressed in floral organs (Bowman and Smyth 1999; Villanueva et al. 1999; Schmid et al. 2005). CRC is required for nectary specification and carpel polarity (Alvarez and Smyth 1999; Bowman and Smyth 1999), and INO is essential for development of the outer integument (Villanueva et al. 1999). A deletion mutant of the INO ortholog in sugar apple (Annona squamosa) was found in a spontaneous seedless mutant (Thai seedless; Ts) (Lora et al. 2011). There are eight YABBY genes in rice (Toriba et al. 2007). DROOPING LEAF has diverse roles in rice leaf development and homeotic transformations of floral organs (Yamaguchi et al. 2004; Ohmori et al. 2011; Li et al. 2011b). TONGARI-BOUSHI1 (OsYABBY5) is reported to control lateral organ development and regulation of meristem organization in the rice spikelet (Tanaka et al. 2012). Moreover, sorghum has three different mutations in the YABBY gene Shattering1 (Sh1), which result in the loss of seed shattering in domesticated sorghum (Lin et al. 2012).

Taken together, members of IQD/SUN, OFP and YABBY gene families play important roles in plant growth and development and may also underlie additional fruit shape genes in tomato and other Solanaceae plants. However, except for SUN, OVATE and FAS, virtually no information is available about the members of these three gene families in tomato. In this study, we identified 34 Solanum lycopersicum SUN (SlSUN) genes, 31 Solanum lycopersicum OVATE family protein (SlOFP) genes and 9 Solanum lycopersicum YABBY (SlYABBY) genes, and determined their closest orthologs in Arabidopsis based on phylogenetic relationships. We also investigated their expression pattern in 11 different tissues from tomato’s closest wild relative, Solanum pimpinellifolium, from which it is thought to be domesticated (Peralta et al. 2008; Tomato-Genome-Consortium 2012). Our results may provide important clues for understanding the roles of the SlSUN, SlOFP and SlYABBY genes in tomato growth and development, and this information could be extended to other plants.

Materials and methods

Plant material and tissue collection for expression analysis

Seeds of S. pimpinellifolium accession LA1589 were obtained from the C.M. Rick Tomato Genetics Resource Center, Davis, California, USA. Plants were grown under standard conditions with supplemental lighting in the greenhouse in Wooster, OH, USA. Over the span of a month, seven different tissue types from 17 separate LA1589 tomato plants were collected in a greenhouse between 9:00 a.m. and 10:00 a.m. and were pooled for each tissue type. The collected tissues were immediately frozen in liquid nitrogen. The tissues collected were newly developed leaves around 5 mm long, mature green leaflets, flower buds younger than or equal to 10 days before anthesis (DBA), flowers at anthesis, 10 days post anthesis (DPA) fruit, 20 DPA fruit and 33 DPA ripening fruit. The following tissues were collected from seeds that germinated and grew for 7 days in a petridish under growing lights: whole root, hypocotyl from below the cotyledons to above the root zone, cotyledons, and vegetative meristems (including leaf primordia).

Identification of SUN, OFP and YABBY genes in tomato

The IQ67 domain (Abel et al. 2005) of SUN was used to identify the members of this family in tomato; the OVATE domain (Liu et al. 2002; Hackbusch et al. 2005; Wang et al. 2007), also known as DUF623 domain (Domain-of-Unknown-Function 623, Pfam accession PF04844), was used to identify OFP genes; the YABBY domain (Pfam accession PF04690) of FAS was used to identify YABBY genes (Cong et al. 2008; Punta et al. 2012). With these domains as initial queries, systematic BLAST searches were performed on all sequences in the International Tomato Annotation Group (ITAG) Release 2.3 predicted proteins (2.40) (BLASTP, E value ≤1e−5), and tomato WGS chromosomes (2.40) (TBLASTN, OVATE domain and YABBY domain E value ≤1e−5; IQ67 domain E value ≤100) (SGN http://solgenomics.net). We identified nine genes that were not in database ITAG Release 2.3 but appear to have protein coding potential based on annotation by FGENESH (http://linux1.softberry.com/berry.phtml). Initial evidence of transcription of all genes was based on the identification in the Lycopersicon Combined (Tomato) Unigenes, and the Solanum peruvianum de novo transcriptome available at SOL Genomics Network (SGN, http://solgenomics.net), and full-length cDNA sequences in the KaFTom database (http://www.pgb.kazusa.or.jp/kaftom/). Further evidence of transcription, including that was not annotated in the latest release of the tomato genome, was based on expression analysis shown in this research. Only genes with at least one average RPKM value from all 11 tissues ≥2 in this study were considered to be expressed. The chromosomal location of SUN, OFP and YABBY genes was initially based on both their genetic map position using segregating populations (van der Knaap and Tanksley 2001) as well as their position on the tomato WGS Chromosomes (SL2.40) (SGN http://solgenomics.net). The sequences of AtIQD, AtOFP and AtYABBY proteins were downloaded from the Arabidopsis thaliana TAIR10 Protein database (ftp://ftp.arabidopsis.org/home/tair/Proteins/TAIR10_protein_lists/TAIR10_pep_20101214). Moss (Physcomitrella patens) IQD/SUN and OFP sequences, and grape (Vitis vinifera), poplar (Populus trichocarpa) YABBY sequences were downloaded from Phytozome v9.0 (http://www.phytozome.net/). The cucumber (Cucumis sativus) YABBY sequences were downloaded from Cucumber Genome DataBase (http://cucumber.genomics.org.cn/page/cucumber/index.jsp). The potato (S. tuberosum) YABBY sequences were downloaded from Solanaceae Genomics Resource (http://solanaceae.plantbiology.msu.edu/pgsc_download.shtml). The sitka spruce tree (Picea sitchensis) YABBY sequences were downloaded from Genbank (http://www.ncbi.nlm.nih.gov/genbank/).

Multiple alignment and phylogenetic analysis

The IQ67 domain was defined as described (Abel et al. 2005). The OVATE and YABBY domains were defined using the Pfam program (http://pfam.sanger.ac.uk/). Multiple alignments of the three conserved domain sequences were performed by ClustalX 2.1 (Larkin et al. 2007) with default setting. The alignment results were exported to MEGA 5.0 (Tamura et al. 2011). Unrooted phylogenetic trees were constructed with neighbor-joining (NJ) method, JTT model and 1,000 replicates. The identification of paralogous and orthologous relationships was based on their phylogenies, sequence similarity and all-against-all bidirectional best hits using SSEARCH (Smith and Waterman 1981; Pearson 1991).

RNA library construction

Total RNA was extracted with Trizol (Invitrogen Inc. USA) as described by the manufacturer or using a hot borate method (only for fruit at 20 DPA or 33 DPA) (Pang et al. 2011). RNA quantity and quality were assessed using a Qubit 2.0 fluorometer RNA Assay Kit (Invitrogen Inc. USA) and an Agilent 2100 Bioanalyzer RNA 6000 Nano kit (Agilent, USA). Strand-specific RNA-seq libraries of approximately 250 bp fragments were prepared using 10 μg total RNA (Zhong et al. 2011). Libraries were barcoded and pooled to represent six libraries from different tissues per lane on the flowcell. Sequences of 51 bp were generated on an Illumina HiSeq2000 at the Genomics Resources Core Facility at Weill Cornell Medical College (New York, NY, USA).

Alignment and analysis of illumina reads

After illumina reads were quality checked, demultiplexed and trimmed, they were clustered per library. The reads were aligned to ribosomal RNA sequences using Bowtie (Langmead et al. 2009) allowing for two mismatches to identify rRNA contamination. The ribosomal filtered reads were then aligned with TopHat (Trapnell et al. 2009) against the S. lycopersicum genome allowing for maximum intron lengths of 5,000 bp, segment lengths of 22 bp and 1 mismatch per segment. All other parameters were set to default. Reads that mapped up to 20 genes were counted as 1 for each match. Aligned sequences were then separated into sense and antisense, and the count of aligned reads for each tomato gene model and from each sample was derived using an in-house perl script. This script also counted reads that partially mapped to the UTRs. Reads per kilobase of exon model per million mapped reads (RPKM) were calculated using an in-house script based on both the ITAG 2.3 exon lengths and also the total number of reads that mapped to the tomato genome. For the expression analysis of selected genes in different tissues, the average RPKM values for each tissue type was shown. All raw reads were deposited in the NCBI sequence read archive with accession number SRA061767. The average RPKM values per sample for all genes can be found at http://ted.bti.cornell.edu/cgi-bin/TFGD/digital/home.cgi.

Results

The SUN genes in tomato

Identification of SUN genes in tomato

Twenty-nine genes encoding the entire IQ67 domain were identified in the ITAG database version 2.3. Four additional genes that potentially encoded other members of the SUN family were found in tomato WGS Chromosomes (SL2.40) (SGN http://solgenomics.net) and evaluated using FGENESH program (http://linux1.softberry.com/berry.phtml). Three of them consisted of a different predicted CDS of Solyc01g009340 (SlSUN2), Solyc01g097490 (SlSUN4) and SL1.00sc00090_96 (SlSUN6) (Table 1; Fig. 1; Online source 1). The SUN gene on chromosome 7, which controls elongated fruit shape, was identified as SUN. The original copy of SUN on chromosome 10 (Xiao et al. 2008) was referred to as SlSUN1. The other members were named SlSUN2SlSUN33 according to their position from the top to the bottom on chromosomes 1–12. Twenty-five SlSUN genes were supported by unigenes or full-length cDNA sequences, and 28 SlSUN genes demonstrated expression in this study (Table 1). Evidence for the expression of the five remaining SUN-like genes either was not found or was below the cut-off in the RNA-seq dataset developed for this study.

Table 1 SUN gene family in tomato

All SlSUN-like genes had multiple introns including one that disrupted the IQ67 domain between codons 16 and 17 (Table 1). This has also been noted for most Arabidopsis IQD genes (Abel et al. 2005). SlSUN6 was the smallest member of this family. It had two exons and was predicted to encode a 128 amino acid protein (Table 1). Whereas SUN is located on chromosome 7 (Xiao et al. 2008), none of the other 33 SUN family members were located on this chromosome. SlSUN19 (Solyc08g007920.1.1) and SlSUN20 (Solyc08g007930.1.1) were close to each other, within a segment of 15 kb on chromosome 8 (Table 1; Fig. 1).

Phylogenetic analysis of Arabidopsis IQD genes and tomato SUN genes

To uncover the phylogenetic relationships between Arabidopsis IQD and tomato SUN genes, we constructed a dendrogram based on their IQ67 domain sequences (Fig. 2; Online source 2). The phylogenetic trees illustrate that the AtIQD and SlSUN genes could be divided into ten subgroups (Fig. 2). The detailed information of closest ortholog pairs between AtIQDs and SlSUNs was listed in Online source 2. SUN and SlSUN1 were paralogs of SlSUN12, and their ortholog was likely represented by AtIQD12 (Online source 2) as reported previously (Xiao et al. 2008). Several AtIQD and SlSUN proteins showed a one-to-one orthologous relationship, such as SlSUN6 and AtIQD20, SlSUN14 and AtIQD32, SlSUN22 and AtIQD6, and SlSUN31 and AtIQD5, which implied there was a common ancestor for these pairs, respectively (Fig. 2; Online source 2).

The expression pattern of SlSUN genes in wild tomato

To gain insights into the role of the SlSUN genes in tomato growth and development, we analyzed their expression patterns in both different tissues and also developmental stages using an RNA-seq approach. Twenty-eight SlSUN genes were expressed in this study. The average of the highest RPKM values in the 11 tissues of the 28 SlSUN genes is 135.18, and SlSUN29 demonstrates the highest gene expression of this family with an RPKM of 836.66 in one of the 11 tissues (Table 1; Online source 3). SlSUN1 was expressed slightly higher in the hypocotyl, flower at anthesis and fruit at 10 and 20 DPA (Fig. 3a; Online source 3). Some SlSUN genes were specifically expressed in certain tissues. For example, SlSUN2 was specifically expressed in the vegetative meristem, young leaf and young flower bud; SlSUN5, SlSUN21 and SlSUN27 were specifically expressed in the root; SlSUN11 and SlSUN22 were specifically expressed in the young leaf and young flower bud; SlSUN12 and SlSUN26 were specifically expressed in the hypocotyl; SlSUN24 was specifically expressed in the vegetative meristem and young flower bud; SlSUN28 was specifically expressed in ripening fruit (33 DPA fruit); SlSUN33 was specifically expressed in fruit at 20 DPA (Fig. 3; Online source 3).

The OFP genes in tomato

Identification of OFP genes in tomato

Twenty-five putative SlOFP genes encoding the OVATE domain were found in the ITAG database version 2.3 (Table 2; Fig. 1; Online source 4). Six putative additional genes that were predicted to encode the OVATE domain were found in tomato WGS Chromosomes (SL2.40) (SGN http://solgenomics.net) using FGENESH program (http://linux1.softberry.com/berry.phtml). Two of them were found in the previous genome annotation, ITAG version 1.0: SL1.00sc02618_4 (SlOFP4) and SL1.00sc03540_201 (SlOFP31) (Table 2; Fig. 1; Online source 4). The gene locus Solyc09g065350 (SlOFP18) in the reference genome of cultivar Heinz1706 had a one-nucleotide deletion causing a nonsense mutation and the loss of the OVATE domain-coding region. The allele in S. pimpinellifolium, LA1589 and S. peruvianum had longer CDS (coding sequence) encoding the OVATE domain (Table 2; Online source 4). In this study, the tomato OVATE gene was referred to as SlOFP1 and the other genes were named from SlOFP2 to SlOFP31 based on their position on the chromosome (Table 2; Fig. 1). There was a cluster of eight SlOFP genes on chromosome 10: SlOFP21SlOFP28 (Table 2; Fig. 1). The expression of 20 SlOFP genes was supported by unigene, full-length cDNA, S. peruvianum de novo transcriptome and/or RNA-seq results from this study (Table 2, Online source 3). Expression for the 11 remaining SlOFP genes was below the threshold level of 2 RPKM.

Table 2 OFP gene family in tomato

Phylogenetic analysis of OFP genes in Arabidopsis and tomato

A dendrogram based on the OVATE domain was constructed to uncover the phylogenetic relationships between Arabidopsis and tomato OFPs (Fig. 4). The phylogenetic tree illustrated that the AtOFP and SlOFP proteins were divided into three subfamilies (Fig. 4). The detailed information of closest ortholog pairs between AtOFPs and SlOFPs was listed in Online source 5. OVATE was a paralog of SlOFP6, and their ortholog was likely represented by AtOFP7. In some subfamilies, SlOFP genes appeared to have expanded in tomato compared to Arabidopsis. For example, within subfamily 1, there were eight SlOFP proteins (from SlOFP22 to SlOFP29) and only one ortholog AtOFP13 in Arabidopsis. On the other hand, several AtOFP and SlOFP proteins demonstrated a one-to-one orthologous relationship, such as SlOFP5 and AtOFP5, SlOFP7 and AtOFP14, and SlOFP15 and AtOFP9 (Fig. 4; Online source 5).

The expression pattern of SlOFP genes in wild tomato

We examined seventeen SlOFP genes expressed in the wild tomato tissues for this study. SlOFP20 is the highest expressed gene of this family with 175.05 RPKM in one of the 11 tissues combined (Table 2; Online source 3). OVATE was expressed slightly higher in the vegetative meristem, young flower bud, flower at anthesis and fruit at 33 DPA (Fig. 5a; Online source 3). Several SlOFP genes were specifically expressed in one or more tissue. SlOFP7 was specifically expressed in fruit at 20 DPA; SlOFP8 and SlOFP20 were specifically expressed in anthesis-stage flower; SlOFP10 was specifically expressed in the root and hypocotyl; SlOFP13 was specifically expressed in the root; SlOFP14 was specifically expressed in fruit at 10 and 20 DPA; SlOFP18 was specifically expressed in young flower buds; SlOFP22 was specifically expressed in young leafs; SlOFP29 was specifically expressed in fruit at 10 DPA. On the other hand, SlOFP30 demonstrated similar expression in all tissues that were evaluated (Fig. 5; Online source 3).

The YABBY genes in tomato

Identification of YABBY genes in tomato

Nine YABBY genes were identified in the tomato genome. They were named by their likely orthologous relationship with Arabidopsis YABBY genes (Table 3; Fig. 6). SlYABBY2b was renamed as FAS because its mutation underlies the FASCIATED phenotype (Cong et al. 2008). The nine YABBY genes were distributed on 7 chromosomes, SlCRCa and SlYABBY1a were located on chromosome 1, SlINO and SlCRCb were located on chromosome 5, SlYABBY2a was located on chromosome 6, SlYABBY5a was located on chromosome 7, SlYABBY1b was located on chromosome 8, FAS was located on chromosome 11 and SlYABBY5b was located on chromosome 12 (Fig. 1; Table 3). Full-length cDNA or unigene sequences were available for six of these genes. All YABBY genes demonstrated expression in the tissues examined in this study (Table 3).

Table 3 YABBY gene family in tomato

Phylogenetic analysis of YABBY genes in Arabidopsis and tomato

To understand the phylogenetic relationships between YABBY proteins in Arabidopsis and tomato, we constructed a dendrogram based on the YABBY domain (Fig. 6). The phylogenetic tree showed that the AtYABBY and SlYABBY proteins were divided into five groups: INO, CRC, YAB2, YAB1/YAB3 and YAB5 (Fig. 6; Online source 6). The pattern of the tree was largely consistent with a previously reported tree (Toriba et al. 2007). Among the five orthologous groups, AtINO and SlINO in the INO group showed a one-to-one orthologous relationship; AtFIL, AtYABBY3, SlYABBY1a and SlYABBY1b in the YAB1/YAB3 group showed a two-to-two orthologous relationship; AtCRC, SlCRCa and SlCRCb in CRC group, AtYABBY2, SlYABBY2a and FAS (SlYABBY2b) in YAB2 group, AtYABBY5, and SlYABBY5a and SlYABBY5b in YAB5 group showed a one-to-two orthologous relationship (Fig. 6; Online source 6).

The expression pattern of YABBY genes in wild tomato

The SlYABBY genes were either not expressed or were they expressed at very low levels in the root (Fig. 5c; Online source 3). SlCRCa, SlCRCb and SlINO were highly expressed in reproductive tissues. SlCRCa was specifically expressed in young flower buds; SlCRCb was specifically expressed in young flower buds and flowers at anthesis; SlINO was specifically expressed in flowers at anthesis (Fig. 5c; Online source 3). To study the three genes in more detail in reproductive tissues, we evaluated their expression pattern in floral and fruit tissues at different developmental stages using semi-quantitative RT-PCR (Online source 6). SlCRCa transcripts were only detected during the early stage of flower development, namely 10 days before anthesis (DBA) and 5 DBA. SlCRCb transcripts were detected in flowers at 10 DBA until 2 DPA in the developing fruit. The peak of SlCRCb expression was in anthesis-stage ovaries. SlINO transcripts were detected in flowers at 5 DBA until 2 DPA of the developing fruit. The peak of the SlINO transcripts was also found in anthesis-stage ovaries (Fig. 5c; Online source 6).

The other SlYABBY genes also showed different expression patterns even though they belonged to the same phylogenetic group. For example, SlYABBY1a was expressed in young flower bud at level of 419.3 RPKM and in flower at anthesis at level of 121.0 RPKM, whereas SlYABBY1b was expressed in young flower bud at level of 121.5 RPKM and in flower at anthesis at level of 37.8 RPKM. SlYABBY2a was expressed at much higher levels than FAS (SlYABBY2b) in all productive tissues. In young flower bud, flower at anthesis, fruit at 10, 20 and 33 DPA, SlYABBY2a were expressed at levels of 146.1, 578.4, 392.6, 191.2, 206.1 RPKM, respectively, whereas, SlYABBY2b was expressed at levels of 105.3, 81.2, 38.2, 16.0, 11.9 RPKM, respectively. SlYABBY5a was expressed at higher levels than SlYABBY5b in all tissues we detected in this study (Online source 3).

Discussion

The SlSUN genes

Orthologs are genes that originate from a single ancestral gene in the last common ancestor of the species and are likely to have equivalent functions (Fitch 1970; Koonin 2005). Four pairs of putative one-to-one orthologous genes were found between SlSUN and AtIQD genes (Fig. 2; Online source 2). Three of these pairs had a similar expression pattern in tomato and Arabidopsis: SlSUN14 and AtIQD32, and SlSUN31 and AtIQD5 are almost ubiquitously expressed, whereas SlSUN22 and AtIQD6 are highly expressed in young flower buds (Fig. 3; Online source 2, Online source 3) (Schmid et al. 2005). Their similar expression patterns suggest that these orthologous pairs may play equivalent roles in growth and development.

Paralogs are genes originating from duplication within one organism and may have more divergent functions (Fitch 1970; Koonin 2005). Eleven pairs of putative paralogs were found in SlSUN gene family (Online source 2). Several pairs of paralogs showed a similar expression pattern, which suggests that they might share a common or similar function. For example, SlSUN11 and SlSUN22 were highly expressed in both young leaves and also young flower buds, SlSUN25, SlSUN29 and SlSUN30 were expressed almost equally (Fig. 3). Several pairs of paralogs have a different expression pattern, suggesting they play a diverse role in tomato development. For example, SlSUN1 demonstrated highest expression in fruit at 10 DPA but SlSUN12 demonstrated greatest expression in the hypocotyls; SlSUN5 showed greatest expression in the root but SlSUN28 had greatest expression in ripening fruit; SlSUN17 was evenly expressed in almost all tissues, yet SlSUN21 demonstrated highest expression in the root; SlSUN24 had greater expression in both vegetative meristems and also young flower buds but SlSUN27 showed much greater expression in the root (Fig. 3; Online source 2, Online source 3).

The SlOFP genes

The tomato OVATE gene is the founding member of the OFP family. Its loss-of-function mutation results in an elongated tomato fruit. It is both thought to be a plant-growth suppressor and expressed in the reproductive organs in the early stages of flower and fruit development as determined by real-time PCR analysis (Liu et al. 2002). In this study, we found that OVATE was indeed expressed in vegetative meristem, but its expression in the reproductive organs showed a different pattern from what was previously reported. In this study, OVATE demonstrated high gene expression in young flower buds and decreased expression in 20 DPA fruit. OVATE expression also increased at the time of fruit ripening. A similar expression pattern of the OVATE gene was found for both the tomato cultivar Heinz1706 and the same wild tomato S. pimpinellifolium accession as was used in this study (Tomato-Genome-Consortium 2012). It might be interesting to further investigate the role of OVATE at the fruit ripening stage.

Several pairs of orthologs between SlOFPs to AtOFPs were shown to have a similar expression pattern, suggesting that they might share common functions. For example, SlOFP7 and AtOFP14 demonstrated greater expression in fruit/silique; SlOFP13 and AtOFP17 were expressed much higher in the root (Fig. 5a, Online source 3, Online source 5) (Schmid et al. 2005; Wang et al. 2011).

Fourteen SlOFP genes were not expressed or were expressed at very low levels. The other members, except SlOFP30, were expressed at high levels in one or a few tissues. This suggests they have a specialized function in plant development. For example, SlOFP8 and SlOFP20 demonstrated much greater expression in anthesis-stage flowers; SlOFP10 and SlOFP13 were specifically expressed in the root and hypocotyl; SlOFP14 and SlOFP29 were expressed much higher in 10 DPA fruit. SlOFP18 was specifically expressed in young flower buds; SlOFP22 was expressed much higher in young leaves (Fig. 5; Online source 3).

The SlYABBY genes

The expression pattern of tomato YABBY genes was similar to that of Arabidopsis YABBY genes. The Arabidopsis YABBY genes are divided into two classes based on their expression pattern: the reproductive and the vegetative YABBY genes. The reproductive YABBY genes of Arabidopsis include CRC and INO, which express exclusively in floral organs (Bowman and Smyth 1999; Villanueva et al. 1999). In contrast, the vegetative YABBY genes of Arabidopsis, including FIL (YAB1), YAB2, YAB3, and YAB5, are expressed in the leaf-derived organs, such as cotyledons, leaves, and floral organs (Sawa et al. 1999; Siegfried et al. 1999; Watanabe and Okada 2003; Stahle et al. 2009; Sarojam et al. 2010). The tomato CRCa, CRCb and INO genes, the orthologs of Arabidopsis reproductive YABBY genes, were expressed in flower and the early stage of fruit development (Fig. 5c; Online source 6). On the other hand, and as expected, tomato FAS, YABBY2a, YABBY1a, YABBY1b, YABBY5a and YABBY5b genes were also expressed in vegetative tissues (Fig. 5c).

The analysis of YABBY mutants suggests that their function has diversified during evolution, despite belonging to the same group in the phylogenetic tree (Yamaguchi et al. 2004; Cong et al. 2008). Arabidopsis CRC and O. sativa DL belong to the CRC group, and they both play a role in carpel development. However, O. sativa DL is also involved in leaf development, whereas Arabidopsis CRC expresses exclusively in floral organs (Bowman and Smyth 1999; Yamaguchi et al. 2004). Two CRC genes, SlCRCa and SlCRCb, were identified in tomato (Table 3). They were only expressed in reproductive tissues but showed a different expression pattern. SlCRCa was specifically expressed at the early stage of flower development (flower buds at 10 days or more before anthesis). SlCRCb is equally expressed at very young floral stages as well as the anthesis stage (Fig. 5c; Online source 6). The different expression pattern of SlCRCa and SlCRCb suggests that they might play different roles in reproductive tissues development. Similarly, two YABBY2 genes, FAS (SlYABBY2b) and SlYABBY2a, have been identified in tomato, and only one YABBY2 gene in Arabidopsis. FAS and SlYABBY2a showed different expression patterns in tomato tissues. SlYABBY2a demonstrated higher expression level than FAS did in all productive tissues we detected in this study (Online source 3). The knockout of the FAS gene results in an increase of carpel and locule number in tomato (Cong et al. 2008). However, there is no evidence that the Arabidopsis YABBY2 gene is involved in regulating carpel number. This suggests that the members in the YABBY2 group of tomato may have gained a new function during evolution.

Duplication mechanisms accounting for the expansion of SUN, OFP and YABBY families

We noted that certain subfamilies of the SlSUN, SlOFP and SlYABBY families showed gene expansion. Gene family member expansions usually result from duplications, such as tandem duplications, segmental duplications and polyploidization or whole-genome duplications (Sankoff 2001; Adams and Wendel 2005). Whole genome duplication has occurred in tomato, and most of collinear blocks were located at the top and bottom part of the chromosomes (Song et al. 2012). Most of SlSUN, SlOFP and SlYABBY genes were also located at the top and bottom part of the chromosomes (Fig. 1), which suggests that whole-genome duplication may have played a significant role in the expansion of the three families.

Fig. 1
figure 1

Chromosomal distribution of tomato SUN, OFP and YABBY genes. The position of SlSUN, SlOFP and SlYABBY genes on the chromosome was based on tomato WGS chromosome (SL2.40). The region of fs8.1 locus was modified from the paper (Ku et al. 2000), and the region of fs10.2 locus was modified from the review (Grandillo et al. 1999)

Fig. 2
figure 2

Phylogenetic tree of the AtIQDs and SlSUNs based on their IQ67 domain sequence. This tree is unrooted tree and is illustrated using gene Pp1s382_30V6.1 in Physcomitrella patens subsp. Patens as an outgroup. Low bootstrap support (<50 %) was not reported

Other types of duplication may also explain the expansion of the three families. SUN on chromosome 7 arose from a gene on chromosome 10 through a retrotransposon-mediated gene duplication (Xiao et al. 2008). The cluster of SlSUN19 and SlSUN20, and the cluster of SlOFPs (from SlOFP22 to SlOFP28) might have arisen from tandem duplication, because they were close to each other on the chromosome and appeared in a close phylogenetic relationship as demonstrated by the dendogram. There was just one SUN-like gene, PGSC0003DMG400005774 (Transcript_ID, PGSC0003DMT400014796), in potato in the homologous genomic region of tomato SlSUN19 and SlSUN20. Using the divergence ratio r = 6.5 × 10−9 mutations per synonymous site per year (Gaut et al. 1996), the estimated divergence time of SlSUN19 and PGSC0003DMT400014796 was ~8.2 million years (Myr). The estimated divergence time of SlSUN19 and SlSUN20 was ~3.3 Myr (Online source 2). Therefore, SlSUN19 and SlSUN20 might have arisen from tomato-specific tandem duplication. However, tomato and potato might share the same kind of tandem duplication that results in the cluster of SlOFPs (Online source 4).

Semental duplication most likely explains the expansion of the tomato YABBY2 subfamily. In the YABBY2 subfamily, Arabidopsis, cucumber, and poplar had one member AtYABBY2, Csa007814 and Potri.016G067300.1, respectively. Grape has two members GSVIVG01022586001 (Transcript name, GSVIVT01022586001) and GSVIVG01037533001 (Transcript name, GSVIVT01037533001); potato has two members PGSC0003DMG400002988 (Transcript_ID, PGSC0003DMT400007731) and PGSC0003DMG400005936 (Transcript_ID, PGSC0003DMT400015197); and tomato has two members FAS (SlYABBY2b) and SlYABBY2a (Online source 6). In this study, the estimated divergence time of tomato gene SlYABBY2a and potato gene PGSC0003DMT400015197 was ~5.9 Myr, and the estimated divergence time of SlYABBY2b and PGSC0003DMT400007731 was ~10.6 Myr. Their divergence time was close to what has been reported for these two species (~7.3 Myr ago) (Tomato-Genome-Consortium 2012). The genomic regions around these orthologous pairs SlYABBY2a and PGSC0003DMT400015197, SlYABBY2b and PGSC0003DMT400007731 were also very similar; however, the tomato genes SlYABBY2a and SlYABBY2b diverged ~50.7 Myr ago, and the potato genes PGSC0003DMT400007731 and PGSC0003DMT400015197 diverged ~41.0 Myr ago. These results indicate that the gene expansion of the tomato and potato subfamily might arise from a segmental duplication, and this duplication already existed before the differentiation of potato and tomato (Online source 6); however, this duplication might be independent to the duplication resulting in gene expansion of V. vinifera YABBY2 subfamily. In this study, potato and tomato is estimated to separate from grape ~76.2 Myr ago. Whereas, grape genes GSVIVT01022586001 and GSVIVT01037533001 separated ~60.9 Myr ago, and the duplication in potato and tomato YABBY2 subfamilies arose ~50.7 Myr ago. Therefore, after tomato and potato diverged from grape, they duplicated in the YABBY2 subfamily separately (Online source 6).

After duplication, the genes may have evolved to acquire new functions in a process called neofunctionalization. A good example of this is SUN on chromosome 7 after it was inserted into DEFL1 showing a different expression pattern compared to its ancestral copy on chromosome 10 (Fig. 3a) (Xiao et al. 2008; Xiao et al. 2009). This change resulted in a new function, even though the gene sequence did not change resulting in an elongated tomato fruit (Xiao et al. 2008).

Fig. 3
figure 3

Expression pattern of SlSUN genes in tomato LA1589. a Genes from SlSUN1 to SlSUN13, b genes from SlSUN14 to SlSUN24, c genes from SlSUN25 to SlSUN33

Fig. 4
figure 4

Phylogenetic tree of the AtOFPs and SlOFPs based on OVATE domain sequence. This tree is unrooted tree and is illustrated using gene Pp1s283_17V6.1 in Physcomitrella patens subsp. Patens as an outgroup. Low bootstrap support (<50 %) was not reported. AtOFP19 (AT2G36026.1), AtOFP20 (AT1G06923.1)

Fig. 5
figure 5

Expression pattern of SlOFP and SlYABBY genes in tomato LA1589. a Genes from OVATE to SlOFP13, b genes from SlOFP14 to SlOFP31, c SlYABBY genes

Fig. 6
figure 6

Phylogenetic tree of the YABBY proteins in tomato and A. thaliana based on YABBY domain sequence. This tree is unrooted tree and is illustrated using protein ADE77109 in Picea sitchensis as an outgroup. Low bootstrap support (<50 %) was not reported

SUN, OFP, YABBY genes and fruit shape loci

Nearly 30 loci control tomato fruit shape (Grandillo et al. 1999). Four genes underlying these loci, namely OVATE, SUN, FAS and LC (Locule Number), have been cloned (Liu et al. 2002; Cong et al. 2008; Xiao et al. 2008; Munos et al. 2011). Identification of the SUN, OFP and YABBY gene family members may help to uncover the genes underlying the other tomato fruit shape loci. For example, fs8.1 is a major locus controlling elongation fruit in tomato, and it is located in the centromeric region of chromosome 8 (Grandillo et al. 1996; Ku et al. 2000) and SlSUN22 gene maps to this region (Fig. 1). SlSUN22 was highly expressed in young flowers (Online source 3), suggesting that it might be a candidate gene of fs8.1. There was a cluster of SlOFPs on the bottom part of chromosome 10 (Fig. 1; Table 2) which overlaps with the tomato fs10.2 region (Grandillo et al. 1999).

Varying levels of synteny exist among members of the Solanaceae family (Livingstone et al. 1999; Doganlar et al. 2002a; Tomato-Genome-Consortium 2012). QTL analysis has shown the existence of several overlapping fruit shape loci in eggplant, pepper and tomato (Doganlar et al. 2002b; Frary et al. 2003; Zygier et al. 2005; Paran and van der Knaap 2007; Borovsky and Paran 2011). Down regulation of CaOvate changes the shape of a round pepper into a more oblong shape (Tsaballa et al. 2011), suggesting that the CaOvate and OVATE might play a similar role in fruit shape determination. Thus, the identification of SUN, OFP and YABBY genes may also help to uncover the genes underlying the fruit shape loci in other Solanaceae species.

In summary, we identified 34 SlSUN, 31 SlOFP and 9 SlYABBY genes in tomato. Genome sequence analysis shows that some SlSUNs and SlOFPs mapped within several known fruit shape loci. The closest putative orthologs in the families between Arabidopsis and tomato were determined through their phylogenetic relationship and sequence similarity. Furthermore, some family members exhibited tissue-specific expression based on the RNA-seq analysis. Our results will pave the way to study the roles of SlSUN, SlOFP and SlYABBY genes in tomato growth and development and further understanding of these families in plant biology in general.