Introduction

Cucumber (Cucumis sativus L.) is one of the most important vegetables in the world (Huang et al. 2009). Color of cucumber fruits is an economic trait, which largely influences consumers’ preference. Biosynthesis of chlorophylls results in the formation of the most natural pigments on the earth (Chen 2014). There are nine known steps for chlorophyll synthesis from glutamate (Chen 2014). Mutation of any enzyme involved in the pathway could affect the chlorophyll production as well as the color of plants. The chlorophyll content is a main factor to determine the color of cucumber fruits and leaves (Lancaster et al. 1997; Sun et al. 2003). Although the biosynthesis pathway of chlorophyll is well established in higher plants, the involved genes remain largely unknown in cucumber. The only reported gene is Arc5, mutation of which resulted in reduced number and enlarged size of chloroplasts, rendering light green cucumber fruits (Zhou et al. 2015).

Change of leaf color is a very common phenotype induced by EMS, which can be easily identified with naked eyes. To obtain new mutants for the cucumber color, we carried out a genetic screen for fruit color alteration from approximately 6000 independent EMS-mutagenized lines of the elite cucumber line 406. The recently developed MutMap is based on whole genome re-sequencing of bulked DNA of F2 segregants derived from the cross between the wild type plant and the mutant, enabling researchers and breeders to rapidly associate phenotypic variation with genomic sequence differences (Abe et al. 2012; Takagi et al. 2015). Because the F2 progeny are derived from a cross between the mutant and its parental wild-type plant, the number of segregating loci responsible for the phenotypic change is minimal, in most cases one, and thus segregation of the phenotypes can be unequivocally observed even if the phenotypic difference is small (Abe et al. 2012). If mutagenesis is performed in an elite cultivar, the mutants and associated SNP markers could be used by breeders to produce new varieties.

In this study, a mutant showing light green fruits and leaves was obtained. The causative SNP underlying the mutant phenotype was rapidly identified through the modified MutMap. The light green mutant provides a new locus for cucumber breeding.

Materials and methods

Plant materials

The north China type cucumber inbred line 406 was used to make the mutant library. The seeds of 406 were treated with 1 % EMS (Sigma M0880) according to Tadmor et al. (2007). The first mutant generation (M1) plants were self-pollinated and brought to the second generation (M2) to make the mutant gene homozygous. A recessive mutant M218 with light green fruits and leaves was identified in an M2 population. Then it was crossed with the wild-type 406. The resulting F1 plants were self-pollinated, and 96 F2 progeny were obtained and grown in the greenhouse to evaluate the phenotype.

Isolation and transformation of protoplasts

The protoplasts of cucumber leaves were isolated following Huang et al. (2013) with slight modifications. Fresh cucumber leaves were cut into strips with 0.5 mm wideth using a sharp blade, and transferred immediately to the enzymolysis solution before shaking slowly in the dark for more than 3 h to isolate protoplasts. The enzymolysate was filtered with 100-mesh nylon membrane and centrifuged at 100 g for 25 min, and the sediment was re-suspended using collecting solution. The protoplasts were photographed using epi-fluorescence microscope Zeiss 400 with PI filter. The number of protoplasts was counted under 100X objective lens.

To study the subcellular localization of Ycf54 protein, Ycf54 cDNA fused with GFP in frame at C-terminus was cloned into a modified vector Psuper1300. The construct p35S::Ycf54 -GFP was transformed into cucumber protoplasts according to He et al. (2007) and observed under a Carl Zeiss LSM700 META laser scanning microscope. GFP signals were collected using emission between 500 and 530 nm with an excitation at 488 nm, and the red auto-fluorescence of chloroplasts was obtained using emission between 615 and 650 nm with an excitation at 639 nm.

Measurement of chlorophyll

The second leaves from bottom to top of 4-week-old cucumber plants were sampled and cut into small pieces. For chlorophyll extraction, 0.16 g leaves were put into 10-ml tubes with 9 ml ethanol acetone mixture (volume ratio 1:1) in the dark for 2–3 days till the leaves turned into white. The extract was measured at the absorbance value of 645 and 663 nm by UV spectrophotometer, and the concentration of chlorophyll was calculated using following formulas:

$${\text{Chl}} a\;({\text{mg/g) = }}(12.7\; \times \;{\text{A}}_{{663}} - 2.69\; \times \;{\text{A}}_{{645}} )\; \times \;{\text{V}}/{\mkern 1mu} ({\text{1000}}\; \times \;{\text{W)}}$$
$${\text{Chl}}b\;({\text{mg/g) = }}(22.9\; \times \;{\text{A}}_{{645}} - 4.89\; \times \;{\text{A}}_{{663}} )\; \times \;{\text{V}}/{\mkern 1mu} ({\text{1000}}\; \times \;{\text{W)}}$$
$${\text{Chl}}\; ( {\text{mg/g) = Chl}}\;{\text{a}} + {\text{Chl}}\;b$$

where V is the volume of the mixture (ml), and W is the weight of samples (g) (Arnon 1949).

Whole genome re-sequencing

Genomic DNA was isolated from young leaves using the Cetyl Trimethyl Ammonium Bromide method (Fulton et al. 1995). DNA of all plants showing mutant phenotype in the F2 population was bulked in an equal ratio to construct the mutant pool, and DNA of 20 normal F2 individuals were used to make the normal pool in the same way. The bulked DNA was subjected to whole genome re-sequencing using an Illumina Hi-Seq 2000 sequencer (100 bp) (BerryGenomics, China). The insert size of both libraries was 500 bp. The genomic DNA of wild-type 406 was also sequenced using the same method to generate a reference sequence to identify mutations induced by EMS.

MutMap method

The short reads of bulked DNA and 406 were tested using the software FastQC (http://www.bioinformatics.babraham.ac.uk/projects/download.html#fastqc) and then aligned to the cucumber reference genome 9930 using BWA (Li and Durbin 2009). Alignment files were converted to SAM or BAM files using SAMtools, and applied to a filter pipeline (Li et al. 2009). The filter pipeline was developed to extract the reliable bases if they meet the following two criteria: (i) Phred quality scores of both base sequencing and read mapping were higher than 20, (ii) reads depth was between 5 and 45. The heterozygous bases in the mutant pool and 406 were removed, and the common bases among three datasets (mutant pool, normal pool and wild-type 406) were extracted. Compared with 406, the bases showing G to A or C to T mutations in the mutant pool and heterozygous G/A or C/T in the normal pool were subjected to further analysis.

Development of dCAPS marker

The dCAPS marker was developed to assess the association between SNPs and the mutant phenotype. The forward PCR primer was designed using the online software dCAPS FINDER 2.0 (http://helix.wustl.edu/dcaps/dcaps.html) (Neff et al. 1998), and only one mismatch was allowed in the primer sequence. The reverse primer was designed using Primer3 (http://bioinfo.ut.ee/primer3-0.4.0/primer3/). PCR program and enzyme digestion were performed following Zhou et al. (2015).

RNA extraction and quantitative RT-PCR

Total RNA from mutant and normal leaves were extracted using EasyPure Plant RNA Kit (Tiangen, China). First-strand cDNA was synthesized with the M-MLV kit (Takara, Japan). The cDNA was diluted to 10-fold and used for quantitative PCR (qPCR) with Eppendorf AG 22331 Hamburg according to the protocol provided by the manufacturer. The primers Csy-F/Csy-R (Table 1) were used to detect the expression of CsYcf54, which produced a 145 bp fragment. The cucumber Tublin gene was used as the internal reference. Gene expression level was calculated on the basis of the 2−ΔΔCt method (Livak and Schmittgen 2001).

Table 1 Primers used in the study

Results

Phenotype of the light green mutant

The fruits and leaves were uniform in color for each F2 plant, showing either light green or normal green (Fig. 1a–d). The ratio of light green/normal green plants was 13:83 in the F2 population, suggesting that the mutant phenotype was controlled by a recessive gene. The quantity and morphology of chloroplasts as well as the chlorophyll content are main factors that affect the cucumber color, so we investigated these two traits in the mutant and normal plants. The protoplasts dissociated from fresh cucumber leaves were observed under the microscope, finding no pronounced difference in terms of the number and shape of chloroplasts (Fig. 1e, f). In contrast, the chlorophyll content was significantly different between normal and mutant plants. Compared with normal plants, the contents of total chlorophyll (Chl), chlorophyll a (Chla) and chlorophyll b (Chlb) in mutant plants decreased by 23.98, 5.83 and 50.82 %, respectively (Fig. 1g). The marked reduction of chlorophylls is the major reason for the light green color in the mutant plants.

Fig. 1
figure 1

Phenotype of the light green mutant. a, b Fruits of the normal plant and Ycf54 mutant. c, d Leaves of the normal plant and Ycf54 mutant. e, f Protoplasts isolated from the normal plant and Ycf54 mutant, scale bar = 15 μm. The number and shape of chloroplasts were not different between normal and mutant plants. g Comparison of concentration of Chla, Chlb and Chl between normal and mutant plants

Identification of the candidate gene for the light green mutant

The bulked DNA of 13 mutant F2 individuals (mutant pool) and bulked DNA of 20 normal F2 individuals (normal pool) were subjected to whole genome re-sequencing. We obtained 45 million and 41 million paired-end reads for mutant pool and normal pool, respectively, corresponding to >10 coverage of the cucumber genome (235 Mb). There were 35 million (78.24 %) and 33 million (79.50 %) reads mapped to the cucumber reference genome in the mutant and normal pools, respectively, and 31 million (68.52 %) and 29 million (70.27 %) reads in each pool had the quality score higher than 20. For the wild-type 406, 50 million paired-end reads were generated. Among them, 72.53 % of reads (36 million) were mapped to the reference genome, and 64.12 % (32 million) were high quality reads.

After trimming, 149,212,755 common nucleotides in three datasets (mutant pool, normal pool and wild-type 406) were extracted. It was expected that the single nucleotide polymorphisms (SNPs) responsible for the recessive phenotype was homozygous in the mutant pool and heterozygous in the normal pool. In total, 23 SNPs exhibited homozygous G to A or C to T mutations in the mutant pool (Table 2). Of them, 17 SNPs were located in the intergenic region (Fig. 2a). For the remaining six genic SNPs, four were located within the intronic region without disrupting splicing sites or generate stop codons, while the other two resulted in the non-synonymous mutations.

Table 2 The twenty-three G to A or C to T mutations identified through modified MutMap
Fig. 2
figure 2

Analysis of the candidate gene CsYcf54. a The distribution of homozygous G to A or C to T mutations in the mutant pool. Black hollow triangles indicated mutations in the intergenic region. Red hollow triangles represented synonymous mutations and red solid triangles indicated non-synonymous mutations. The numbers (Mb) under the triangles were the position of mutations on the chromosome. b Gene structure and mutation information of CsYcf54. c Alignment of the mature amino acid sequences of Ycf54 from different species. Microphotographs showing green fluorescence of Ycf54-GFP (d), light image (e), auto-fluorescence of chloroplasts (f), and merged images (g) indicated the Ycf54 protein was localized in chloroplasts of cucumber protoplast, scale bar = 15 μm

To identify the causative SNP, the association between the two non-synonymous mutations (SNP6G9285631 and SNP7G4699781) and the mutant phenotype was determined. The SNP7G4699781 was heterozygous in 13 F2 mutant plants. The causative SNP should be homologous recessive in the mutant pool, thus SNP7G4699781 was excluded as the causative one. For SNP6G9285631, all mutants showed homozygous A, while normal plants exhibited either heterozygous G/A or homozygous G. The co-segregation between SNP6G9285631 and the mutant phenotype indicated that SNP6G9285631 was the causative SNP for the light green phenotype. Moreover, 10 out of 23 SNPs are distributed in a cluster on chromosome 6, which is a typical characteristic of EMS mutagenesis.

Characterization of the candidate gene

The causative SNP occurs in the third exon of the gene Csa6M133820 on the chromosome 6 (Fig. 2b). The full coding sequence of Csa6G133820 is 681 bp, encoding a homolog of Ycf54, designated CsYcf54 here. Csa6M133820 is located on the antisense strand, thus the mutation in the coding sequence is actually C to T. The mutation (C to T) at 607 bp resulted in an amino acid substitution from hydrophobic proline (P) to hydrophilic serine (S). The expression of CsYcf54 did not differ between normal and mutant leaves (Fig. S1). Alignment of homologous proteins from distantly related species showed that the mutation (P203S) occured in the conserved region of Ycf proteins (Fig. 2c).

The TARGETP program (http://www.cbs.dtu.dk/services/TargetP) predicted that CsYcf54 protein was targeted to the chloroplast. Observation of the green fluorescent signals showed that it was co-localized with the auto-fluorescence of chloroplasts (Fig. 2d–g), indicating that CsYcf54-GFP was localized in the chloroplast.

Discussion

MutMap allows rapid identification of the putative variants responsible for a mutant phenotype (Abe et al. 2012). Here we applied MutMap to localize the gene controlling the cucumber leaf color using a cross between the light green mutant and wild-type normal green line 406. In the F2 population, 86 plants showed normal phenotype and 13 were mutant, which didn’t fit the 3:1 ratio for Mendelian inheritance (Χ 2 = 6.72, P < 0.05). The segregation distortion could not be explained by homozygous lethality because there was no difference in the growth vigor between mutant and normal plants. It might be due to the relatively small segregating population. Conventional mapping entailed large number of plants to pinpoint the genetic variants. For example, Wu et al. (2007) isolated a chlorophyll-deficient gene ygl1 in rice from over 12000 F2 plants through map-based cloning. While MutMap analysis needed a small number of plants to identify the genetic variants as illustrated in this study. Here we rapidly identified 23 homozygous G to A or C to T mutations in the mutant pool. Among them, only one non-synonymous mutation (SNP6G9285631) on the chromosome 6 co-segregated with the mutant phenotype, and the mutation resulted in an amino acid change (P203S) in the gene Csa6M133820. Since the other eight intergenic SNPs on chromosome 6 were over 2 kb apart from the flanking genes, they are supposed to have a marginal effect on the gene function. Thus SNP6G9285631 was considered the causative SNP for the mutant phenotype. We analyzed the genome data of 18 lines with light green fruits among 115 cucumber core collections, and the SNP genotype was wild type in these 18 lines (unpublished data). Therefore, the mutation identified in this study was a new locus responsible for light green color in cucumber.

Csa6M133820 encodes an Ycf54-like protein. Ycf54 has recently been implicated in the cyclase step of chlorophyll biosynthesis (Hollingshead et al. 2012; Bollivar et al. 2014). In this step, Mg-protoporphyrin IX methylester (MgPMe) cyclase catalyzes the incorporation of atomic oxygen into MgPMe to form Chla precursor protochlorophyllide, which results in an isocyclic ring (Chen 2014). The ring structure is the unique feature of chlorophylls compared with all other tetrapyrroles (Chen 2014). The formation of the isocyclic ring involves a series of reactions that may be catalyzed by a multi-subunit enzymatic complex encompassing at least one soluble and three membrane-bound components (Rzeznicka et al. 2005; Bollivar et al. 2014). Ycf54 is a membrane-bound component in the cyclase complex, which was firstly identified in the cyanobacterium Synechocystis 6803 and then further characterized in barley (Hollingshead et al. 2012; Bollivar et al. 2014). Mutation of Ycf54 led to the decrease of Chla in Synechocystis 6803 (Hollingshead et al. 2012). In our study, Chla production was also reduced in the Csycf54 mutant, but Chlb content showed more severe depletion. That might because Chlb can be converted to Chla, but not the opposite (Chen 2014).

The cucumber CsYcf54 mutant, showing light green leaves and fruits, allows varietal breeding for different consumers. A detailed study of CsYcf54 would contribute a better general understanding of chlorophyll biosynthesis.