Introduction

Cucumber (Cucumis sativus L., 2n = 2 × = 14) is one of the most important vegetable crops of the world (Pan et al. 2017) and is widely grown in China in both the field and greenhouses (Yang et al. 2018). Cucumber quality depends on a number of traits, including the density, size, and color of fruit spines (Xie et al. 2018), and decades of cucumber breeding has resulted in four major types based on the demand in different regions, namely those with numerous spines, dense spines, and few spines and those that are smooth. In the USA, consumers prefer cucumbers with few spines and the European market favors smooth fruits, while in China, the north China-type cucumber with dense fruit spines is the most popular (Zhang et al. 2016b). For the dense spine cucumbers, the fruit spine number may vary among cultivars. Thus, there is the need to elucidate the molecular mechanism that controls fruit spine development.

The trichomes of cucumber are multicellular and unbranched, whereas those that occur in Arabidopsis thaliana are unicellular and branched. Nevertheless, the trichomes of Arabidopsis have been utilized as a plant model for studying fruit spine development, and more than 30 different genes have been implicated (Hülskamp et al. 1999; Hülskamp 2004; Schellmann and Hülskamp 2005; Machado et al. 2009; Pattanaik et al. 2014). The initiation of trichomes and the expression of downstream targets are related to the trimeric activator complex, MYB-bHLHWDR (MBW) formed from GLABRA1 (GL1), TRANSPARENT TESTA GLABRA1 (TTG1), and GLABRA 3/ENHANCER OF GLABRA3 (GL3/EGL3) (Pattanaik et al. 2014). The negative regulation factor, six single-repeat R3MYBs (Grebe 2012), and a group of C2H2 zinc finger transcription factors have been implicated in trichome formation (Gan et al. 2006, 2007). In addition, phytohormones, including gibberellins, cytokinins, and jasmonic acid (JA), play significant roles in Arabidopsis trichome development (Pattanaik et al. 2014).

Quantitative trait loci (QTLs) were mapped for constitutive and induced trichome density in two recombinant inbred line (RIL) populations of Arabidopsis. Twelve candidate genes were mapped onto five chromosomes, and most of them were involved in the initiation and development of trichomes (Bloomer et al. 2014; Pattanaik et al. 2014). Additionally, two micro-RNAs, miR171 and miR156, were shown to control trichome distribution in Arabidopsis (Xue et al. 2014), and recent studies have demonstrated that NTL8, a membrane-associated NAC transcription factor, negatively regulates trichome formation in Arabidopsis by directly activating the R3 MYB genes TRY and TCL1 (Tian et al. 2017).

In contrast, current understanding of genes related to trichome formation and development in cucumber is limited. Map-based cloning has identified three genes, namely tiny branched hair (TBH), microtrichome (MICT), and glabrous mutant (CsGL1), which are allelic and are involved in trichome development but not trichome initiation (Chen et al. 2014; Li et al. 2015; Zhao et al. 2015). Furthermore, csgl1 has been reported to be epistatic to Tu (tuberculate fruit) (Yang et al. 2014). TRIL and CsGL3 have been mapped to the same gene, Cas6M514870, which has a significant influence on trichome differentiation and initiation (Wang et al. 2016; Pan et al. 2015; Cui et al. 2016; Zhang et al. 2016a; Liu et al. 2016). TRIL and CsGL3 may be upstream positive regulators of TBH, MICT, and CsGL1 for the regulation of multicellular trichome development, affecting epidermal cell initiation (Pan et al. 2015). Xie et al. (2018) fine-mapped the numerous spines gene, ns, and showed that ns encoded the auxin transporter-like protein 3. Chen et al. (2016) showed by homologous cloning strategy that CsTTG1 is an important component of a molecular network that regulates fruit blooming and trichome and wart formation in cucumber. Yang et al. (2018) identified a complex CsMYB6-CsTRY, which negatively regulates cucumber fruit trichome formation. However, details on multicellular trichome formation in cucumber and related genes require further investigation.

In this study, we used three RILs (LM, YR, and HR) and one double haploid (DH) (HP) population to analyze the inheritance of fruit spine density. We also mapped the same major locus fsd6.2 in all four populations. Fine mapping showed that the candidate gene of fsd6.2 is Csgl3, which controls high fruit spine density. To further investigate the domestication history of Csgl3, we used 64 core cucumber germplasms (CG) and 8754 single-nucleotide polymorphisms (SNPs) (coverage: 500 kb) in the regions upstream and downstream of the Csgl3 gene position, which indicated that China is the origin of high and ultra-high fruit spine density cucumber. China also has the highest genetic diversity in fruit spine density. In addition, we detected a novel locus fsd6.1 using GWAS that is consistent with a minor QTL detected in the HR and HP populations. Then, polymorphic SNP alignment showed that fsd6.1 regulates ultra-high fruit spine density. This study thus provides important insights into high and ultra-high fruit spine density formation in cucumber.

Materials and methods

Definition of fruit spine density

The number of spines in the upper, middle and lower surface of a 1-cm-diameter cucumber was counted (Fig. 1a). The average was used to estimate fruit spine density (FSD) and used in QTL mapping.

Fig. 1
figure 1

Variations in the number of fruit spines among cultivars PI183967, XTMC, YD, 65G, and 02245. a Fruit spine counting method and the phenotype of five parents with different fruit spine densities. b The number of fruit spines among five parents. c The origin of the LM, YR, HR, and HP populations and phenotyping environments

Based on the distribution of the average number of spines in the core germplasm (CG), 64 CG lines were categorized into the following three fruit spine density groups (Fig. 4d): low fruit spine density (LFSD, average spine number < 2), high fruit spine density (HFSD, 4 < average spine number < 5), and ultra-high fruit spine density (UFSD, average spine number > 7).

Plant materials and mapping populations

Three RILs and one DH line derived from five parental lines were employed in this study (Fig. 1c). These five parental lines contained LFSD (wild cucumber PI183967, north China-type YD, and European-type 65G), HFSD (north China-type XTMC), and UFSD (north China-type 02245) (Fig. 1b).

For association analysis, 64 cucumber CG lines from the 115 cucumber core collections were selected (Qi et al. 2013). The remaining 51 in the core collections included ten lines with numerous spines (independent gene control) and 41 cucumber lines (17 Xishuangbanna cucumber, 13 wild cucumber, and 11 others) that either have no fruit spine phenotype because of late flowering time or their seeds were unavailable.

Phenotypic data collection and analysis

The number of fruit spines of the three RILs (XTMC × PI183967, YD × XTMC, 65G × 02245), one DH (65G × 02245), and one CG in 17 plastic house trials in two locations was determined for a total of 6 years. Details of the 17 experiments are presented in Table S1. Briefly, the 160 LM RILs (XTMC × PI183967) were grown in plastic houses in Shunyi/Nankou, Beijing, China, in the spring/fall of 2012 and 2013, respectively. The 85 YR RILs (YD × XTMC) were grown in plastic houses in Shunyi/Nankou, Beijing, China, in fall of 2014 and spring/fall of 2015. The 140 HR RILs (65G × 02245) were grown in plastic houses in Nankou, Beijing, China, in the spring/fall of 2016 and 2017, respectively. The 87 DH lines (65G × 02245) were grown in plastic houses in Nankou, Beijing, China, in the spring/fall of 2016 and 2017, respectively. The 64 cucumber core collections were grown in Nankou/Shunyi, Beijing, China, in the spring of 2016 and 2017, respectively. The temperature range data of all environments are shown in Fig. 2e.

Fig. 2
figure 2

Violin and box plots depicting the phenotypic distribution of mean number of fruit spines among the LM, YR, HR, and HP populations in 15 environments

In each experiment, the fruit spine number was counted from the commerial mature cucumber fruits. For traits, five plants per replication in all the trials were investigated. Statistical analysis of phenotypic data was performed using SAS v9.3 (SAS Institute Inc., Cary, NC, USA). Pearson’s correlation coefficients among different trials for each population were estimated with the PROC CORR function based on the grand mean of each RIL, or DH in all the experiments.

Linkage map development

Cucumber or melon SSR markers described in Ren et al. (2009), Cavagnaro et al. (2010), and Yang et al. (2012) were used to screen for polymorphisms in crosses between PI183967 and XTMC, XTMC and YD, and 65G and 02245. Polymorphic markers were used to genotype the three RIL and one DH population. DNA extraction, PCR amplification of molecular markers, and gel electrophoreses were performed as described by Li et al. (2011). For each marker, the Chi-square test for goodness of fit was performed against the expected 1:1 segregation ratio in the RIL/DH population. Linkage analysis was conducted using JoinMap 4.0 software. Linkage groups were determined with a minimum LOD score of 4. Genetic distance was calculated using the Kosambi mapping function.

QTL analysis

QTL analysis was performed with the R/qtl software package (http://www.rqtl.org/) (Broman et al. 2003). Using the individual RIL/DH plant data and the means of three replicates, a whole genome scan was performed to map the QTL with composite interval mapping (CIM) procedures (Weng et al. 2015). Genome-wide LOD threshold values (P < 0.05) for declaring the presence of QTLs were determined using 1000 permutations. For each detected QTL, a 1.5-LOD support interval was calculated and defined by left and right markers. The QTLs were named according to chromosome location and trait name as described by Bo et al. (2015). For example, fsd4.1 and fsd6.1 designated the first QTL for fruit spine density on cucumber chromosomes 4 and 6, respectively. QTLs detected from different populations but located at the same position on the same chromosome were designated the same QTL.

In silico bulked segregant analysis

The association between SNPs and FSD in natural populations was established using the in silico BSA strategy with the aim of narrowing down the region harboring the major QTL locus. Ten HFSD or UFSD CG lines (CG100, CG101, CG107, CG120, CG89, CG95, CG97, CG99, CG57, and CG26) and ten LFSD CG lines (CG10, CG40, CG39, CG6, CG33, CG12, CG14, CG23, CG87, and CG88) were selected for fine mapping (Table S7). The re-sequenced genome data and SNPs of these 20 CG lines are available in the NCBI database (Qi et al. 2013).

For in silico BSA, SNPs between ten lines with HFSD or UFSD and ten lines with LFSD were identified. SNPs between the HFSD or UFSD lines and LFSD lines were considered to be associated with the fsd locus.

Phylogenetic and geographical distribution analysis

To further understand the origin and evolution of the HFSD fsd6.2 (csgl3) allele, nucleotide variations within the fsd6.2 region were explored in 64 CG lines, including seven UFSDs, six HFSDs, and 51 LFSDs. A total of 8754 SNPs were selected, covering 500 kb flanking fsd6.2 and used to construct a phylogenetic tree as described by Bo et al. (2016). Multiple sequence alignment was performed using Clustal Omega program (http://www.ebi.ac.uk/Tools/msa/clustalo/). MEGA 6.0 was used to construct a phylogenic tree. The geographical information of the 64 CG lines is presented in Supplemental Table S7. DIVA-GIS software was used to construct the geographical map.

To understand the evolution of the UFSD fsd6.1 allele, nucleotide variations within the fsd6.1 region in the 64 CG lines were examined using 281 SNPs (Supplemental Table S8) and used in phylogenetic tree construction.

Genetic diversity of Csgl3 in CG lines

Zhang et al. (2016a) reported a cucumber molecular marker MM3, developed from the promoter region in Csgl3 that controlled fruit spine density. To further explore the relationship between Csgl3 and the LFSD, HFSD, and UFSD phenotypes, 64 cucumber CG lines were tested using the MM3 marker.

Genome-wide association analyses of fruit spine density

A linear mixed model (LMM) was used for association tests, with an estimated relatedness matrix as covariate. GWAS was conducted, and the genome-wide lowest P value was recorded. The 5% lowest tail was taken from the 200 recorded minimal P values as the threshold for genome-wide significance. The Manhattan map for GWAS was generated using the R package qqman (Turner 2014).

RNA extraction and qRT-PCR

Cucumber ovary samples at the same developmental stage (6 days before flowering) were collected from LFSD, HFSD, and UFSD lines to flash frozen in liquid nitrogen on the same day, with three biological replications of each sample. Cucumber total RNA was extracted, and first-strand cDNA was synthesized using an RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) and a PrimeScript RT Reagent Kit with gDNA Eraser (TaKaRa, Kyoto, Japan), respectively, according to the manufacturer’s instructions. qRT-PCR was conducted using 2 × SG Fast qPCR Master Mix (BBI), and PCR amplification was performed according to the manufacturer’s protocol. The PCR primers were designed with Primer Premier 6.0. Actin (Bo et al. 2016) was employed as the reference gene to correct gene expression values. Each qRT-PCR experiment was performed with three biological replications and three technical replicates. The analysis of candidate gene relative expression data was performed using the 2−ΔΔCt method (Kenneth et al. 2001).

Results

Phenotypic variation of fruit spine density in four populations

Phenotypic data on fruit spine density were collected from the three RIL and one DH populations in 15 experiments that were conducted over a span of 6 years (Fig. 1c, Table S1). The frequency distributions of fruit spine density among the populations from different experiments are illustrated in Fig. 2. All observed distributions showed a clear bimodal distribution, particularly in the YR RILs, which suggested that fruit spine density is controlled by a major QTL. The average number of fruit spines in the HR/HP populations was higher than that of the LM/YR populations (Fig. 2), which may be attributable to the fact that the donor parent 02245 has a significantly higher fruit spine density than XTMC (Fig. 1b).

The fruit spine density correlations among multiple environments were examined for each population, and the Spearman’s rank correlation coefficients (rs) between each environmental pair are presented in Table 1. Strong positive correlations were observed among different environments in each population.

Table 1 Spearman’s rank correlation coefficients among different environments in the LM, YR, HR, and HP populations

For the HR and HP populations in the 2017F trail, the average and maximum numbers of fruit spines were lower than those of the three other trails, which may be attributable to the low temperature during this season (Fig. 2e). Despite this observation, major and minor QTLs were detected in the HR2017F and HP2017F data that coincided with those identified in the other datasets (see below).

Linkage map construction

Cucumber or melon SSR markers and InDel markers were used to screen for polymorphisms and to construct a genetic map. The main statistics of the four maps are presented in Table S2, and detailed information on all mapped markers is shown in Tables S3 (LM), S4 (YR), S5 (HR), and S6 (HP).

A high-density map constructed using the LM population identified two QTLs that were related to fruit spine density (see below). To validate the QTLs controlling fruit spine density in natural cucumbers, three other maps were constructed using the YR, HR, and HP populations for QTL analysis.

Overall, the LM map included 315 markers that spanned 671.09 cM with an average marker interval of 2.25 cM. The YR map comprised 92 markers that spanned 701.90 cM with an average marker interval of 7.90 cM. The HR map includes 139 markers that spanned 716.23 cM with an average marker interval of 6.25 cM. The HP map consisted of 86 markers that spanned 887.00 cM, with an average marker interval of 13.02 cM (Table S2).

Based on the 9930 genome (V2.0) anchored by these markers, these four maps seemed to physically cover the majority of the cucumber genome, and the marker orders were also highly consistent with their physical locations. Therefore, all four genetic maps were suitable for subsequent QTL mapping.

QTL mapping

The phenotypic data for fruit spine density from all the experiments were used in the QTL analyses. Because there were a few gaps of > 10 cM in size in the genetic map, a whole genome scan was initiated for QTLs with a large window size of 25 cM. Refinement of QTL number and location was then performed with CIM using 10-cM windows. For each population, the LOD threshold to declare the significance of QTLs was determined with 1000 permutation tests (P = 0.05).

A global view of the LOD profiles of all QTLs detected across the seven chromosomes is presented in Fig. 3. Details of each detected QTL, including map location, LOD support value, percentages of total phenotypic variances explained (R2), and 1.5-LOD support interval are shown in Table 2. The chromosomal locations of the major QTLs in the four populations are shown in Fig. 4a. In total, one major QTL and two minor QTLs were detected. The details of these QTLs are described below.

Fig. 3
figure 3

Whole-genome view of QTL locations for fruit spine number in LM (a), YR (b), HR (c), and HP (d) populations that were detected in 15 experiments (LM2012S, LM2012F, LM2013S, LM2013F, YR2014F, YR2015S, YR2015F, HR2016S, HR2016F, HR2017S, HR2017F, HP2016S, HP2016F, HP2017S, and HP2017F) based on the CIM model in R/qtl. For each population, the X-axis represents the linkage map of seven chromosomes, and the Y-axis is the LOD score; the horizontal line represents the LOD threshold obtained with 1000 permutation tests (P = 0.05)

Table 2 QTL for fruit spine density detected with LM, YR, HR, and HP populations in 15 experiments (LM: 2012S, 2012F, 2013S, 2013F; YR: 2014F, 2015S, 2015F; HR: 2016S, 2016F, 2017S, and 2017F; and HP: 2016S, 2016F, 2017S, and 2017F)
Fig. 4
figure 4

Identification of the candidate gene for the fsd6.2 locus by combined linkage mapping and association analysis. a The candidate region of fsd6.2 in cucumber chromosome 6 based on four populations. b Distribution of 17 SNPs in 9930 cucumber chromosomes suggests a 50-kb region is associated with variations in fruit spine number in ten low fruit spine density and ten high/ultra-high fruit spine density CG lines. c csgl3 is located in the target region of the fsd6.2 locus. d Sixty-four CG cucumber lines were categorized into three groups (LFSD, HFSD, and UFSD) based on the number of fruit spines. e The MM3 marker reported by Zhang et al. (2016a) co-segregated with the spine density in 64 CG lines. The size of all fragments from the LFSD lines is 2277 bp and that from the HFSD/UFSD lines is 1475 bp. This result indicates that csgl3 is the critical gene from LFSD to HFSD/UFSD. f Expression pattern analysis of the eight predicted genes among LFSD, HFSD, and UFSD cucumbers. **Significant difference (P < 0.01)

Two QTLs, fsd4.1 and fsd6.2, were detected in the LM population. The QTL with largest effect, fsd6.2 (R2 = 44–52%), was identified in the LM2012S, LM2012F, LM2013S, and LM2013F experiments, with a highly consistent peak on the genetic map at 77.3 cM (Fig. 3a and Table 2). The minor QTL, fsd4.1 at 28 cM (R2 = 6–7%), was detected in LM2012S, LM2013S, and LM2013F (Fig. 3a and Table 2).

For the YR population, one QTL, fsd6.2 at 134 cM (R2 = 77–94%), was detected in all experiments, including YR2014F, YR2015S, and YR2015F (Fig. 3b and Table 2).

For the HR population, two QTLs, fsd6.1 and fsd6.2, were detected. The QTL with the largest effect, fsd6.2 (R2 = 43–66%), was identified in HR2016S, HR2016F, HR2017S, and HR2017F, with a highly consistent peak on the genetic map at 100 cM (Fig. 3c and Table 2). The minor QTL, fsd6.1 (R2 = 16–20%), was detected in HR2016S, HR2016F, HR2017S, and HR2017F (Fig. 3a and Table 2).

For the HP population, two QTLs fsd6.1 and fsd6.2 were detected. The QTL with the largest effect, fsd6.2 (R2 = 37–62%), was identified in HP2016S, HP2016F, HP2017S, and HP2017F with a highly consistent peak on the genetic map at 100 cM (Fig. 3d and Table 2). The minor QTL, fsd6.1 (R2 = 21–26%), was detected in the four experiments but not in HP2017F (Fig. 3d and Table 2).

In summary, we detected the major-effect QTL fsd6.2 in all four populations that indicated that fsd6.2 directly controls high fruit spine density. One additional QTL fsd6.1 was detected in the HR and HP populations, which may be attributable to the fact that the donor parent 02245 has an ultra-high fruit spine density phenotype (the number of fruit spines of 02245 is significantly higher than that in XTMC) (Fig. 1b). Thus, fsd6.1 possibly controls UFSD together with fsd6.2.

Physical location alignment of the major-effect QTL, fsd6.2

Figure 3 shows that the major-effect QTL fsd6.2 is consistent in four different populations. To verify the QTL fsd6.2, we put the FSD candidate region (1.5-LOD interval) on the 9930 chromosome. Figure 4a shows that the LM, YR, HR, and HP populations share the same physical interval, which suggests that the fsd6.2 in the four populations is the same QTL.

Fine mapping of fsd6.2 using an in silico BSA strategy

Because fsd6.2 shared the same region in four different populations, we conducted in silico BSA to explore the natural populations to narrow down the region harboring the fsd6.2 locus.

In the 2016S and 2017S seasons, FSD data were collected from 64 cucumber CG lines. Among these lines, seven had UFSD, six had HFSD, and the rest had LFSD (Fig. 4d). All the polymorphic SNPs of the CG lines are available in the NCBI database (Qi et al. 2013). Ten UFSD or HFSD CG lines and ten LFSD CG lines were used in the analysis of unique SNPs. A total of 17 unique SNPs were detected in the present study, of which 13 (Fig. 4c, Table S9) were distributed within a ~ 50-kb region (genomic coordinates: 26,180,000–27,630,000) of the 9930 genome (Fig. 4b). Annotation of this region predicted eight genes including the Csgl3 gene (Fig. 4c), which was reported by Zhang et al. (2016a) to control fruit spine density because of a polymorphism in the promoter of Csgl3. Screening the cucumber CG lines with the MM3 marker indicated that all the LFSD lines carried the Csgl3-2,277-bp allele, and all the HFSD and UFSD lines carried the Csgl3-1,475-bp allele (Fig. 4e).

To determine the possible candidate gene, total RNA extracted from LFSD, HFSD, and UFSD lines was employed to examine the expression patterns of Csa6G514870 (Csgl3), Csa6G514880, Csa6G514890, Csa6G514900, Csa6G514910, Csa6G514920, Csa6G514930, and Csa6G514940. The gene expression analysis showed that Csa6G514870 (Csgl3) has a significantly higher expression in HFSD and UFSD lines than in LFSD line (Fig. 4f), whereas the other seven genes have no significant expression difference. The above results suggested the Csgl3 might be the best candidate gene of the fsd6.2 locus that controls fruit spine density.

Evolution and geographical distribution analysis of fsd6.2 (Csgl3) in a natural population

The number of fruit spines in all 64 cucumber CG lines was counted, and the CG lines were categorized into five groups based on fruit spine density and country of origin. These lines were 17 UFSD (China), six HFSD (China), 17 LFSD (China), eight LFSD (Japan), and 26 LFSD (other countries).

To gain more information on the evolution of Csgl3 during domestication, the approximate geographical coordinates (altitude and latitude) of the origin of each line (Supplemental Table S7) were used to plot the 64 lines on a world map (Fig. 5b). The accessions from China were highly enriched with the Csgl3-1,475-bp allele (UFSD and HFSD): 57% of the cucumber accessions from China (23 out of 40) carried the Csgl3-1,475-bp allele. All the other accessions from the other countries carried the Csgl3-2,277-bp allele (LFSD). These results suggested that the Csgl3-1,475-bp allele (UFSD and HFSD) originated from China. We also observed that China has genetic diversity for cucumber fruit spine density, which may be attributable to its extensive history of domestication or diversifying selection in cucumber breeding.

Fig. 5
figure 5

Phylogenetic analysis and geographical distribution of the csgl3 gene. a A dendrogram of 64 cucumber CG lines based on 8754 SNPs within a 500-kb region of the csgl3 locus. b Distribution of 64 cucumber CG lines in different continents based on the genotypic of the csgl3 locus

To further ascertain the origin of Csfsd6.2, a phylogenetic tree was constructed using 8754 SNPs, encompassing 500 kb of the flanking region of fsd6.2 (Fig. 5a). No distinct nucleotide polymorphisms were observed among the 26 LFSD lines from other countries. In the resulting NJ tree, the 64 lines were placed into four groups, with groups 1 and 2 comprising 27 lines that originated from more than ten different countries that were characterized by LFSD. Group 3 was composed of eight accessions originating from three different countries. All of these were LFSD except for CG95, which is a UFSD. Most UFSD lines and all the HFSD lines were clustered into group 4, which exhibited the highest genetic diversity among the groups. In groups 3 and 4, the LFSD lines from Japan were closely related to the LFSD from China, suggesting that the cucumber in Japan may have been introduced from China. In addition, most of the UFSD and HFSD lines belonged to a very small cluster in group 4. These results further confirmed our hypothesis that the Csgl3-1,475-bp allele (UFSD and HFSD) originated from China.

Candidate gene identification of the novel locus fsd6.1 using GWAS

One novel locus, fsd4.1, was detected in the LM population (Fig. 3a), and another, fsd6.1, in the HR and HP populations (Fig. 3c, d). To verify the novel loci, GWAS analysis was used in the present study using a linear mixed model with correction of kinship bias (Supplemental Fig. 1). GWAS for fruit spine density was performed to assess the potential to design a GWAS for causal gene identification. Using the linear mixed model, one locus was detected that consistently exceeded a significant threshold (−log10P ≥ 8.21) (Fig. 6a). A candidate region was estimated to extend from 19.493 Mb to 19.511 Mb (~ 18 kb) using pairwise LD correlations (r2 ≥ 0.6) (Fig. 6b). Interestingly, this candidate region is included in the fsd6.1 QTL region that was detected in HR and HP populations (Fig. 6d). This consistent result suggested that the fsd6.1 locus is responsible for fruit spine density, particularly for ultra-high fruit spine density in cucumber.

Fig. 6
figure 6

GWAS for fruit spine number and identification of the causal gene for the peak detected on chromosome 6. a Manhattan plot for fruit spine number. The dashed line represents the significance threshold (−log10 P = 8.21). Arrowheads indicate the position of the strong peaks that did not localize with the known csgl3 gene investigated in this study. b Local Manhattan plot (top) and LD heatmap (bottom) surrounding the peak on chromosome 6. Dashed lines indicate the candidate region (~ 18 kb) for the peak. c Four genes were predicted in the fsd6.1 candidate region. d The fsd6.1 candidate region detected by GWAS is consistent with that detected in the HR and HP populations

According to the Cucumber Genome Browser (http://www.icugi.org/cgi-bin/ICuGI/index.cgi), four annotated genes are located in the fsd6.1 candidate region (Fig. 6c). Alignment of the polymorphic SNPs between the UFSD and non-UFSD lines indicated five unique SNPs that were associated with UFSD (Fig. 6c, Table S8). Of the five SNPs, three were located within the non-gene region and two were situated within the Csa6G421750 gene. Then, we did a qRT-PCR experiment using LFSD, HFSD, and UFSD lines. The results showed that Csa6G421750 has a significantly higher expression in UFSD line than in LFSD and HFSD lines (Fig. 7a). These results indicated that Csa6G421750 might be the candidate gene of the fsd6.1 locus.

Fig. 7
figure 7

Phylogenetic analysis of fsd6.1 and the predicted model of cucumber ultra-high fruit spine density formation. a Expression pattern analysis of the four predicted genes among LFSD, HFSD, and UFSD cucumbers. **Significant difference (P < 0.01). b A dendrogram of 64 cucumber CG lines based on 281 SNPs within an 18-kb region of the fsd6.1 locus. c Two unique SNPs located in the candidate gene Csa6G421750 of the fsd6.1 locus. d csgl3 is the critical gene from LFSD to HFSD/UFSD, and the fsd6.1 locus controls the UFSD based on the csgl3 function

Phylogenetic analysis of the fsd6.1 locus

To further understand the relationship between the UFSD and non-UFSD lines, a phylogenetic tree was constructed using 281 SNPs in the fsd6.1 candidate region (Fig. 7b). All seven of the UFSD CG lines clustered into a small unique group located at the end of the tree. The results indicated the fsd6.1 could be a domestication locus during human diversifying selection. To confirm the candidate gene within the fsd6.1 locus, a SNP alignment using all CG lines was conducted. Two SNPs (SNP19509345 and SNP19509981) were located on the Cs6G421750 gene that are unique to the ultra-high fruit spine density trait (Fig. 7c), which confirmed that Cs6G421750 is the candidate gene of fsd6.1.

Fsd6.1 and fsd6.2 together control ultra-high fruit spine density formation

The major QTL, fsd6.2, was found in all four different populations (Fig. 3), suggesting that fsd6.2 is the critical gene from the LFSD to the HFSD phenotype. A stable minor QTL, fsd6.1, was detected in the HR and HP populations (Fig. 3c, d). Because the donor parent 02245 of the HR and HP populations has ultra-high fruit spine density, it is highly likely that the minor QTL fsd6.1 controls the UFSD. In addition, the GWAS results provide additional support for this hypothesis (see above). Therefore, two key genes regulating the fruit spine density from LFSD to UFSD. The first step, from LFSD to HFSD, is controlled by fsd6.2 (Csgl3). The second step, from HFSD to UFSD, is controlled by fsd6.1 (Cs6G421750) (Fig. 7d).

Discussion

Inheritance and QTL/gene responsible for cucumber fruit spine density

In the present study, three LFSD, one HFSD, and one UFSD natural inbred lines were used to develop three RILs (LM, YR, and HR) and one DH (HP) population to study the inheritance of spine density in cucumbers. Fifteen experiments were conducted over a span of 6 years. The phenotypic data between the different experiments were consistent for the same population (Fig. 2, Table 1). The distribution of fruit spine density among the four populations was bimodal rather than normal (Fig. 2), particularly in the YR RILs, suggesting that fruit spine density is controlled by a major QTL.

A number of cucumber fruit spine-related genes have been identified, such as B for black spine, ns for numerous spines, ss for small spines, s or s1 for spine size and frequency, s2 and s3 for dense and small spines, respectively, and gl (glabrous) and tril (trichome-less) (Call and Wehner 2010; Wang et al. 2016). The gl-2, csgl1, and tril/csgl3 genes that control the presence or absence of trichomes on foliage or fruit have been mapped to chromosomes 2, 3, and 6, respectively (Yang et al. 2011; Li et al. 2015; Pan et al. 2015; Wang et al. 2016; Zhao et al. 2015; Cui et al. 2016). The fs1 gene has been reported to control a few spines on cucumber fruits (Zhang et al. 2016a). The black spine locus B has been mapped to the short arm of chromosome 4 (Li et al. 2013). The ss and ns genes that are related to spine number and size have been localized to a cluster of genes (u, D, Fr, te, and Tu) on chromosome 5 that control fruit epidermal features (Fanourakis and Simon 1987; Vakalounakis 1992; Walters et al. 2001; Yuan et al. 2008; Miao et al. 2011; Xie et al. 2018). The present study identified three QTLs, namely, fsd4.1, fsd6.1, and fsd6.2 controlling fruit spine density (Fig. 3). Of these, fsd6.2 is the major-effect QTL that was detected in four different populations and at the same chromosome physical location (Fig. 4a). Fine mapping indicated that fsd6.2 matches fs1 of Zhang et al. (2016a). The candidate gene for the two loci is Csgl3 (Fig. 4c). For the minor QTL fsd6.1, the candidate gene (Cs6G421750) was found using GWAS (Fig. 6). Fsd6.1 is a novel locus that is related to fruit spine density and differs from all genes reported previously to control traits, such as ns, ss, s1, s2, s3, and fs1. Our GWAS results also suggest that there may be other loci that control fruit spine density that have yet to be discovered (Fig. 6a), indicating that genes controlling cucumber fruit spine density are located on different chromosomes.

Fsd6.2 and fsd6.1 control ultra-high fruit spine density

The 64 cucumber CG lines, including seven UFSD, six HFSD, and 51 LFSD, indicated that both UFSD and HFSD materials are important members of the natural cucumber germplasm. To discover the locus controlling UFSD and HFSD, four different populations were developed using lines LFSD (PI183967, YD, and 02245), HFSD (XTMC), and UFSD (02245) (Fig. 1). For the populations (LM and YR) derived from the LFSD and HFSD lines, the major locus fsd6.2 was determined to be a critical gene for HFSD formation (Fig. 3a, b). For the populations (HR and HP) derived from the LFSD and UFSD lines, two QTLs, fsd6.1 and fsd6.2, were detected, suggesting that the UFSD parent line contained the HFSD allele (Fig. 3c, d).

A geographical map of the distribution of all the HFSD and UFSD materials in the 64 cucumber CG lines from China (Fig. 5b) showed that China is the origin of HFSD and UFSD. This could be due to either the long history of domestication of cucumber, diversification from cucumber breeding, or a combination of both. Phylogenetic reconstruction of fsd6.2 (Fig. 5a) indicated that all the HFSD lines belonged to group 4. However, these lines were dispersed into small clusters, suggesting that many of these are of different origins. In addition, most of the UFSD lines were distributed within the same group as HFSD, suggesting that UFSD was derived from HFSD. Phylogenetic analysis of fsd6.1 (Fig. 7b) showed that all UFSDs are lines located at the end of the tree and clustered within the same group. These results support the hypothesis that some of the Chinese lines within LFSD changed into HFSD during human domestication and selection because of a mutation within fsd6.2, and then fsd6.1 changed HFSD into UFSD during modern cucumber breeding.

The ultra-high fruit spine density and numerous spines are controlled by different genes

Ten cucumber CG lines with numerous fruit spines controlled by the single recessive gene ns have been reported by Xie et al. (2018). These lines are not part of the 64 CG lines used in the present study because these are under different genetic control. Cucumbers with numerous spines are a unique group, with an average of more than 40 spines in the unit circle, which is about fourfold higher than either HFSD or UFSD (Fig. 1b). Xie et al. (2018) fine-mapped ns using an F2 population of 2513 individuals derived from NCG122 (a line with numerous fruit spines) and NCG121 (a line with few fruit spines) and showed that NS is auxin transporter-like protein 3. Genetic analysis of the ns gene in natural populations revealed that one InDel within the coding region of ns can distinguish between numerous-spines and non-numerous-spines cucumbers. Thus, the UFSD lines and cucumbers with numerous spines have different phenotypes and the spines numbers are most likely controlled by different genes.

Cucumber fruit spine density is one of the most important economic traits for visual appearance. Fruit spines are multicellular, non-glandular trichomes that develop from epidermal cells of the fruit skin, which are similar in shape and structure to leaf trichomes (Guan 2008). It seems that the development and growth of trichomes in the foliage are under the same genetic control as those on the fruits, because trichomes are absent in all aboveground parts of glabrous fruit mutants (Yang et al. 2011; Li et al. 2015; Pan et al. 2015; Wang et al. 2016; Cui et al. 2016) and present in similar numbers on the aboveground parts of ecotypes with few trichomes. The results presented in this paper will be used in the Chinese cucumber improvement program to produce a range of products that addresses consumer demand.