Introduction

Spinach downy mildew, caused by Peronospora farinosa f. sp. spinaciae (Pfs) Byford (= P. effusa), is one of the most destructive diseases of spinach worldwide (Qian et al. 2016). A major outbreak of downy mildew can significantly reduce spinach yield and quality, thus posing a serious threat to production. Developing new resistance cultivars is the most successful approach for managing spinach downy mildew (Correll et al. 1994).

A number of downy mildew resistance genes have been identified in other plants. A total of 28 downy mildew resistance genes (designated RPP for Recognition Peronospora Parasitica) have been reported in Arabidopsis, among which six genes (RPP1, RPP4, RPP5, RPP8, RPP13, and RPP27) have been cloned (Slusarenko and Schlaich 2003; Wang and Cao 2006; Nemri et al. 2010). In addition, the resistance genes DM3, PI8, and MrRPV1 were cloned and characterized in lettuce, sunflower, and grapevine, respectively (Meyers et al. 1998; Radwan et al. 2005; Feechan et al. 2013). To date, all the cloned downy mildew resistance genes are members of known plant disease resistance (R) gene classes. R genes encode proteins (R proteins) that play an essential role in defense against many pathogens and pests, including bacteria, viruses, fungi, and oomycetes, as well as nematodes and insects (Dangl and Jones 2001; Meyers et al. 2005). R genes are grouped into different classes according to their protein structure. Most R genes encode proteins with an extracellular nucleotide-binding site (NBS) and leucine-rich repeats (LRRs). Based on the structure of the N-terminal, these NBS-LRR-type proteins have been divided into two classes: One is called TIR-NBS-LRR or TNL proteins which have a N-terminal TIR (toll/mammalian interleukin 1 receptor) domain, and the other is known as CC-NBS-LRR or CNL proteins which encode a CC (coiled–coiled) motif (Dangl and Jones 2001; Meyers et al. 2005). The second class encodes Ser/Thr kinases (S/TK) such as the Pto gene from tomato, which confers resistance to Pseudomonas syringae pv tomato. (Martin et al. 1993). The third class contains receptor-like kinases (RLKs) with an extracellular LRR domain followed by a transmembrane region and a cytoplasmic Ser/Thr kinase domain, such as the rice Xa21 gene, which confers resistance to Xanthomonas oryzae pv. oryzae (Song et al. 1995). Other R genes encode receptor-like proteins (RLPs), which are similar to RLK genes. Compared with RLKs, RLPs lack a cytoplasmic Ser/Thr kinase domain. The class contains the tomato Cf-2, Cf-4, Cf-5, and Cf-9 genes that confer resistance to Cladosporium fulvum (Tör et al. 2004). The fifth class of R genes only encodes a possible coiled-coil domain. Members of this class encompass genes such as the RPW8 gene, which is a broad-spectrum downy mildew resistance gene in Arabidopsis thaliana (Dangl and Jones 2001; Xiao et al. 2001). To date, all of the cloned genes conferring resistance to the downy mildew pathogen belong to the NBS-LRR class, except for RPP27, which was the first gene that was found to encode RLPs in Arabidopsis. A total of 139 NBS-LRR genes were identified in the spinach genome, and CNL and NL constitute the most abundant classes of NBS-LRR genes, representing 40 and 36% of the NBS-LRR genes, respectively (Xu et al. 2017).

A total of six spinach downy mildew resistance genes/alleles (RPF1-RPF6, Resistance against Peronospora Farinose) have been identified to date. However, none of these resistance genes have been cloned (Correll et al. 2011). Sixteen races have been reported since 1824, 12 of which have been identified in the past 20 years (Correll et al. 2011; Plantum 2016; Kubota et al. 2017). Race identification depends on the disease reactions on a set of diverse host differentials including open-pollinated, commercial hybrids, and breeding lines. The host differentials contain six resistance genes that confer resistance to multiple races of Pfs. For instance, RPF1 provides resistance to race 1–7, 9, 11, 13, and 15, whereas the RPF2 gene provides resistance against race 1–10 (Irish et al. 2003, 2008; Feng et al. 2014). The codominant molecular marker DM-1, developed from an amplified fragment length polymorphism (AFLP) marker, is closely linked (≈ 1.7 cM) to the RPF1 locus (Irish et al. 2008). In addition, the marker RGA 5B14r, which was identified from a spinach bacterial artificial chromosome (BAC) library, is closely linked to the RPF1 locus with a genetic distance of 1.7 cM and co-segregated with the marker DM-1 (Feng et al. 2015). Although the DM-1 marker has been widely used for marker-assisted selection (MAS), the distance of the marker from the suspected resistance gene limits its full utility. Currently, a draft genome of spinach has been published, and five genes close to DM-1 markers were deemed to be potential candidate genes of RPF1 (Xu et al. 2017).

Bulked segregate analysis (BSA) is a method used for rapidly identifying DNA markers that are tightly linked to regions of interest. Two gene pools are generated via the selection of a pair of bulked DNA samples of individuals from a segregating population with extreme phenotypes (e.g., resistant and susceptible to a specific disease). Thus, the two bulked DNA samples differ genetically in the region of interest but otherwise have the same genetic background (Michelmore et al. 1991). BSA has been extensively used for mapping significant traits in many crops over the past two decades, such as rice (Venuprasad et al. 2009), maize (Quarrie et al. 1999), and barley (Poulsen et al. 1995). In recent years, the advent of advanced next-generation sequencing (NGS) technologies has led to the establishment of a large number of methods for marker development, including restriction site-associated sequencing (RAD-Seq) (Baird et al. 2008), whole-genome resequencing (WGRs) (Xie et al. 2010), genotyping by sequencing (GBS) (Elshire et al. 2011), and specific-locus amplified fragment sequencing (SLAF-Seq) (Sun et al. 2013); the latter being used in the current study. The combination of BSA and SLAF-Seq technologies is regarded as an efficient and powerful approach to map important traits in many plants such as rice (Zheng et al. 2016), cucumber (Liang et al. 2016), barley (Qin et al. 2015), maize (Xia et al. 2015), and sweet cherry (Jin et al. 2016).

In this study, the resistant inbred line 12S3, a female and recurrent parent containing the resistance gene at the RPF1 locus, and the susceptible inbred line 12S4, a male parent lacking the resistance gene at the RPF1 locus, were used as parents. The objectives of the present study were to fine map the RPF1 locus and identify the candidate genes via BSA-SLAF-Seq technology in the BC1 and F2 populations derived from the crosses of F1/12S3 and 12S3/12S4, respectively. The results of the current study will play a crucial role in the cloning of the resistance gene and developing disease-resistant cultivars.

Materials and methods

Plant materials

The F1 population was developed from a cross between the inbred lines 12S3/12S4. 12S3, a female and recurrent parent with the RPF1 locus, is resistant to 11 races of Pfs (Pfs1–7, 9, 11, 13, 15); however, 12S4, a male parent, is susceptible to these races. A mapping population of the 148 BC1 and 200 F2 lines was generated by the backcrossing of F1/12S3 and self-pollination of F1 plants, respectively. BC1 population were used for preliminary mapping of RPF1, and F2 population were inoculated to genetic analysis and fine mapping. All of these materials were obtained from the Spinach Research Group, Institute of Vegetables and Flowers (IVF), Chinese Academy of Agricultural Sciences (CAAS).

Inoculation and genetic analysis

The F2 populations were planted in commercial potting mix in 28 cm × 42 cm flats, and eight 12S4 plants were grown per flat as a control group. All of these plants were cultivated in a greenhouse in 2015. Seedlings with two true leaves (25–30 day-old plants) were spray-inoculated with a previously reported sporangial suspension (2.5 × 105 sporangial/mL) of Pfs race 9 (Qian et al. 2016). The inoculum was applied using a sprayer (Fig. S6a). After being inoculated, the plants were placed in a dew chamber maintained at 18–20 °C with 100% relative humidity for 24 h in dark and then transferred to a growth chamber maintained at 18–20 °C with a 12 h photoperiod for 6 days. After the 6 days, the plants were returned to the dew chamber for 24 h in dark, and then seedlings were scored as resistant or susceptible as previously described by Correll et al. (2010) (Fig. S6b).

The segregation ratios of the F2 populations were analyzed using a Chi-square test (χ2) with SAS software (version 9.2; SAS Institute Inc., Cary, NC, US).

DNA extraction and SLAF library construction for high-throughput sequencing

The fresh leaves from the two parents (12S3 and 12S4) and the BC1 individuals were collected and then frozen in liquid nitrogen prior to DNA extraction. Total genomic DNA was extracted using the cetyltrimethyl ammonium bromide (CTAB) method reported by Murray and Thompson (1980). The DNA quality and concentration were assessed by electrophoresis on 1.0% agarose gels and an ND-2000 spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, USA). DNA samples from the two parents and 148 BC1 individuals were used for SLAF library construction and sequencing.

Initially, SLAF-pre-design experiment was performed to avoid repeated SLAFs and obtain an even distribution of SLAFs for maximum SLAF-Seq efficiency. The SLAF library was then constructed using the pre-design scheme. The RsaI and HaeIII (New England Biolabs, NEB) restriction enzymes were used to completely digest the genomic DNA from each sample separately. After digestion, adenine and duplex tag-labeled sequencing adapters were added to the DNA fragments. Following this, 20 polymerase chain reaction (PCR) cycles were conducted to ensure sufficient concentration of the DNA fragments to meet the sequencing requirements. The PCR products were then purified and pooled, and the sample was assessed using 2% agarose gel electrophoresis. Following gel purification, DNA fragments of 364–414 bp were excised and diluted for paired-end sequencing. Finally, SLAF-Seq was performed using the Illumina HiSeq 2500 platform (Illumina, Inc; San Diego, CA, USA), and then single-nucleotide polymorphism (SNP) genotyping and evaluation were performed (Sun et al. 2013).

Analysis of SLAF-Seq data and association analysis

The SLAF-Seq data was analyzed using the procedures detailed in Sun et al. (2013). All SLAF paired-end reads with clear index information were clustered based on sequence similarity, which was detected using one-to-one alignment in BLAST (Kent 2002). Sequences with over 95% identity were grouped into one SLAF locus. SLAFs with two, three, or four tags were identified as polymorphic SLAFs.

Based on the genotypes of the BC1 population as assessed using marker DM-1, the sequencing data of 50 homozygous resistant (RR) individuals and 50 heterozygous resistant (Rr) individuals were selected as the sequencing results of the RR pool and Rr pool, respectively. The association analysis was subsequently performed by Euclidean distance (ED) followed by Loess regression analysis (Loess fit with a polynomial exponent of 1) using the Akaike Information Criterion (AICc), which identifies the regions containing the mutation and generates a list of trait-related candidate regions. ED association analysis was calculated as follows:

$${\text{ED}} = \sqrt {\left( {A_{\text{RR}} - A_{\text{Rr}} } \right)^{2} + \left( {T_{\text{RR}} - T_{\text{Rr}} } \right)^{2} + \left( {C_{\text{RR}} - C_{\text{Rr}} } \right)^{2} + \left( {G_{\text{RR}} - G_{\text{Rr}} } \right)^{2} }$$

In this study, ARR, GRR, CRR, TRR indicate the depth of four types of base in the RR pool while ARr, GRr, CRr, TRr represent the depth of four types of base in the Rr pool. In theory, the higher the ED value, the closer the object site (Geng et al. 2016).

Fine mapping of the RPF1 locus

To narrow down the location of the RPF1 region, polymorphic SLAF markers in the predicted region were designed using Kompetitive Allele Specific PCR (KASP) assay primers to screen recombinants from the susceptible individuals of the F2 population. Additionally, the codominant marker DM-1 was also designed using the KASP assay primer (DM-1-43), based on the sequence information of the dominant band and recessive band. All these KASP primers were designed by the LGC company (Shanghai, China). The primer sequences of the 22 markers are shown in Table S1. KASP was performed in a total reaction volume of 5 µL containing 2.5 µL 2 × KASP Master mix, 0.07 µL KASP Assay mix, and 2.5 µL genomic DNA diluted to 20–30 ng/µL. The reactions took place in 384-well PCR plates, and two no-template controls (NTCs) were included on each plate. All amplifications were performed on a Veriti 384 Well Thermal Cycler (Applied Biosystems, Foster City, CA, US) under the following conditions: 15 min at 94 °C; followed by ten cycles of 20 s at 94 °C and 60 s at 61 °C (0.6 °C drop per cycle), achieving a final annealing temperature of 55 °C; followed by a further 26 cycles of 20 s at 94 °C and 60 s at 55 °C. All plates should be read below 40 °C in a 7900 HT Fast Real-Time PCR System (Applied Biosystems), and the data were analyzed using SDS2.3 software (supplied by Applied Biosystems). If sufficiently defined genotype clusters are not obtained after the initial KASP thermal cycles, an additional thermal cycle is required, namely three cycles of 94 °C for 20 s, and 57 °C for 60 s, after which the data are assessed again. The additional thermal cycling and plate reading were carried out until defined genotype clusters were obtained.

Analysis of candidate genes

Based on the spinach genome annotations in SpinachBase (http://www.spinachbase.org/cgi-bin/spinach/tool/download.cgi), all R genes in the final mapping regions were selected as candidate genes.

Resequencing data were obtained from five spinach lines, including the susceptible (10S2, 12S1, and 12S4) and resistant (12S2 and 12S3) individuals. Prior to mapping, low-quality reads were filtered according to the method of Cheng et al. (2016). The high-quality reads were used for mapping, which was conducted using the Burrows-Wheeler alignment (BWA) software (version 0.7.15; parameter “mem”). SNPs and InDels were detected in the genomes of the five lines using SAMtools software (version 1.3.1) using the following parameters: (1) mpileup –q 20 –Q 30 –ugf ref.fa and (2) bcftools view –p 0.9 –cg. Based on the variations (SNPs and InDels), 14 candidate genes in the resistant and susceptible lines were translated into proteins using Open Reading Frame Finder (ORF Finder, parameter “default”) (https://www.ncbi.nlm.nih.gov/orffinder/). The amino acid sequences and conserved domains of the 14 candidate genes in the resistant and susceptible lines were then compared. Finally, the gene exhibiting conserved domain variation was considered as the best candidate gene for the RPF1 locus. Alignment of amino acid sequences and conserved domain analysis were performed using MEGA7 software (parameter “ClustaW”) and the online tool SMART (http://smart.embl-heidelberg.de/) (parameter “default”), respectively.

Expression analysis

Relative expression of the candidate genes was quantified in a quantitative real-time (qRT)-PCR assay, using a QuantStudio™ 12 K Flex Real-Time PCR System (Applied Biosystems). Leaf material was harvested from two resistant (12S2 and 12S3) and three susceptible (12S1, 12S4, and 10S2) lines inoculated with Pfs9 at the end of inoculation test when all of the susceptible plants have susceptible symptoms. The RNA extraction was performed using a TRIzol reagent (Thermo Fisher Scientific Inc.), and cDNA was synthesized using a TranScript One-Step gDNA Removal and cDNA Synthesis Kit (TransGen Biotech, Beijing, China). The primer sequences used in qRT-PCR analysis are shown in Table S1. Relative expression was calculated by normalization using the reference genes G6PDH (glucose-6-phosphate dehydrogenase). qRT-PCR was performed in a 10-µL reaction volume containing 70 ng template cDNA, 0.2 µM of each primer, 0.2 µM ROX Reference Dye II, 3.4 µL ddH2O, and 5 µL 2 × SYBR Fast qPCR Mix (TaKaRa, www.takara-bio.com/). The reactions were performed in a 384-well plate, and all genes expression analyses were conducted with three independent biological replicates and three independent technical replicates for each sample. The thermal cycling condition is as follows: 50 °C for 2 min, 95 °C for 10 min, followed by 35 cycles at 95 °C for 15 s and 60 °C for 1 min. The relative genes expression levels were determined using the \(2^{{ - \Delta \Delta C_{T} }}\) method (Livak and Schmittgen 2001).

Results

Phenotypic and genetic analysis

To assess the inheritance of resistance to Pfs in 12S3, crosses were made between 12S3 and 12S4. All the F1 plants were resistant to downy mildew. In the F2 population, 153 of 200 individuals constituted resistant plants and 47 constituted susceptible plants, with the Chi-square test confirming the segregation ratio to be 3:1 (Table 1). The results indicated that the resistance to Pfs in 12S3 was controlled by a single dominant gene, designated the RPF1 locus in a previous report (Irish et al. 2008).

Table 1 Genetic analysis of resistance to spinach downy mildew in the F2 population

Fine mapping of RPF1

A total of 4080 high-quality polymorphic SLAFs had been obtained to construct a high-density genetic map of spinach (Qian et al. 2017). Association analysis was then performed by Euclidean distance using the sequencing data of two pools (Rr and RR). The association threshold was 0.01, and 21 SLAF markers on LG1, from 0 to 10.8 cM, were significantly correlated with the target gene. After aligning sequences of the 21 SLAF markers and DM-1 to the spinach genome, these SLAF markers and DM-1 were mapped on chromosome 3 from 0.05 to 1.77 Mb. The results of the mapping and association analysis are shown in Table 2 and Fig. 1a, respectively.

Table 2 Position of 22 markers closely associated with the RPF1 locus on the genome
Fig. 1
figure 1

Mapping analysis of RPF1. a Association analysis to locate the RPF1 locus to a 10.8 cM region on LG1. X-axis = linkage group; Y-axis = ED value; red line indicates the threshold value. b Fine-mapping analysis delimited RPF1 to a 0.89 Mb interval flanked by the SNP markers KM256645 and KM1179193. Twelve recombinants are shown. The black segments represent the resistant 12S3 genotype, while the white segments indicate the susceptible 12S4 genotype. Phenotype scores (R = resistant, S = susceptible) are indicated on the right (color figure online)

To further refine the mapping region, the 22 markers were used to screen recombinants from the susceptible individuals of F2 population. Of these 22 markers, six markers were not available due to lack of segregation (KM3020126, KM1091570, and KM3712903), distorted segregation (KM2531431), and a lack of signal (KM1813691 and KM1591384). Eventually, 12 recombinants were obtained and the RPF1 candidate region was reduced to a 0.89 Mb interval flanked by marker KM256645 (0.34 Mb) and KM1179193 (1.23 Mb) (Fig. 1b).

Analysis of candidate genes in the candidate region

A total of 177 genes were found within the 0.89 Mb interval on chromosome 3 (Table S2). Based on the annotation of the spinach reference genome database in SpinachBase (http://www.spinachbase.org/cgi-bin/spinach/tool/download.cgi), 14 R genes were identified as the candidate genes at the RPF1 locus (Table 3).

Table 3 Candidate genes within the region on chromosome 3 from 0.34 to 1.23 Mb

Resequencing of 12S2, 12S3, 10S2, 12S1, and 12S4 was performed by Biomarker Technologies (Beijing, China). A total of 151 Mb raw paired reads were obtained, and the low-quality reads were filtered out, resulting in 136 Mb clean reads (Table S3). The total SNPs and InDels of the 14 candidate genes were 17,521 and 96, respectively. The number of variations in the exons and introns of 14 candidate genes is shown in Table 4.

Table 4 SNP and InDel variation in the exons and introns of the 14 candidate genes

Screening of candidate genes

After comparative analysis of the amino acid sequences and conserved domains of the 14 candidate genes in the resistant and susceptible lines, three significant genes, namely Spo12784, Spo12903, and Spo12729, were obtained. Both Spo12784 and Spo12903 encode an NBS-LRR domain, and Spo12729 encodes a receptor-like protein. The three genes differ majorly between the resistant and susceptible lines (Fig. 2). Compared with the conserved domain of Spo12784 in the resistant lines, the susceptible lines lacked the LRR_8 domain, which was ascribed to a change in codon ACA into AAG at position − 1700 in the resistant lines, resulting in amino acid substitutions (from C into L) at position + 561 (Fig. S7). Conversely, compared with the domain of Spo12903 in the susceptible lines, the resistant lines lacked the LRR_8 domain. This was attributed to one nucleotide substitution (from A into G) at position − 720 in the resistant lines, resulting in threonine being substituted for isoleucine at position + 1081 (Fig. S8). As for Spo12729, a signal peptide domain was present in the resistant lines, whereas a transmembrane region was present in the susceptible lines. This was due to codon AAT changing into GGT at position + 529 in the resistant lines, which led to asparagines substituting glycine at position + 177 (Fig. S9). Additionally, the number of LRR domains differs between the resistant and susceptible lines, but it remains unclear which SNP determines the variation.

Fig. 2
figure 2

Conserved-domain comparative analysis of the best candidate genes and other downy mildew resistance genes. PI8 is the downy mildew resistance gene from sunflower, and RPP8, RPP13, and RPP27 are Arabidopsis downy mildew resistance genes. PI8, RPP8, RPP13, Spo12784, and Spo12903 encode an NBS-LRR protein, while RPP27 and Spo12729 encode RLP. The Spo12784, Spo12903, and Spo12729 locus proteins in the resistant and susceptible lines differ substantially in domain type and LRR number. The red box with the letter “A” inside indicates the signal peptide domain, and the blue box with the letter “B” inside indicates the transmembrane region (color figure online)

The remaining candidate genes were excluded using the same approach. The three candidate genes Spo12762, Spo12719, and Spo12765 possess identical amino acid sequences between the resistant and susceptible lines (Fig. S10, Fig. S11, and Fig. S12). Thus, the three candidate genes constituted highly conserved proteins and were not the best candidate genes at the RPF1 locus.

The candidate gene Spo12916 encoded RLKs with extracellular receptors, a transmembrane domain, and a S/TK domain. Several amino acid mutations were observed at the amino acid level, but the conserved domain did not differ between the resistant and susceptible lines (Fig. S13). Furthermore, the RLKs can be further divided into RD and non-RD kinases (R and D indicate arginine and aspartic acid, respectively), based on the presence or absence of a conserved arginine (R) residue preceding the DxxxxN motif (de Oliveira et al. 2014). To date, most non-RD kinases are involved in plant immunity recognition, whereas RD kinases are thought to play an important role in other processes, such as development (Dardick and Ronald 2006; Hurni et al. 2015). As the candidate gene Spo12916 is an RD kinase, it was thus eliminated from the analysis.

The candidate gene Spo12821 encodes a 1275-aa-predicted CC-NBS-LRR protein with homology to a Beta vulgaris putative disease resistance protein RGA4 (GenBank accession No. XP_010690548). However, the resequencing data showed that the candidate gene was a heterozygote in the susceptible inbred line 10S2, suggesting that the gene was not the candidate resistance gene at the RPF1 locus.

The remaining six genes Spo12773, Spo12730, Spo12736, Spo12793, Spo12905, and Spo12908 encoded the RLK, RLP, NBS-LRR, S/TK, CC-NBS-LRR, d1m0za_ domains, respectively. All of these genes possessed several mutations in the coding region, while the conserved domain did not differ majorly between the resistant and susceptible lines (Fig. S13). Therefore, all of the above genes do not constitute the best candidate resistance genes at the RPF1 locus.

Expression analysis of the RPF1 candidate genes

To clarify the transcription patterns of the three candidate genes in the inoculated resistant and susceptible lines, qRT-PCR analysis was performed to analyze the transcript levels in the two resistant and three susceptible lines. The results are summarized in Fig. 3. The Spo12903 was significantly expressed in the two resistant lines, 12S2, and 12S3, compared with those in the three susceptible lines, 12S1, 12S4, and 10S2 (Fig. 3a). Similarly, the Spo12784 relative expression was 2.6–31.6 times higher in the two resistant lines, compared with those in the three susceptible lines (Fig. 3c). Surprisingly, the transcript levels of Spo12729 did not significantly differ in inoculated resistant and susceptible lines (Fig. 3b).

Fig. 3
figure 3

Relative expressions levels of the three candidate genes in the inoculated resistant and susceptible lines. a–c Relative expression of Spo12903, Spo12729, and Spo12784, respectively. The error was calculated from the standard deviation of three biological replicates, and different letters on the top of the bars indicate significant differences (Student’s test, p < 0.01). Formula \(2^{{ - \Delta \Delta C_{T} }}\) is used to calculate relative expression

Discussion

Analysis of candidate genes

Downy mildew constitutes one of the most destructive spinach diseases, resulting in significant reductions in yield and quality, thus posing a severe global threat to spinach production. Therefore, resistance to downy mildew is the major focus of all major spinach breeding programs (Correll et al. 1994). Sixteen races have been identified since 1824, 12 of which have been described in the past 20 years (Correll et al. 2011; Plantum 2016; Kubota et al. 2017). To date, six loci (RPF1-RPF6) are known to be responsible for downy mildew resistance in spinach, but the resistance genes have not been identified or cloned. Recently, a draft genome of spinach was published. The codominant marker DM-1, tightly linked to the RPF1 locus, was mapped on the spinach genome, and five NBS-LRR genes (Spo12736, Spo12784, Spo12903, Spo12905, and Spo12821) close to the marker were identified as potential candidate resistance genes at the RPF1 locus (Xu et al. 2017). This finding provided a reference for the further mapping of RPF1. However, two major deficiencies of the approach have been noted. Firstly, fine mapping of RPF1 was not performed using any mapping population. Secondly, only NBS-LRR genes were selected as the candidate genes. There are five classes of R genes, and only two classes (NBS-LRR and RLP) of R genes have been reported to have resistance against downy mildew. Although the majority of downy mildew resistance genes encode NBS-LRR, there is the possibility that other classes of R genes may be involved in downy mildew resistance. Thus, the other R gene classes may have been overlooked.

In the present study, we mapped RPF1 into an interval of 0.89 Mb on chromosome 3 using 148 BC1 and 200 F2 populations. A total of 14 candidate genes were identified within the 0.89 Mb region of the spinach genome. To avoid missing the candidate genes, chromosomal synteny analysis of the region of the RPF1 locus in spinach and sugar beet was performed using MCScanX (Wang et al. 2012). The results showed that the region of RPF1 in spinach shared a striking colinearity to Bvchr3 of sugar beet (Fig. 4). A total of 10 R genes were identified in the syntenic region of sugar beet, eight of which shared colinearity with some of the 14 candidate genes in spinach. The results of the syntenic analysis are shown in Table S4, and the 10 R genes in syntenic region of sugar beet are shown in Table S5.

Fig. 4
figure 4

Synteny analysis of the candidate region of RPF1 in spinach and sugar beet. The red box annotated with “Sochr3” indicates chromosome 3 of spinach. The blue box annotated with “Bvchr3” represents chromosome 3, while the blue boxes annotated with “Bv3_unsca004” and “Bv3_unsca003” inside represent chromosome 3 unlocalized genomic scaffolds 004 and 003 of sugar beet, respectively. The numbers on both sides of the chromosomes represent the physical location (color figure online)

Furthermore, conserved domain analysis of the 14 candidate genes indicated that among 14 candidate genes, five genes (Spo12736, Spo12784, Spo12903, Spo12905, and Spo12821) contained the NBS-LRR domain; Spo12729 and Spo12730 belonged to RLP; another two genes (Spo12773 and Spo12916) belonged to RLK; and Spo12762, Spo12719, Spo12765, and Spo12793 encoded a Ser/Thr kinase domain (Table 3). As most downy mildew resistance genes encode NBS-LRR and RLP, the candidate genes possession of the NBS-LRR and RLP domain were regarded as the crucial genes. To differentiate these resistance genes, a comparison of the amino acid sequences and conserved domain of the 14 candidate genes in the resistant and susceptible lines revealed that the gene exhibiting conserved domain variation constituted an important candidate gene. Ultimately, Spo12729, Spo12784, and Spo12903 were selected as the best candidate resistance genes at the RPF1 locus.

Additionally, expression levels of the three genes in two inoculated resistant and two susceptible lines were analyzed by qRT-PCR. The result indicated that Spo12784 and Spo12903 of the inoculated resistant lines exhibited the higher expression than the susceptible lines. There were no obvious differential transcript levels in Spo12729 (Fig. 3). However, further analysis is required to confirm the candidate genes, including transforming the three genes into susceptible plants.

Homology analysis of the best candidate genes and known downy mildew resistance genes

The genes Spo12784 and Spo12903 encoded a 1126-aa and 1206-aa NBS-LRR domain, respectively. Spo12729 encoded a 634-aa RLP. In this study, eight known downy mildew resistance genes, namely RPP1, RPP4, RPP5, RPP8, RPP13, and RPP27 from Arabidopsis, and Dm3 and PI8 from lettuce and sunflower, respectively, were obtained from Plant Resistance Genes database (http://prgdb.crg.eu/old.php). To determine the homology of the three candidate genes and other downy mildew resistance genes, homology analysis of the amino acid sequences of the three candidate genes and the eight known downy mildew resistance genes was performed using online tool BLASTP (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome). The alignment results showed that among the eight known downy mildew resistance genes, only RPP8, RPP13, and PI8 shared high homology with Spo12784 and Spo12903. Specifically, Spo127484 shared 28, 27, and 25% identity with RPP8, RPP13, and PI8, respectively, and Spo12784 shared 28, 33, and 28% identity with RPP8, RPP13, and PI8, respectively. However, only RPP27 shared high homology (27% identity) with Spo12729. Additionally, the candidate genes Spo12903 and Spo12784 and Spo12729 and RPP27 each were grouped together in the phylogenetic tree (Fig. 5). Based on comparative analysis of the conserved domain in Fig. 2, it is evident that Spo12784, Spo12903, and PI8, RPP8, and RPP13 possess a similar structure containing the NBS-LRR domain, and Spo12729 and RPP27 have a similar structure with a signal peptide and LRR domain.

Fig. 5
figure 5

Cluster analysis of the three candidate genes and other downy mildew resistance genes. The color indicates the bootstraps value. Red represents a minimum bootstrap value, and green represents a maximum bootstrap value (color figure online)

The NBS-LRR domain can be further divided into CNL and TNL classes according to the structure of the N-terminus domain. The TNL domains are only observed in dicotyledons, such as the N and M genes from tobacco and flax, respectively, while CNL domains are found in both in monocotyledons and dicotyledons, such as the Lr1 and Lr10 genes in monocotyledonous wheat, and RPP8 and RPP13 in the model eudicot A.thaliana (She et al. 2017). Spinach is a eudicot, and 139 NBS-LRR genes were identified in the spinach genome. The major NBS-LRR gene groups in spinach are CNL and TNL, representing 40 and 36% of the NBS-LRR genes, respectively, and only one TNL gene was observed in the spinach genome (Xu et al. 2017). Spo12784 and Spo12903 encode a CNL and NL domain, respectively. RPP8, RPP13, and PI8 also contain a CNL domain. Therefore, compared with the TNL (RPP1, RPP4, and RPP5) and RLP (RPP27) class R genes, a CNL class R gene shared the highest homology with Spo12784 and Spo12903. Additionally, both Spo12729 and RPP27 encode an RLP domain. Thus, only RPP27 shared the highest homology with Spo12729, and the two genes clustered together in the phylogenetic tree (Fig. 5).

Based on candidate gene variation, SNP markers were developed for MAS

In recent years, molecular markers have been widely used in many crop breeding programs (Devran et al. 2015). Compared with traditional breeding methods, MAS is able to improve breeding efficiency and accelerate breeding progress. Therefore, the development of markers tightly linked to the gene of interest is significant for breeders. KASP is considered as an adaptable, efficient, and effective approach in SNP genotyping analysis and has been widely applied on many plants, including cotton, wheat, and potato (Graves et al. 2016).

In the present study, three crucial amino acid mutations of Spo12784, Spo12903, and Spo12729 were obtained via comparative analysis of the conserved domain in the resistant and susceptible lines. Based on the amino acid mutation, three SNP variants of the candidate genes were identified that can be developed as KASP markers for MAS. This would be valuable in cloning the RPF1 alleles and evaluating how each may function.

Conclusion

In this study, a SLAF-Seq approach combined with BSA was successfully used to locate the RPF1 gene in spinach. A localized region of 10.8 cM with 21 SLAF markers on LG1 was detected to be tightly associated with the RPF1 locus. A 1.72 Mb region on chromosome 3 was obtained by aligning the 21 SLAF markers and the DM-1 marker close to RPF1 in the spinach genome. To narrow down the candidate region, the 22 markers were developed using KASP assay primers to screen recombinants from an F2 population. Ultimately, a 0.89-Mb region with 14 candidate R genes was discovered. To further evaluate the candidate genes, resequencing of two resistant inbred lines (12S2 and 12S3) and three susceptible inbred lines (12S1, 12S4, and 10S2) was performed. The three most likely candidate genes (Spo12729, Spo12784, and Spo12903) were identified from 14 genes via amino acid sequence analysis and conserved domain analysis between the resistant and susceptible inbred lines. These results provide a basis for cloning the resistance gene at the RPF1 locus and determining the functionality of the resistance gene, as well as for developing suitable markers for spinach breeding.

Author contribution statement

QW and XZ designed the study. SH and QW conducted the experiments. SH, QW, and LZ analyzed the data. SH wrote the manuscript. QW, FC, and CJ made the revision of the manuscript. QW, ZH, and XZ prepared and collected the samples. WX and WJ helped conduct experiments.