Introduction

Clubroot, caused by Plasmodiophora brassicae (Wor.), is an important disease on brassica crops worldwide, especially in temperate climate regions. It can infect most of the Cruciferae, such as cabbage and Chinese cabbage (Dixon 2006; Piao et al. 2009). Clubroot is an increasing problem on canola (Brassica napus L.) on the Canadian Prairies, where the number of fields with confirmed clubroot infestations has steadily increased since it was first identified in 2003 (Gossen et al. 2015). The disease has also been reported in Saskatchewan and Manitoba (Cao et al. 2009; Dokkenbouchard et al. 2010). Clubroot is also a serious problem for rapeseed production in China (Hu and Peng 2016).

Identification of clubroot resistance (CR) genes is a very important step toward breeding for Brassica crops resistant to clubroot. To date, > 10 genes or QTLs were mapped into chromosomes of the A-genome in B. rapa. Crr2 and PbBa1.1 were located on A01 (Chen et al. 2013; Suwabe et al. 2003); CRc and Rcr8 on A02 (Sakamoto et al. 2008; Yu et al. 2017); Crr3, CRa, CRb, CRbkato, CRd, CRk, PbBa3.1, PbBa3.2, PbBa3.3, Rcr1, Rcr2, and Rcr4 on A03 (Chen et al. 2013; Chu et al. 2014; Hirai et al. 2004; Huang et al. 2017; Matsumoto et al. 1998; Pang et al. 2018; Piao et al. 2004; Sakamoto et al. 2008; Yu et al. 2016, 2017); Crr4 on A06 (Suwabe et al. 2003); and Crr1, PbBa8.1, and Rcr9 on A08 (Chen et al. 2013; Suwabe et al. 2006; Yu et al. 2017). Three genes, Crr1, CRa, and CRbkato, have been cloned. These genes encode toll-interleukin-1 receptor, nucleotide binding site, leucine-rich repeat (TIR-NBS-LRR (TNL)) proteins (Hatakeyama et al. 2013, 2017; Ueno et al. 2012).

Genetic resistance is a highly effective approach to manage this disease, but CR germplasm is rare in Brassica napus (canola). Fortunately, sources of resistance are present in vegetable crops of B. rapa and B. oleracea, which are closely related to B. napus (Liu et al. 2018; Peng et al. 2014) and can be used for breeding of CR B. napus cultivars. Identifying resistance through phenotyping can be influenced by temperature, light, pH, and other factors. Use of molecular markers linked to CR genes for selection provides greater sensitivity and consistency (Liu et al. 2018). Kompetitive Allele Specific PCR (KASP), a single nucleotide polymorphism (SNP) genotyping technology, is fast, efficient, and accurate for molecular breeding of crops (Jatayev et al. 2017).

Bulked segregant RNA-sequencing (BSR-Seq) based on next-generation sequencing is a very robust approach for quantifying gene expression and identifying abundant SNP sites, which has been used for transcriptome analysis and gene mapping in crops (Chu et al. 2014; Huang et al. 2017; Liu et al. 2012; Yu et al. 2016). It can be used to determine the position of target genes by identifying large numbers of SNPs, which can be used to develop molecular markers and finely map the causal genes (Dakouri et al. 2018; Liu et al. 2012; Yu et al. 2016). For example, a previous study identified a high percentage of polymorphic variants (PPV) on chromosome A03 of B. rapa adjacent to the clubroot resistance gene Rcr1 (Yu et al. 2016). This observation was used to map a major clubroot resistance gene in B. oleracea (Dakouri et al. 2018) and a major clubroot resistance gene in B. nigra (Chang et al. 2019).

A turnip cultivar, “Purple Top White Globe” (PTWG), was recently identified to be resistant to pathotype 3 of P. brassicae (Williams’ system) in Canada (Peng et al. 2014). The current study describes the identification and mapping of a clubroot resistance gene from PTWG using BSR-Seq. SNP markers tightly linked to the gene were developed and the gene was finely mapped.

Materials and methods

Materials

The clubroot-resistant (R) turnip cultivar PTWG (seed purchased online at https://www.westcoastseeds.com/shop/vegetable-seeds/turnip-seeds/purple-top-white-globe/) was crossed with a highly susceptible (S) doubled-haploid, canola-quality line of B. rapa, ACDC (developed by Dr. Kevin Falk, Saskatoon Research and Development Centre, Agriculture and Agri-Food Canada), to produce F1. One F1 plant (male) was crossed with the susceptible (S) line (female) to produce a BC1 population.

Inoculation and resistance test

The parents, F1 plants, and BC1 populations were tested for resistance to pathotype 3 of P. brassicae, the most prevalent on canola in the Canadian Prairie region (Strelkov et al. 2006, 2007). A field collection of the clubroot pathogen was used to inoculate plants for studies on the inheritance and genetic mapping. Preparation of inoculums, inoculation, and clubroot severity rating followed the methods described by Chu et al. (2014). Briefly, each plant was inoculated by pouring 5 ml of P. brassicae resting spore suspension (1 × 107 spores/ml) into each seeded well. The inoculated seedlings were maintained in a greenhouse at about 20/18 °C day/night temperature and 16-h photoperiod. At 5 weeks after inoculation, plants were uprooted and clubroot symptoms on roots were rated using a standard 0 to 3 scale (Fig. 1A) where 0 = no clubbing; 1 = small clubbing only; 2 = moderate clubbing; and 3 = severe clubbing (Strelkov et al. 2006). A rating of 0 was considered R and ratings of 1–3 were S. The highly susceptible B. rapa breeding line ACDC was included as a susceptible control to ensure that the inoculation conditions were conducive for infection. Segregation of R and S phenotypes in the BC1 population was analyzed using chi-square (χ2) tests for goodness of fit (Sokal and Rohlf 1995).

Fig. 1
figure 1

Phenotypes of plants in response to inoculation with pathotype 3 of Plasmodiophora brassicae: (A) range of clubroot symptoms in the BC1 population at 5 weeks after inoculation (0 = no clubs, 3 = large clubs) and (B) phenotypes of resistant cultivar PTWG and susceptible line ACDC

RNA-Seq and sequence alignment

Components of the BC1 population were assessed using RNA-Seq. At 15 days post-inoculation, leaf tissues from 30 plants with a rating of 0 were combined to form a R bulk, and leaves from 30 plants with a rating of 3 were combined to form a S bulk. Together, one R and one S bulk formed a single biological replicate. Three biological replicates were assessed using RNA-Seq with an RNeasy Plant Mini Kit (Qiagen, Toronto, ON). On-column deoxyribonuclease (DNase) digestion using a Qiagen RNase-Free DNase set (Qiagen) was used for total RNA extraction, following the manufacturer’s instructions.

The RNA concentration was checked using a NanoDrop 2000c Spectrophotometer (Thermo Scientific, Waltham, MA), and RNA quality was assessed with an Agilent Bioanalyzer 2100 (Agilent Technologies; Mississauga, ON) to ensure that the RNA integrity number (RIN) was > 8 for each sample. cDNA libraries were prepared following the TruSeq RNA Sample Preparation v2 Guide (Illumina, San Diego, CA). A NanoDrop ND-2000c spectrophotometer was used to check cDNA concentrations and purity. Quality control and qPCR analysis were used to validate the cDNA libraries. The Experion DNA 1K Analysis Kit (Bio-Rad Laboratories, Inc.) was used to confirm the size and purity of the cDNA libraries, based on a band at approximately 260 bps. The KAPA Library Quantification Kit v4.11 was used to perform qPCR of cDNA libraries.

RNA-Seq was carried out on samples from each inoculated R and S bulk using the Illumina MiSeq platform at the University of Saskatchewan (Saskatoon, SK, Canada). Short reads from a pool of the three R bulks and a pool of the three S bulks (pooled sample assembly) were assembled into the reference genome (B. rapa, V1.5; http://brassicadb.org/brad) as described by Yu et al. (2016) using SeqMan NGen 13 (DNASTAR, Madison, WI) software. Standard assembling and filtering parameters were used. Discovery of variants (SNPs and InDels) in comparison with the DNA sequences in the B. rapa “Chiifu” was performed using SeqMan Pro 13 (DNASTAR, Madison, WI) with Q call ≥ 15 and depth ≥ 5.

SNP discovery and mapping of the causal gene

SNP discovery and mapping of the causal gene were performed using uniquely aligned short reads from the pooled R and S bulks, as described by Yu et al. (2016). Each potential DNA variant was examined for polymorphism, and putative SNPs were identified using the Lasergene Genomics Suite13 software (DNASTAR). The DNA variants were classified into two types, monomorphic (mono) and polymorphic (poly). The location of the causal gene was determined according to PPV (Dakouri et al. 2018).

SNP genotyping and linkage analysis

Genomic DNA was extracted from young leaves of each BC1 plant (including the BC1 plants used for RNA-Seq) with the CTAB (cetyltrimethylammonium bromide) method (Doyle 1990). The final DNA concentration was adjusted to 50 ng/μl for each sample. SNPs identified in the target region were confirmed using KASP (http://www.lgcgroup.com/), following the manufacturer’s instruction. PCR reactions were performed using a StepOne Plus Real-Time PCR System (Applied Biosystem, Mississauga, ON). A genetic linkage map was developed using JoinMap 4.1 software (Ooijen 2006).

Annotation of the genes in the target region

The gene description and gene ontology (GO) of each gene in the target region were assessed using a Blast2GO search (Conesa et al. 2005), which uses BLASTX algorithms to search for matches against the non-redundant protein database from the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov). BLAST hits were mapped to the functional information stored in the GO database to retrieve GO terms associated with the hits.

Results

Inheritance of clubroot resistance in PTWG

At 5 weeks after inoculation, 14 plants each of the parental lines and F1 were evaluated for resistance to pathotype 3 of P. brassicae. All of the PTWG plants were resistant (rating = 0), all ACDC plants were highly susceptible (rating = 3) (Fig. 1B), and all F1 plants were resistant (rating = 0). Separation of R and S was observed in the BC1 population, where the numbers of plants with ratings of 0, 1, 2, and 3 were 435, 27, 71, and 291, respectively, and the number of the clubroot-R and -S plants were 435 and 389, respectively, which was consistent with an expected ratio of 1:1 (χ2 = 2.56, P = 0.11). These results indicated that resistance in PTWG was associated with a single dominant gene, designated Rcr5 (Resistance to clubroot 5).

Assembly of RNA-Seq short reads into the reference genomes of B. rapa

There are a total of 108.2 M sequences, 2062.2 Mb in length, with 27-fold coverage of the reference A-genome from the three pooled R bulks, and 96.0 M sequences, 6240.7 Mb in length, with 27-fold coverage from the three pooled S bulks (Table 1). The sequence counts assembled into the genome for each chromosome were significantly correlated to chromosome length for the R bulks (r = 0.90, P = 0.004) and S bulks (r = 0.91, P = 0.0003).

Table 1 Short reads from the resistant (R) and susceptible (S) bulks, number of polymorphic variants (SNP and InDel) on each chromosome uniquely identified from R sample bulks of the BC1 population, when compared against the Brassica rapa reference genome v1.5 at http://brassicadb.org/brad

Identification of DNA variants and mapping of Rcr5

A total of 78.8 K poly variants (51.9 K SNPs, about 66% of the variants, and 26.9 K InDels, about 34% of the variants) were identified in 10 chromosomes from the BSR-Seq project (Table 1). The number of poly variants was also significantly associated with the length of chromosome (r = 0.74; P = 0.010). The PPV was usually 40–43%, whereas, for A03, the PPV was higher than that of the other chromosomes, reaching to 49% (Fig. 2A), indicating that Rcr5 was located on chromosome A03. The PPV on chromosome A03 was further analyzed. The highest PPV was located in the physical range from approximately 23–31 Mb (Fig. 2B), and no other regions in the rest of chromosomes showed such a high level of PPV (Supplemental Figure 1), indicating that Rcr5 was located in the genomic region of the chromosome. There were 824 plants in the BC1 population analyzed with 15 poly SNP loci (Table 2) spanning the 23–27-Mb region of chromosome A03 using the KASP method (Fig. 3). A total of 1253 unique SNPs from the R bulk were identified in the 23–27-Mb region from the BSR-Seq project (Supplemental Table 1). The 15 SNP loci (Table 2) consisted of 11 identified from the project (11 out of 1253 SNPs) and 4 SNPs (SNP-A03_19, SNP-A03_32, SNP-A03_54, and SNP-A03_62) from the previous mapping of Rcr1 (Yu et al. 2016) and Rcr2 (Huang et al. 2017) projects. A linkage map composing of 3.6 cM was constructed (Fig. 4A), confirming that Rcr5 resides in the region. Rcr5 was flanked by SNP_A03_83 & 84 and SNP_A03_99 &100, in an interval of 0.5 cM (Fig. 4A).

Fig. 2
figure 2

Distribution of DNA variants (%): (A) monomorphic and polymorphic variants on each chromosome and (B) polymorphic variants on chromosome A03

Table 2 Details of SNP markers linked to Rcr5
Fig. 3
figure 3

Analysis of Kompetitive Allele Specific PCR (KASP) using SNP_A03_84 in the parental lines PTWG (R) and ACDC (S) and BC1 plants of ACDC × (ACDC × PTWG). The green dots are clubroot-resistant plants and blue dots are susceptible plants. The circled dots are the parental lines

Fig. 4
figure 4

Mapping of the Rcr5 region in the BC1 population: (A) linkage map of the region surrounding the Rcr5 gene (y) and (B) a partial physical map of A03 chromosome of Brassica rapa showed the homologs of mapped marker sequences. Dotted lines indicate the relationship of the two maps

DNA variants and gene annotation in the target region

The two closest SNP markers, SNP_ A03_100 and SNP_ A03_83, flanking Rcr5 were developed based on the SNP sites at 23,339,019 and 23,465,030 bp (Fig. 4B). These markers were mapped within the genes Bra012595 and Bra012578 on A03, respectively. The physical distance between these two markers was 126,012 bases. There are 18 genes in this region, based on the reference genome v1.5 (Table 3). A very low level of gene expression was found in most of the genes except seven of them (Bra012595, Bra012591, Bra012588, Bra012586, Bra012581, Bra012580, and Bra012578) (Table 3). Poly variants in the coding regions were detected in all seven genes (Table 3, Supplemental Table 2), but no TNL genes were identified in the region (Supplemental Table 2).

Table 3 Numbers of polymorphic variants (SNPs and InDels) and the level of gene expression in RPKM (Reads Per Kilobase of transcript per Million mapped reads) among the 18 genes located in the Rcr5 interval that could be identified

Discussion

Genetic resistance to P. brassicae can provide effective management of clubroot, but sources of resistance in B. napus are very limited, so identification and introgression of CR from related species into B. napus is important. In this study, more than ten KASP markers closely linked to Rcr5 were developed, which can be used in the detection of these loci in B. napus or B. rapa breeding programs.

A previous study identified a high proportion of PPV on chromosome A03 of B. rapa adjacent to the clubroot resistance gene Rcr1 (Yu et al. 2016). Dakouri et al. (2018) successfully map Rcr7 through identification of PPV. In this study, Rcr5 was found to be associated with chromosome A03 of B. rapa since A03 carried the highest PPV in the BSR-Seq project. A high PPV was identified within the physical interval 23–31 Mb of chromosome A03 in B. rapa, which indicated that Rcr5 was likely located in this region. KASP and linkage analysis of the BC1 population based on several SNP markers further narrowed the interval to 0.126 Mb. This supported the previous report that identification of PPV could be used for genetic mapping of genes of interest (Dakouri et al. 2018). We conclude that identification of PPV in combination with KASP analysis is a powerful approach for fine mapping of causal genes that control agronomic traits of crops.

Several genes for clubroot resistance have previously been mapped into chromosome A03 of B. rapa, including Crr3 (Hirai et al. 2004), CRa (Matsumoto et al. 2005; Ueno et al. 2012), CRb (Piao et al. 2004; Zhang et al. 2014), CRbkato (Hatakeyama et al. 2017; Kato et al. 2013), CRk (Sakamoto et al. 2008), Rcr1 (Chu et al. 2014; Yu et al. 2016), Rcr2 (Huang et al. 2017), and Rcr4 (Yu et al. 2017). Two cloned genes CRa and CRbkato are identical (Hatakeyama et al. 2017) while Rcr1, Rcr2, and Rcr4 were co-localized with CRa/CRbkato (Huang et al. 2017; Yu et al. 2016, 2017), in the genomic region of 24.35 and 24.39 Mb of the A03 chromosome (Ueno et al. 2012; Zhang et al. 2014). CRb was tightly linked to CRa/CRbkato, located between 23.67 and 23.76 Mb, and Crr3 and CRk were located between 15.2 and 15.4 Mb (Zhang et al. 2014). In the current study, two SNP markers in the CRa/CRbkato region (A03_SNP_32, 24.39 Mb; A03_SNP_19, 24.39 Mb) and two SNP markers in the CRb region (A03_SNP_54, 23.71 Mb; A03_SNP_58, 23.72 Mb) were analyzed. These four SNP markers identified several recombinants in the BC1 population. Rcr5 was mapped between 23.33 and 23.46 Mb, which demonstrated that it was located at a different location from the CR genes published previously. In addition, the resistance reaction of the parental resistant line, PTWG, against some of the Canadian pathotypes differed from the lines carrying the other resistant genes (unpublished data).

Only 18 genes lie in the small interval into which Rcr5 was mapped. Poly variants unique to the R samples were expressed in 7 out of 18 genes, but none of the 18 genes encode proteins known to be associated with TNL or with other components of plant disease resistance. Previous studies have shown that there are DNA sequence gaps in the Rcr5 target region in the B. rapa reference genome (B. rapa, V1.5; http://brassicadb.org/brad). It is possible that Rcr5 is located in a sequence gap or that Rcr5 encodes a novel protein that differs from the previous cloned CR resistance genes. However, these different hypotheses can only be resolved by cloning Rcr5, which was beyond the scope of the current study.