Introduction

The purple leaf trait, caused by the accumulation of anthocyanin in plant tissue, has important significance in crop breeding and research. Anthocyanin is reportedly closely related to cold resistance in plants (Christie et al. 1994; Leng et al. 2000; McKown et al. 1996). Anthocyanin may increase the cold resistance adaptability of a plant via osmotic control and depression of the freezing point of water in plant tissues (Chalker-Scott 1999). Similar results have been reported from cold resistance studies of rapeseed and transgenic tomato (Ahmed et al. 2015; Gomaa et al. 2012; Jin et al. 2012). Furthermore, anthocyanins play key roles in defense as antimicrobial agents, feeding deterrents and UV-protective compounds (Winkel-Shirley 2001). Purple leaves can also be used as visible markers of plant transformation. As an alternative to chemically selectable markers (Jin et al. 2012; Kim et al. 2010), such a plant-derived selectable marker gene would represent a rapid, antibiotic-free method, with good public acceptance (Kortstee et al. 2011; Li et al. 2009). In addition, purple leaves can be used to effectively remove the fake parent in hybrid seed production to improve and rapidly identify seed purity at the seedling stage (Cao et al. 1999).

At present, numerous results regarding the modes of inheritance and linkage relationships of the genes responsible for purple leaves have been reported in Brassica. In B. napus, Lv (2008) reported that the purple leaf trait was controlled by one partially dominant gene, and five amplified fragment-length polymorphism (AFLP) markers linked to the target gene have been identified. Li et al. (2016) first reported the fine mapping of a purple leaf gene, BnaA.PL1, located in a 99-kb region at the end of chromosome A3 of B. napus. Several relevant studies on B. rapa and B. oleracea have been published in recent years, due to the edibility and health benefits of the purple leaves of these plants. In Chinese cabbage, Hayashi et al. (2010) localized Anp, which originated from turnips, to linkage group A07 of B. rapa. The Anp gene is flanked by the simple sequence repeat (SSR) marker BRMS-036 and the cleaved amplified polymorphic sequence (CAPS) marker OPU10C, at distances of 2 and 4 cM, respectively. Wang et al. (2014) mapped a single dominant purple leaf gene, BrPur, to linkage group A03 of B. rapa within a genomic region of 54.87 kb. In cauliflower, Chiu et al. (2010) isolated a unique purple gene (Pr) via a combination of candidate gene analysis and fine mapping. This gene was found to encode an MYB transcription factor that exhibited tissue-specific expression. However, few studies related to the fine mapping and cloning of purple leaf genes in B. juncea have been reported to date.

In the present study, a purple leaf mutant, designated 1280-1, was discovered in the B. juncea line 1280. A previous genetic analysis revealed that the purple leaf trait in 1280-1 is controlled by one dominant gene, designated BjPl1. Ten AFLP markers linked to the BjPl1 purple leaf gene have been identified, and two flanking markers located 0.7 and 1.3 cM away from the target gene (Zhao and Du 2013). To promote the effective use of this gene, further fine mapping was performed in the present study. This work will aid in cloning the purple leaf gene and will lay the foundation for Brassica breeding and elucidation of the molecular mechanisms underlying anthocyanin accumulation in B. juncea.

Materials and methods

Plant materials and population construction

The B. juncea lines 1280-1 (Fig. 1a) and Duoshi (Fig. 1b) were used as research materials in the present study. Line 1280-1, with the purple leaf trait, was used as the female parent. The fully expanded leaves of 1280-1 are green with dark-purple leaf margins and light-purple leaf veins. The purple color spreads over the entire leaf during leaf development until the final flowering phase. Duoshi, an excellent B. juncea landrace originating from the Qinghai-Tibetan Plateau that possesses green leaves, was used as the male parent. All of these materials were self-pollinated for more than seven generations.

Fig. 1
figure 1

Phenotypes of the parental lines and the BC1 segregating progeny. a The female parent with purple leaves; b the male parent with green leaves; c the segregating BC1 population at the seedling stage; and d the BC1 population at the flowering stage

A cross was developed between 1280-1 and Duoshi. The resulting F1 generation was subsequently backcrossed with Duoshi to produce a BC1 population (Fig. 1c, d) comprising 794 individuals, for fine mapping of the purple leaf color gene BjPl1. Purple-leaved plants of the BC1 population were further backcrossed to Duoshi for three generations to generate a BC4 population for whole-genome re-sequencing analysis.

To compare the total anthocyanin content between plants with green leaf color and purple leaf color plants with different genotypes, a purple leaf color plant that was randomly selected from the BC4 population was self-pollinated to produce a BC4F2 population including individuals with three genotypes (BjPl1BjPl1, BjPl1Bjpl1 and Bjpl1Bjpl1) characterized by a similar genetic background. The leaf color of all individuals was visually scored at the 6-leaf stage.

Analysis of total anthocyanin content

At the seedling stage, the homozygous (BjPl1BjPl1) and heterozygous (BjPl1Bjpl1) genotypes of BC4F2 individuals with purple leaves were distinguished using co-dominant markers and were further verified by examining the phenotype of the BC4F2:3 progeny. The genotypes of individuals with green leaves were identified according to the phenotypes observed in the field.

After the genotype of each individual was confirmed, freeze-dried leaf samples (0.5 g) from each of five plants of each genotype at the seedling stage and full flowering stage were ground to a powder, which was then extracted twice with 25 ml of methanol/water/acetic acid (V:V:V = 85:15:0.5) at 50 °C for 2 h. The extract solution was collected and filtered through a 0.22 µm filter to measure the absorbance. The total anthocyanin content (TAC) was estimated following the modified pH differential method described by Wang et al. (2014). The data were analyzed via one-way analysis of variance using SAS 8.1 (Littell et al. 2006). The absorbance of the extracts at 510 and 700 nm was measured on an Eppendorf BioSpectrometer fluorescence instrument (Eppendorf, Hamburg, Germany).

Linkage group identification

Genomic DNA was extracted according to Doyle (1990). The DNA concentration of each sample was adjusted to 50 ng/μl. Two purple leaf gene pools (pl) were generated from the DNA of 12 purple-leaved plants randomly selected from the BC1 population, and two green leaf gene pools (gl) were similarly generated. Two pairs of bulks were subsequently used for bulk segregant analysis (BSA) to identify molecular markers linked to the BjPl1 gene.

To assign the BjPl1 gene to a specific linkage group, SSR and intron polymorphism (IP) primers were first obtained from the following public sources: http://brassicadb.org/brad/, http://www.brassica.info/ and public reference maps of B. juncea and B. carinata for the A and B genomes (Guo et al. 2012; Panjabi et al. 2008; Ramchiary et al. 2007; Yadava et al. 2012). Markers that could detect loci that were well distributed across all linkage groups were employed as anchor markers. Polymorphism screening was first performed in two pl and two gl bulks and subsequently verified in the corresponding individuals comprising the bulks. SSR and IP amplification was performed as described by Lowe et al. (2002) and Panjabi et al. (2008), respectively. The amplified products were separated in a 6% denaturing polyacrylamide gel.

Marker development and genetic mapping

After one marker was assigned to the special linkage group in the reference map, markers from the specific linkage group were further selected for polymorphism screening in our mapping population. All markers displaying polymorphisms were then used to screen 190 individuals for primary mapping.

Due to the absence of genomic information for B. juncea in the early phase of this study, the closely related sequenced B. rapa Chiifu genome was used as a proxy reference genome to develop markers that were tightly linked to the BjPl1 gene, based on a comparative mapping strategy. After identifying the homologous region on A2 of B. rapa, SSR primers were designed for further fine mapping of BjPl1 using sequence information for the B. rapa genome. Microsatellite screening was conducted with the software SSR Hunter 1.3 (Li and Wan 2005). Specific PCR primers were designed using Primer 3 software (http://frodo.wi.mit.edu/primer3).

After the target gene was delimited to a physical region of B2 based on recently released information on the B. juncea genome (https://www.ncbi.nlm.nih.gov/nuccore/LFQT00000000), a new set of SSR markers were developed to further narrow the physical region covering the BjPl1 gene. Finally, all of the markers were used to screen 794 individuals for fine mapping.

The phenotype data and the data from the AFLP (Zhao and Du 2013), IP and SSR analyses were combined to generate a linkage map of the region covering the BjPl1 gene. Mapping distances (cM) were estimated using the Kosambi function (Kosambi 1943), and the linkage map was drawn with JoinMap 3.0 (van Oijen and Voorrips 2001).

Marker sequencing and identification of the homologous region

Specific fragments of the AFLP (Zhao and Du 2013), SSR and IP markers were isolated from dried polyacrylamide gels. DNA was subsequently purified following the methods described by Yi et al. (2006). The purified products were ligated to the plasmid vector pGEM-T (Promega, Beijing, China). The transformed clones were screened via PCR with the commercially available M13 primer (Sangon, Shanghai, China). For each fragment, three positive clones were sequenced by the Shanghai Sangon Biotechnology Corp. (Shanghai, China).

To identify a putative syntenic region around BjPl1 in B. rapa, B. juncea and Arabidopsis, sequence information for the specific fragments was submitted to the Brassica Database (BRAD) (http://brassicadb.org/brad/), the B. juncea genome (https://www.ncbi.nlm.nih.gov/nuccore/LFQT00000000) and The Arabidopsis Information Resource (TAIR) (https://www.arabidopsis.org/) for Blast analysis.

Detection of the location of BjPl1 via whole-genome re-sequencing

Whole-genome re-sequencing was conducted on the parents and two pooled DNA samples. The two pooled samples were prepared by separately mixing the DNA of 20 purple-leaved (P-pool) and 20 green-leaved (G-pool) plants from the BC4 population. Sequencing libraries were sequenced on the Illumina HiSeq 4000 platform, and 150-bp paired-end reads were generated, with an insert size of approximately 350 bp. The B. juncea genome was downloaded from (https://www.ncbi.nlm.nih.gov/nuccore/LFQT00000000). After quality control, sequence data from four samples were aligned against the reference genome with the Burrows-Wheeler Aligner, and repetitive sequences were removed with SAMTOOLS (Li and Durbin 2009). SNP calling was performed using the Unified Genotyper function in GATK software (McKenna et al. 2010). The SNP indexes of the two pools were calculated using one parent (Duoshi) as a reference. The ΔSNP index was determined by subtraction from the SNP indexes of the two pools. Re-sequencing was carried out by the Novogene Company (Beijing, China).

Results and discussion

Significant difference in the total anthocyanin content (TAC) of two phenotypes

As expected, the plants with purple leaves and green leaves in the BC4F2 population displayed a genetic ratio of 3:1. The purple-leaved plants from the BC4F2 population with different genotypes were distinguished using co-dominant markers (Supplementary Fig. 1) and were further verified through self-pollination. The statistical analysis revealed no significant differences between the TAC of the homozygous (BjPl1BjPl1) genotype and that of the heterozygous (BjPl1Bjpl1) genotype at the seedling stage and full flowering stage, respectively. However, the TAC of the purple leaves was significantly higher than that of the green leaves at the same growth stage (Supplementary Tables 1, 2). These results indicate that the purple leaf trait is controlled by a complete dominance gene and is correlated with the anthocyanin content of the leaves, which is consistent with the results of Wang et al. (2014) for B. rapa. However, a study in B. napus revealed that the purple leaf trait is controlled by an incomplete dominance gene (Li et al. 2016). Another study in zicaitai (Brassica rapa L. ssp. chinensis var. purpurea) indicated that multiple genes are involved in controlling the purple leaf trait (Guo et al. 2015). The results described above illustrate that models of the inheritance of leaf color traits differ among various materials.

Considering that the purple leaf trait of 1280-1 is consistently observed from the seedling stage to the end of the flowering stage and is controlled by a complete dominance gene, 1280-1 can be used as novel germplasm for the transfer of the purple leaf trait. Successful introduction of the purple leaf trait of 1280-1 to the leaves of mustard (B. juncea) or other Brassica crops consumed as vegetables would greatly improve the edible value of these crops. In addition, it is worth noting that the purple leaf color appeared lighter in the flowering stage than in the seedling stage. The reason for this difference could be a lower TAC in the flowering stage, possibly due to mechanisms of molecular regulation influenced by environmental factors, such as light intensity and temperature levels occurring in different developmental stages (Wang et al. 2014).

Mapping BjPl1 to the B2 linkage group of B. juncea

A total of 426 PCR-based primer pairs, distributed evenly across the A and B genomes, were used for polymorphism screening. Only one IP marker (At5g21070) and one SSR marker (BrMS046) showed polymorphism between the bulks and the corresponding individuals (Supplementary Table 3). The IP marker (At5g21070) was located in linkage group B2 in the reference map of B. juncea (Panjabi et al. 2008; Yadava et al. 2012), whereas the SSR marker (BrMS046) was located in linkage groups B2 of B. carinata (Guo et al. 2012) and A2 of B. rapa (http://brassicadb.org/brad/).

To further determine the position of the BjPl1 gene in the published reference genetic map, all of the markers from the B2 and A2 linkage groups in the corresponding reference maps mentioned above were subsequently used to detect polymorphisms in our mapping population. This screen identified one additional IP marker (At5g07910) and two additional SSR markers (Ni2-D10 and Ni4-G10), which were all derived from the B2 linkage group of the B. juncea reference map (Yadava et al. 2012) (Supplementary Table 3). At this point, five PCR-based markers had been identified as being linked to the BjPl1 gene, four of which markers (At5g21070, At5g07910, Ni2-D10 and Ni4-G10) were derived from linkage group B2 of B. juncea, while one marker (BrMS046) was found in both A2 of B. rapa and B2 in the reference map of B. carinata. Therefore, we concluded that the BjPl1 gene is located in linkage group B2 in B. juncea (Fig. 2).

Fig. 2
figure 2

Comparative mapping between B. rapa, A. thaliana and B. juncea around the BjPl1 gene (a) Linkage group of B2, as reported by Yadava et al. (2012); (b) linkage map of the region surrounding the BjPl1 gene; (c) a partial physical map of A2 in B. rapa showing the homologs of the markers; and (d) R and E blocks corresponding to different Arabidopsis chromosome fragments proposed by Schranz et al. (2006). Markers presented in different colors can be individually aligned to blocks of the same color. Dotted lines indicate the relationships among these linkage maps

To our knowledge, there has been no previously reported research in regard to mapping of the purple leaf trait in B. juncea. Thus, our mapping results are obviously different from those for the other purple leaf genes reported in Brassica, which were localized to the Brassica A genome (Chiu et al. 2010; Guo et al. 2015; Hayashi et al. 2010; Wang et al. 2014). Therefore, physical mapping of BjPl1 will facilitate the cloning of this gene and lay a foundation for clarification of its function in the anthocyanin biosynthesis pathway in Brassica.

Localization of BjPl1 in a 0.7-cM genetic interval in the B2 linkage group based on comparative mapping

According to the definition of conserved genomic blocks from the ancestral karyotype reported by Schranz et al. (2006), together with the comparative map of B. juncea constructed by Lagercrantz and Lydiate (1996) and Panjabi et al. (2008), the B2 and A2 linkage groups of Brassica species show homoeology for the block motifs, including E, P, O, W and R blocks. In the present study, the results of primary mapping of the BjPl1 gene indicated that the Ni4-G10 marker was located on one side of BjPl1, positioned at the E block of B2, whereas the At5g21070 and At5g07910 markers were located on the other side of the target gene, positioned at the R block of B2 (Fig. 2). Thus, to develop markers tightly linked to BjPl1, primers from all five blocks were used for polymorphism screening based on homoeologous blocks on linkage groups A2 and B2 in Brassica. As a result, another IP primer (At1g72890) at the E block was identified in our mapping population (Supplementary Table 3).

These polymorphic markers, including 7 AFLP, 3 IP and 3 SSR markers, were subsequently cloned and sequenced. Because no information on the B. juncea genome was available in the early stage of this research, these markers were used for a Blast analysis against the BRAD database to identify the collinear region of B. rapa. The results indicated that 5 markers showed linear homology with A2 of B. rapa and revealed synteny between a physical region from 1.1 to 11.9 Mb in B. rapa and the genetic region around BjPl1 in linkage map (Supplementary Table 4, Fig. 2). Subsequently, the homologous region in B. rapa restricted by the closer marker Atg21070, positioned at 4.2 Mb, and At1g72890, at 11.9 Mb, was utilized for marker development. A total of 80 pairs of SSR primers, positioned approximately 100 kb apart within the homologous region, were designed for polymorphism detection. This screen identified five markers (PLC815, PLC837, PLC871, PLC875 and PLC879) tightly linked to the BjPl1 gene (Supplementary Table 3).

To more accurately determine the map location of the BjPl1 gene, the BC1 population comprising 794 plants was used to detect recombinants. The flanking SSR markers (BrMS046 and Ni4-G10) farthest away from the BjPl1 gene detected 101 and 17 recombinants, respectively. These 118 recombinants were subjected to genotyping for all of the markers between the two flanking markers to evaluate the genetic distance from BjPl1. Based on this analysis, the BjPl1 gene was delimited to a 0.7-cM interval between the markers PLC871 (cosegregating with EA04MC03 and EA07MC16) and PLC875 and an approximately 222-kb interval of A2 corresponding to the physical region from 9.2 to 9.4 Mb (Fig. 2).

Comparative mapping studies using common molecular markers have revealed the existence of conserved blocks between Brassica and Arabidopsis. Moreover, comparative genomics data for the Brassica paved the way for a unified comparative genomics framework based on block system (Panjabi et al. 2008; Parkin et al. 2005; Schranz et al. 2006, 2007). This block system can be used to visualize the comparative genomic structures of crucifer species as an additional type of genetic mapping (Schranz et al. 2006). Chromosome synteny based on the block system that exists between the sequenced genomes and other Brassica plants can be effectively used for fine mapping or cloning genes in unsequenced species (Xie et al. 2012). In this study, the BjPl1-linked markers developed from the block motifs sharing within B2 and A2 in Brassica and sequence information on homologous region on A2 in B. rapa again illustrates the effectiveness of comparative analyses for gene mapping in species without a reference genome sequence. However, use of the B. rapa genome sequence must be carefully because some genome rearrangements have been detected, such as inversion between B2 and A2 in Brassica (Panjabi et al. 2008).

Narrowing the BjPl1 gene to a 225-kb interval using information on the B. juncea genome

Due to the recent release of genome data for B. juncea, the sequence information for all of the markers linked to BjPl1 was subjected to Blast searches against B. juncea Tumida cultivar T84-66. The results indicated that most of the markers presented homology with the B2 chromosome of B. juncea. The closest flanking markers, PLC875 and PLC871, delimited the target gene to an approximately 434-kb interval of B2 corresponding to the physical region from 17.52 to 17.97 Mb (Fig. 3, Supplementary Table 3).

Fig. 3
figure 3

Analysis of candidate intervals for the BjPl1 gene on chromosome B2 of B. juncea through a combination of whole-genome re-sequencing and genetic linkage mapping (a) ΔSNP index graph for chromosome B2. Two parallel black lines indicate the interval from 17.74 to 17.97 Mb. Gray lines represent the threshold value. Black dotted lines delineate two candidate intervals above the threshold value. (b) A partial genetic linkage map around the BjPl1 gene. (c) A partial physical map of linkage markers around the BjPl1 gene. The black region indicates two candidate intervals, corresponding to 17.74–17.78 Mb and 17.93–17.96 Mb, identified through whole-genome re-sequencing. (d) Results of a Blast analysis using sequences from two candidate intervals against the Arabidopsis genome. The rectangles contain homologous Arabidopsis genes with a cut off E-value ≤1E−30 in two regions

According to the sequence information for the 17.52–17.97 Mb genomic region of chromosome B2 of B. juncea, 28 SSR primers were designed for polymorphism screening. A total of five SSR markers (B2-7, B2-9, B2-17, B2-23 and B2-24) showed polymorphism in our mapping population (Supplementary Table 3). Subsequently, the 118 recombinants were subjected to genotyping for the five markers for evaluation of genetic distances. Two recombinants were detected between BjPl1 and markers B2-7 and B2-9, whereas no recombinants were detected between the target gene and markers B2-17, B2-23, and B2-24. Thus, the BjPl1 gene was further delimited to a 0.4-cM genetic region between markers PLC871 (cosegregating with EA04MC03 and EA07MC16) and B2-9 (cosegregating with B2-7), corresponding to an interval of approximately 225 kb, from 17.74 to 17.97 Mb, on B2 (Fig. 3).

Two candidate intervals identified through whole-genome re-sequencing analysis

Previous reports have indicated that whole-genome re-sequencing analysis can be used to further verify the results of molecular marker mapping, leading to greater accuracy (Lu et al. 2014; Wang et al. 2016). Therefore, re-sequencing analysis was conducted for further fine mapping of the BjPl1 gene. As a result, 59.19 G of clean data were filtered from 59.58 G of raw data. The GC content ranged between 39.05 and 42.09%. The quantity and quality (Q20 ≥ 94.75% and Q30 ≥ 89.06%) of the data fulfilled the standards for deeper analysis. The clean reads were aligned to the reference genome (https://www.ncbi.nlm.nih.gov/nuccore/LFQT00000000) using BWA (Burrows-Wheeler Aligner) software (Li et al. 2009). The average read depth was > 15× for each sample. A total of 1,594,798 SNPs were identified between the 1280-1 and Duoshi, and 559,437 homozygous SNPs identified between two parents were subsequently used for calculation of the SNP index in two descendants (P-pool and G-pool). A ΔSNP index graph was calculated and plotted against genomic regions (Fig. 4). Among the intervals identified at the 99% significance level, two candidate intervals on the B2 chromosome, from 17.74 to 17.78 Mb and 17.93 to 17.96 Mb, were included in the target region (17.74–17.97 Mb on B2) identified through molecular marker mapping. Therefore, the target gene was further delimited to two regions of approximately 64 kb in total on B2 (Fig. 3).

Fig. 4
figure 4

SNP index and ΔSNP index Manhattan plot graphs a SNP index Manhattan plot graphs of BC4-P pools; b SNP index Manhattan plot graphs of BC4-G pools; and c ΔSNP index Manhattan plot graph. The blue line indicates the threshold value. The nomenclature for 18 chromosomes (CM007185.1–CM007194.1 and CM007195.1–CM007202.1) from the NCBI website is equivalent to the names of the A genome (A1–A10) and B genome (B1–B8) in B. juncea

Dissection of the target region of BjPl1

The sequences of the two candidate intervals were subsequently submitted to the TAIR (http://www.arabidopsis.org/) and GenomeNet (http://www.genome.jp/tools/blast/) databases for Blast analysis. The results showed that the candidate region sequences were similar to a region on Arabidopsis chromosome 1 that includes 12 putative homologous Arabidopsis genes (Supplementary Table 5). According to gene annotation of these genes in the TAIR database, four genes (At1g56650, At1g66390, At1g66370 and At1g66380) were related to anthocyanin biosynthesis, all of which were within the candidate region of 17.93–17.96 Mb on B2. These four genes represent positive regulatory genes encoding R2R3-MYB transcription factors (AtPAP1, AtPAP2, AtMYB113 and AtMYB114) that are involved in the later steps of the anthocyanin biosynthetic pathway, by forming ternary complexes with basic helix-loop-helix (bHLH) and WD40 proteins. (Guo et al. 2014; Nemie-Feyissa et al. 2015). Moreover, considering the markers’ arrangement in the linkage map, the PLC871 marker, with a physical position of approximately 17.97 Mb, is very close to the genetic position of the BjPl1 gene. In conclusion, we speculate that the physical region of 17.93–17.96 Mb might harbor the target gene and that one homologous gene among the four Arabidopsis genes might be a candidate BjPl1 gene. It is noteworthy that these four genes exhibit high sequence similarity and that three of the genes (At1g66370, At1g66380 and At1g66390) occur as a three-gene tandem array, which makes it difficult to further predict and isolate one of the putative candidate genes. Therefore, the isolation and verification of the candidate gene will require further in-depth analyses.