Introduction

Anthocyanins are a large group of secondary metabolites belonging to the flavonoids that are widely found in higher plants (Springob et al. 2003; Lepiniec et al. 2006). These metabolites provide plants with red to purple pigments that play important roles in growth and development, attracting pollinators and seed carriers, protecting plants against reactive oxygen species (ROS) and UV-B radiation (Solovchenko and Schmitz-Eiberger 2003; Landi et al. 2014), and mediating plant–microbe interactions (Gould 2004). Additionally, the antioxidant properties of anthocyanins promote human health by protecting against diabetes, cardiovascular disease and cancer (Winkel-Shirley 2001; Nhukarume et al. 2010).

Most anthocyanin genes have been well characterized in the model plant Arabidopsis thaliana. Anthocyanins are synthesized through the phenylpropanoid pathway, starting with the conversion of phenylalanine to coumarate-CoA by phenylalanine ammonia lyase (PAL), cinnamate 4-hydroxylase (C4H) and 4-coumarate: CoA ligase (4CL) (Zhang et al. 2014) (Fig. 1). The key enzymes chalcone synthase (CHS), chalcone isomerase (CHI), flavanone 3-hydroxylase (F3H) and flavonoid 3′-hydroxylase (F3′H) catalyze malonyl CoA and ρ-coumaroyl CoA to produce dihydroflavonols (Holton and Cornish 1995). These intermediates are oxidized to flavonols by flavonol synthase (FLS) or catalyzed by the dihydroflavonol 4-reductase (DFR) and anthocyanidin synthase (ANS) genes to produce various anthocyanidins (Harborne and Williams 2001; Winkel-Shirley 2002). The anthocyanidins undergo modification via glycosylation, acylation and/or methylation to form stable anthocyanidins. In A. thaliana, eight characterized genes are associated with these different modifications. UGT78D2, UGT75C1, UGT79B1 and UGT84A2 are responsible for glycosylation, and A5G6′′′MaT, A3G6′′p-CouT, A3G6′′p-CouT and SCPL10 are responsible for acylation (Shi and Xie 2014). However, no genes responsible for methylation have been characterized in A. thaliana. Anthocyanin biosynthesis is largely regulated at the transcriptional level (Fig. 1). R2R3-MYBs, including MYB11, MYB12, and MYB111, independently positively regulate the phenylpropanoid pathway and early anthocyanin biosynthetic genes (Stracke et al. 2010). Late anthocyanin biosynthetic genes are activated by the complex MYB-bHLH-WDR (MBW), which comprises one of the R2R3-MYB proteins (PAP1, PAP2, MYB113, or MYB114), one of the bHLH proteins (GL3, EGL3, or TT8), and a WD repeat protein (Zhao et al. 2013). The R3-MYB proteins CPC and MYBL2 are characterized as negative regulators interacting with MBW complexes (Shi and Xie 2014). The other necessary step is the transport of anthocyanins from the cytosol to the vacuole. The important transport genes include TT12, TT19 and AHA10 (TT13) (Zhao and Dixon 2010).

Fig. 1
figure 1

The biosynthetic pathway of anthocyanin

Brassica oleracea comprises multiple important vegetable crops that have contributed to human health and nutrition for hundreds of years (Parkin et al. 2014). The species exhibits diversity in anthocyanin accumulation. Most varieties accumulate anthocyanins at low levels, whereas several cultivars of cauliflower, red cabbage, kale and kohlrabi, are extremely enriched in anthocyanins (Scalzo et al. 2008). However, the mechanisms underlying anthocyanin accumulation in B. oleracea are poorly understood. To date, only one gene has been characterized: a R2R3-MYB transcription factor, MYB2 (PAP2-like), responsible for the purple cauliflower mutant (Chiu et al. 2010). Other studies revealed that MYB2 is possible key regulator in red cabbage, kale and kohlrabi, and MYBL2 is negative key regulator in red cabbage, but this hypothesis has not been verified (Song et al. 2018; Yan et al. 2019).

The available B. oleracea reference genomes make it possible to perform genome-wide searches for anthocyanin biosynthetic genes based on their orthologs in A. thaliana. In the present study, we provide comprehensive information on anthocyanin biosynthetic genes in B. oleracea by performing comparative genomic analysis between B. oleracea and A. thaliana. We also profile the expression patterns in tissues of green and purple plants and further explored the function of positive regulator BoMYB113.1 and negative regulator BoMYBL2.1. This study will improve our understanding of the anthocyanin biosynthetic genes in B. oleracea.

Materials and methods

Identification and analysis of ABGs in the B. oleracea genome

All the sequences of anthocyanin biosynthetic genes (ABGs) of A. thaliana referred to in this study were downloaded from the Col-0 Arabidopsis reference genome in TAIR (http://www.arabidopsis.org/). The 02-12 cabbage reference genome sequence (version 1.0) (Liu et al. 2014) and annotated gene sequences from Bolbase (http://www.ocri-genomics.org/bolbase/index.html) were used to identify the ABGs in B. oleracea.

To survey ABGs in B. oleracea, we performed a comparative genomic analysis between B. oleracea and A. thaliana using the genome sequences and corresponding annotation information of the two species. The anthocyanin biosynthetic gene and protein sequences of A. thaliana were aligned with the genome and protein sequences of B. oleracea using BLASTN and BLASTP with an E-value cutoff ≤ 1E−10 and coverage ≥ 0.75. The syntenic orthologs between A. thaliana and B. oleracea from the BRAD were determined based on the following criteria: a sequence similarity of E ≤ 10–20, the collinearity of flanking genes, and non-syntenic orthologs with a sequence identity > 70% and coverage > 60% (Cheng et al. 2012a, b). In addition, gene clusters were determined according to the description by Alamery et al. (2018). Furthermore, the physical distance between neighboring ABGs was required to be ≤ 200 kb.

Availability of all B. oleracea genes in this work

The gene IDs, names, chromosome locations, coding sequences and protein sequences of all B. oleracea anthocyanin pathway genes were shown in Supplementary Table S1.

Chromosome distribution analysis

The physical position information of ABGs was downloaded from the B. oleracea database. The distribution of genes on the chromosomes was subsequently determined using MapChart software (Voorrips 2002). BoABGs on unanchored scaffolds were displayed independently.

Expression analysis of ABGs in B. oleracea

The expression patterns of ABGs in B. oleracea were measured using RNA-Seq data. RNA-seq data for seven tissues, namely root, callus, bud, silique, flower, stem and leaf tissues, of B. oleracea accession 02-12 were obtained from the Gene Expression Omnibus (GEO) database with the accession number GSE42891.

Total RNA was extracted from leaves of 8-week-old plants of red cabbage line 15Z_P, white cabbage line 120QY_G, and ornamental kale line S2309, and was used for cDNA library construction and RNA sequencing following the protocol as previously described (Han et al. 2018). FPKM values are available in Supplementary Table S2. With B. oleracea actin gene (AF044573.1) an internal control, quantitative RT-PCR (qRT-PCR) was performed to validate expression level of BoABGs in 15Z_P and 120QY_G. The qRT-PCR mixture was prepared using SYBR Premix Ex Taq II (Tli RNase H Plus; Takara, Dalian, China), and reactions were conducted on a CFX96 Touch Real-Time PCR Detection System (Bio-Rad, Hercules, CA, USA). All samples were assayed in triplicate. The relative expression levels were calculated using the 2−ΔΔCT method (Livak and Schmittgen 2001). The primers used in this study were listed in Supplementary Table S3.

Agrobacterium tumefaciens-mediated Arabidopsis transformation

The coding sequences of BoMYB113.1 (amplified from red cabbage 15Z_P), and BoMYBL2.1 (amplified from white cabbage line 120QY_G) and the fragments were sub-cloned into a modified binary vector pBWA(V)BS (reconstructed from pCAMBIA1301) (Han et al. 2019). The constructs were introduced to A. thaliana Col-0 plants using the floral dip method (Clough and Bent 1998). Overexpressing transgenic lines were obtained by screening the T0 seeds on MS (Murashige and Skoog) medium containing 30 mg/L of hygromycin and were further confirmed by RT-PCR analysis. Measurement of anthocyanin in the transgenic plants was conducted following the methods previously reported (Wang et al. 2016).

Virus-induced gene silencing (VIGS) of BoMYB113.1 and BoMYBL2.1 in cabbage

VIGS using cabbage leaf curl virus (CaLCuV)-based vector was conducted following the methods described by Xiao et al. (2020). In brief, conserved sequences of BoMYB113.1 (34–533 nt in the coding sequence) and BoMYBL2.1 (51–550 nt in the coding sequence) were selected, amplified, and inserted to PCVA vector. Empty PCVA, PCVA-BoMYB113.1, PCVA-BoMYBMYBL2.1 and PCVB were independently transformed into Agrobacterium GV3101 cells. The Agrobacterium culture suspensions of PCVB and PCVA-BoMYB113.1 were mixed together at a 1:1 ratio for inoculation of red cabbage 15Z_P, PCVB and PCVA-BoMYBL2.1 were mixed together at a 1:1 ratio for inoculation of white cabbage 120QY_G, and as controls, PCVB and PCVA were mixed together at a 1:1 ratio for inoculation of 15Z_P and 120QY_G. The phenotypes were evaluated 2–3 weeks after sowing in compost.

Statistical analysis

Statistical analysis was performed using Student’s t test. Significant differences refer to statistical significance at two levels P < 0.05 and P < 0.01. All the measurements were performed by three independent biological replicates.

Results

Identification of anthocyanin biosynthetic genes in B. oleracea

In A. thaliana, 51 Arabidopsis anthocyanin biosynthetic genes (AtABGs) have been reported, including 29 structural genes encoding anthocyanin biosynthetic enzymes, 19 regulatory genes encoding transcriptional factors, and 3 transport genes that are required for anthocyanin transport from the cytosol to the vacuole (Table 1; Supplementary Table S1). Using the AtABGs as queries, we searched their syntenic and non-syntenic orthologs in B. oleracea. A total of 88 B. oleracea anthocyanin biosynthetic genes (BoABGs) were identified, representing homologs of 46 of the 51 AtABGs (Table 1; Fig. 2; Supplementary Table S1). The other 5 AtABGs, named FLS6, A3G6''p-CouT1, A3G6''p-CouT2, SCPL10 and MYB11, had no B. oleracea orthologs. The BoABG members were named according to the names given in the A. thaliana anthocyanin biosynthetic pathway. However, some tandemly duplicated AtABGs showed quite similar sequences, and their B. oleracea orthologs could not be distinguished, so they were named according to one of the tandemly duplicated AtABGs. For instance, B. oleracea orthologs of FSL2, FSL3, FSL4, and FSL5 were named BoFSL2.1, BoFSL2.2, BoFSL2.3, BoFSL2.4 and BoFSL2.5, respectively; MYB113, MYB114 and PAP2 were named BoMYB113.1, BoMYB113.2 and BoMYB113.3, respectively. These BoABGs comprise 52 structural genes, 32 transcriptional factors and 4 transport genes. Based on the orthology analysis, 79 (89.0%) BoABGs were syntenic orthologs of 46 AtABGs, and 9 (10.0%) BoABGs had no syntenic relationships.

Table 1 Information of the Brassica oleracea anthocyanin biosynthetic genes (BoABGs)
Fig. 2
figure 2

Circos diagram of syntenic anthocyanin biosynthetic genes in Arabidopsis thaliana and Brassica oleracea. Chr1 to Chr5 indicate A. thaliana chromosomes, and C01 to C09 indicate B. oleracea chromosomes

Genomic distributions of BoABGs

Genome chromosomal location analysis revealed that the BoABGs were distributed on all 9 chromosomes and all three sub-genomes (Fig. 3). Of the 88 BoABGs, 61 were unevenly distributed on the 9 chromosomes, with 4, 6, 9, 8, 6, 8, 5, 8 and 7 ABGs mapped on chromosomes C01-09 in the B. oleracea genome, respectively. The remaining 27 genes, representing 30.68% of the total BoABGs, were positioned on different scaffolds, which were not anchored to any chromosome.

Fig. 3
figure 3

Genomic localization of the BoABGs on the 9 chromosomes of B. oleracea. C01 to C09 indicate B. oleracea chromosomes

Genome rearrangement after WGT in B. oleracea led to the following three sub-genomes, with gene densities from high to low: LF, MF1, and MF2 (Liu et al. 2014). With this sub-genomic information, we then assigned BoABGs to the three sub-genomes (Table 1). Of the 79 syntenic orthologs, 33, 19 and 27 were located in LF, MF1 and MF2, respectively.

Duplication of BoABGs

The BoABGs in the genome of B. oleracea have expanded mainly due to whole-genome triplication (WGT) since the divergence of the species from A. thaliana (Liu et al. 2014). Most of the ABGs syntenic to AtABGs were present in multiple copies in B. oleracea. There were thirteen ABGs with ≥ 3 copies in B. oleracea, whereas the remaining 33 ABGs had 1–2 syntenic orthologs due to gene fractionation after the triplication event (Table 1).

In addition to the WGT-derived BoABGs, some tandem duplications were identified. According to the gene cluster defined by Alamery et al. (2018), we identified 21 tandemly duplicated BoABGs located in 10 gene clusters, with 2 clusters each on C03 and C05, one each on C04, C06 and C08, and three on scaffolds (Scaffold000133, Scaffold000121_P2 and Scaffold000173), whereas the remaining 67 genes were found as singletons in chromosomes (Fig. 3). The numbers of genes in clusters ranged from two to three in B. oleracea. A. thaliana and B. oleracea shared some common tandemly duplicated loci, e.g., MYB113, MYB114 and PAP1, indicating that these tandem duplications occurred before the divergence of the two species. We also identified tandem duplications solely present in B. oleracea; for instance, C4H, 4CL5, CHI, F3H and AtUGT79B1 produced tandemly duplicated copies in different sub-genomes (note: Bo4CL2.1 and Bo4CL2.2 possibly belong to one gene, in which case they were incorrectly predicted as belonging to two genes in the 02-12 reference genome, so Bo4CL2.1 and Bo4CL2.2 were not considered tandem duplications). All the tandem duplications were biosynthetic structural genes.

Expression profiles of BoABGs in various tissues

Most B. oleracea varieties accumulate low levels of anthocyanins, so we chose a white cabbage variety ’02-12’ with a low level of anthocyanins to analyze the expression profiles of BoABGs.

Based on RNA-seq data of cabbage variety ’02-12’ from the GEO database (GSE42891), the expression levels of all 88 BoABGs in different tissues, including root, callus, bud, silique, flower, stem and leaf tissues, were analyzed. A hierarchical map was constructed to show the abundance of the BoABGs in the seven tissues (Fig. 4, Supplementary Table S2). According to the RNA-seq data, the expression of BoABG members was highly variable among tissues, supporting the diversification of functions for the ABGs during B. oleracea development. There were 11 genes showing no expression in any of these seven tissues, which suggested that the genes are nonfunctional or inactive in this variety. Forty BoABGs were expressed in all seven tissues (FPKM > 0), and 18 genes showed constitutive expression (FPKM > 2 in all tissues), suggesting that BoABGs play roles at multiple developmental stages. Some genes, such as BoF3H.4, BoMYB111.2, BoCPC.2 and BoCPC.3, had very low expression levels (FPKM < 1), suggesting that they are less important for growth and development in B. oleracea. The root had the highest overall expression level, and the silique had the lowest overall expression level, although it was only slightly lower than that in the leaf. Interestingly, the root had the maximum number of unexpressed genes (39), with 20 lowly expressed genes (FPKM < 1), which included most of the early biosynthetic genes, late biosynthetic genes, transport genes and positive regulatory genes. However, some genes in the root, such as BoPAL2.1, BoPAL1.3, BoPAL2.2, Bo4CL1.1, BoC4H.1, BoC4H.5, BoPAL1.1, BoPAL1.2, and BoC4H.3, were much more highly expressed in the root and callus than in the other tissues. Above all, the expression in tissues indicated that members of the BoABGs might participate in different biological processes in B. oleracea.

Fig. 4
figure 4

Expression profiles of BoABGs in seven organs of cabbage ’02-12’. The expression level was indicated by a color scale representing FPKM-normalized log2-transformed counts. Red indicates high expression level, and blue indicates low level

In different processes of the biosynthetic pathway, most phenylpropanoid pathway genes and early biosynthetic genes showed high expression levels, which may indicate their activity in other branches of the flavonoid pathway. Most early biosynthetic genes, late biosynthetic genes, transport genes and positive regulatory genes showed low expression levels, whereas some negative regulatory genes (BoMYBL2.1, BoSPL9.1, BoMYB4.2 and BoLBD37.2) had high expression levels.

We also analyzed the expression variance among duplicated ABGs in B. oleracea (Fig. 5 Supplementary Table S2). The BoABGs that had more than one copy were selected to analyze the expression variance in B. oleracea. Analysis of RNA-Seq data generated from different B. oleracea tissues suggested that BoABG duplications had significant expression differentiation. In some duplicated BoABG pairs, such as those of BoPAL3, Bo4CL2, BoCHS, BoUGT79B1, BoMYB12, BoMYB113, BoEGL3, BoMYBL2, BoCPC3, BoSPL9 and BoLBD38, the differential expression levels between the copies were not significant, and most of them had low expression levels. For other genes, such as BoPAL1, BoPAL2, BoC4H, Bo4CL5, BoCHI, BoF3H, BoFLS, BoANS, BoUGT84A2, BoMYB111, BoMYB4, BoLBD37 and BoTT19, the expression levels differed significantly between the copies. For the duplicated ABGs residing in the same cluster, such as BoC4H.2 and BoC4H.3, BoC4H.4 and BoC4H.5, Bo4CL5.1 and Bo4CL5.2, Bo4CL5.3 and Bo4CL5.4, BoCHI.1 and BoCHI.2, and BoFLS2.3 and BoFLS2.4, one copy was much more highly expressed than the other.

Fig. 5
figure 5

Expression levels of duplicated anthocyanin biosynthetic genes in B. oleracea

Expression analysis of BoABGs in cultivars of red cabbage and ornamental kale with high levels of anthocyanins

Some B. oleracea varieties accumulate high levels of anthocyanins, such as red cabbage and some cultivars of ornamental kale. In contrast to the accumulation of anthocyanins in purple cauliflower, which is controlled by a semidominant locus (Chiu et al. 2010), that in red cabbage is believed to be controlled by multiple loci. We analyzed transcriptome data from the leaves of the red cabbage line 15Z-P (high anthocyanin accumulation) and compared them with transcriptome data from the white cabbage line 120QY-G (low anthocyanin accumulation) (Fig. 6; Supplementary Table S2). Almost all the BoABGs involved in the phenylpropanoid pathway showed lower expression levels in 15Z-P than in 120QY-G. However, the early biosynthetic genes, late biosynthetic genes and transport genes (BoTT19.1 and BoTT19.2) were largely upregulated, consistent with the upregulation of positive regulators, especially BoMYB113.1 and BoTT8, and the downregulation of the negative regulators BoMYBL2.1 and BoMYBL2.2. We selected some belonging to biosynthetic genes, transport genes and regulator genes for qRT-PCR validation, which suggested that expression levels of most genes were in accordance with the transcriptome data (Fig. 7). The results indicated that regulatory genes may be key genes responsible for the high level of anthocyanins in red cabbage.

Fig. 6
figure 6

Expression levels of BoABGs in red cabbage and ornamental kale. a Phenotype of the white cabbage cultivar 120QY-G. b Phenotype of the red cabbage cultivar 15Z-P. c Phenotype of the ornamental kale cultivar S2309. d The biosynthetic pathway of anthocyanin and expression levels of BoABGs in leaves of 120QY-G, 15Z-P and S2309. S2309-P indicates the inner purple leaf; S2309-W indicates the middle white leaf, S2309-G indicates the outer green leaf

Fig. 7
figure 7

Quantitative RT-PCR analysis of BoABGs expression in leaves of red cabbage 15Z-P and white cabbage 120QY-G. a Expression levels of structural BoABGs genes in 15Z-P and 120QY-G. b Expression levels of regulator and transport genes in 15Z-P and 120QY-G. Significant difference at P0.05 level is marked by *, and difference at P0.01 level is marked by **

We selected another plant material, an ornamental kale cultivar with triple-type (green, white, and purple from outer to inner) leaves (Fig. 6). All the BoABGs involved in the phenylpropanoid pathway were upregulated in the purple leaves, in accordance with the upregulation of their positive regulators BoMYB12.1 and BoMYB111s. Among the early biosynthetic genes, BoC4H.3, Bo4CL3, BoCHSs, BoCHIs, BoF3H.1, BoF3'H, BoFLS1, and BoFLS2.3 were upregulated. Among the late biosynthetic genes, BoDFR, BoANSs, BoUGT75C1, and BoUGT84A2s and the transport gene BoTT19s were upregulated, in accordance with the upregulation of the positive regulators BoMYB113.1 and BoTT8 and downregulation of the negative regulators BoMYBL2.2 and BoLBD37s.

These results indicated that the red cabbage and ornamental kale varieties differed in their phenylpropanoid pathways. However, they shared the activation of later biosynthetic processes by the positive regulators BoMYB113.1 and BoTT8 and negative regulator BoMYBL2.

Functional characterization of BoMYB113.1 and BoMYBL2.1

We introduced BoMYB113.1 and BoMYBL2.1 driven by the CaMV35S promoter into A. thaliana Col-0 wild-type plants, respectively. A total of 16 independent BoMYB113.1-OE lines and 18 BoMYBL2.1-OE lines were generated. In comparison with the brown seeds of wild-type plants, T2 transgenic lines of BoMYB113.1-OE had dark brown seeds and BoMYBL2.1-OE had light yellow seeds (Fig. 8a). Getting rid of carotenoid and chlorophyll pigments from seedlings by Norflurazon [(4-chloro-5-(methylamino)-2-[3-(trifluoromethyl)phenyl]-3(2H)-pyridazinone] enables a better visualization of anthocyanin. After treatment with Norflurazon, the BoMYB113.1-OE seedlings displayed increased level of purple anthocyanin, whereas BoMYBL2.1-OE seedlings displayed reduced level of purple anthocyanin (Fig. 8a). BoMYB113.1-OE seedlings without Norflurazon treatment also showed obvious over-accumulation of purple anthocyanins on leaves, petioles, veins and hypocotyls. However, less purple phenotype could be distinguished only on hypocotyls of BoMYBL2.1-OE seedlings. We selected BoMYB113.1-OE#4 with the highest abundance of BoMYB113.1 and BoMYBL2.1-OE#12 with the highest abundance of BoMYBL2.1 for measurement of anthocyanin contents. Direct quantification of anthocyanin levels confirmed the higher anthocyanin levels in BoMYB113.1-OE#4 and lower anthocyanin levels in BoMYBL2.1-OE#12 (Fig. 8b). These results indicated that BoMYB113.1 was a functional positive regulator and BoMYBL2.1 is a functional negative regulator of anthocyanin biosynthesis.

Fig. 8
figure 8

Functional characterization of BoMYB113.1 and BoMYBL2.1 in A. thaliana and cabbage. a A. thaliana seeds and seedlings showing over-accumulation of purple anthocyanins in BoMYB113.1-OE lines and down-accumulation of purple anthocyanins in BoMYBL2.1-OE lines compared with WT; The non-green seedlings were treated with norflurazon to get rid of carotenoid and chlorophyll pigments; the green seedlings were grown on normal MS plates. b Measurement of anthocyanins in A. thaliana WT and transgenic seedlings. Values are means ± SD of three biological replicates. The three groups a, b and c show significant differences with each other (P < 0.01). c Expression of BoMYB113.1 in control and BoMYB113.1-silenced cabbage plants. Significant difference at P0.01 level is marked by **. d The BoMYB113.1-silenced cabbage showed down-accumulation of purple anthocyanins; The BoMYBL2.1-silenced cabbage showed no distinguishable differences comparing with the control

As BoMYB113.1 is a positive regulator of anthocyanin biosynthesis showing high expression level in red cabbage and BoMYBL2.1 is a negative regulator showing high expression level in white cabbage. We conducted VIGS of BoMYB113.1 in red cabbage 15Z_P, and VIGS of BoMYBL2.1 in white cabbage 120QY_G using the CaLCuV-based VIGS system described previously (Xiao et al. 2020). The control 15Z_P plants showed obvious purple on hypocotyls, leaf veins and young leaves, whereas some BoMYB113.1-silenced plants showed less purple on true leaves and leaf veins (Fig. 8d). To investigate whether the phenotype correlated with reduced expression of BoMYB113.1 by VIGS, the expression levels of BoMYB113.1 were measured by qRT-PCR which showed that BoMYB113.1 was significantly downregulated in BoMYB113.1-silenced plants compared with the control plants (Fig. 8c). However, we observed the BoMYBL2.1-silenced plants showed no distinguishable differences with the control plants (Fig. 8d). We speculated that BoMYBL2.1 had minor function in anthocyanin accumulation in red/white cabbage, or it may function together with other genes, in consideration of red phenotype in cabbage is a quantitative character controlled by multiple genes.

Discussion

Brassica oleracea contains various health-promoting phytochemicals, including anthocyanins, glucosinolates, carotenoids, and vitamins. However, little is known about the genetic basis of anthocyanin biosynthesis in B. oleracea. In this study, 88 anthocyanin biosynthetic genes in B. oleracea were identified by comparative genomic analyses between A. thaliana and the cabbage variety 02-12. Most of the AtABGs in A. thaliana have orthologous BoABGs in B. oleracea, while five AtABGs, namely, FLS6, A3G6''p-CouT1, A3G6''p-CouT2, SCPL10, and MYB11, had no B. oleracea orthologs, indicating that these genes are not necessary for the biosynthesis of anthocyanins.

The number of gene copies may increase through WGD, tandem duplication, segmental duplication, and gene transposition duplication (Sémon and Wolfe 2007). The ABGs in cabbage outnumbered those in Arabidopsis. A. thaliana and Brassica species are derived from a common ancestor. After divergence from A. thaliana, Brassica genomes experienced a unique whole-genome triplication event (Wang et al. 2011), producing many WGT-derived AtABGs. However, polyploidization after WGD is usually accompanied by gene fractionation and loss (Freeling and Thomas 2006), so most ABGs retained fewer than three WGT-derived copies in the cabbage variety 02-12. Interestingly, the negative regulator of BoABGs retained more WGT-derived copies.

In addition to WGDs, genes may also be duplicated through tandem duplication. Some of these genes remain clustered, which may contribute to the expansion of gene families and facilitate the evolutionary process (Cheng et al. 2012a, b). We identified 10 tandem gene clusters comprising 21 tandemly duplicated BoABGs. Only five tandemly duplicated BoABGs, namely C4H, 4CL5, CHI, F3H and AtUGT79B1, which were biosynthetic structural genes, were produced after the divergence from Arabidopsis.

Anthocyanins are derived from branches of the flavonoid pathway, which starts with phenylalanine via the general phenylpropanoid pathway. The biosynthetic genes can be divided into phenylpropanoid pathway genes, early biosynthetic genes, late biosynthetic genes, transport genes and (positive and negative) regulatory genes (Shi and Xie 2014). Transcriptome analyses of the BoABGs using the cabbage variety 02-12 as the material revealed diverse expression patterns of the BoABGs. We found that before the side branch of anthocyanins was activated, the phenylpropanoid pathway genes and early biosynthetic genes had high expression levels. The phenylpropanoid pathway genes BoPAL2.1, BoPAL1.3, BoPAL2.2, Bo4CL1.1, BoC4H.1, BoC4H.5, BoPAL1.1, BoPAL1.2, and BoC4H.3 were abundant in all tissues and had high expression levels in the root and callus. The early biosynthetic genes BoCHS.3, BoCHI.3, BoF3H.1, BoFLS1, BoCHS.1, BoCHS.2, and BoF3'H had high expression levels in the bud and very low expression levels in the root. However, most of the late biosynthetic genes and positive regulators were expressed at low levels in all tissues, which may be the reason for the low anthocyanin levels in white cabbage. We also found that some duplicated BoABG genes showed significantly different expression patterns, an indicator of non-functionalization or neo-functionalization after gene duplication (Ganko et al. 2007).

Several cultivars of B. oleracea are rich in anthocyanins (Scalzo et al. 2008). Chiu et al. (2010) demonstrated that the activation of MYB2 (BoMYB113.1 in this study) is responsible for the formation of purple cauliflower. Yan et al. (2019) reported that this gene was also responsible for the formation of purple kale, kohlrabi, and cabbage, but this role was not functionally verified. Song et al. (2018) reported that BoMYBL2-1 (BoMYBL2.1 in this study) was a negative regulator in purple B. oleracea var. capitata. We used transcriptome data from red cabbage and a triple-leaf-type (green, white, and purple leaves from outside to inside) ornamental kale as materials to reveal the expression of BoABGs in these accessions. The results revealed that the positive regulators BoMYB113.1 and BoTT8 and the negative regulator BoMYBL2.1 may be key genes responsible for anthocyanin accumulation by upregulating early biosynthetic genes, late biosynthetic genes and transport genes. The function of BoMYB113.1 and BoMYBL2.1 was further confirmed by ectopic expression in Arabidopsis. In ornamental kale, BoMYB12.1 and BoMYB111s may also be key regulators by activating the phenylpropanoid pathway genes. Additionally, although SPL9 was reported as a negative regulator, BoSPL9s showed higher expression levels in red tissues than in non-red tissues of red cabbage and ornamental kale, indicating that this gene is not important for anthocyanin accumulation in B. oleracea.

Conclusion

Based on whole-genome comparative analysis between A. thaliana and B. oleracea, 88 anthocyanin biosynthetic genes were identified in B. oleracea. Most BoABGs presented multiple copies originating via WGD, tandem duplication and gene loss after the WGT event. We identified 21 tandemly duplicated BoABGs located in 10 gene clusters, most of which formed before the divergence of A. thaliana and B. oleracea.

Expression analysis revealed diverse expression patterns of BoABGs in different tissues and the expression differentiation of BoABG duplications, indicating non-functionalization or neo-functionalization of the duplications after polyploidization. Expression analysis of BoABGs in red cabbage and ornamental kale revealed that the positive regulators BoMYB113.1, BoTT8, BoMYB12.1 and BoMYB111s and negative regulator BoMYBL2 may be key genes responsible for anthocyanin accumulation by upregulating phenylpropanoid pathway genes, early biosynthetic genes, late biosynthetic genes and transport genes. Functional characterization by ectopic expression in A. thaliana confirmed the positive role of BoMYB113.1 and negative role of BoMYBL2.1 in anthocyanin accumulation; and by VIGS in cabbage further confirmed the positive role of BoMYB113.1 in anthocyanin accumulation. The results of this study should improve our understanding of the anthocyanin biosynthetic genes in B. oleracea.

Author contribution statement

HL and ZF conceived and designed the work. FH and XZ performed the experiments and analyzed the data. FH and XZ wrote and revised the manuscript. YZ, YL, ZL, LY, MZ, and YW analyzed the data and revised the manuscript. All authors have read and approved the final manuscript.