Introduction

Ornamental kale (Brassica oleracea L. var. acephala) is a popular ornamental plant owing to its colorful leaves of varied shapes. The complete genome sequence of B. oleracea has been released to the Brassica database (BRAD; http://brassicadb.org/brad/), enabling functional genomics research in B. oleracea crops such as cabbage (B. oleracea L. var. capitata), red cabbage (B. oleracea L. var. capitata rubra), savoy cabbage (B. oleracea L. var. sabauda), ornamental kale (B. oleracea L. var. acephala), collard (B. oleracea L. var. viridis), kohlrabi (B. oleracea L. var. gongylodes), Brussels sprouts (B. oleracea L. var. gemmifera), kai-lan (B. oleracea L. var. alboglabra), broccoli (B. oleracea L. var. italica), and cauliflower (B. oleracea L. var. botrytis) (Liu et al. 2014). The genome information can be found in another two Web sites, Bolbase (http://www.ocri-genomics.org/bolbase/, v1.0) and EnsemblPlants (http://plants.ensembl.org/Brassica_oleracea/Info/Index).

The leaf is an important plant organ that plays a role in the photosynthetic supply of nutrients, gas exchange, light energy absorption, and water transport. These functions are strongly affected by leaf shape and size (Tsukaya 2005). Leaves can be divided into two categories, simple (single lamina) or compound, and can be distinguished by leaf margins, which can be entire, serrated, or lobed (Sinha 2004). Lobed leaves are adapted to environmental stresses, such as strong winds, water stress, and light energy absorption (Semchenko and Zobel 2007; Vogel 2009; Peppe et al. 2011). Furthermore, lobed leaves are an important decorative trait for ornamental crops such as kale (Zhang and Shen 2007). In ornamental kale, lobed-leaf double-haploid lines can be distinguished from other entire-leaf double-haploid (DH) lines when the first true leaf appears and presents deep-lobed features. As the plant develops, these differences become more obvious and complex, with leaves exhibiting two to three levels of cleavage. Whereas entire leaves present a complex network of veins, the third-level veins in lobed leaves are few, poorly developed, and disordered (Fig. 1).

Fig. 1
figure 1

Phenotypes of the parental ornamental kale lines and F2 populations. a Y005-8. b JC007-2. ce Different phenotypes of the individuals in F2 populations: c lobed-leaf kale, d Serrated-leaf kale. e Entire-leaf kale

Key genes regulating leaf margins in plants have been identified using mutants and by reverse genetics. These genes belong to several families, namely the KNOTTED1-like homeobox transcription factor gene family, which includes KNAT1(BP), KNAT2, KNAT6, and STM (Hay and Tsiantis 2006; Piazza et al. 2010); the BEL1-like TALE family, which includes SAW1 and SAW2 (Kumar et al. 2007); the MYB transcription factor family, which includes AS1 (Byrne et al. 2002; Guo et al. 2008; Ikezaki et al. 2010; Takahashi et al. 2013); the LOB domain family, which includes AS2 and AT5G63090 (Ikezaki et al. 2010; Bell et al. 2012; Ishibashi et al. 2012; Takahashi et al. 2013); the LATERAL ORGAN BOUNDARY DOMAIN family, which includes JAGGED LATERAL ORGAN (JLO) (Borghi et al. 2007); the NO APICAL MERISTEM (NAC) transcription factor family, which includes CUC1, CUC2, and CUC3 (Hibara et al. 2006; Kamiuchi et al. 2014); the C2H2 transcription factor family, which includes JAGGED (Dinneny et al. 2004; Ohno et al. 2004); the TCP transcription factor family, which includes TCP3 (Koyama et al. 2010); the BTB/POZ family, which includes BOP1 and BOP2 (Ha et al. 2003; Norberg et al. 2005); and a homeodomain leucine zipper class I (HD-Zip I) meristem identity regulator, LATE MERISTEM IDENTITY1 (LMI1) (Saddic et al. 2006). The genes PIN1, which is involved in auxin hormone transport, and GA20ox and GA2ox, which are involved in gibberellin (GA) biosynthesis and degradation, respectively, have also been reported to play a role in the regulation of plant leaf margins (Sakamoto et al. 2001; Barkoulas et al. 2008).

Although some studies have conducted genetic analyses of lobed leaves from ornamental kale, their conclusions on lobed-leaf inheritance were inconsistent. Using F2 segregation populations, Gu et al. (2002) found that the lobed-leaf trait had incomplete dominance over the entire-leaf trait. Xie (2003) crossed a lobed-leaf kale with an entire-leaf kale and found that the F1 phenotypes tended to be intermediate but in favor of the female parent. Using reciprocal cross experiments, Zhang and Shen (2007) reported that a nuclear gene controlled incomplete dominance in the lobed-leaf trait. Zhu et al. (2016a) constructed an F2 segregation population by crossing a flat-leaved ornamental kale inbred line, 0835, and a feathered-leaved inbred line, 0819, and checked the leaf length, width, index, and number of serrated leaf margins in rosette leaves. The results indicated that more than one gene controlled the lobed-leaf trait. Feng (2016) further mapped the gene BoFeL on chromosome 9 with five linked markers. So far, one candidate gene, i.e., BrGA20OX3, related to leaf margin has been cloned in Chinese cabbage (Deng et al. 2012). The KNOX protein binds to the first intron of GA20OX3 and represses the gene expression. Further, there are some indels/deletions in the first intron of BrGA20OX3 in serrated leaf. Ni et al. (2015) fine-mapped a lobed-leaf gene (BnLl) in rapeseed, and an AT5G03790 ortholog, Bra009510, was considered a candidate gene. Other studies have mainly concentrated on mapping the gene (Hui 2011; Yan 2011; Tu et al. 2013).

In this study, we isolated a lobed-leaf DH line and genetic inheritance showed that the lobed-leaf trait in ornamental kale was quantitatively inherited. The quantitative trait loci-sequencing (QTL-seq) and linkage analysis showed that the gene is located on chromosome 9 (C09) and delimited to a 75.3-kb interval (BRAD, chromosome v1.0) or a 79.6-kb interval (EnsemblPlant). Whole-genome resequencing and sequence comparison analysis showed no variations in the promoter and gene sequence of two possible candidate genes (Bol010029 and Bol010030/Bo9g1181720), whereas Bol010025 and Bol010031/Bo9g181730 were co-dominant in all the recombinant individuals. Three possible hypotheses were proposed to explain the formation of lobed leaf. The aims of the present study were to map the gene(s) controlling the lobed-leaf trait in ornamental kale and lay a foundation to study the mechanism of lobed leaf in ornamental kale.

Materials and methods

Plants

The DH lines Y005-8 and JC007-2 were obtained from microspore cultures of ornamental kale. These double-haploid lines exhibit stable inheritance after multiple-generation breeding. The leaves of Y005-8 are red and lobed, whereas those of JC007-2 are white and entire (Fig. 1). Plants from both lines were cultivated in a greenhouse at Shenyang Agricultural University, Shenyang, China. The F1 plants were generated using Y005-8 as the female and JC007-2 as the male parent, and F2 plants were generated by self-pollination of F1 plants. The lobed-leaf phenotype was assessed at the six-leaf stage. Plant genotypes were determined by statistically analyzing the numbers of plants with lobed leaves versus those with entire leaves in the F1, F2, and four BC1 populations [i.e., (Y005-8 × JC007-2) × Y005-8, (Y005-8 × JC007-2) × JC007-2, (JC007-2 × Y005-8) × Y005-8, and (JC007-2 × Y005-8) × JC007-2]. Fifty lobed-leaf individuals and 50 entire-leaf F2 individuals together with Y005-8 and JC007-2 were selected for QTL-seq, whereas 903 F2 individuals with entire leaves together with Y005-8 and JC007-2 were used for verifying the results of QTL-seq and fine mapping.

Whole genome resequencing and QTL-seq analysis

Two bulks of individuals from the mapping F2 population, each containing 50 individuals presenting either the extreme-lobed or the entire-leaf phenotype, were set. Genomic DNAs were isolated from the individuals within the two bulks and parental populations using the Plant Genomic DNA Kit (Tiangen, Beijing, China), following the manufacturer’s instructions, and then purified using the Zymo Research DNA Clean & Concentrator (Zymo Research, Shanghai, China), according to the manufacturer’s instructions. The quantity and quality of the DNAs were ensured using spectrophotometric analysis and agarose gel electrophoresis. Equally high-quality genomic DNA from four samples was used to construct DNA library following the manufacturer’s instruction using the NEBNext® Ultra™ DNA Library Prep Kit for Illumina (NEB, USA), with 130 mean coverage. Finally, 150-bp paired-end reads were generated. The raw reads obtained were filtered using Trimmomatic v0.32 (Bolger et al. 2014), which removed adaptor sequences, low-quality bases (Q < 20), average four-base mass < 20, and reads shorter than 36 nucleotides. The resulting clean reads were mapped to the reference genome (BRAD, Brassica oleracea, v1.0) using Bowtie 2 v2.1.0 (Langmead and Salzberg 2012), and single nucleotide polymorphism (SNP) calling was performed in GATK (The Genome Analysis Toolkit; http://www.broadinstitute.org/gatk/). All positions with at least 25% of the bases different to the majority call were identified as heterozygous SNPs, whereas all positions with ≥ 80% of bases different to the reference and a support of at least three nonrepetitive reads were identified as homozygous SNPs. The SNP-index and Δ(SNP-index)-led QTL-seq assay were employed to define the target QTL genomic interval, and the confidence interval was defined to be 95% as described previously (Takagi et al. 2013). Fisher exact test was also used to verify the results according to the following calculation formula (https://en.wikipedia.org/wiki/Fisher%27s_exact_test):

$$ p=\frac{\left(a+b\right)!\left(c+d\right)!\left(a+c\right)!\left(b+d\right)!}{a!b!c!d!n!} $$

Among them, a, b, c, and d represent SNP reads of different phenotypic samples, respectively; n is the total number of a, b, c, and d. The different indels and SNPs between the two parents were also identified based on the resequencing data from the two parents.

DNA isolation and amplification

Genomic DNA was extracted from fresh leaves of parent and F2 individuals at the six-leaf stage using cetyltrimethylammonium bromide (CTAB) and the method of Murray and Thompson (1980), with minor modifications for verifying the QTL-seq analysis. Genomic DNA concentrations were estimated using agarose gel electrophoresis. Polymerase chain reaction (PCR) amplification was carried out in a 10-μL reaction solution containing 25 ng template DNA, 0.8 μL 2.5 mM dNTPs, 1.0 μL Taq PCR buffer with Mg2+, 1.0 μL of 0.4 μM forward and reverse primers, and 0.25 U Taq polymerase. The PCR program comprised 5 min at 95 °C followed by 30 cycles of 95 °C for 30 s, 56 °C for 30 s, 72 °C for 1 min, and a final elongation period of 5 min at 72 °C. The PCR products were separated on a 5% denaturing polyacrylamide gel in 1× Tris/Boric Acid/EDTA buffer and visualized with silver staining (Sanguinetti et al. 1994).

SSR and indel marker development and data analysis

The candidate region and other possible regions for BoLl identified by QTL-seq were verified using the 30 SSR markers distributed on chromosome 9, which has already been designed by our lab previously based on the genome sequence from BRAD, v1.0. Forty-three Indel markers were designed based on the resequencing data, using Primer Premier 5.0 (http://www.premierbiosoft.com/primerdesign/), to narrow down the interval. Polymorphism analysis of the parental lines was performed using the SSR and Indel markers. Six polymorphic SSR markers and six Indel markers between the two parental lines were used to survey 30 recessive homozygous individuals from F2 populations. The linked markers were then applied to the 873 recessive homozygous individuals from F2 (Table S1). Map distances were calculated using Kosambi’s (1943) mapping function. Segregation data for the polymorphic markers and leaf traits were used to construct a linkage map of the F2 population on MapChart (Voorrips 2002).

Identification of genes influencing leaf morphogenesis in B. oleracea

Sequences of genes and proteins that influence leaf morphogenesis in Arabidopsis thaliana were acquired from the Arabidopsis Information Resource (TAIR) database (http://www.arabidopsis.org/) and the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nig.gov). Sequences of genes influencing leaf morphogenesis in B. oleracea were obtained from BRAD (chromosome v1.0). The A. thaliana gene and protein sequences were aligned with the genome and protein sequences of B. oleracea using the basic local alignment search tools BLASTN and BLASTP, with a cutoff E-value ≤ 1 × 10−10 and coverage ≥ 0.75. According to sequence similarity (cutoff E ≤ 1 × 10−20) and to the collinearity of flanking genes, syntenic orthologs between A. thaliana and B. oleracea were identified.

Candidate gene prediction

All genes within the targeted interval were identified using annotations from BRAD. Differences between the genes from the two parental lines were assessed using resequencing data. All SNPs and indels were verified by cloning. The PCR primers designed to amplify the full-length sequences of the candidate genes and promoters from genomic DNA of Y005-8 and JC007-2 are indicated in Table S2. The PCR products were purified using the Gel Extraction Kit (CWBIO, Beijing, China), introduced into the PMD 18-T Vector (Takara, Dalian, China), and transformed into TOP10 competent cells (CWBIO). The recombinant plasmids were sequenced by Genewiz (Tianjin, China) and sequences were aligned using DNAMAN 6 (https://www.lynnon.com/).

RNA isolation and transcript analysis

Total RNA was extracted from the inner leaves of two mature parent plants using the Total RNA Purification kit (LC Science, Houston, TX, USA; TRK1001), according to the manufacturer’s protocol. RNA quantity and purity were monitored using the Bioanalyzer 2100 and the RNA 6000 Nano LabChip kit (Agilent Technologies, Santa Clara, CA, USA), according to the manufacturer’s protocol. Poly(A) mRNA was purified from total RNA using oligo (dT) magnetic beads, and fragmented into small pieces by the addition of a fragmentation buffer. First-strand cDNA was synthesized using random hexamer primers, and second-strand cDNA was synthesized using buffer, dNTPs, RNaseH, and DNA polymerase I. The library products were then paired-end sequenced using the Illumina HiSeq 4000 platform (Illumina) at LC Science (Hangzhou, China), according to the manufacturer’s protocol. The clean reads were mapped to the reference genome (http://brassicadb.org/brad/index.php, v1.0) in HISAT (Kim et al. 2015), with a maximum of two mismatches and up to 20 multiple alignments per reads. Differentially expressed genes (DEGs) were identified using the R package Ballgoen (https://www.r-project.org/) with log2 (fold change) > 1 and p value < 0.05 as the significance thresholds.

Analysis of gene expression patterns

Total RNAs from parental plant leaves were extracted during the early and ornamental stages using TRIzol (Invitrogen, Carlsbad, CA, USA), according to the manufacturer’s protocol. RNA integrity was monitored using 1% agarose gel electrophoresis. Aliquots of RNA (1 μg) were reverse-transcribed according to the manufacturer’s instructions of HiScript® II 1st Strand cDNA Synthesis Kit (Vazyme, Nanjing, China), and the resulting cDNA was diluted 50-fold. Quantitative real-time PCR (qRT-PCR) was carried out in 50-μL reactions comprising 1 μL diluted cDNA, 10 μL SYBR Green PCR mix (CWBIO), and 0.5 μL of each primer (100 nM final concentration). The candidate gene-specific primers used were Bol010029-L 5′-CGAGATGATAAAGAAGAAGCAGAAAC-3′, Bol010029-R 5′-CAAACAGCTACCTGACGTGGC-3′, Bol010030-L 5′-AAAGAAGAAGAGACTAACGAGTGGACA-3′, Bol010030-R 5′-GCGATCTGACGTGGCTGC-3′. Actin was used as the internal reference gene, and it was amplified by qRT-PCR using primers Actin-L (5′-GGTCGTGACCTTACTGATTACCTCA-3′) and Actin-R (5′-GAAGTCTCCATCTCCTGCTCGT-3′). The qRT-PCR conditions were as follows: 95 °C for 60 s, followed by 40 cycles of 95 °C for 10 s and 60 °C for 20 s. Melt curve analysis (55–95 °C) was performed to confirm the specificity of the PCR amplification. The qRT-PCR experiments were performed in QuantStudio 6 (Applied Biosystems, Foster City, CA, USA). Three biological and three technical replicates were included for each experiment. Relative expression was calculated using the 2−ΔΔCt method (Livak and Schmittgen 2001).

Bioinformatics analysis of the genes

The protein sequences of Bol010025, Bol010029, Bol010030, and Bol010031 were analyzed using HMMER (http://plants.ensembl.org/hmmer/index.html/), and all gene and protein sequences were aligned using DNAMAN 6. The sequence of AT5G03770, Bol010031, and BnaCnng025800 from Arabidopsis, Brassica oleracea, and Brassica napus was downloaded from BRAD. The secondary structures of the proteins from the two parental lines were predicted using PredicProtein (https://open.predictprotein.org/). Cis-acting regulatory elements were predicted using PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/).

Results

Genetic analysis

Plants from the F1, F2, and BC1 populations at the six-leaf stage were used to investigate the inheritance pattern of the lobed-leaf trait. Leaf shape in F1 plants from reciprocal crosses between Y005-8 and JC007-2 was intermediate between lobed and entire. The F2 population contained lobed-, serrated-, and entire-leaf plants, and the phenotypes showed continuous variation. These results suggested that the lobed-leaf trait in ornamental kale was quantitatively inherited (Fig. 1).

Candidate QTL for lobed leaf identified by QTL-seq

Overall, 1276.4 million reads were obtained from the two parental and two bulk populations, 1048.1 million reads were mapped (Table S3), and 752,678 SNPs were identified between the two parental populations. The SNP index and Δ(SNP index) of the two bulk populations were calculated and plotted to the genome position (Fig. 2). The highest peak region, which was considered as the candidate interval for the QTL, was located on chromosome 9 (C09) and contained 38.82–40.12 Mb (BRAD, v1.0).

Fig. 2
figure 2

ΔSNP-index and –log10(P value) plots used for locating the candidate interval for the major QTL. a Δ(SNP-index) plot from QTL-seq analysis. b –Log10(P value) plot from QTL-seq analysis. c Candidate QTL region identified in B. oleracea chromosome C09 (38.82–40.12-Mb interval). Arrows point out the major QTL

Validation of the identified QTL by SSR and Indel markers

Traditional map-based clone strategy with minor reversion was used to validate the QTL region identified by QTL-seq analysis. Thirty SSR markers distributed on C09 were used to detect the polymorphic markers between the two parental lines. Six of these markers were polymorphic. Thirty entire-leaf individuals were used to further confirm if the markers were linked to the candidate gene BoLl; Marker RJ0927 was linked to this gene. The other markers around other peaks were not linked to the gene. Another 873 recessive homozygous individuals were used for gene mapping, which showed that RJ0927 was linked to BoLl at a genetic distance of 1.33 cM. Primers for Indel markers close to the QTL were then developed for PCR amplification-based detection and one Indel marker, LYIn30, was found to be linked to BoLl. These results indicated that BoLl is located between markers LYIn30 and RJ0927, at a genetic distance of 1.38 cM and 1.33 cM, respectively (Fig. 3). These results showed that the interval found by QTL-seq analysis was indeed the QTL region.

Fig. 3
figure 3

Genetic and physical maps of the BoLl gene locus, and candidate gene analysis. a Map of the BoLobed-Leaf (BoLl) locus on chromosome 9 built using five markers. b Genetic and physical maps of the candidate genes in the targeted interval

Fine-mapping of BoLl

To further shorten the mapping interval and identify the candidate genes for BoLl, 23 indel markers were developed within the targeted gene region, according to the B. oleracea genome sequence from BRAD (chromosome v1.0). Five of these markers (LYIn35, LYIn39, LYIn29, LYIn26, and LYIn40) were polymorphic between the two parental lines and linked to BoLl. While LYIn35 and LYIn39 were on the same side as LYIn30, at a distance of 0.50 cM and 0.17 cM from BoLl, respectively, LYIn40 and RJ0927 were on the opposite side at a distance of 0.11 cM from BoLl (Fig. 3). The physical distance between the two most closely linked markers, LYIn39 and LYIn40, was estimated as 75.3 kb. LYIn29 and LYIn26 showed co-segregation with BoLl in all F2 individuals (Fig. 3).

Gene expression levels in the two parental lines

Twenty-three different gene families known to influence leaf morphogenesis have been described in A. thaliana, and 112 homologs were found in B. oleracea (Table S4), of which 10 (Bol011529, Bol011530, Bol011402, Bol010029, Bol010030, Bol043862, Bol038831, Bol038837, Bol038832, and Bol038834) were mapped to C09 in B. oleracea. Furthermore, 20 genes were assigned to different scaffolds that have not been anchored to any chromosome of the B. oleracea genome v1.0. Genomic data from BRAD indicated that the 75.3-kb mapped region contained eight genes (Table 1). Of the ten genes located on chromosome 9, only two genes (Bol010029 and Bol010030) were located in the interval. Bol010029 and Bol010030 were syntenic with the known leaf shape modification gene LMI1/ATHB51, and Bol010029 and Bol010030 also formed a tandem array. LMI1 is a meristem identity regulator that plays an important role in controlling leaf shape and size in Arabidopsis. Based on the functional annotation of these genes in the target QTL interval, we hypothesized that either Bol010029 or Bol010030 or both were the candidate genes on the BoLl locus.

Table 1 Annotation of the genes included in the mapped interval within chromosome 9

We also analyzed the expression patterns of the 112 genes influencing leaf morphogenesis in ornamental leaves. The fragments per kilobase of transcript per million mapped reads (FPKM) values obtained for 30 of the 112 genes analyzed were zero, and the genes may not be expressed in older leaf. The expression levels of half of the 112 genes were similar between the two parental lines. The expression level of other 30 genes, including KNOX, NAC transcription factor, and LOB DOMAIN gene, among others, varied greatly between lobed and entire leaves. However, four DEGs were identified, and the expression of two of these genes (Bol028963 and Bol024911) was upregulated and that of the other two DEGs was downregulated (Bol040876 and Bol038837) (Fig. 4a, Table S4). We also found that the expression level of Bol010029 in lobed leaves was significantly higher than that in entire leaves, although the FPKM values of lobed and entire leaves were 0.09 and 0, respectively.

Fig. 4
figure 4

Gene expression patterns in ornamental kale. a Expression patterns of genes related to the lobed-leaf trait. b Expression patterns of genes in the target QTL interval. c Expression levels of genes in young leaves from two parents

The expression patterns of the eight candidate genes were also investigated, but only one DEG, Bol010024, was found. The FPKM values for Bol010025 and Bol010026 from the two parental lines were zero (Fig. 4b), while the expression level of Bol010031 in leaves was similar in the two parental lines. Furthermore, Bol010025 and Bol010026 might not be expressed at the time point.

A previous study showed that LMI was highly expressed in young leaves in Arabidopsis (Saddic et al. 2006), and therefore, we detected the expression level of these two genes in young kale leaves. The expression levels of the two putative LMI genes (i.e., Bol010029 and Bol010030) in lobed leaves were significantly higher than that in entire leaves (Fig. 4c).

Sequence analysis

The Bol010029 and Bol010030 genes and promoter sequences from Y005-8 and JC007-2 were cloned using six primer pairs (Table S2). Two percent agarose gel electrophoresis showed that no polymorphisms were detected by these primer pairs. The PCR products of all primer pairs were sequenced with three replications and all sequences were compared, but showed no differences between the parental lines. We then compared the gene and promoter sequences of the remaining six genes based on resequencing data, which showed the presence of some SNPs and indels in the parental lines (Table S5). Cloning results revealed variations in three genes (Bol010024, Bol010025, and Bol010031) (Fig. S1). The sequence of Bol010026 was almost identical between Y005-8 and JC007-2, and according to the resequencing data, nine single base differences were located in the promoter region; however, no SNP was detected in the promoter region by cloning and sequencing. We then cloned Bol010024, Bol010025, and Bol010031 from the five recombinant individuals between the most closely linked markers (LYIn39 and LYIn40), and Bol010025 and Bol010031 sequences showed co-dominance in all recombinant individuals (Fig. S2).

As some genes related to lobed leaf were not aligned to any chromosome, we also aligned the sequence of the two most closely linked markers, LYIn39 and LYIn40, to the B. oleracea genome on EnsemblPlant, and the physical distance between the two most closely linked markers was estimated as 79.6 kb. Eleven genes were located in the interval, namely, Bo9g181640, Bo9g181650, Bo9g181660, Bo9g181670, Bo9g181680, Bo9g181690, Bo9g181700, Bo9g181710, Bo9g181720, Bo9g181730, and Bo9g181740. The sequence of Bo9g181640 and Bol010024, Bo9g181670 and Bol010027, Bo9g181720 and Bol010030, and Bo9g181730 and Bol010031 showed high similarity. A part of the sequence of Bo9g181660 and Bol010026, Bo9g181700 and Bol010028, and Bo9g181710 and Bol010029 was compared with each other. The sequence of Bo9g181650, Bo9g181680, Bo9g181690, and Bo9g181740 was not compared with any of the eight genes. We then cloned Bo9g181660, Bo9g181700, Bo9g181710, Bo9g181650, Bo9g181680, Bo9g181690, and Bo9g181740 between the two parents. No differences in Bo9g181660, Bo9g181700, Bo9g181710, and Bo9g181650 gene sequences were found in the parental lines. Only a part of the sequence of Bo9g181690 and Bo9g181740 was identified in the parental lines, and we inferred that the two genes did not exist in the parental lines (Fig. S1). Furthermore, the initial codon (ATG) of Bo9g181680 was not found in the parental lines, and we inferred that the gene did not exist in the parental lines (Fig. S1).

In summary, we inferred that one or more of Bol010029, Bol010030/ Bo9g181720, Bol010025, and Bol010031/Bo9g181730 may be related to the formation of lobed leaf.

Bioinformatics analysis of the candidate genes

Among the four candidate genes, only three, Bol010029/ Bo9g181710, Bol010030/ Bo9g181720, and Bol010031/Bo9g181730, contained conversed homeobox, homeobox, and Glycos_transf_N domain, respectively. Because Bol010031/Bo9g181730 differed between lobed and entire leaves, we analyzed the gene and promoter sequences of the two parental lines. The Bol010031/Bo9g181730 gene, located on C09, contains 2805 bp, 11 exons, and 10 introns. The full coding region length is 1362 bp, and the protein sequence comprises 453 amino acids. Compared with the JC007-2 gene sequence, the Y005-8 sequence has a 6-bp insertion in exon 1, a 2-bp change in exon 1, and a single-bp change in exon 4 (Fig. S1E). These polymorphisms result in two amino acid insertions (Ala and Pro, 9 aa and 10 aa, respectively), and two amino acids substitutions (Thr to Cys and Cys to Thr; Fig. S1F). The gene contains a glycosyl transferase N domain, spanning amino acids 55 to 236, and one of the substituted amino acids in the lobed-leaf phenotype (Met) is located in this domain. We then compared 3-deoxy-D-manno-octulosonic acid transferase-related genes among Cruciferae species. In the lobed-leaf kale, the substituted amino acid was Met, while in other species, including the entire-leaf kale, this amino acid position harbored a Leu (Fig. S3).

We also compared the secondary structure of Bol010031/Bo9g181730 in Y005-8 and JC007-2. We found three binding regions and a disordered region changes in the lobed-leaf lines (Fig. S4). As there were insertions in the promoter region of Bol010031/Bo9g181730 in Y008-5, cis-acting regulatory elements were also investigated. The results showed that the insertions led to specific motifs in the lobed-leaf kale, including ACE, Box-W1, GNC4, P-box, W-box, and two other unnamed motifs. Furthermore, some motif sequences were disrupted, such as the 5′ untranslated region (UTR) Py-rich stretch, Box II, CATT motif, GAG motif, GATA motif, Gap-box, and an O2-site (Table S6).

Discussion

Research examining the leaf shape of ornamental kale has been previously conducted, with some studies reporting that the lobed-leaf trait in ornamental kale is controlled by a single dominant gene, while others have shown that the lobed-leaf trait is a quantitative phenotype, controlled by several major QTLs (Gu et al. 2002; Xie 2003; Zhang and Shen 2007; Zhu et al. 2016a). In the present study, we investigated the leaf shape of F1, F2, and BC1 populations of ornamental kale. In F2, lobed-, serrated-, and entire-leaf phenotypes were found suggesting that the mechanism underlying lobed-leaf formation in kale is complex.

LMI is a member of the HD-Zip transcription factor family. To date, LMI1-like genes have been found in several plants and were reportedly involved in leaf shape formation (Saddic et al. 2006; Andres et al. 2014; Sicard et al. 2014; Vlad et al. 2014; Vuolo et al. 2016; Andres et al. 2017). An AT5G03790 ortholog, Bra009510, was considered the candidate gene in lobed-leaf rapeseed (Ni et al. 2015). Saddic et al. (2006) reported that LMI plays an important role in simple serrate leaves and that it is strongly expressed in the margin of young leaves rather than in old leaves. The loss of LMI leads to the higher expression level of KNOX1 transcription factor (KNAT1), and LMI has been reported to negatively regulate BLADE-ON-PETIOLE (BOP) gene expression and further influence the expression of JAGGED to suppress leaflet and bract formation. One LMI1 homolog, REDUCED COMPLEXITY (RCO), has been found to play an important role in leaflets, and it is considered to arise from the duplication of LMI1-type sequences within Brassicaceae (Vlad et al. 2014). Furthermore, RCO is reported to have a key role in the formation of the complex leaf shape formation in A. thaliana and the simple leaf shape is due to the loss of RCO. After analyzing the well-established auxin activity role in leaf shape formation, Vlad et al. 2014 proposed that RCO acts parallel or downstream of auxin activity. Vuolo et al. (2016) also found that the introduction of RCO and ChKMI1 into col-0 A. thaliana decreased leaf area. The LMI1 gene has also been considered as an evolution hotspot for leaf shape diversity in model plants (Andres et al. 2017). Up to now, three class I HD-ZIP genes have been found in Brassicaceae, including the RCO locus, and the function of these three genes has also been identified. The locus was formed through two independent gene duplication events, and the three copy genes were denoted as positions 1, 2, and 3. Position 1 gene, also known as LMI1, was identified by Saddic et al. (2006) and it influences leaf serration in A. thaliana. Position 2 gene was found by Vlad et al. (2014), and it promotes the formation of leaflets in the compound leaf of Cardamine hirsuta. Position 3 gene controlled leaf shape variation in Capsella sp. (Sicard et al. 2014). The loss of position 2 and position 3 genes leads to the formation of simple leaves. Streubel et al. (2018) found that position 2 and 3 genes acted independently and that these three genes played different roles in different parts of the leaf. Other studies have also identified LMI as the candidate gene for lobed leaf. Andres et al. (2014) mapped L20 to a 5.4-cM interval using genetic markers and identified two LMI1-like candidate genes for the lobed-leaf trait in cotton. The STS marker 13-LS-195 obtained from the candidate gene Gorai.002G244000 co-segregated with the entire-leaf phenotype in F2 cotton populations, and was thus proposed as a candidate gene for the lobed-leaf trait in cotton. Ni et al. (2015) considered Bra009510, an orthologous of AtLMI1, as the candidate gene for the lobed leaf gene in rapeseed (Brassica napus L.). Andres et al. (2017) identified LMI1-like gene as responsible for okra leaf shape formation in Upland cotton (Gossypium hirsutum L.) and virus-induced gene silencing of the LMI1-like gene recovered the normal shape of the leaf. Zhu et al. 2016b identified GhOKRA as the candidate LMI1-like gene for okra leaf formation. Although few studies have focused on the KDTA gene, Seveno et al. (2010) identified the AtKDTA gene and suggested that it might be involved in the synthesis of a non-identified mitochondrial lipid A-like molecule. However, cessation of this gene expression did not lead to phenotypic changes. Li et al. (2011) also found that the AtKDTA product was located in mitochondria.

In the present study, we found two LMI1-like genes in the mapped QTL interval. Although no differences were found in promoter or gene sequences, the expression levels of these two genes were significantly higher in young lobed leaves than in young entire leaves, which is consistent with the results of Saddic et al. (2006). Only Bol010025 and Bol010031/Bo9g181730 showed co-dominance in the recombinant individuals. Bol010031/Bo9g181730 was identified as a 3-deoxy-D-manno-octulosonic acid transferase-related gene (KDTA) involved in the biosynthesis of lipid A, a phosphorylated glycolipid, which serves as a structural component of the outer membrane of mitochondria and chloroplasts in plants, and may be involved in signal transduction and plant defense responses. However, there is no evidence that the 3-deoxy-D-manno-octulosonic acid transferase-related gene is involved in leaf margin formation. Because there were un-annotated genomic sequences in the interval between Bol010029/Bo9g181710 and Bol010030/Bo9g1181720, three hypotheses were proposed to explain the formation of the lobed leaf. One is that the 3-deoxy-D-manno-octulosonic acid transferase-related gene involves in the lobed leaf formation. The second is that there is an un-annotated gene in the interval that is related to leaf shape formation or that regulates the expression of the LMI-like gene in the lobed leaf. The third hypothesis is that another HD-ZIP gene is located between Bol010029/Bo9g181710 and Bol010030/Bo9g181720, and that a variation in this gene leads to lobed-leaf formation. In the future studies, we will use a larger mapping population to narrow the target QTL interval and transgenic methods will be applied to further confirm the model of lobed-leaf formation in ornamental kale.