Introduction

In the seed of oil crops, triacylglycerols (TAGs) stored in lipid droplets (LDs) represent the major energy and carbon reserve for seed germination and subsequent seedling growth (Huang 2018; Guzha et al. 2023). LDs are enclosed by a layer of phospholipids and several structural proteins (Hsieh and Huang 2004). Among these proteins, oleosins (OLEs) are most abundant and feature a long central hydrophobic hairpin of approximately 72 residues (X30PX5SPX3PX30), which penetrates the TAG core and functions to stabilize the LD (Abell et al. 1997; Shimada et al. 2008; Brocard et al. 2017). On the contrary, sequences at the N- and C-terminals, which lie on the phospholipid surface, are usually amphipathic and variable (Huang 2018). Generally, oleosins possess a small molecular weight of 14–30 kilodalton (kDa) (Huang and Huang 2015; Zou et al. 2022a). Since their first appearance in green algae, the oleosin family has highly expanded and diverged in terrestrial plants and 2–53 members representing five clades, i.e., U (universal), SL (seed low-molecular-weight), SH (seed high-molecular-weight), T (tapetum in Brassicaceae), and M (mesocarp in Lauraceae), have been described (Schein et al. 2004; Huang and Huang 2015; Jia et al. 2022; Zou et al. 2022a). Interestingly, accumulation of oleosins in seeds not only determines the LD size but also positively relates to the oil content (Siloto et al. 2006; Zhang et al. 2019). Nevertheless, despite their high abundance in seeds as well as pollen and tapetum cells, oleosins have rarely been found in the LDs of fruits and leaves (Hsieh and Huang 2004; Horn et al. 2013; Kretzschmar et al. 2020).

Cyperus esculentus L., commonly known as yellow nutsedge or tigernut, is an herbaceous perennial C4 plant of the Cyperaceae (sedge) family within Poales (De Castro et al. 2015; Xiao et al. 2022; Zou et al. 2021, 2023a, b). Unlike most Poaceae species predominately accumulating starch in their seeds, tigernut is the only known species that produces high amounts of oil (up to 35%) in underground tubers, which is also different from its close relative purple nutsedge (Cyperus rotundus, Cyperaceae) as well as oil palm (Elaeis guineensis), an Arecaceae plant bearing oil in the fruit mesocarp instead (Stoller and Weber 1975; Turesson et al. 2010; Singh et al. 2013; Codina-Torrella et al. 2015; Zou et al. 2021; Xu et al. 2022). Oil extracted from tigernut tubers, which is mainly composed of oleic acid (up to 80%), can be consumed as healthy edible oil or as an alternative resource of biodiesel fuel (Zhang et al. 1996; Barminas et al. 2001; Makareviciene et al. 2013; Maduka and Ire 2018). Tigernut is a promising oil crop for its high yield and wide regional adaptability, including tropical, subtropical, temperate as well as cold zones (Stoller and Sweet 1987; De Castro et al. 2015; Zou et al. 2022b). It has been reported that tigernut can produce 4.5–12 t tubers per hectare, which means that its oil yield could reach up to approximately 90–240 kg per acre, lower than oil palm but much higher than most oilseed crops such as maize (Zea mays), rapeseed (Brassica napus), and soybean (Glycine max) (Makareviciene et al. 2013; Maduka and Ire 2018). Compared with oilseed crops, the mechanism of oil accumulation in tigernut tubers is poorly understood.

According to our previous study, rapid accumulation of oil during tuber development is accompanied by the increasing numbers of LDs (Jin et al. 2010). More recently, proteome analysis of LD-enriched fractions of tigernut tubers revealed the high abundance of oleosins (Niemeyer et al. 2022). Given essential roles of oleosins in LD formation and stabilization (Siloto et al. 2006; Shimada et al. 2008; Zhang et al. 2019; Ojha et al. 2021), in this study, we conducted a genome-wide characterization of the oleosin family in tigernut, including gene localizations, gene structures, sequence characteristics, and evolutionary patterns. Moreover, correlation of CeOLE mRNA/protein abundance with oil accumulation during tuber development, subcellular localizations, protein interaction patterns, and overexpressing in tobacco (Nicotiana benthamiana) were also investigated, which provide valuable information for further functional analysis and genetic improvement in this special species.

Materials and methods

Plant materials

In this experiment, a tigernut variety named Reyan3 (Zou et al. 2021) was used, which was cultivated in sandy soil in the Wenchang city of Hainan province, China. At approximately 7 days after sowing (DAS), shoots started to emerge above soil from seed tubers. At about 30 DAS, new tubers started to develop at the apex of stolons. To determine the dry mass and oil content, tubers were periodically collected from the onset of tuber initiation until maturation, spanning about 35 d as described before (Zou et al. 2022b). Since tubers appear throughout the growth cycle, this study adopted the term DAI (days after tuber initiation) instead of DAS used by several researchers (Jin et al. 2010; Bai et al. 2021; Wang et al. 2021). For gene cloning, gene expression analysis, and protein quantification, fresh tubers of representative stages (i.e. 1, 3, 5, 10, 15, 20, 25, and 35 DAI) were quickly freezed with liquid nitrogen and stored at −80 °C. Additionally, young leaves were also collected for genomic DNA extraction. For analyses of bimolecular fluorescence complementation (BiFC), subcellular localization, and oil regulation, tobacco were grown as previously described (Qiao et al. 2022a; Xu et al. 2022).

Identification of oleosin family genes by using public datasets

As shown in Supplementary Table S1, oleosin genes described in rice (Oryza sativa), arabidopsis (Arabidopsis thaliana), and Amborella trichopoda (Huang and Huang 2015; Zou et al. 2022a) were obtained from RGAP7 (http://rice.plantbiology.msu.edu/), TAIR11 (https://www.arabidopsis.org/), and Phytozome v13 (https://phytozome.jgi.doe.gov/pz/portal.html), respectively. Genomic and transcriptome data of tigernut, purple nutsedge, Carex littledalei, C. myosuroides, Rhynchospora breviuscula, and Juncus effusus were accessed from CNGBdb (https://db.cngb.org/search/assembly/CNA0051961/) and NCBI (https://www.ncbi.nlm.nih.gov/). RNA sequencing (RNA-seq) reads (NCBI Project accession no. PRJNA671562, 150 bp paired-ends) of purple nutsedge tubers were de novo assembled using Trinity (v2.13.2, https://github.com/trinityrnaseq/trinityrnaseq/wiki), which resulted in 31,192 unigenes for the construction of an in-house library. To identify oleosin family genes, HMMER (v3.3, http://hmmer.janelia.org/) searches were performed using the Pfam profile PF01277 (v35.0, https://pfam.xfam.org/). Gene models of candidates were further curated with mRNAs when available, and presence of the conserved oleosin domain in deduced proteins was confirmed by Pfam Search.

Sequence alignment and phylogenetic analysis

Nucleotide and protein multiple sequence alignments were conducted using ClustalW and MUSCLE implemented in MEGA6 (Tamura et al. 2013), respectively. The result of protein sequence alignment was displayed using Boxshade (https://embnet.vital-it.ch/software/BOX_form.html), and phylogenetic tree construction was performed using MEGA6 with the maximum likelihood method and bootstrap of 1000 replicates. Systematic names of oleosin family genes were assigned with two italic letters denoting the source organism and a progressive number based on sequence similarity. To distinguish them from the oleosin genes in arabidopsis, three italic letters were used in A. trichopoda (i.e. AtrOLE15).

Synteny analysis, definition of orthogroups (OGs), and calculation of evolutionary rate

Synteny analysis was carried out as previously described (Zou et al. 2019), where duplicate pairs were identified using the all-to-all BLASTp method and syntenic blocks were inferred using MCScanX (E-value, 1e-10; BLAST hits, 5) (Wang et al. 2012). Different modes of gene duplication were identified using the DupGen_finder pipeline (Qiao et al. 2019), and Ks (synonymous substitution rate) and Ka (nonsynonymous substitution rate) of duplicate pairs were calculated using TBtools (Chen et al. 2020). Orthologs between different species were identified using the best-reciprocal-hit (BRH) method, and orthogroups were assigned only when they are present in at least two species examined.

Protein properties and conserved motif analysis

Physiochemical parameters such as theoretical molecular weight (MW), isoelectric point (pI), aliphatic index (AI), and grand average of hydropathicity (GRAVY) of oleosins were calculated using ProtParam (http://web.expasy.org/protparam/). Transmembrane helices (TMHs) and Kyte–Doolittle hydrophobicity plots were predicted using TMHMM (v2.0, https://services.healthtech.dtu.dk/service.php?TMHMM-2.0) and ProtScale (https://web.expasy.org/protscale/), respectively. Conserved motifs were identified using MEME (v5.4.1, https://meme-suite.org/tools/meme) with optimized parameters of any number of repetitions, maximum number of 15 motifs, and the width of 6 and 120 residues for each motif.

Genomic DNA extraction and gene cloning

Genomic DNA extraction was conducted using a modified CTAB method as previously described (Zhao and Wei 2011). Primers used for gene isolation are shown in Supplementary Table S2, whereas PCR amplification was performed with the following reaction conditions: pre-denaturation at 95 °C for 3 min, denaturation for 35 cycles at 95 °C for 30 s, annealing at 60 °C for 30 s, and extension at 72 °C for 40 s, and a final extension at 72 °C for 5 min. PCR products were cloned into the pMD-18 T vector (TaKaRa Bio Inc., Dalian) for subsequent Sanger sequencing.

Gene expression analysis based on RNA-seq

Global expression profiles of oleosin genes in tigernut and yellow nutsedge were investigated using transcriptome datasets as shown in Supplementary Table S3. Raw sequence reads in the FASTQ format were obtained using fastq-dump, and quality control was performed using fastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Read mapping was performed using Bowtie 2 (Langmead and Salzberg 2012), and relative expression level of each transcript was presented as FKPM (fragments per kilobase of exon per million fragments mapped) (Mortazavi et al. 2008). Expression profile data of oleosin genes in rice was accessed from BAR (http://bar.utoronto.ca/).

Gene expression analysis based on qRT-PCR

Total RNA extraction, synthesis of the first-strand cDNA, and qRT-PCR analysis were performed as described before (Zou et al. 2022b). All qRT-PCR assays were conducted in triplicate for each biological sample with two reference genes CeTUB4 and CeUCE2 as reported before (Bai et al. 2021). Relative gene abundance was estimated with the 2−ΔΔCt method and statistical analysis was performed using SPSS Statistics 20 as previously described (Zou et al. 2015).

4D-parallel reaction monitoring (4D-PRM)-based protein quantification

Proteomic data of tigernut and purple nutsedge, which was accessed from ProteomeXchange/PRIDE (PXD021894, PXD031123, and PXD035931), was analyzed using Maxquant (v1.6.15.0) as previously described (Wang et al. 2020). Based on their abundance, two dominant isoforms, i.e., CeOLE2 and -5, were selected for 4D-PRM analysis, and related unique peptides are shown in Supplementary Table S4. Protein extraction, trypsin digestion, and LC–MS/MS analysis were conducted as described before (Wang et al. 2020).

BiFC, subcellular localization, and functional analysis on oil accumulation

Plasmid construction and Agrobacterium tumefaciens-mediated transformation were carried out as previously described (Qiao et al. 2022a; Xu et al. 2022), where primers used are shown in Supplementary Table S2. Briefly, the coding sequence (CDS) without the termination codon was cloned into pNC-BiFC-Ecn, pNC-BiFC-Enn, pNC-Cam1304-SubC, and pCAMBIA1301 by Nimble Cloning (Yan et al. 2023), and resulted recombinant plasmids were introduced into GV3101 with the helper plasmid pSoup-P19. Agrobacterium solution with an OD600 of 0.8 was collected and suspended using the infiltrating solution that includes 10 mmol/L MgCl2, 10 mmol/L MES, and 150 μmol/L acetosyringone (pH = 5.6). Tobacco leaves of approximately 4-week-old plants were infiltrated, which were later processed for confocal laser scanning microscopy imaging (Zeiss LMS880, Germany). All assays were performed across three different leaves at the same position of independent plants. For subcellular localization analysis, the endoplasmic reticulum (ER) marker RFP-HDEL (Gong et al. 2021) was employed. The wavelength of laser-1 was set as 610 nm for RFP observation, where the fluorescence was excited at 587 nm and detected at a band width of 565–700 nm. The wavelength of laser-2 was set as 517 nm for EGFP observation, where the fluorescence was excited at 493 nm and detected at a band width of 400–580 nm. The wavelength of laser-3 was set as 470 nm for chlorophyll autofluorescence observation, where the fluorescence was excited at 633 nm and detected at a band width of 650–690 nm. For oil regulation, transformed leaves were collected at one, three, and five days after infiltration, which were subject to qRT-PCR analysis and TAG determination as described before (Jin et al. 2010; Xu et al. 2022).

Results

Identification, gene localization, and synteny analysis of oleosin genes in tigernut

As shown in Fig. 1A and Supplementary Table S1, mining the tigernut genome resulted in six oleosin genes from six scaffolds (Scf). The family numbers are equal to that present in rice, one more than five reported in A. trichopoda, and considerably less than 17 described in arabidopsis (Supplementary Table S1). The CDS length of CeOLE genes varies from 423 to 483 base pairs (bp), putatively encoding 140–160 amino acids (AA) with the MW of 14.33–16.25 kDa; the theoretical pI values are all above 7 (9.46–10.19), indicating their alkaline characteristic. As expected, all of them possess similar Kyte–Doolittle hydrophobicity plots (Supplementary Fig. S1), and exhibit the amphipathic property with high AI values (94.57–106.90) and GRAVY values of more than 0 (0.344–0.577) (Supplementary Table S1). Besides harboring one 113-AA oleosin domain that includes the highly conserved proline knot motif (PX5SPX3P) (Fig. 2A), CeOLE proteins were shown to contain two to three TMHs (Supplementary Table S1).

Fig. 1
figure 1

Duplication events of CeOLE genes and synteny analysis within and between tigernut and representative species. A Duplication events detected in tigernut. Serial numbers are indicated at the top of each scaffold, and the scale is in Mb. Duplicate pairs identified in this study are connected using lines in different colors, i.e., transposed (green), dispersed (purple), and WGD (gold). B Synteny analysis within and between tigernut, rice, arabidopsis, and A. trichopoda. C Synteny analysis within and between tigernut, C. littledalei, R. breviuscula, and J. effusus. Shown are oleosin-encoding chromosomes/scaffolds and only syntenic blocks that contain oleosin genes are marked. (At, A. thaliana; Atr, A. trichopoda; Ce, C. esculentus; Chr, chromosome; Cl, C. littledalei; Je, J. effusus; Mb, megabase; OLE, oleosin; Rb, R. breviuscula; Os, O. sativa; Scf, scaffold; WGD, whole-genome duplication) (color figure online)

Fig. 2
figure 2

Multiple sequence alignment and phylogenetic analysis of oleosins. A Sequence alignment and structural features of six CeOLE proteins. Sequence alignment and display were conducted using MUSCLE and Boxshade, respectively. Identical and similar amino acids are highlighted in black or dark grey, respectively. The SeqLogo of the 72-residue proline knot motif is shown above the alignment, and the PX5SPX3P pattern is underlined. The C-terminal AAPGA of Clade U and the putative C-terminal insertion of Clade SH are boxed. B Phylogenetic analysis of CeOLEs with homologs from other species. Shown is an unrooted phylogenetic tree resulting from full-length oleosins with MEGA6 (maximum likelihood method and bootstrap of 1,000 replicates), where the distance scale denotes the number of amino acid substitutions per site and the name of each clade is indicated next to the corresponding group. (At, A. thaliana; Atr, A. trichopoda; Ce, C. esculentus; Cl, C. littledalei; Cm, C. myosuroides; C. rotundus; Je, J. effusus; OLE, oleosin; Rb, R. breviuscula; Os, O. sativa; SH, seed high-molecular-weight; SL, seed low-molecular-weight; T, tapetum; U, universal)

Protein sequence similarities within the CeOLE gene family vary from 37.0% to 72.5%, and relatively high similarities were observed between two pairs, i.e., 68.5% and 72.5% for CeOLE2/-3 and CeOLE4/-5, respectively (Supplementary Table S5), implying their recent origin. Indeed, intraspecies synteny analysis showed that CeOLE2 and -3 are located within syntenic blocks of tigernut scaffolds (Fig. 1B). Moreover, interspecies synteny analysis revealed that both CeOLE2 and -3 are syntelogs of OsOLE3 and AtrOLE3, though only CeOLE3 was identified as the syntelog of At-T5 (Fig. 1B), implying their appearance sometime after tigernut-rice divergence. As for CeOLE4 and -5 that are not located within syntenic blocks, only CeOLE4 was characterized as the syntelog of OsOLE2 (Fig. 1B), thereby, they were defined as dispersed repeats. Additionally, CeOLE6 was also defined as a dispersed repeat of CeOLE4, whereas CeOLE1 and -2 were characterized as transposed repeats, which is consistent with OsOLE3 that was identified as the transposed repeat of OsOLE4 (Supplementary Table S1).

Characterization of oleosin genes in representative species and insight into lineage-specific family evolution in Cyperaceae

Since the origin of CeOLE5 and -6 was not well resolved by synteny analysis as described above, we thereby took advantage of available genome and transcriptome data to identify homologs from representative species of the Cyperaceae family (i.e. purple nutsedge, C. littledalei, C. myosuroides, and R. breviuscula) as well as its close family Juncaceae (i.e. J. effuses) (Can et al. 2020; Hofstatter et al. 2022; Ning et al. 2023). As a result, five to six oleosin family genes were identified, which were shown to distribute across four to five chromosomes (Chr). It is worth noting that, in contrast to RbOLE1 and -6 that are co-located on Chr4, ClOLE1 and CmOLE1 are, respectively, co-located with ClOLE2 and CmOLE2 on Chr3 (Supplementary Table S1), implying species or genus-specific chromosomal rearrangement.

To uncover their evolutionary relationships, an unrooted phylogenetic tree was constructed using full-length oleosins present in tigernut, purple nutsedge, C. littledalei, C. myosuroides, R. breviuscula, J. effusus, rice, arabidopsis, and A. trichopoda. As shown in Fig. 2B, these oleosins were grouped into four clades, i.e., U, SL, SH, and T, where T is arabidopsi-specific. Obviously, Clades SL and SH in Cyperaceae could be further divided into two and three groups that were denoted as SL1, SL2, and SH1–3, respectively, which is consistent with BRH-based homologous analysis as shown in Table 1. Two main differences are that AtrOLE4 was assigned into SH1, whereas AtrOLE5 and OsOLE5 were shown to form one additional OG named SH4 (Table 1). Despite frequent occurrence of chromosome fission and fusion in Cyperaceae species (Can et al. 2020; Hofstatter et al. 2022; Ning et al. 2023; Zhao et al. 2023), a high level of collinearity relation was observed between oleosin-encoding regions, though SH3 is absent from C. littledalei for gene fragmentation (one gene fragment was identified on Chr11) and ClOLE5 is no longer located within syntenic blocks (Fig. 1C). Moreover, presence of SL2 (i.e. JeOLE3) and SH3 (i.e. JeOLE5) homologs in J. effusus but not rice (Table 1) implies their appearance sometime before Cyperaceae-Juncaceae divergence but after Cyperaceae-Poaceae split. Interestingly, as observed in tigernut, SL1 and -2 homologs are also located within syntenic blocks of C. littledalei and R. breviuscula (Fig. 1C), suggesting that they may be generated by one whole-genome duplication (WGD) event shared by Cyperaceae and Juncaceae after the split with Poaceae. Although a relatively high similarity of 75.3% was observed between JeOLE2 and -3 (Supplementary Table S5), possible transposition or chromosome rearrangement may occur in the JeOLE3-encoding region in contrast to the conserved evolution of that of JeOLE2 (Fig. 1C). The absence of a SH2 homolog in J. effusus as well as rice indicates that this group is more likely to be Cyperaceae-specific, generated via an unknown mechanism sometime after Cyperaceae-Juncaceae divergence, which is consistent with relatively higher sequence similarities (63.9–71.1% vs 71.9–75.5%) and smaller Ks values (5.2337–10.5926 vs 8.4319–51.8549) between SH1 and -2 than that between SL1 and -2 (Table 2). Moreover, variable Ks values also imply different evolutionary rate between these species. Nevertheless, low Ka/Ks ratios from 0.0045 to 0.2750 (Table 2) indicate that paralogs were stabilized mainly by purifying selection.

Table 1 Seven orthogroups identified on the basis of analyzing nine representative species
Table 2 Evolutionary rate of oleosin homologs in tigernut, purple nutsedge, C. littledalei, C. myosuroides, R. breviuscula, and J. effusus

Although most CeOLE genes were shown to be intronless, CeOLE5 possesses one intron immediately after the sequence encoding the hydrophobic hairpin stretch (Fig. 3B), which is confirmed by gene cloning using tigernut genomic DNA as template. Interestingly, the intron was shown to be highly conserved in SH2 homologs of Cyperaceae species, varying from 187 to 679 bp. Additionally, JeOLE2 in SL1 also contains one intron (133 bp) and the difference is that this intron is located before the hairpin-coding region (Fig. 3B). Since all oleosin genes present in rice and A. trichopoda are intronless (Supplementary Table S1), species or lineage-specific gain of these introns could be speculated.

Fig. 3
figure 3

Structural and phylogenetic analysis of the oleosin family in tigernut, purple nutsedge, C. littledalei, C. myosuroides, R. breviuscula, and J. effuses. A Shown is an unrooted phylogenetic tree resulting from full-length oleosins with MEGA6 (maximum likelihood method and bootstrap of 1,000 replicates), where the distance scale denotes the number of amino acid substitutions per site and the name of each clade is indicated next to the corresponding group. B Shown are the exon–intron structures. “?” represents the unknown length. C Shown is the distribution of conserved motifs among oleosins, where different motifs are represented by different color blocks as indicated and the same color block in different proteins indicates a certain motif. (Ce, C. esculentus; Cl, C. littledalei; Cm, C. myosuroides; Cr, C. rotundus; Je, J. effusus; OLE, oleosin; Rb, R. breviuscula; SH, seed high-molecular-weight; SL, seed low-molecular-weight; T, tapetum; U, universal)

Structural divergence among Cyperaceae and Juncaceae oleosins was also uncovered via analyzing conserved motifs. Among 15 motifs identified using MEME, Motifs 1, 2, 3, 5, 6, and 8 belong to the oleosin domain: Motif 1 is broadly distributed; Motif 2 is present in most orthogroups with the exception of SH3 that is placed by Motifs 5 and 8; and Motif 3 is present in most orthogroups with the exception of JeOLE3 and several members of SL1 (CrOLE2, ClOLE2, CmOLE2, and RbOLE2) that are placed by Motif 6. Motif 9 includes the partial sequence of the C-terminal AAPGA, which was characterized as the hallmark for Clade U (Huang and Huang 2015; Zou et al. 2022a). In contrast, little is known about other motifs: Motif 4 is widely present in SH1, SH2, and SH3, though it is replaced by Motif 7 in CeOLE6 and CrOLE6; Motif 7 is widely present in SL1 as well as two SL2 members, i.e., CeOLE3 and CrOLE3; Motif 14 is only present in CeOLE5 and CrOLE5; Motif 8 is only present in CeOLE1 and CrOLE1; Motif 11 is U-specific, though it is absent from JeOLE1; Motif 12 is SH1-specific, whereas Motif 10 is SH2-specific; Motif 13 is only present in CeOLE4 and CrOLE4 (Fig. 3C).

Global expression profiling revealed a tuber-predominant expression pattern of CeOLE genes

To uncover global expression profiles of CeOLE genes, nine tissues/developmental stages were investigated, i.e., two stages of developmental leaf (i.e. young and mature), sheath, root, rhizome, shoot apex, and three stages of developmental tuber (i.e. 40, 80, and 120 DAS). As shown in Fig. 4A, the total CeOLE transcripts were most abundant in tuber, peaking at 80 DAS, which is 2.44 and 1.37 folds more than 40 and 120 DAS, respectively. The average of total transcripts in three stages of developmental tuber is 31.54, 54.88, 96.65, 188.55, 195.45, and 690.70 folds more than that in root, sheath, young leaf, rhizome, shoot apex, and mature leaf, respectively, in striking contrast to seed-preferential expression of most oleosin genes in rice (Supplementary Fig. S2). In most tissues, CeOLE2 in the SL1 group and CeOLE5 in the SH2 group represent two major isoforms, though the transcripts of CeOLE1 in the U1 group were also abundant in non-tuber tissues. Compared with shoot apex, the expression of CeOLE5 was significantly up-regulated in young leaf, whereas the majority of CeOLE genes were down-regulated in mature leaf relative to young leaf. Interestingly, significantly higher abundance of CeOLE5 was observed in sheath than leaf. Despite the low expression of CeOLE3, its pattern is similar to CeOLE6, which increased along with tuber development. CeOLE4 transcripts also peaked at 120 DAS, but a remarkable drop was observed at 80 DAS; by contrast, CeOLE2 and -5 transcripts peaked at 80 DAS, and more transcripts were found at 120 DAS relative to 40 DAS (Fig. 4A and Supplementary Table S6).

Fig. 4
figure 4

Expression profiles of oleosin genes in tigernut and purple nutsedge. A Tissue-specific expression profiles of CeOLE genes. Tuber40, Tuber80, and Tuber120 represent tubers of 40, 80 and 120 DAS. B Expression profiles of CeOLE and CrOLE genes during tuber development. Color scale represents FKPM normalized log10 transformed counts where red indicates low expression and blue indicates high expression. (Ce, C. esculentus; Cr, C. rotundus; DAI, days after tuber initiation; DAS, days after sowing; FKPM, fragments per kilobase of exon per million fragments mapped; OLE, oleosin)

CeOLE transcripts in tubers were considerably more abundant than that of purple nutsedge

Despite sharing a very close genetic relationship with tigernut, purple nutsedge accumulates few oil in its tubers, i.e. 2.5% vs 25.8% (Ji et al. 2021; Niemeyer et al. 2022), prompting us to compare the expression patterns of oleosin genes during tuber development of these two contrasting species. As shown in Fig. 4B, among three stages profiled, i.e. 20, 50, and 90 DAI, CeOLE transcripts were considerably more abundant than that of CrOLE genes, varying from tens to thousands folds. In contrast to the predominant expression of two CeOLE genes (i.e. CeOLE2 and -5), major transcripts in purple nutsedge were contributed by three isoforms, i.e., CrOLE2, CrOLE1, and CrOLE6 (accounting for about 90.0%) in order. Moreover, their expression was not much correlated with tuber development except for CrOLE2 and CrOLE5 that was visibly up-regulated at 50 DAI. In contrast, transcripts of most CeOLE genes gradually increased along with tuber development. Whereas total CrOLE transcripts peaked at 50 DAI, CeOLE transcripts peaked at 90 DAI and relatively high-level expression started even at 20 DAI (Fig. 4B and Supplementary Table S7).

TAG accumulation and CeOLE mRNA/protein profiles during tuber development

In tigernut, it has been well established that TAGs gradually increase along with tuber development, peaking at the maturity (Jin et al. 2010; Turesson et al. 2010; Ji et al. 2021; Wang et al. 2021). In the present study, the whole growth period observed was about 85 d, and new tubers began to appear from approximately four weeks after shoot emergence. Visible tubers collected in this study were defined as the first stage, i.e. 1 DAI, and the mean dry mass per tuber was about 2.8 mg. In later stages of development, the dry matter increased gradually, peaking at the maturity of 35 DAI with approximately 1264.4 mg/tuber (Fig. 5A). A total of six swelling stages were collected, i.e. 3, 5, 10, 15, 20, and 25 DAI, where the former five stages were characterized as white in appearance. At 25 DAI, tubers began to turn light brown and their size had reached the maximum. In contrast, tubers at 35 DAI had totally matured and were characterized as hard and dark brown in appearance. The TAG content at above eight stages was also measured, and results showed that TAGs were not detected until 5 DAI, i.e., 5.1% of dry weight. Later, the TAG content slowly increased to 8.4% at 20 DAI, followed by the fast increase to 20.6% and 34.3% at 25 and 35 DAI, respectively (Fig. 5B).

Fig. 5
figure 5

Dry mass (A) and TAG (B) accumulation in developing tigernut tubers from 1 to 35 DAI. Bars indicate SD (N ≥ 3) and lowercase letters indicate difference significance tested following Duncan's one-way multiple-range post hoc ANOVA (P < 0.05). (DAI, days after tuber initiation; TAG, triacylglycerol)

To reveal the correlation between gene expression and TAG accumulation, the transcript levels of CeOLE genes in different developmental tubers were further checked using qRT-PCR. As shown in Fig. 6, the expression of most CeOLE genes is positively correlated with TAG accumulation during tuber development. At 1 DAI, CeOLE genes were usually lowly expressed with the exception of CeOLE1 whose transcripts were more abundant than that at 3 and 5 DAI. Later, most genes were significantly up-regulated, especially at two late stages, i.e., 25 and 35 DAI. Consistent with transcriptional profiling as described above, CeOLE2 and -5 represent two major isoforms with considerably lower Ct values, supporting expression divergence with their recent paralogs CeOLE3 and CeOLE4, respectively. In contrast, despite the relatively low expression, CeOLE6 and CeOLE4 were shown to be two most up-regulated genes during tuber maturity. It is worth noting that, a sudden drop of gene expression was observed for CeOLE6 and CeOLE1 at 15 DAI, when a drop of TAG content was also observed. The reason for this needs to be further studied.

Fig. 6
figure 6

Expression profiles of CeOLE genes during tuber development. Bars indicate SD (N = 3) and lowercase letters indicate difference significance tested following Duncan's one-way multiple-range post hoc ANOVA (P < 0.05). (Ce, C. esculentus; DAI, days after tuber initiation; OLE, oleosin)

Considering that protein abundance is not always in agreement with the transcript level and oleosin genes shall function at the protein level, we therefore first took advantage of available proteomic data to identify and quantify oleosins in both tigernut and purple nutsedge. As shown in Fig. 7A, all six CeOLE proteins were identified in LD-enriched fractions, though their abundances were highly variable. In total protein extracts, however, only four members, i.e., CeOLE1, CeOLE2, CeOLE5, and CeOLE6, were identified in freshly harvested, dried, rehydrated for 48 h, and sprouted tubers, in contrast to none in roots and leaves. As expected, protein abundance in total protein extracts was considerably less than that in LD-enriched fractions. Notably, despite the presence of six CrOLE genes, only the protein of CrOLE1 was identified in total proteins of freshly harvested tubers, which may be due to less sensitivity of proteomic techniques and correspond to relatively low levels of their transcripts in tubers as described above (Fig. 4B). To further uncover the protein dynamics during tuber development, two dominant isoforms (i.e. CeOLE2 and -5) were profiled in three representative stages, i.e. 5 DAI, 20 DAI, and 35 DAI, where 5 DAI was selected as the first stage for its clear TAG accumulation as described above (Fig. 5B). As shown in Fig. 7B, three to six and two to four fold increase was observed for CeOLE2 and -5, respectively. Their increased percentage is more in accordance with TAG accumulation (Fig. 5B), but far less than the transcript (Fig. 6A).

Fig. 7
figure 7

Relative protein abundance of CeOLE genes in different tissues (A) and stages of tuber development (B). Color scale represents intensity normalized log2 transformed relative quantitative values where red indicates low expression and blue indicates high expression. (Ce, C. esculentus; DAI, days after tuber initiation; DT, dried tubers; FT, freshly harvested tubers; LD, lipid droplet; OLE, oleosin; RT, rehydrated tubers for 48 h; ST, sprouted tubers)

Protein interaction and subcellular localization of CeOLE2 and -5

As described above, at both transcription and protein levels, CeOLE2 and -5 represent two dominant isoforms. Their co-expression and high abundance in small LDs suggest that they may interact with each other. For this purpose, BiFC analysis was carried out using A. tumefaciens-mediated infiltration of tobacco leaves (Qiao et al. 2022a). As shown in Fig. 8, our results indeed supported protein interaction between CeOLE2 and -5. Moreover, homologous interaction was also observed, implying that oleosins are more likely to function in homo and heteromultimers. Despite the well-known LD-localization of oleosins (Yee et al. 2021; Niemeyer et al. 2022), to our surprise, CeOLE2 and -5 were shown to locate not only in LDs (highly similar to that reported by Yee et al. 2021) but also the ER when transiently overexpressed in tobacco leaves of the BiFC experiment (Fig. 8). To confirm the result, an ER marker RFP-HDEL described before (Gong et al. 2021) was also used for subcellular localization analysis. As shown in Fig. 9, in the ER, fluorescence signals of CeOLE2-EGFP and CeOLE5-EGFP were highly coincided with RFP-HDEL, supporting their ER-localization.

Fig. 8
figure 8

BiFC-based protein interaction of CeOLE2 and CeOLE5 in tobacco leaves. (Ce, C. esculentus; OLE, oleosin)

Fig. 9
figure 9

Subcellular localization analysis of CeOLE2 and CeOLE5 in tobacco leaves. (Ce, C. esculentus; OLE, oleosin)

Overexpressing CeOLE2 and -5 in tobacco leaves

Due to the lack of an efficient genetic transformation platform for tigernut, roles of CeOLE2 and -5 in oil accumulation of vegetative tissues were investigated via transiently overexpressing in tobacco leaves, which has been widely employed to study genes involved in oil regulation (Grimberg et al. 2015; Yee et al. 2021; Xu et al. 2022; Kim et al. 2023). As shown in Fig. 10A, transcripts of CeOLE2 and -5 were detected even at 1 d after infiltration, and then steadily increased for 2.74–3.17 and 28.71–50.33 folds at two later time points, i.e., 3 d and 5 d, respectively. The TAG contents of transformed leaves were also examined. As for controls that were transformed with the empty vector, the TAG content was about 3.57 mg/g, in contrast, those of transgenic leaves increased 1.52–2.33 and 1.52–2.15 folds for CeOLE2 and -5 at three time points (1 d, 3 d, and 5 d) examined, respectively. Interestingly, both of them were shown to peak at 3 d after infiltration (Fig. 10B). The result supported a role of these two genes in oil accumulation of vegetative tissues.

Fig. 10
figure 10

Transcript (A) and TAG (B) increase in CeOLE2 and CeOLE5-overexpressing tobacco leaves. Bars indicate SD (N = 3) and “**” indicate difference significance tested following Duncan's one-way multiple-range post hoc ANOVA (P < 0.01). (Ce, C. esculentus; OLE, oleosin; TAG, triacylglycerol)

Discussion

As the rapid increase of world population and vegetable oil consumption, the huge demand for edible oil has put pressure on the supply by traditional oil crops, which predominantly accumulate high amounts of TAGs in their seeds (Board et al. 2022). Although plant vegetative tissues usually do not produce significant levels of TAGs, they have the capacity for their synthesis, storage, and metabolism, providing an opportunity to create alternative resources for increasing overall plant oil production (Xu and Shanklin 2016). Nevertheless, our knowledge on oil metabolism and regulation are mainly from arabidopsis and related oilseed crops (Bates et al. 2013; Xu and Shanklin 2016). Exploring new resources and characterizing key genes associated with oil accumulation in vegetative tissues are of particular interest.

Expansion of the oleosin family in tigernut was contributed by WGD and dispersed duplication, which were shown to be lineage-specific

As a rare example accumulating high levels of TAGs in the underground tubers, tigernut has been emerging as a novel oil crop and an idea model to study the mechanism of oil accumulation in vegetative tissues (Jin et al. 2010; Turesson et al. 2010; Xu et al. 2022; Zhao et al. 2023). Like seeds of oil crops, proteome analysis of LD-enriched fractions indicated that LDs of tigernut tubers are largely coated by oleosins (Niemeyer et al. 2022), in striking contrast to LDs in leaves, roots, and the mesocarp of avocado (Persea americana) that are mainly enclosed by lipid droplet-associated proteins (LDAPs) or small rubber particle proteins (SRPPs) (Horn et al. 2013; Kretzschmar et al. 2020; Guzha et al. 2023).

To improve our knowledge on roles of oleosins in vegetative tissues, in this study, a comprehensive characterization of the oleosin family was conducted in tigernut and six members identified represent three out of six previously described clades, i.e., U (1), SL (2), and SH (3), which is consistent with that reported in other monocots (Liu et al. 2012; Huang and Huang 2015; Ojha et al. 2021). Nevertheless, despite possessing equal members with most Poaceae species (e.g. six in rice), the family composition and evolution pattern were shown to be different. In rice that has been proven to experience three rounds of WGDs (named τ, σ, and ρ) after monocot-eudicot split (Jiao et al. 2014), the oleosin family is composed by two members for each clade: two U members, i.e., OsOLE1 and OsOLE4, were derived from the Poaceae-specific ρ WGD; two SL members, i.e., OsOLE3 and OsOLE6, were resulted from dispersed duplication (Supplementary Table S1); and they constitute four out of seven orthogroups proposed in this study, i.e., U, SL1, SH1, and SH4 (Table 1). Since these four orthogroups are also present in A. trichopoda, a basal angiosperm (Amborella Genome Project 2013), their early divergence before monocot-eudicot split and lineage-specific contraction in Cyperaceae and Juncaceae could be speculated. In contrast, CeOLE genes constitute six orthogroups, i.e., U, SL1, SL2, SH1, SH2, and SH3, three of which were shown to be absent from rice, arabidopsis, and A. trichopoda, implying their recent origin or lineage-specific contraction in rice. To address this issue, genome-wide identification of oleosin genes was also performed in representative species of Cyperaceae and Juncaceae. The wide presence of SL2, SH2, and SH3 in tested Cyperaceae species, i.e., purple nutsedge, C. myosuroides, and R. breviuscula, implies their divergence before the radiation of Cyperaceae. Since both SL2 and SH3 homologs have already been present in J. effuses, their generation could be dated back to sometime before Cyperaceae-Juncaceae divergence. Moreover, the location of SL1 and SL2 in syntenic blocks supports their WGD-derivation, which is more likely to be shared by Cyperaceae and Juncaceae or the recent WGD as described in C. littledalei (Can et al. 2020). In contrast, SH3 and SH2 were characterized as dispersed repeats of SH1, which may be respectively generated sometime before or after the recent WGD via yet unknown mechanisms. The evolution pattern is also different from arabidopsis that experienced three WGDs (named γ, β, and α) after monocot-eudicot split (Bowers et al. 2003), the expansion (in all orthogroups) of whose family was contributed by WGDs (β, 1; α, 3), tandem (5), proximal (1), and transposed duplications (1) (Supplementary Table S1).

Evolution of CeOLE genes was driven by divergence of gene structures and conserved motifs

Although three clades of oleosins have been established in monocots (Liu et al. 2012; Huang and Huang 2015), their subclassification is yet to be resolved. In this study, classification of six CeOLE genes into six groups is supported not only by exon–intron structures but also by conserved motifs. Generally, oleosin genes are intronless. However, gain of certain introns has been reported in some lineages, e.g., Euphorbiaceae, Brassicaceae, Cyperaceae, and Juncaceae (Huang and Huang 2015; Jia et al. 2022; Zou et al. 2022a; This study). In arabidopsis, 88.24% of oleosin genes were shown to possess one to two introns, which appear in different positions (Zou et al. 2022a). Position of the intron found in Cyperaceae species is similar to that reported in Euphorbiaceae species (Zou et al. 2022a), which is different from that found in JeOLE2, a SL1 member. Thereby, SH2 may differ from SH1 with the presence of one intron.

Different groups may also harbor group-specific conserved motifs. In this study, without any exception, the U clade was shown to include Motif 9 or the conserved AAPGA motif, which was characterized as the hallmark for this clade before (Huang and Huang 2015; Zou et al. 2022a). SH1 and SH2 differ from other groups with the presence of a putative C-terminal insertion of 18 AA as widely described in the SH clade (Tzen et al. 1990; Huang and Huang 2015; Jia et al. 2022; Zou et al. 2022a). Besides the presence of one conserved intron, SH2 also differs from SH1 with the substitution of Motif 12 by Motif 10 at the C termini. SH3 differs from other groups with the substitution of Motif 2 by Motifs 5 and 8. These results imply possible function divergence of different groups.

CeOLE genes have evolved to predominantly express in oil-rich tubers and exhibit seed-like accumulation with TAGs during tuber development

Generally, oleosin genes are preferentially expressed in seeds of oil crops, especially in maturing seeds with high levels of TAGs and abundant LDs (Kim et al. 2002; Huang and Huang 2015; Zhang et al. 2019; Jia et al. 2022; Zou et al. 2022a). In arabidopsis, rapeseed, and soybean, oleosin abundances were shown to be negatively correlated with the LD size and positively related to the oil content of seeds (Siloto et al. 2006; Hu et al. 2009; Zhang et al. 2019). Moreover, overexpression of oleosin genes could increase the seed oil content in arabidopsis, rice, and soybean (Liu et al. 2013; Zhang et al. 2019; Ojha et al. 2021; Yuan et al. 2021). Since tigernut rarely set seeds, we are not able to examine the TAG accumulation and expression profiles of CeOLE genes during seed development. Instead, transcriptional profiling of several main tissues (i.e., leaf, sheath, root, rhizome, shoot apex, and tuber) showed that most CeOLE genes were predominantly expressed in the tuber, coinciding with high amounts of TAGs produced by this special tissue (Jin et al. 2010; Turesson et al. 2010; Codina-Torrella et al. 2015; This study). In fact, tuber-predominant expression of CeOLE genes appears to be tigernut-specific, co-opting with the LD/TAG accumulation but not the tuber tissue itself. Evidences are as follows: Firstly, as a close relative to tigernut that may diverge as late as 0.3–12 million years ago (MYA) (Niemeyer et al. 2022; Xiao et al. 2022; Zou et al. 2023a), purple nutsedge produces tubers with high amounts of starch instead of TAGs (Stoller and Weber 1975; Ji et al. 2021); secondly, despite the presence of six CrOLE genes, their transcripts in tubers especially at late stages were considerably less than that of CeOLE genes; thirdly, despite the detectability of all six CeOLE proteins in LD proteomes (Niemeyer et al. 2022), four of them (i.e. CeOLE1, CeOLE2, CeOLE5, and CeOLE6) were also identified in total proteins of freshly harvested tubers, in contrast to a single one (i.e. CrOLE1) detected in freshly harvested tubers of purple nutsedge; fourthly, CrOLE1 belongs to the U clade whose homologs such as CeOLE1 was universally expressed in most tissues including leaf, sheath, root, rhizome, and shoot apex examined in this study; finally, CeOLE genes were lowly expressed in early stages of developmental tuber and transcripts/proteins gradually increased along with TAG accumulation during later development, which is consistent with the accumulating dynamics of LDs reported in tigernut (Jin et al. 2010; Turesson et al. 2010; Wang et al. 2021). Nevertheless, the mechanism underlying is still to be addressed. One possible cause may be the co-opting master regulator WRINKLED1 (WRI1) encoded by CeWRI1, whose transcripts were also shown to positively correlate with TAG accumulation during tuber development (Xu et al. 2022). However, genes encoding other master regulators homologous to FUSCA3 (FUS3), LEAFY COTYLEDON1 (LEC1), LEAFY COTYLEDON2 (LEC2), and ABSCISIC ACID INSENSITIVE3 are rarely expressed in tubers (Zou et al. 2021), implying tuber-specific regulatory of oil accumulation in tigernut.

Oleosins may function in homo and heteromultimers

Among six clades (i.e. P, U, SL, SH, T, and M) of oleosins described, SL and SH represent two clades that are highly abundant in seeds of angiosperm (Huang and Huang 2015). Despite wide occurrence of species or lineage-specific expansion of these two clades, generally only one from each clade is active and they are usually in an approximately 1:1 ratio (Tzen et al. 1990; Huang and Huang 2015; Zou et al. 2022a), implying possible interaction between SL and SH oleosins. In tigernut, despite recent origin of SL and SH paralogs, i.e. CeOLE2/-3 and CeOLE4/-5, apparent expression divergence was observed and CeOLE2/-5 have evolved to be two dominant isoforms in oil-rich tubers. As expected, according to our BiFC analysis, CeOLE2 and -5 could indeed interact with each other. Moreover, strong homologous interaction could also be detected for CeOLE2 and -5, supporting that oleosins may function in homo and heteromultimers. To the best of our knowledge, this is the first experimental evidence for oleosin multimerization. Additionally, both BiFC and subcellular localization analyses suggest that CeOLE2 and -5 are located not only in LDs but also the ER when transiently overexpressed in tobacco leaves. Their LD-localization is in accordance with proteome analysis of LD-enriched fractions (Niemeyer et al. 2022) as well as subcellular localization analyses performed in other species (Abell et al. 1997; Huang and Huang 2017; Yee et al. 2021). In contrast, their retention in ER is more likely due to overexpression resulting in mass synthesis of oleosins in ER and limited LDs present in the leaf tissue (Abell et al. 1997; Huang and Huang 2017).

Overexpression of CeOLE genes could improve oil accumulation in leaves

As discussed above, tuber-specific activation of CeOLE genes is more likely to be a key factor determining high levels of oil accumulation in tigernut tubers, in accordance with artificial selection of soybean OLEO1 that resulted in the high level of transcription and increased seed oil accumulation (Zhang et al. 2019). Nevertheless, direct functional evidence is lacking for CeOLE genes. Since the genetic transformation system has not been well established in tigernut, a method for transiently overexpressing in tobacco leaves was used for the purpose. As the main tissue for photosynthesis, under normal conditions, leaves rarely accumulates LDs and oleosins (Kretzschmar et al. 2020; Niemeyer et al. 2022). However, they have the capacity for oil biosynthesis and overexpressing oil-related genes such as WRI1 could increase the TAG content in transgenic leaves, just like that observed in transgenic seeds of arabidopsis, rice, and soybean (Liu et al. 2013; Grimberg et al. 2015; Zhang et al. 2019; Ojha et al. 2021; Yuan et al. 2021; Xu et al. 2022). As expected, transiently overexpressing two dominant isoforms (i.e., CeOLE2 and -5) in tobacco leaves could significantly increase the TAG content by 1.52–2.33 folds. The increased folds are comparable to that of CeWRI1-overexpressing leaves, however, the trend is different: for both CeOLE2 and -5, the TAG content peaked at 3 d after infiltration, in contrast to 5 d for CeWRI1 (Xu et al. 2022). The underlying mechanism still needs to be studied and one possible reason may be due to their inherent characteristics: oleosins are structural proteins that mainly function in LD formation and stabilization, whereas WRI1 is a master regulator of genes involved in oil biosynthesis (Grimberg et al. 2015; Kong et al. 2019; Qiao et al. 2022b; Xu et al. 2022).

Conclusions

To our knowledge, this is the first genome-wide characterization of the oleosin family in tigernut, a unique Cyperaceae plant producing TAGs in its underground tubers. Six members identified represent three clades (i.e. U, SL and SH) or six out of seven orthogroups proposed in this study, i.e., U, SL1, SL2, and SH1–3. Expansion of Clades SL and SH appears to be lineage-specific, contributed by WGD and dispersed duplication, respectively. Evolution of CeOLE genes was driven by divergence of gene structures, conserved motifs, and expression profiles. CeOLE genes have evolved to predominantly express in oil-rich tubers and exhibit seed-like accumulation (at both mRNA and protein levels) with TAGs during tuber development. Roles of CeOLE genes in oil accumulation of vegetative tissues were confirmed via overexpressing in tobacco leaves. Moreover, homologous and heterologous interactions of CeOLE proteins were first presented in this study. These findings provide insights into lineage-specific family evolution and putative roles of oleosin genes in oil accumulation of underground tubers, which facilitate further genetic improvement for tigernut.