INTRODUCTION

Today, plants from the family Brassicaceae are one of the most important oil-bearing crops across the world, first of all, species such as rapeseed (Brassica napus), (B. rapa ssp. oleifera, syn. B. campestris) and brown mustard (B. juncea) [1–3]. However, more and more species now attract attention as prospective sources of high-quality oil for diesel and jet biofuel production. Among the most promising candidates are the following species: Ethiopian mustard (B. carinata) [4–6] and two species that were earlier neglected as weeds—camelina (Camelina sativa) [7, 8] and field pennycress (Thlaspi arvense) [9, 10].

Camelina attracted the interest of researchers, first of all, due to its significant capacity of adaptation to a broad spectrum of biotic and abiotic types of stress [11]. Camelina is well adapted to cultivation in temperate climatic conditions, which could possibly develop thanks to its origin from Eastern Europe. This plant had been a widespread oleaginous crop across this territory until the early 1900s when sunflower and soya replaced it [12–15]. Camelina is positively distinguished from other prospective oil-bearing crops for its high adaptation capacity. For example, C. sativa seeds can germinate at +1°C and successfully tolerate frosts at –8 to –10°C [13, 14, 16]. Moreover, camelina can grow on depleted soils and is resistant to fungal pathogens, such as Leptosphaeria maculans and Alternaria brassicae [15].

Camelina is valued as one of the most promising oil-bearing crops for cultivation as a biofuel plant in the climate zones of Europe and North America. The prospective use of different Cruciferae members for biodiesel production has widely been discussed in recent years [17–20], but camelina is now regarded as the most desired oil-bearing crop [21]. Camelina oil was taken as a model to demonstrate a possibility of jet biofuel production [22, 23], and this very culture was selected by the International Air Transport Association (IATA) as a candidate for jet biofuel [24].

Camelina also has a broad potential for both selective and biotechnological improvement. C. sativa is one of the closest relatives of Arabidopsis thaliana, which is the best studied model object. Therefore, camelina is considered to be an exceptionally attractive object for genetic engineering [25] and genome editing, which, however, is somewhat limited by the hexaploid structure of its genome [26]. Methods for effective in vitro transformation and regeneration have now been developed for camelina [25, 27], and this fact significantly facilitates further manipulations with the plant.

Despite all the above mentioned, the challenge of camelina’s improvement based on the classical selection methods is no less important [11]. However, the realization of this approach requires assessing genetic polymorphism of the existing cultivars in the world’s collections. For this purpose, the C. sativa genetic map was built for the first time in 2006 using 157 AFLP markers and three SSR markers [28]. Other researchers have also studied the genetic diversity of camelina using a series of molecular markers, such as RAPD, AFLPs, hTBP, SSR, and SNP [29–35].

Some of these studies have confirmed a low level of genetic diversity and polyploidy in camelina [28, 33, 36]. The sequencing of the C. sativa genome has additionally confirmed allohexaploidy in this plant [37], which may be correlated with its low genetic diversity. Also, the possibility of participation of 2–3 wild camelina parent species in the formation of the polyploid C. sativa genome is also discussed now [38, 39]. Thus, although we have a sufficient number of the studied camelina genotypes at the current time [39], the level of its genetic diversity is considered as low. Moreover, it should be noted that the genetic resources of the Eastern European C. sativa remains almost completely unstudied, although this region is thought to be the center of the origin of this species.

One of the recent studies contained investigations of genetic distances between camelina cultivars of Polish and Ukrainian breeding [40]. The Ukrainian cultivars described in this study were predominantly related to the results of the Soviet breeding studies and did not reflect the existing genetic resources of C. sativa in Ukraine. According to the data of the National Register of Cultivars in Ukraine (http://agro.me.gov.ua/ua/file-storage/reyestr-sortiv-roslin-ukrayini), as of 2020, nine cultivars of spring camelina have been recorded (Table 1), and eight breeding lines are still in the final breeding stages at the Hryshko National Botanic Garden, National Academy of Sciences of Ukraine [41–44].

Table 1.   Spring camelina cultivars registered in Ukraine

In our earlier studies, we showed a possibility to use some of these camelina genotypes as an oily feedstock for producing a high-quality diesel fuel [45]. We have already started to assess genetic diversity of Ukrainian camelina samples using ISSR markers [46]; however, the obtained results demonstrated the necessity to attract new molecular markers and make this genetic analysis more detailed. Therefore, the objective of this study was to give an integrated molecular-genetic, biochemical, and morphometric assessment of the present C. camelina genetic resources of Ukrainian breeding and evaluate the prospects for the further improvement of camelina cultivars.

MATERIALS AND METHODS

Plant material. For this study, we used spring camelina (C. sativa f. annua) samples from the collection of the Hryshko National Botanic Garden, National Academy of Sciences of Ukraine, including four cultivars (Mirazh, Klondike, Peremoha, Evro-12) and eight breeding lines (FEORZhYaF-1, FEORZhYaF-2, FEORZhYaF-3, FEORZhYaF-4, FEORZhYaF-5, FEORZhYaFD, FEORZhYaFCh, and FEORZhYaFChP).

Chromatographic analysis of fatty acid composition of seed lipids. A 10-mL test tube was filled with 60 mg of a specimen of oil previously extracted from different camelina samples, 4 mL of isooctane, and 200 μL of potassium hydroxide dissolved in methanol (2M). The tube was shaken until a transparent solution was obtained, i.e., until completion of the reetherification reaction. To neutralize potassium hydroxide, 1 g of potassium sulfate monohydrate was added to the test tube. Upon sedimentation of the formed salts, the upper layer was removed and put into new 4-mL tubes. The obtained samples contained approximately 15 mg/mL of methyl ethers of fatty acids dissolved in isooctane.

Fatty acid methyl ethers were further separated and identified by gas chromatography with a flame ionization detector (FID) using a GC-MS Agilent 7890B-7697 gas chromatographer (Agilent, United States) equipped with a Zebron ZB-FAME capillary column (Phenomenex Inc., United States) (60 m × 0.25 mm × 0.20 μm). Nitrogen was used as a gas-carrier, while hydrogen was used to maintain the FID. The temperature of the evaporator reached 250°C, while the temperature of the column increased as high as 185°C. The volume of the injection was 1 μL, while the total duration of analysis per sample reached 40 min. The separated fatty acids were identified by comparing them with the chromatogram of a standard fatty acid solution (C10-C24, Supelco) obtained under analogous conditions. The obtained percentage distribution values for the areas of peaks on the chromatogram were recalculated relative to the molecular mass of each particular fatty acid. As a result, we obtained values for the content of each particular fatty acid in % mol.

Isolation of DNA and polymerase chain reaction (PCR) conditions. Genomic DNA was isolated from sprouts of seeds of samples (approximately 10 plants for each genotypes) using the cetyltrimethylammonium bromide (CTAB) method [47]. The analysis using SSR markers included the isolation of DNA from seven different exemplars of each genotype, whereas genomic DNA was used for all other analyses. The DNA’s quality and amount were determined by electrophoresis on a 1.5% agarose gel using an Eppendorf spectrophotometer. DNA samples were stored at –20°C. A portion of 50 ng DNA of each of the samples was used for PCR. Each PCR was minimally twice repeated using a negative control to have a possibility to detect nonspecific amplification products under a further electrophoretic analysis. PCR was conducted in 250 μL microtubes in a Thermal Cycler 2720 amplifier (Applied Biosystems, United States). The 10-μL reaction mixture contained a fivefold PCR buffer with ammonium sulfate, 2.5 mmol MgCl2, 50 ng plant DNA, 1 μM of each of the primers, 0.2 mm of each deoxynucleotide triphosphate (dNTP), and 0.5 U Taq polymerase (Fermentas, Lithuania).

Genotyping samples using ISSR markers. The genetic diversity of the above-mentioned spring camelina samples was studied using ISSR markers by the methodology described earlier [46]. In analogy with the previously published study, seven primers (Table 2) were used, a part of which was for the first time described as being based on microsatellite sequences of Tritium aestivum but capable to work with a broad spectrum of species, including Chlorella vulgaris and C. pyrenoidosa [48].

Table 2. Sequences of ISSR primers used in the study

DNA was amplified using ISSR markers according to the following protocol: initial denaturing (at 95°C) for 5 min., 45 cycles of amplification (denaturing at 95°C for 1 min, annealing of primers for 1 min at optimal temperature, elongation for 2 min at 72°C, end elongation at 72°C for 7 min, and retention at 15°C [46]. The amplification products were separated by electrophoresis in a 1.5% agarose gel, adding ethidium bromide and in a single Tris/borated-EDTA (TBE) buffer. The DNA markers (O’GeneRulerTM 100 bp DNA Ladder (Fermentas, Lithuania) were used to determine fragment lengths. The fragments were visualized under UV light.

Genotyping samples using simple sequence repeats (SSR) markers. Polymorphism in the alleles of microsatellite sequences was studied using primers specific to seven different C. sativa SSR loci given in Table 3 [31, 33]. As has been shown earlier, the size of amplified fragments at the given loci varied from 60 to 320 bp [40]. The amplification for all SSR markers was run according to the following protocol: initial denaturing (at 94°C) for 5 min., 40 cycles of amplification (denaturing at 94°C for 30 s, annealing of primers at optimal temperature (depending on the primer) for 30 s, elongation for 30 s at 72°C, and end elongation for 30 min at 72°C [33].

Table 3.   Sequences of primers for SSR loci of C. sativa used in the study

PCR products were separated by electrophoresis in a 6% polyacrylamide gel under denaturing conditions in a single TBE buffer [47]. The duration of separation of amplified fragments was 1.5 h at a voltage of 350 V. The length of targeted fragments was determined using the O’GeneRulerTM 50 bp DNA Ladder marker (Fermentas, Lithuania). The silver nitrate staining method was applied to visualize the fragments [49].

Genotyping samples using the β-tubulin introns lengths polymorphism method (TBP analysis). The TBP analysis was conducted according to the methodology described in [50]. We used the pairs of degenerated primers, which flank tubulin β intron I (TBP analysis) and intron II (cTBP analysis), i.e., for TBP analysis (5' → 3') – F: AACTGGGCBAARGGNCA-YTAYAC; R: ACCATRCAYTCRTCD GCRTTYTC [51], while for cTBP analysis (5' → 3') – F: GARAAYGCHGAYGARTGYATG; R: CRAAVCCBACCATGAARAARTG [52].

DNA for TBP- and cTBP analysis was amplified under the following protocol: (initial denaturing at 94°C) for 3 min., 35 cycles of amplification (denaturing at 94°C for 30 s, annealing of primers at 55°C for 40 s, elongation at 72°C for 1.5 min), end elongation at 72°C for 8 min, and retention at 15°C [50]. The conditions of electrophoretic separation of the obtained amplicons were analogous to those used for the SSR analysis, except the duration parameter, which was 2–3 h in this case. The length of the formed and most clear fragments was determined using the O’GeneRulerTM 100 bp Plus DNA Ladder marker (Fermentas, Lithuania).

Genotyping samples using the actin introns lengths polymorphism method (TBP analysis). To assess the actin genes introns polymorphism, we used the previously developed degenerated primers specific to actin genes intron II in plants; ActIn_F: 5'-TGGCATCAYACNTTYTAC- AAYGA-3'; ActIn_R: 5'-CCMCCACCTDAGVACRATGTT-3' [53–55]. DNA was amplified under the following protocol: initial denaturing (at 95°C) for 3 min., 40 cycles of amplification (denaturing at 95°C for 45 s, annealing of primers at 59°C for 45 s, elongation at 72°C for 1 min, end elongation at 72°C for 7 min, and retention at 15°C [53–55]. The conditions of electrophoretic separation of the obtained amplicons were analogous to those used for the TBP- and cTBP analyses, including the duration of separation and the used molecular mass marker.

Statistical processing of data. The chromatographic data for the camelina lipid fatty acid composition were statistically treated using p-criterion. The coefficients for the relative weight of metabolic pathways for fatty acid synthesis were estimated according to [56–57] with insignificant changes, in particular, using two new coefficients suggested by the analogy with the existing ones—oleic elongation ratio (OER) and gondoic elongation ratio (GER) coefficients. The values of elongation ratio (ER), desaturation ratio (DR), oleic desaturation ratio (ODR), linoleic desaturation ratio (LDR), OER, and GER were estimated according to the following formulas:

$$\begin{gathered} {\text{ER}} = \frac{{\% {\text{C}}20{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}22{\kern 1pt}:{\kern 1pt} 1}}{{\% {\text{C}}20{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}22{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 2 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 3}}, \\ {\text{DR}} = \frac{{\% {\text{C}}18{\kern 1pt}:{\kern 1pt} 2 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 3}}{{\% {\text{C}}20{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}22{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 2 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 3}}, \\ {\text{ODR}} = \frac{{\% {\text{C}}18{\kern 1pt}:{\kern 1pt} 2 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 3}}{{\% {\text{C}}18{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 2 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 3}}, \\ {\text{OER}} = \frac{{\% {\text{C}}20{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}22{\kern 1pt}:{\kern 1pt} 1}}{{\% {\text{C}}18{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}20{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}22{\kern 1pt}:{\kern 1pt} 1}}, \\ {\text{LDR}} = \frac{{\% {\text{C}}18{\kern 1pt}:{\kern 1pt} 3}}{{\% {\text{C}}18{\kern 1pt}:{\kern 1pt} 2 + \% {\text{C}}18{\kern 1pt}:{\kern 1pt} 3}}, \\ {\text{GER}} = \frac{{\% {\text{C}}22{\kern 1pt}:{\kern 1pt} 1}}{{\% {\text{C}}20{\kern 1pt}:{\kern 1pt} 1 + \% {\text{C}}22{\kern 1pt}:{\kern 1pt} 1}}, \\ \end{gathered} $$

where %C18:1; %C18:2; %C18:3; %C20:1; and %C22:1 are the values for the content of the corresponding fatty acid in % mol.

Based on the values obtained for fatty acid coefficients, we performed a hierarchical cluster analysis using the Origin 9.1 software. Clustering in building a dendrogram was performed using the unweighted pair group method with arithmetic averages (UPGMA).

To estimate and record molecular weights of amplicons (concerning ISSR, SSR, and TBP/cTBP markers used in this study), the digital photographs of gels were analyzed using the Gel Analyzer program software (http://www.gelanalyzer.com/). The bands of amplicons were assessed by the binary system, when 1 represents the presence of a band while its absence was designated as 0. The genetic distances between the pairs of genotypes were determined using the Free Tree software [58] based on the presence or absence matrix of amplified bands shown by the investigated samples by the Nei and Li coefficient [59], which were estimated by the following equation:

$${{S}_{{ij}}} = {{2{{N}_{i}}} \mathord{\left/ {\vphantom {{2{{N}_{i}}} {({{N}_{i}} + {{N}_{j}}),}}} \right. \kern-0em} {({{N}_{i}} + {{N}_{j}}),}}$$

where Ni, j is the number of alleles present in the i- and j-genotypes; Ni is the number of alleles present in the i-genotype; Nj is the number of alleles present in the j‑genotype; i, j = 1, 2, … [60].

The estimated values for the genetic similarity coefficients were used to perform a cluster analysis and build a dendrogram by the UPGMA method. The clustering results were assessed based on 1000 bootstraps using the earlier mentioned software. The terminal dendrograms were visualized using the FigTree v1.4.2 software (http://tree.bio.ed.ac.uk/software/figtree).

The heatmap (diagram) presented in the study was estimated as a pairwise comparison matrix based on the morphological parameters of the above mentioned camelina genotypes reported by us earlier [42]. The estimation was based on the data of the parameters, such as the height of plants in the flowering phase (cm), the number of lateral shoots on the stem (number) in the flowering phase, the diameter of the stem (mm) in the flowering phase, the number of leaves on the stem (number), the leaf length (cm), the leaf width (cm), the number of inflorescences on the main stem (number), the length of inflorescences (cm), the height of plants in the phase of insemination (cm), the taproot length (cm), the number of lateral shoots (number), the number of pods on the main stem (number), the number of pods of the lateral shoots (number), the lengths of the fruit (mm), the width of the fruit (mm), the thickness of the fruit (mm), the length of the seed (mm), the width of the seed (mm), the mass of 1000 seeds (g). The heatmap was visualized in the two-color variant using the online Heatmapper software (http://heatmapper.ca/pairwise/) [61].

RESULTS AND DISCUSSION

An extremely important factor determining the value of any oil-bearing crop is oil quality, depending on its fatty-acid composition. Therefore, we performed a chromatographic analysis of the lipid content and fatty acid composition of seeds from the studied camelina samples (Fig. 1). We have established that the content of linolenic (18 : 3) acid was 31.95–38.89% (the highest content was shown by the FEORZhYaFD breeding line) and that of linoleic (18 : 2) acid was from 20.04 to 24.89% (in the Klondike cultivar). The highest content of oleic (18 : 1) acid was identified in the FEORZhYaF-2 breeding line (18.57%), whereas the content of erucic (22 : 1) acid varied from 1.08 to 2.01%. The obtained data are correlated with the results of our previous studies on the fatty acid composition of oil from seeds of these genotypes [43–45]. The camelina seed oil, in general, is characterized by a high content of polyunsaturated fatty acids and a low level of erucic acid compared with other Cruciferae [11, 16]. The genotypes studied by us are also no exception.

Fig. 1.
figure 1

Fatty acid composition of oils from different camelina genotypes used in this study (%, p < 0.05)

In addition, we have estimated the content of the main fatty acid groups (the content of saturated, polyunsaturated, and monounsaturated fatty acids, including fatty acids with the carbon chain length ≤C18 or >C18) in the oil of the studied camelina breeding lines and cultivars (Fig. 2). In general, the content of particular fatty acid groups were not significantly different among genotypes. The content of saturated fatty acids was the lowest in FEORZhYaF-2 (12.85%), whereas the highest level was recorded in FEORZhYaF-4 (16.36%). The highest content of polyunsaturated fatty acids was recorded in FEORZhYaFD (61.58%), while monounsaturated fatty acids had their highest level in FEORZhYaF-2 (31.71%). The highest content of fatty acids with a short chain was recorded in FEORZhYaFD (87.01%). The samples FEORZhYaF-1, FEORZhYaF-3, FEORZhYaF-5, FEORZhYaFCh, FEORZhYaFChP, Mirazh, Klondike, Peremoha, Evro-12 did not differ strongly from one another in these indicators. According to our previous studies, the FEORZhYaFD breeding line has already been supposed as the most promising genotype for the use as a feedstock for biodiesel production [43–45]. The FEORZhYaF-2D breeding line differs from other genotypes by its high content of oleic (18 : 1) acid as well as by the lowest level of saturated acids.

Fig. 2.
figure 2

Distribution groups of fatty acids in the camelina oils: SFA—saturated fatty acids; PUFA—polyunsaturated fatty acids; MUFA—monounsaturated fatty acids; £C—fatty acids with the carbon chain length 18 and less; >C18—fatty acids with the carbon chain length over 18 (%, p < 0.05); 1—FEORZhYaF-1; 2—FEORZhYaF-2; 3—FEORZhYaF-3; 4—FEORZhYaF-4; 5—FEORZhYaF-5; 6—FEORZhYaFD; 7—FEORZhYaFCh; 8—FEORZhYaFChP; 9—Mirazh; 10—Klondike; 11—Peremoha; 12—Evro-12.

At the next stage of our studies, we estimated fatty acid coefficients for each of the investigated camelina genotypes (Table 3). The values of these indicators allow us to assess so called “relative weight of a particular region in a metabolic pathway” and approximately determine the relative activity of an enzyme cascade acting at a particular stage of elongation and saturation. ER describes the elongation of the oleic acid chain (C18:1) to eicosenoic (C20:1) and erucic (C22:1) acids, whereas DR describes the desaturation of oleic acid to linoleic (C18:2) and linolenic acid (C18:3), ODR from oleic to linoleic, and LDR from linoleic to linolenic acid. ER and DR were estimated according to [57], while ODR and LDR were estimated according to [56]. We also offered additional coefficients to describe the processes of elongation in more detail: OER describes the relative weight of the elongation pathway from oleic (C18:1) acid to gondoinic (C20:1) acid and GER describes the elongation of gondoinic acid to erucic (C22:1) acid. Summing the above, we can state that these coefficients indicate the % mol ratio between the content of particular fatty acids and the content of the acids that are the products of their conversion through the desaturation or elongation pathways.

Thus, according to Table 4, the FEORZhYaFD breeding line characterized by the highest content of short-chain (≤C18) fatty acids has the lowest ER value—0.1174. At the same time, DR is the highest for the FEORZhYaFD genotype—0.7128—since its seed oil composition is characterized by the highest content of polyunsaturated fatty acids compared with other genotypes.

Table 4.   Estimation data on relative weight coefficients for metabolic pathways of fatty acid synthesis

All camelina breeding lines and cultivars have a rather low GER coefficient value, varying within 0.0953–0.1603, which is correlated with an insignificant content of erucic acid in these genotypes (2% and less). Other coefficients also change according to the same principle. For example, а low LDR value is correlated with a low content of linolenic acid, etc. The values of the given coefficients allow us to assess not only the content of some particular acid but also the entire differences in the functions of metabolic pathways for the synthesis of fatty acids, which can further serve as an orientation for creating camelina lines with increased content of particular fatty acids.

To make the total assessment of differences in the biochemical composition of lipids in the C. sativa seeds of Ukrainian breeding, we have built a dendrogram (Fig. 3) based on the values of the estimated fatty acid coefficients (Table 4) on which we can distinguish three main groups. Group C included two genotypes Klondike and FEORZhYaFChP, characterized by rather low LDR and GER values. Group A contains 5 genotypes: FEORZhYaF-1, Mirazh, FEORZhYaF-2, Peremoha, and FEORZhYaFD. At the same time, Evro-12 and the FEORZhYaF-3, FEORZhYaF-4, FEORZhYaF-5, and FEORZhYaFChP breeding lines are related to the B group. Thus, judging by the fatty acid composition, the group A containing prospective breeding lines, such as FEORZhYaF-2 and FEORZhYaFD, is the most interesting due to their special fatty acid composition. According to the results of our previous studies, it should also be noted that all A group members, with the exception of the Mirazh cultivar, were characterized by an especially high seed lipid content exceeding 42% [43–45].

Fig. 3.
figure 3

Dendrogram based on the hierarchical clustering analysis of fatty-acid coefficient values shown by different camelina genotypes.

According to different sources, the yield of spring camelina types may reach 2500 kg/ha [21] or 3320 kg/ha [14]. At the same time, the spring camelina cultivars in southern Poland can produce almost 2400 kg/ha under similar conditions and the winter cultivars can produce yields reaching 2800 kg/ha (data of 2012, the most favorable year for cultivation) [40]. According to our data, the productivity of the studied genotypes reached 4111 kg/ha for the Evro-12 cultivar [45], which allows providing the yield of lipids reaching 1330 kg/ha [41]. As has previously been shown, the content of oil in seeds of the cultivars studied by us varied within 36.04–43.89% [41, 43–45], whereas, according to different reports, the oil content of camelina seeds may reach 49% [11]. Some current approaches allow producers to increase the total fatty acid content in camelina seeds. For example, the camelina seed oil content may be increased by inducing the overexpression of the WRI1 gene by as high as 14% compared with untransformed plants [62].

To establish the level of polymorphism and genetic differences among the Ukrainian C. sativa breeding lines and cultivars, we have performed DNA profiling with seven ISSR primers (ISSR-3, ISSR-4, ISSR-5, ISSR-16, ISSR-18, ISSR-25, ISSR-62), which were used by us previously for camelina fingerprinting [46]. The electrophoregrams with the separation of amplicons obtained with ISSR-25 and ISSR-18 primers are presented in Fig. 4. In total, 81 loci have been detected, including 68 polymorphic ones with the level of polymorphism reaching 83.95%.

Fig. 4.
figure 4

Electrophoregrams of amplicons obtained for different camelina genotypes using ISSR markers: (a) ISSR-25: 1—Peremoga; 2—FEORZhYaF-5; 3—FEORZhYaFChP; 4—FEORZhYaFD; 5—FEORZhYaF-4; 6—FEORZhYaF-2; 7—FEORZhYaF-1; 8—FEORZhYaF-3; 9—FEORZhYaFCh; 10—Klondike; 11—Mirazh; 12—Evro-12; (b) ISSR-18: 1—FEORZhYaF-1; 2—FEORZhYaF-2; 3—FEORZhYaF-3; 4—FEORZhYaF-4; 5—FEORZhYaF-5; 6—FEORZhYaFD; 7—FEORZhYaFCh; 8—FEORZhYaFChP; 9—Peremoha; 10—Evro-12; 11—Mirazh; 12—Klondike. M—molecular weight marker.

The obtained data substantially differ from the earlier obtained results [46], mainly, due to the fact that a higher number of loci were detected with the markers ISSR-3 (nine loci in total), ISSR-4 (11 loci), ISSR-16 (10 loci), and ISSR-18 (12 loci) under this study, which increased the total number of detected loci. The number of detected polymorphic loci has also proved to be larger (83.95%) compared with the previous results (76.5%), which, in the first place, was caused by higher levels of polymorphism associated with the ISSR-3 (78%) and ISSR-16 (90%) primers compared with the previous studies of these camelina genotypes [46]. No significant differences have been shown for ISSR-5, ISSR-25, and ISSR-62 primers in either the number of present loci or the number of polymorphic ones, which varied from 77 to 93%.

The indicated differences in the results of molecular-genetic profiling with ISSR markers, are, first of all, associated with the very method and its specificities. The ISSR analysis is one of the variations of the Random Amplification of Polymorphic DNA (RAPD) method with the partial specificity to microsatellite repeats [63]. Since ISSR primers can conventionally randomly select sequences for amplification under PCR, the number and length of obtained fragments may vary, although, despite this, the reproduction of the method is much better than in the case of RAPD [63]. This is also one of the causes as to why the same ISSR primers can work with a broad range of plant types (from monocotyledonous and dicotyledonous to microalgae) [46, 48]. Subsequently, to verify the obtained results, we also used other DNA profiling methods based on evaluating the polymorphism of both the SSR loci and the lengths of introns in different genes of cytoskeletal proteins (actin and β-tubulin). Despite this, we estimated the coefficients of genetic distances on the basis of the obtained molecular-genetics profiles with different ISSR markers and built a dendrogram (Fig. 5).

Fig. 5.
figure 5

Dendrogram based on the polymorphism data for the amplified fragments obtained using SSSR markers of loci in different camelina genotypes.

As clearly seen from Fig. 5, the dendrogram is subdivided into two main branches designated A and B. All cultivars (Peremoha, Klondike, Evro-12, and Mirazh) are referred to branch A, as well as the FEORZhYaF-5 breeding line (A1), whereas the FEORZhYaFCh (A3) falls out from the main group. The FEORZhYaF-1, FEORZhYaF-2, FEORZhYaF-3, and FEORZhYaFChP breeding lines are grouped at some distance from the B2 subclade. This distribution hardly correlates with the dendrogram based on the fatty acid coefficients (Fig. 3). The exceptions include the FEORZhYaF-1 and FEORZhYaF-2 breeding lines, which fall under the common clade. The built tree almost completely coincides with the previously obtained data [46], which confirms that the results are reproduced within these genotypes despite the specificities of the ISSR analysis method. There are no published studies so far in which ISSR markers were used for assessing the C. sativa polymorphism. Although the methods based on the amplification of random regions are not so widely used today, it has recently been shown that the application of RAPD primers with SSR markers helps to discriminate between cultivars of the Polish and Ukrainian-Soviet breeding [40].

In addition, we analyzed seven plants of each genotype with seven pairs of SSR primers (the P3C3, P3H4, P4B3, P4C2, P4T6, P6T4, and LiB19 loci). The total number of detected alleles varied from one (the monomorphic P6E4 locus) to six (the P4C2 locus). We detected two alleles at the LiB19 locus and two alleles at the P4B3 locus. Three allelic variants were detected at each of the P3H4, P4E6, and P3C3 loci. Figure 6 gives electrophoregrams with the amplified SSR loci of the Peremoha cultivar.

Fig. 6.
figure 6

Fragments of the electrophoregram with amplified SSR loci in different plants of the Peremoha cultivar.

In general, the cultivars and breeding lines studied by us were characterized by a sufficiently low level of heterozygosity, and, in particular, no heterozygous plants have been found at the LiB19 and P6E6 loci. The largest number of heterozygotes has been detected at the P4E6 locus, for example in the FEORZhYaF-3 line (one exemplar) and in the Euro 12 cultivar (two exemplars). Other studies devoted to a genetic diversity of camelina, using the given SSR markers, also reported about a high level of homozygosity detected in plants of this species [33]. Based on the obtained molecular genetic profiles at different microsatellite loci, we have estimated the coefficients of genetic distances and built a dendrogram (Fig. 7). As is clearly seen, the genotypes are distributed in two large branches (A and B).

Fig. 7.
figure 7

Dendrogram based on the polymorphism data on SSR loci in different camelina genotypes.

We should note that the majority of breeding lines are differentiated at some distance (in the A branch) from the Mirazh, Peremoha, and Euro 12 cultivars located in the branch B. It is interesting that the B branch is itself subdivided into subclades coinciding with similar subclades of the dendrogram based on the polymorphism of ISSR markers (Fig. 5). Thus, the B1 subclade of the tree with the SSR markers is identical to the A1 subclade of the tree with ISSR markers and contains the FEORZhYaF-5 breeding line and the Peremoha cultivar. The same may be stated about the Mirazh and Evro 12 cultivars belonging to both the B2 subclade (the SSR tree, Fig. 7) and the A2 subclade (the ISSR tree, Fig. 5).

A similar situation may be observed in relation to the FEORZhYaF-1, FEORZhYaF-2, and FEORZhYaF-3 breeding lines, which fall under the same subclade in both dendrograms. The FEORZhYaF-1 and FEORZhYaF-2 breeding lines are also differentiated in both cases to the common branch of the lowest order, thus demonstrating their high genetic similarity. It should be noted that the FEORZhYaFD breeding line is stably differentiated in both cases to the particular subclade far from the above-mentioned samples and localized in the middle of the dendrogram, indicating its sizeable genetic distance from other breeding lines, which is still insufficient for FEORZhYaFD to fall out from the main groups and be distributed into another clade. In addition, this fact may indicate that FEORZhYaFD is genetically closer to the members of the B clade (Fig. 7) than other representatives from the A clade.

As an additional instrument to assess genetic polymorphism in the studied camelina breeding lines and cultivars, we used molecular genetic markers allowing us to assess the intron length polymorphism (ILP). Among these were markers based on the actin gene II intron length polymorphism. This approach was earlier used for different plant species, including flax (Linum usitatissimum) [53], as well as for Solanaceae and Poaceae representatives [54]. We employed this approach to detect genetic diversity of Ukrainian camelina cultivars and breeding lines. The results of electrophoretic separation of the obtained actin II intron amplicons in different C. sativa genotypes are given in Fig. 8. The obtained DNA fragments had lengths ranging from 550 bp to approximately 3000 bp. It should be noted that the majority of the obtained fragments did not differ between different camelina genotypes. However, their insignificant polymorphism was detected in the 1000–1100 bp zone. The given fragments had a sufficiently low intelligibility in all repeats but were reproduced from which allowed us to conclude that these amplicons were not nonspecific PCR products.

Fig. 8.
figure 8

Electrophoregrams with actin intron amplicons in different camelina genotypes. The figure shows the molecular mass range in which polymorphic fragments were identified: 1—Peremoga; 2—FEORZhYaF-5; 3—FEORZhYaFChP; 4—FEORZhYaFD; 5—FEORZhYaF-4; 6—FEORZhYaF-2; 7—Evro-12; 8—FEORZhYaF-1; 9—FEORZhYaF-3; 10—FEORZhYaFCh; 11—Klondike; 12—Mirazh. M—molecular weight marker.

Although the given fragments were not fully intelligible for assessing their length, it was possible to conclude that their sizes were similar to those of the FEORZhYaF-2, FEORZhYaF-3, FEORZhYaFD, Pe-remoha, and Klondike genotypes, for which the given amplicon length was close to 1000 bp. The FEORZhYaF-1, FEORZhYaF-4, FEORZhYaF-5, FEORZhYaFCh, and FEORZhYaFChP breeding lines and the Euro 12 and Mirazh cultivars have also been characterized by the presence of fragments with similar sizes. Due to a significantly low level of detected polymorphism, the data of these molecular genetic profiles were not used further for building dendrograms. In general, the actin intron length polymorphism method may possibly be more suitable for detecting polymorphism either among higher taxonomic units (at the level of different species and genera) or for assessing geographically distant populations or studying interspecies hybrids and identifying their parent forms [53–55].

We have also performed a TBP analysis for identifying the level of polymorphism in the β-tubulin I intron length in the studied C. sativa genotypes. The obtained electrophoregram (Fig. 9a) shows that the length of all amplicons ranged from 295 bp to 3200 bp. A great number of the formed amplification products (nearly 50) were observed in all studied camelina samples, which is explained by the allohexaploid structure of this plant’s genome as well as by a number of the β‑tubulin isotypes in C. sativa (nearly 20) [32].

Fig. 9.
figure 9

Electrophoregram with amplicons of (a) I and (b) II introns of the β-tubulin gene in different C. sativa genotypes. The arrows indicate the most polymorphic fragments: 1—Mirazh; 2—FEORZhYaF-2; 3—FEORZhYaF-3; 4—Evro-12; 4—FEORZhYaF-4; 5—FEORZhYaF-5; 6—Peremoha; 7—Klondike; 8—FEORZhYaFCh; 9—FEORZhYaF-1; 10—FEORZhYaF-5; 11—FEORZhYaFChP; 12—FEORZhYaFD. M—molecular weight marker.

A large part of bands on the electrophoregram is monomorphic (Fig. 9a). In total, one can distinguish seven clear polymorphic bands located in the following ranges: 295–300 bp (band 1'), 350–400 bp (bands 1, 2), 600–700 bp (bands 3, 4), and 1000–1100 bp (bands 5–7). However, it is probable that band 1' (295 bp) may be formed as a nonspecific amplification product since this band is clear in some samples but unnoticeable or absent in other samples. The samples FEORZhYaF-5 and FEORZhYaFChP are characterized by 370 bp amplicons, whereas amplicons in other samples are formed with a different weight (375 bp). The 660 bp band is observed in four of the studied 12 breeding lines (Mirazh, FEORZhYaF-1, FEORZhYaF-5, and FEORZhYaFChP) but the remaining eight breeding lines were characterized by the 650 bp long band. The 1085 bp band characterizes the cultivar Mirazh alone. Apart from this, the molecular profile of Mirazh in the majority of repeats contained one amplicon approximately 1150 bp long, which was not encountered in other genotypes.

Thus, assessing the β-tubulin gene I intron length polymorphism, we have successfully discriminated between different samples of the C. sativa cultivars and breeding lines. However, the majority of amplicons were similar within the genetic profiles of all cultivars and breeding lines, except the earlier mentioned molecular weight ranges. However, approximately half of the samples have a very similar or even the same genetic profile, which agrees with the reports about a low level of genetic diversity among camelina cultivars [28, 33]. In our case, in addition, the majority of the studied C. sativa genotypes have a common breeding origin.

Therefore, to specify and detail the obtained information for the purpose of a more accurate differentiation between samples, we have additionally performed the cTBP analysis, i.e., studied polymorphism in the β‑tubulin gene II intron length. The results of electrophoretic separation of the amplified fragments of the β-tubulin gene II intron also indicated the presence of a great number of fragments, the majority of which were also monomorphic for all samples (Fig. 9b).

The DNA fragments were detected approximately within the 350 bp–2990 bp range, but clear and reproduced fragments were present only within the range from 350 bp to 1190 bp, and therefore, only the latter were subsequently estimated. In total, seven polymorphic fragments were detected with approximate molecular mass: 515 bp (band 1*), 520 bp (band 2*), 525 bp (band 3*), 550 bp (band 4*), 570 bp (band 5*), 795 bp (band 6*), and 1015 bp (band 7*). The studied genotypes also rather strongly differed in the presence or absence of particular amplicons. For example, band 4* was present only in FEORZhYaF-1, FEORZhYaF-2, FEORZhYaF-3, and the Peremoha cultivar, whereas band 5* was characteristic for all other genotypes. Bands 4* and 5* are most likely related to the same region that has a different length in different genotypes. As for bands 1*, 2*, and 3*, two bands from this range are present in each of the FEORZhYaF-3, FEORZhYaF-5, Mirazh, and Evro 12 genotypes, which may confirm a high level of heterozygosity in these lines. Most probably, these bands are also amplification products of homologous regions differing in length. The same is related to the duplicate band 7* in FEORZhYaF-4. It should be noted that these nuances happened to be detected only owing to the cTBP analysis, whereas, according to the analysis of the first intron length polymorphism, all profiles of the studied genotypes had an equal number of bands.

Based on the obtained molecular genetic profiles of the studied camelina genotypes, we have estimated the Nei and Li similarity coefficient. According to the TBP analysis, the coefficient values vary from 0 (among the FEORZhYaF-5–Mirazh and FEORZhYaF–ChP–Mirazh samples) to 1 (in the majority of samples). The Nei genetic distance had values from 0 to 1.242. The Nei and Li similarity coefficient values estimated on the basis of the cTBP analysis profiles vary from 0 to 1, while the Nei genetic distance varies from 0 to 1.099. Based on the estimated Nei and Li coefficients, we have built a dendrogram (Fig. 10) that shows the distribution of the studied camelina samples by their genetic TBP/cTBP profiles.

Fig. 10.
figure 10

Dendrogram based on the polymorphism data on β-tubulin intron lengths (by I (TBP) and II (cTBP) introns) in different camelina genotypes.

Three major branches (A, B, C) can be seen in the built dendrogram. It should also be noted that the Mirazh cultivar (Institute of Oilseed Crops, Ukrainian Academy of Agrarian Sciences, (UAAS)) falls out of all groups, which indicates its significant genetic distance from the genotypes from the Hryshko National Botanic Garden and the Klondike cultivar (Institute of Agriculture, UAAS). Group C includes the FEORZhYaF-5 and FEORZhYaFChP breeding lines, whereas the FEORZhYaF-3 breeding line and the Evro-12 cultivar are related to group B. It should be noted, according to the previous analyses, that these samples were not grouped in a similar way in any of the cases (Figs. 5, 7), with the exception of the dendrogram based on fatty acid coefficients (Fig. 3), in which FEORZhYaF-3 and Evro-12 were related to the B group, although they were located in different branches. As in the case with SSR- and ISSR markers, the FEORZhYaF-1, FEORZhYaF-2, and FEORZhYaF-4 breeding lines were related to the A group (Fig. 10). It is interesting that the Klondike and Peremoha cultivars are located in the same branch of the lowest order, which was frequently observed in the dendrogram with ISSR markers in which these genotypes belonged to the same branch. The genetic closeness of the FEORZhYaD and FEORZhYaF-4 breeding lines is also confirmed by their location in the adjacent branches in Fig. 10 and by their distribution in the case of ISSR-tree to the same branch. The FEORZhYaD and FEORZhYaF-4 breeding lines were also sufficiently close with SSR markers and both were related to the A group, albeit to its different subgroups (Fig. 7).

To compare the studied camelina genotypes in morphometric indicators, we used the data of our previous studies [42]. The heatmap reflects the level of diversity in the genotypes in morphology (Fig. 11). The lowest degree of similarity with other samples was demonstrated by the Evro-12 cultivar differing by significantly higher productivity [45], larger sizes of fruits and seeds and, in general, by larger plants compared to other genotypes [42]. The Peremoha cultivar differed slightly less but nevertheless substantially. The highest similarity was shown by the following pairs of genotypes: FEORZhYaFD–FEORZhYaF-1, FEORZhYaFChP–FEORZhYaF-5, Mirazh–FEORZhYaF-5, and Klondike–FEORZhYaFD. All other genotypes demonstrated a moderate degree of differences between one another.

Fig. 11.
figure 11

Heatmap of pair comparison, which shows the level of difference between genotypes (with Black color—completely identical; white color—most different) in morphological parameters of plants: 1—FEORZhYaF-1; 2—FEORZhYaF-2; 3—FEORZhYaF-3; 4—FEORZhYaF-4; 5—FEORZhYaF-5; 6—FEORZhYaFD; 7—FEORZhYaFCh; 8—FEORZhYaFChP; 9—Mirazh; 10—Klondike; 11—Peremoha; 12—Evro-12.

There are no clear coincidences in the distribution of samples if we compare the results of molecular-genetic analysis with different markers but some similarities are present. For example, the FEORZhYaFChP–FEORZhYaF-5 breeding lines are related to the same branch in the TBP dendrogram (Fig. 10) but belong to absolutely different groups in other cases (Figs. 5, 7). Mirazh–FEORZhYaF-5 belong to the common group by the results of ISSR- and SSR-analysis, although they fall under different subclades within these groups. In the case of the Klondike–FEORZhYaFD pair, there are coincidences with dendrograms on Figs. 7 and 10, where these genotypes are related to the same branch within the same group. This correlation was not observed in other cases.

If we turn to the analysis of differences in the fatty-acid composition (Fig. 3) and to the analysis of genetic polymorphism (Figs. 5, 7, 10), we can see that the pictures substantially differ. It is known that, in the case when genetic distances are determined based on the analysis of noncoding regions of the genome, the obtained results may substantially differ from the established similarity of genotypes in morphological or biochemical traits [64]. Therefore, this correlation between the morphology, fatty-acid composition, and polymorphism of molecular genetic markers is insufficiently clear.

To obtain camelina genotypes with increased contents of particular fatty acids, it would be theoretically expedient to select breeding pairs in which the desired trait would be most expressed. At the same time, these breeding lines (or cultivars) should be genetically distant from each other as far as possible. A possibility of obtaining heterozygous lines has been shown for different species of Cruciferae: rape (B. napus) [65, 66], Indian mustard (B. juncea) [67], white cabbage (B. oleracea var. capitata) [68], Chinese leafy or bok choy cabbage (B. rapa ssp. chinensis) [69], and camelina [70]. Moreover, apart from the use of molecular markers for identifying genetic distances between cultivars, the confirmation of their homozygosity and selection of parent genotypes for obtaining heterozygous lines of plants in the first generation was successfully shown on the example of Indian mustard using RAPD markers [71] and on the example of Pekinese cabbage (B. rapa ssp. pekinensis) [71] using SSR markers and sequencing [72].

The polyploidy of the C. sativa genome provides a high level of diversity of allelic variants of particular genes, which may facilitate the obtaining of hybrids with a high level of heterozygosity and, respectively, provide the appearance of novel phenotypic variants with the expressed heterosis effect [37]. Camelina is also a primarily self-pollinating plant, which leads to the natural formation of inbred lines with a high level of homozygosity [33, 70]. Thus, we have also confirmed in our study with SSR markers that the number of homozygous plants within a cultivar or a breeding line significantly exceeded the number of heterozygotes.

Within the context of this study, we can speak about the selection of camelina genotypes characterized by attractive qualities in fatty acid composition (such as the FEORZhYaF-2 and FEORZhYaFD genotypes) and demonstrating a sufficiently high level of seed productivity and oil content. This all, in the first place, is related to genotypes of the A group (Fig. 3), including FEORZhYaF-1, Mirazh, FEORZhYaF-2, Peremoha, and FEORZhYaFD. As was mentioned above, these breeding lines and cultivars, except FEORZhYaF-1, are valued for higher-than-average productivity and a high seed lipid content (more than 42%) [43, 45], which provides the yield of oil at a level exceeding 1200 kg/ha per unit of area, whereas this value for the studied genotypes varies from 1058 kg/ha (FEORZhYaF-1) to 1330 kg/ha (Evro-12) [41]. At the same time, the morphometric indicators, such as productivity, are close in the given genotypes to the average values per sample, which may additionally contribute to heterosis, since, as has been shown earlier, this phenomenon is more strongly manifested in crossing camelina plants with moderate indicators of quantitative traits [70]. In view of this, we can state that pairs for crossing, such as Peremoha-FEORZhYaF-2 and Peremoha-FEORZhYaFD, as well as Mirazh–FEORZhYaF-2 and Mirazh–FEORZhYaFD, are more promising for the probability of manifesting the heretosis effect, since these pairs are sufficiently distant in the dendrograms in the majority of cases based on the polymorphism of molecular-genetic markers (Figs. 5, 7, 10). Moreover, according to the analysis of polymorphism in the β-tubulin introns lengths, the Mirazh cultivar completely falls out of the main group, which indicates its distant location.

The FEORZhYaF-2–FEORZhYaFD combination may also be potentially interesting in view of their fatty-acid composition specificities. This pair of genotypes is not so genetically distant, as the previous one, since the plants are grouped in the majority of cases under the same clade, despite their distribution in rather distant branches. The same can be related to the Mirazh–Peremoha pair since its components are assigned to the same group in all cases, except Fig. 10. The Peremoha–FEORZhYaF-1 and Mirazh–FEORZhYaF-1 pairs can be qualified as insufficiently attractive due to low productivity indicators in the FEORZhYaF-1 line. The FEORZhYaF-1–FEORZhYaF-2 combination of genotypes is the least preferable, since these breeding lines show the highest level of genetic similarity in all cases among all samples.

There is also a possibility to form other combinations with genetically most distant camelina samples, but the obtained hybrids may be less promising for use as a feedstock for the biofuel production in view of their productive parameters and oil quality. This is confirmed by the fact that, although all the studied genotypes came from the same geographical region, they manifest rather significant variability between one another. Although it is known that camelina crossing combinations with geographically distant populations could be most advantageous, the heterosis effect may theoretically also be possible within samples from the studied genotypes, which, of course, must be verified in practice.