Introduction

Flaxseed/linseed (Linum usitatissimum L.; 2n = 2x = 30), a self-pollinated annual crop is known to mankind since the beginning of the civilization and considered one amongst the earliest crops to have been domesticated (Zohary and Hopf 2000). Northwest India (central Asiatic centre) is one of the centres of origin of linseed, the other being, Near-Eastern Centre, Mediterranean Centre, and the Abyssinian Centre (Vavilov 1951). True to its species name ‘usitatissimum’ (most useful in Latin), linseed has multiple uses, however, is mainly grown either for fiber or seed oil. This preferential cultivation resulted in development of two distinct morphotypes of linseed, ‘flax type’ (80–120 cm tall, less branched) and linseed type (40–60 cm tall, highly branched) (Zuk et al. 2015; Kaur et al. 2017). Linseed seed oil has unique drying property owing to high linolenic acid (45–60%) which make it suitable for several industrial applications such as manufacturing of paints, inks, varnishes and linoleum flooring etc. (Kurt and Bozkurt 2006). Linseed oil has also been used traditionally in culinary purpose in certain pockets of India especially in Chhattisgarh and Maharashtra states. Further, more varieties having seed oil suitable for the culinary purpose are also available (Hosseinian et al. 2004; Kaur et al. 2017). Recently there has been increase in interest in linseed due to nutraceutical applications and functional food ingredients largely owing to its high α-Linolenic acid (ALA, omega 3 fatty acid) and high lignans which are known to have anti-cancer properties (Goyal et al. 2014; Bassett et al. 2009).

In India, linseed is second most important rabi oilseed crop, however, grown mostly as rainfed crop by farmers with limited resources. India is the sixth largest producer of linseed with production of 174,000 tonnes from 320,000-hectare land, however, with a far lower average productivity (543 kg/ha) than the world’s average (1053 kg/ha) (FAOSTAT 2018). Among the factors, drought and high temperature are known to reduce the linseed yield as well as oil quality (Dillman and Hopper 1943; Painter et al. 1944; Ford and Zimmerman 1964). At warmer temperature, reduced seed set and partial to complete necrosis of ovule was also reported (Kraft et al. 1963). High temperature during flowering also has negative effects on boll set (Ford and Zimmerman 1964). Therefore, early maturity is desirable trait in linseed as it protects the crop from heat, drought, frost and disease (Hall et al. 2016). In Indian perspective, early maturity is particularly important as linseed is largely grown under rainfed and Utera cropping system in rice fallows (Singh and Satapathy 2019) and utilizing available moisture to raise the crop is crucial determinant of yield. Linseed being a facultative long day plant (Domantovich et al. 2012), flowering initiation in short days is a prerequisite for plants to enter in reproductive stage to enable early maturity. Interestingly, genotypic differences have been reported for the photoperiod response (Zhang 2013). Therefore, it appears sensible to look for the variability in the available plant genetic resources of linseed and identifying genetic resources for early flowering, maturity and important agro-morphological traits.

Phenotypic characterization and evaluation of crop germplasm accessions using standard descriptors can help in identification of trait specific superior accessions and therefore better utilization of germplasm (Diederichsen and Richards 2003; Bioversity International 2007). This approach has been used by researchers in linseed (Diederichsen 2001; Diederichsen et al. 2006; Dikshit and Sivaraj 2015; Worku et al. 2015; Kaur et al. 2018). For genetic diversity studies, SSRs have been choice of markers in different crop plants (Pandey et al. 2013; Ganesan et al. 2014; Tiwari et al. 2015; Singh et al. 2017; Kumari et al. 2019). Genetic diversity of linseed accessions using SSR markers have also been reported (Soto-Cerda et al. 2012; Choudhary et al. 2017; Chandrawati et al. 2017).

From an agricultural perspective, transition from vegetative to reproductive growth is a crucial determinant that enables flowering and seed development in plants. Over course of time, farmers and plant breeders have selected variants for flowering time seeking advantage of yield, agricultural flexibility and eco-geographical range of crops (Cockram et al. 2007). That renders opportunities even in the well domesticated crop plants to trap variability for different traits including flowering and maturity time making use of plant genetic resources. The flowering time is a complex trait and studies on Arabidopsis thaliana have shed light on the complex genetic network that can detect environmental and internal signals (Roux et al. 2006). Among the suite of genes, AP2 is considered as one of the most important regulators of network of genes controlling floral gene expression, flower and seed development (Jofuku et al. 1994).

In present study, 220 linseed germplasm accessions were characterized based on agro-morphological traits including flowering and maturity related traits for 2 years. Based on field evaluation, a panel of accessions with early and late flowering and maturity were selected for studying genetic diversity using SSR markers. Further, a gene encoding AP2 domain transcription factor (putative floral homeotic gene) was sequenced from two early and two late accessions to study trait specific allelic variants.

Material and methods

Plant materials and field evaluation

Study comprised a total of 220 linseed germplasm accessions received from the National Gene Bank, ICAR-National Bureau of Plant Genetic Resources (ICAR-NBPGR), New Delhi (28° 38′ 53.7′′ N 77° 09′ 05.4′′ E). The 220 germplasm accessions consisted of 53 exotic collections (EC) (Argentina, Australia, USA, Hungary, and Holland) and 165 indigenous collections (IC) (15 different state of India, especially, Maharashtra, Uttar Pradesh, Orrisa, Himachal Pradesh, Madhya Pradesh etc.) and two released varieties namely, Binwa (IC0420772) and Kota Barani (IC0611328) (Online Resource 1). Seeds were obtained from the National Gene Bank, ICAR-NBPGR, New Delhi. The accessions were evaluated for agro-morphological traits along with three check varieties (T-397, Shekhar, and Kartika) for two consecutive rabi seasons (2017–2018 and 2018–2019) at ICAR-NBPGR research farm, New Delhi in an Augmented Block Design (ABD) in six blocks with three checks replicating twice in each block. Each accession was grown in single row of 3 m length, with a between-row spacing of 45 cm, and within-row spacing of 10 cm. The identified accessions with the specific traits were further validated during year 2019–2020 at two locations, ICAR-NBPGR, New Delhi and ICAR-NBPGR Regional station (20° 42′ 03.2′′ N 77° 01′ 53.6′′ E). Recommended agronomic practices were followed during various stages of crop growth.

Phenotyping for agro-morphological traits and estimation of oil content

Observation were recorded on 25 agro-morphological traits including 18 quantitative and 7 qualitative traits. Quantitate traits were, days to 5% flowering (DF5), days to 50% flowering (DF50), days to 95% flowering (DF95), days to 80% maturity (DM), flower diameter (FD; mm), plant height (PH; cm), number of primary branches (PB), number of capsules per plant (CPP), number of seeds per capsule (SPC), seed weight per capsule (SWPC, mg), capsule diameter (CD, mm), capsule area (CA; mm2), seed length (SL; mm), seed breadth (SB; mm), seed area (SA; mm2), seed yield per plant (YPP, g), thousand seed weight/1000 seeds weight (TSW; g), and oil content (OC; %). Qualitative traits were, plant growth habit (PGH), flower aestivation (FA), flower color (FC), flower shape (FS), stamen color (SC), anther color (AC) and seed color (SDC). Data was recorded from three randomly selected plants from middle of each row following NBPGR descriptors (Mahajan et al. 2000). For PGH, SDC, flowering and maturity time traits the entire plot/row was considered for recording the observations. The grain morphometry was done using scanner-based Grain Analysis software (version 1.3). Oil content was estimated using petroleum ether as a solvent in Soxhlet extraction system as per AOAC method number 920.39 using three replicates per accessions. Oil estimation was done for only one season 2017–2018.

Statistical analysis

For each year, the data for each quantitate trait was analysed using statistical tool for ABD (Rathore et al. 2005). The homogeneity of variances was estimated using F-test. For the traits with homogeneous variances, the combined analysis was performed on the adjusted means and in case of heterogeneous variance (traits: DF5, DF50, DF95, and DM), the data were transformed following Aitken’s transformation and the further analysis was performed on the combined transformed data using augmentedRCBD package of ‘R’ (Aravind et al. 2019). Adjusted means were used for calculating descriptive statistics, pairwise correlation, phylogenetic tree and Principal Component Analysis (PCA) using PAST software v3.26b (Hammer et al. 2001). Pearson’s linear correlation method was used for establishing pairwise correlation between the traits. Correlation between the traits was considered significant only if the p-value was less than or equal to 0.05. A phylogenetic tree of 220 accessions was reconstructed based on 18 quantitative traits using UPGMA method and was exported to MEGA X (Kumar et al. 2018) for visualisation.

Genetic diversity using SSR markers

Seeds of single plant progeny from a panel of 34 accessions for flowering-maturity time were grown in laboratory conditions and 2 weeks old seedlings were used for DNA isolation. Genomic DNA was extracted from fresh leaf tissues using plant DNA isolation kit (Qiagen) as per manufacturer’s instructions. The purified DNA was checked by gel electrophoresis with 0.8% (w/v) agarose (G-Bioscience) in 1X TAE buffer and Quality was checked on Nanodrop (DS-11 spectrophotometer). SSR primers as reported in Cloutier et al. (2012) were used for this study. After preliminary Screening of 31 SSR primers, 29 primers were selected for PCR amplification (Online Resource 2). PCR was carried out in a 20 μL reaction mixture containing 50 ng genomic DNA as template, 1X PCR buffer, 2.5 mM MgCl2, 0.2 mM of each dNTPs, 1µΜ forward and reverse primers and 1U Taq DNA polymerase. PCR reaction was performed in a thermal Cycler (Takara) with the following programme: Initial denaturation at 94 °C for 5 min followed by 35 cycles at 94 °C for 30 s for denaturation, annealing at 48–52 °C for 30 s and extension at 72 °C for 1 min followed by final extension at 72 °C for 7 min. The PCR-amplified products were resolved on 3% MetaPhor agarose gel (Lonza Rockland, USA) stained with ethidium bromide using 1X TAE buffer, at constant 120 V for ~ 4 h or more (depending upon the separation of bands). A 100 bp DNA ladder (G-Biosciences) was used as molecular weight marker for estimation of the size of fragments. The gels were then visualized in the gel documentation system (G-Box-Syngene). Band/allele size was scored using alpha imager software, Alpha View SA-Version: 3.4.0.0 with manual curation.

GenAlEx v6.5 (Peakall and Smouse 2012) was used to calculate the molecular distance matrix from the SSR data which was used to construct un-weighted neighbour joining tree using MEGA X (Kumar et al. 2018). Further, GenAlEx was also used to calculate Analysis of molecular variance (AMOVA), number of effective alleles, Shannon’s information index, observed and expected heterozygosity. Polymorphism Information Content (PIC) was calculated using polysat package (Clark and Jasieniuk 2011; Clark and Schreier 2017) in R. Model based, population structure was studied by using STRUCTURE v2.3.4 (Pritchard et al. 2000). Analysis was performed at K = 1 to 10 with 5 iterations at each K following the admixture model with a burn-in period of 50,000 steps followed by 50,000 MCMC repetitions. Online programme “Structure harvester” (http://taylor0.biology.ucla.edu/structureHarvester/) was used to calculate the ΔK value. Proportion of the genome of an individual that belongs to each inferred population (admixture) was also estimated.

Gene amplification, sequencing and analysis

For PCR amplification of AP2 Floral homeotic gene, specific primers were designed using gene sequence of loci Lus10015055. PCR amplification was performed using the sequence specific primers (Online Resource 2) and the amplicon was electrophoresed on agarose gel. The desired product was gel eluted using DNA Gel Elution kit (Quiagen). The purified amplicon was subjected to DNA sequencing. For phylogenetic analysis, orthologs of linseed floral homeotic AP2 gene (Lus10015055) from 23 plant species were used. Protein sequences were downloaded from NCBI (https://www.ncbi.nlm.nih.gov/genome/) except Linum usitatissimum from Phytozome (https://phytozome.jgi.doe.gov/pz/portal.html) and Hordeum vulgare from Ensembl Plants (https://plants.ensembl.org/index.html). Multiple Sequence Alignment (MSA) was done by standalone version of T-Coffee (Notredame et al. 2000). Phylogenetic tree was constructed using MEGA-X software (Kumar et al. 2018) following Maximum Likelihood method on default settings. Gene structure was visualized using Gene Structure Display Server 2.0 (Hu et al. 2015).

Results

Phenotypic and genotypic variability

Among the 18 quantitative traits studied, a wide range of variability was observed for most of the traits. The estimates of 18 quantitative traits have been given in Table 1 and frequency distribution of important traits have been shown in Online Resource 3. Considerable variation was recorded for TSW (2.69 g to 12.52 g), CPP (51.14 to 438.87), PB (3.74–18.32), SA (6.15–14.87), CA (25.88–53.04) and SWPC (16.77–91.28) (Table 1). The significant variation in most of the quantitative traits was also evident by analysis of variance (Online resource 4).

Table 1 Descriptive statistics and genetic variability analysis of important quantitative traits in 220 linseed accessions

Genetic Coefficient of variation (GCV) was highest for CPP, followed by TSW and PB. As expected, the broad sense heritability (hBS) was highest for OC, seed traits (SA, SB, SL) and DF95. Interestingly, flowering initiation (DF5) showed significantly lower values of hBS compared with completion of flowering (DF95). Least heritability was observed for YPP and SWPC. Genetic advance was highest for TSW followed by SA, PB, FD and least for YPP followed by CPP (Table 1).

The superior trait specific accessions identified in 2-year evaluation were further validated at two locations during 2019–2020 at New Delhi and Akola. The four environments mean of the selected accessions for the respective traits are given here. For early flowering, four accessions were identified with DF50% <  = 60 days (IC0096496: 60.28 days, IC0096539: 59.95 days, IC0523807: 58.44 days and IC0525939: 57.28 days), whereas, the best check T-397 recorded 76.56 days for DF50. Two accessions, IC0523807 and IC0525939 were early maturing with days to maturity 118.26 and 116.76 day, respectively, whereas the best check T-397 took 128.13 days to attain physiological maturity. Three accessions were found to have high number of capsules per plant (IC0053278: 267.52, IC0280320: 246.32 and IC0384578: 280.26), while the best check Kartika recorded 242.28 CPP. With respect to seed size, accessions, EC0041469 and EC0041700 have shown seed area 14.04 and 14.63 mm2, respectively as against the best check, Shekhar having seed area of 11.72 mm2. Interestingly, EC0041700 also showed the largest capsule size with capsule area 52.58 mm2, compared to check variety Shekhar having capsule area 40.18 mm2. Accessions EC0041469, EC041700 and EC0041720 showed high thousand seed weight (9.51 g., 10.93 g. and 9.49 g. respectively) compared to the best check variety Shekhar which showed TSW 8.13 g.

Frequency distribution of seven qualitative traits (plant growth habit, flower color, flower shape, flower aestivation, anther color, stamen color and seed color) exhibited fair representation from each category (Online Resource 5). For PGH (Plant Growth Habit) trait, bushy, erect and semi-erect accessions were 28.7, 31.4 and 39.9%, respectively. For quantitative traits, the most common type of linseed germplasm accessions were characterized by semi erect growth habit (39.91%), funnel shaped flowers (54.26%) with blue corolla (64.12%) and valvate (45.29%).

Clustering of accessions based on agro-morphological traits

Based on 18 quantitative traits, the accessions were clustered into 10 clusters (Fig. 1). The smallest cluster was of 2 accessions and the largest cluster comprised 40 accessions. Although, there was no clear mark of geographic distribution on the clustering pattern as the exotic and indigenous accessions were mixed and found in all but one clusters, most of the clusters showed specific trait features. Cluster-I which comprised 4 accessions which were late flowering (DF50, DF95) with highest value for traits CPP, and YPP, while lowest value for seed traits (SA, SB, SL, SWPC and TSW), capsule traits (CA, CD). Cluster-II comprised 2 accessions which showed late flowering initiation (DF5), late maturing (DM) with highest plant height. Cluster 5 with 31 accessions showed lowest average plant height. Cluster-VIII and IX, showed lowest average among the clusters for maturity (DM) and flowering (DF5, DF50 and DF95) traits. Cluster-IX accessions also showed highest average for TSW (Online Resource 6).

Fig. 1
figure 1

Clustering of 220 linseed accessions based on 18 quantitative traits

Correlation between agro-morphological traits

Pairwise correlation between the selected traits was observed as expected. For yield per plant (YPP), the capsule per plant (CPP) trait showed strongest positive correlation (r = 0.77, p value <  = 0.05), followed by seed weight per capsule (SWPC) and thousand seed weight (TSW) (Fig. 2). For, thousand seed weight (TSW) trait, seed area (SA) and capsule area (CA) and seed weight per capsule (SWPC) showed strong positive correlation. Other than the flowering traits, days to 50% flowering (DF50) showed positive albeit moderate correlation with days to maturity (DM), plant branching (PB) and plant height (PH). However, a negative correlation of DF50 with other yield related traits such as seed area (SA), capsule area (CA) and thousand seed weight (TSW) was observed. Days to maturity showed positive correlation with all three flowering traits (DF5, DF50 and DF95) and plant height (PH).

Fig. 2
figure 2

Pairwise correlation between quantitative traits. Color gradient of the circle from blue to red denotes positive to negative values of correlation, respectively. Size of the circle indicates the strength of p value. Correlation between the traits was considered significant and displayed in the figure if the p-value was less than or equal to 0.05. The upper and lower diagonal from left to right are mirror images

Principal component analysis

PCA analysis of 18 agro-morphological traits showed that around 84% variation is contributed by first five components with PC1 and PC2 contributing 45.31% and 15.48% variation, respectively (Table 2). Towards PC1, the largest variation was through seed area (SA) followed by seed length (SL), thousand seed weight (TSW), and seed breadth (SB). Maximum variation towards PC2 was by days to maturity (DM) followed by plant height (PH), flower diameter (FD) and days to 50% (DF50) and 95% flowering (DF95).

Table 2 Principal component analysis (PCA) of 18 quantitative traits of linseed

Variability in flowering and maturity trait

Among the studied accessions, about 50% of the accessions showed flowering (DF50) before 80 days and before all the three check varieties, whereas, only about 10% accession were very late (Online Resource 3). With respect to maturity (DM) only 45% accessions showed maturity before 140 days. A few accessions (EC0041735, EC0041753, IC0096539 and IC0096523) have also shown flowering initiation in short days in Delhi conditions. Based on 2 years’ evaluation of flowering and maturity related traits, a panel of 34 accessions with 19 early and 15 late flowering/maturing accessions was prepared (Table 3) in order to study the genetic diversity at molecular level using SSR markers. Early phenotype of two representative accessions (IC0523807 and IC0525939) in field conditions have been shown in Fig. 3.

Table 3 List of 34 early and late germplasm accessions for flowering and maturity time traits
Fig. 3
figure 3

Variation in flowering time of early accessions, IC0523807 and IC0525939 along with check variety Kartika

Genetic diversity and population structure based on SSR markers

Total 29 SSR markers were used for studying the genetic diversity of panel of 34 linseed accessions. PCR amplifications of four SSRs as a representation have been shown in Fig. 4A. Total 89 alleles were observed across 29 loci with mean PIC value of 0.42 ranging from 0.2 to 0.65 (Table 4). Cluster analysis based on SSR dissimilarity matrix showed 2 clusters. Cluster I showed two sub-clusters I-a and I-b with 19 and 4 accessions, respectively. Interestingly, all the accessions in sub-cluster I-a were from early flowering-maturing category including three exotic accessions, whereas, sub-cluster I-b showed only four accessions and all were exotic accessions and late flowering-maturing. Cluster II, showed 11 accessions and all were from late flowering-maturing phenotype, of which seven were exotic and four indigenous (Fig. 4B).

Fig. 4
figure 4

SSR markers based molecular diversity among selected linseed accessions. A panel of 34 accessions was selected based on extreme phenotypes of flowering and maturity traits A: Representative SSR profile of 34 short listed linseed accessions using four SSR primer pairs. Name of the SSR markers are given at the bottom of each gel. B: Clustering of 34 linseed accessions (19 Early and 15 late flowering/maturity) based on SSR marker data. Filled circles and square boxes show early and late flowering-maturing accessions, respectively

Table 4 Summary statistics of SSR markers in two populations of linseed germplasm

In order to understand the genetic relatedness of the 34 accessions, the population structure based on SSR data of 29 loci was studied by a model based programme STRUCTURE. The number of distinct populations was determined based on maximum Delta K value plotted against the K (K = 1 to K = 10) using ‘Structure harvester’ programme, Maximum delta K value was observed for K = 2 (Fig. 5A), therefore two distinct populations were considered for the 34 accessions. The bar plot (Fig. 5B) clearly shows two populations with 19 accessions in population-I and 15 accessions in population-II. Majority of accessions in population-I were exotic (12 of 19) whereas, population-II comprised all but two indigenous. Most of the accessions in both the populations were pure (score > 0.9), with exception of four accessions (IC0345397, EC0041619, IC0319846 and EC0041753) from population-I, which showed some admixture (Fig. 5B). Population-I and II were mostly represented by late and early flowering-maturing accessions, respectively.

Fig. 5
figure 5

Model based clustering of 34 linseed accessions based on SSR data in STRUCTURE for K = 2 populations of (A, B)

Genetic diversity within each population was studied by calculating effective number of alleles (Ne), Shannon index of information (I) and expected heterozygosity (He). The average effective allele number (Ne) for Population-I and II was 1.88 and 2.05, respectively. The average gene diversity (He) for Population-I (0.41) was less than that of Population-II (0.47). Shannon Information Index (I) for Population-I and II were 0.67 and 0.78, respectively (Table 4). Analysis of molecular variance (AMOVA) based on SSR data showed maximum molecular variance within individuals (54%), followed by among individual (30%) and among populations (16%) (Fig. 6).

Fig. 6
figure 6

Distribution of molecular variance as per AMOVA of SSR data from 34 linseed accessions

Sequencing of putative floral homeotic AP2 gene

A gene encoding AP2 domain transcription factor from L. usitatissimum (Lus10015055) was used for PCR amplifications using specific primers. A full length gene of around 2.3 kb was PCR amplified from two early (IC0523807, IC0525939) and two late (EC0115148, EC0718827) accessions of linseed. PCR amplicon was sequenced and the verified sequences from the above mentioned accessions have been submitted to NCBI GenBank (GenBank accession number: MT001915, MT001916, MT001917, MT001918). As in Lus10015055 from CDS Bethune, the gene comprises eight introns of size range from 72 to 218 bp (Fig. 7A). Sequences analysis and alignment of gene sequences from all the four accessions revealed a total of 11 Single nucleotide polymorphisms (SNPs) and one indel. Five SNPs were in the coding region of which four were of non-synonymous (nsSNP) nature and six SNPs were from the intronic region of the gene. Three of the five coding region SNPs (position- 21, 106, 217) and three intronic SNPs (691, 958, 960) showed one variant for both the late accessions and the alternate variant for the early accessions (Fig. 7B). Further, from the non-synonymous SNPs, SNPs on 106th and 217th position resulted change of amino acid Tyrosine (Y) at position 36 and 73 in late accessions to amino acid Asparagine (N) in early accessions (Fig. 7B, C). These SNPs are important as these are distinguishing early and late accessions. However, the other two nsSNPs could not distinguish between early and late accessions. Phylogenetic analysis of the linseed AP2 floral homeotic gene with 23 plant orthologs showed that linseed gene is closer to Populus trichocarpa Torr. & A. Gray, Gossypium hirsutum L. and Gossypium raimondii Ulbr. (Fig. 8). Most neighboring clade consisted Glycine max (L.) Merr., Glycine soja (L.) Merr., Vigna radiata (L.) R. Wilczek, Vigna angularis (Willd.) Ohwi & H. Ohashi and Arachis hypogaea L.

Fig. 7
figure 7

A Genomic structure of gene encoding putative floral homeotic APETALA 2 gene in linseed. Filled boxes and lines represent exons and introns, respectively. B SNP haplotype of putative floral homeotic APETALA 2 (AP2) gene in early and late flowering-maturing germplasm accessions of linseed. Nucleotides letters in green font are from exon regional and in red font are from intron regions. NCBI GenBank accession number of putative floral homeotic AP2 gene sequence variants for respective early and late linseed germplasm accessions are shown in the brackets. C Multiple sequence alignment of translated amino acid sequence of putative floral homeotic APETALA 2 gene from 2 early, IC0523807, IC0525939 and 2 late accessions flowering-maturing accessions, EC0115148 along with CDS Bethune (Lus10015055). Two AP2 domains in the floral homeotic protein sequences is underlined. Unfilled boxes indicate altered amino acid as a result of non-synonymous SNPs in gene sequences

Fig. 8
figure 8

Phylogenetic analysis of putative linseed floral homeotic AP2 gene along with orthologs from 23 other plant species. Amino acid sequence was used for phylogenetic analysis using MEGA X

Discussion

High genetic variability in linseed germplasm accessions for important agro-morphological traits

Present study revealed high genetic variability for key agro-morphological traits such as YPP, CPP, TSW, SWPC and PB in congruence with earlier studies on linseed genetic diversity (Diederichsen et al. 2013; Chandrawati et al. 2017; Kaur et al. 2018; You et al. 2017). For DF50, DM and OC traits, relatively less variability was observed which is in agreement with You et al. (2017); Dikshit and Sivaraj (2015) and Nizar and Mulani (2015). For most of the quantitative traits slightly lower magnitude of GCV was reported in the present study than the recent reports (Chandrawati et al. 2017; Kaur et al 2018).

It is interesting to note that several traits specific accessions have been identified in this study which have not been reported earlier for important agro-morphological traits such as early flowering, early maturity, high number of capsules per plant, large capsule size, high thousand seed weight and bold seed (Table 2). These accessions are expected to harbor potential alleles for the respective traits and therefore could be included in the linseed improvement programme. Broad sense heritability (hBS) of traits gives an account of experimental or environmental error and therefore is a measure of accuracy of the experiment (You et al. 2016, 2017). With exception of YPP, CPP, PB, PH, SWPC and DM, hBS was more than 75% for all the traits suggesting the accuracy and repeatability of the experiment. Interestingly high broad-sense heritability was observed for flowering time traits DF5 (81.25), DF50 (90.54) and DF95 (94.64) suggesting good potential for genetic enhancement (Table 1). High estimates of hBS for flowering time trait in linseed was also reported earlier (Chandrawati et al. 2017; You et al. 2017; Kaur et al 2018). Further, a significant positive correlation of maturity (DM) with flowering traits (DF95 and DF50) and plant height (PH), albeit moderate, (Fig. 2) suggested that selection for early flowering, short plant height may aid in the breeding for early-maturing linseed. Similar observation was also reported in case of Canadian spring wheat cultivars (Iqbal et al. 2006).

As linseed is grown in India largely as rainfed and utera cultivation, utilization of residual moisture and avoiding terminal heat and drought is crucial for seed yield (Singh and Satapathy 2019; Kaur et al. 2017). In this perspective early flowering is critical for plant to enter in the reproductive stage. Linseed is considered a facultative long day plant and the short day photoperiod delays the flowering and transition from vegetative to reproductive stage (Domantovich et al. 2012; Sun et al. 2019). However, linseed possesses genotypic differences for the photoperiod response (Zhang 2013). Therefore, it is imperative to attempt to search genetic variability for flowering and maturity traits. By evaluation of 220 linseed accessions over two consecutive years, a few early and late flowering as well as maturing accessions have been identified. Accordingly, a panel of 34 accessions (19 early and 15 late flowering-maturing accessions) was prepared (Table 3) to facilitate more detailed study at molecular level. Since, in present study some of the early flowering accessions have been specially included from earlier reports, a comparison of the same with the earlier studies could be drawn. Early flowering accessions reported by Dikshit and Sivaraj (2015) (IC0345409 and IC0345397) and Kaur et al. (2018) (IC0096539, IC0096496) were also found amongst the early flowering accessions in our study (Table 4). However, there were differences in days to 50% flowering and maturity for the same accessions from the reports (Dikshit and Sivaraj 2015; Kaur et al. 2018) and our study. This is expected as the different regions with temperature differences, show slightly different flowering time, the colder regions show delayed flowering and maturity compare to hotter regions. It has been shown earlier that flowering as well as growth and development in linseed are influenced significantly by photoperiod and temperature (Plessers 1961; Davidson and Yermanos 1965; Sirohi and Wasnik 2018).

A few accessions (EC0041735, EC0041753, IC0096539 and IC0096523) have shown flowering initiation at New Delhi (28° 38′ 23′′ N) in short day photoperiod suggesting different degrees of photo-insensitivity among linseed germplasm accessions. Differential response to photoperiod in linseed varieties have also been reported earlier (Sirohi and Wasnik 2018; Sun et al. 2019).

SSR based diversity shows clustering similar to flowering-maturity trait groups

The 29 SSRs amplified in a panel of 34 accessions showed total 89 alleles with average of 3.069 alleles per locus which was lower than that observed by Chandrawati et al. (2017) (6.74) and Wiesner et al. (2001) (3.7), however, was slightly higher than the other studies Cloutier et al. (2009) (2.3), Soto-Cerda et al. (2011) (2.3) and Cloutier et al. (2012) (2.4 for EST based SSR and 2.8 for BAC-end sequences based SSR). Soto-Cerda et al. (2011) reported similar mean alleles for SSRs as observed in present study. The mean PIC value of SSR in present study (0.49) was slightly higher than that of Chandrawati et al. (2017) (0.47), Bickel et al. (2011) (0.47), Rachinskaia et al. (2011) (0.42), Bickel et al. (2011) (3.3) and Choudhary et al. (2017) (0.32), however, lower than Wiesner et al. (2001) (0.6) and Deng et al. (2010) (0.6). The polymorphism differences among different studies are expected as the differences could be due to the different SSR loci used for screening, nature of genetic material and number of genotypes studied. It is important to note that clustering using 29 SSR markers could distinguish accessions as per the trait group (Fig. 4). In congruence with UPGMA based dendrogram, the model based population structure showed two populations nearly similar to the trait groupings (Fig. 6). Further, the AMOVA analysis showed 16% genetic variation between the two populations, which is slightly less that reported by Chandrawati et al. (2017) for four subpopulations of 168 linseed accessions. Among the two populations, population specific allele patterns were observed (Table 4). For Population-II, higher average effective allele (Ne), Shannon’s information Index (I) and Effective heterozygosity (He) was observed suggesting higher diversity.

The flowering-maturity traits specific panel of accessions and SSR based genetic diversity between and within the populations would help linseed breeders in designing the breeding programme for early maturing verities.

SNP haplotype of putative floral homeotic APETALA 2 (AP2) gene

Role of AP2 domain transcription factor in floral identity, transition and development has been known in other plants (Yant et al. 2010; Bowman et al. 1989). AP2 transcription factors are also known to regulate flowering time as well as seed development in Arabidopsis (Jofuku et al. 1994, 2005; Ohto et al. 2009; Yant et al. 2010). In the present study, sequencing a putative floral homeotic APETALA 2 (AP2) gene was done from two early and two late flowering-maturing linseed accessions to study the allelic variation (Fig. 7). From the total 11 SNPs and one indel identified, 8 SNPs and an indel variant were distinct in early and late accessions, while same in the respective categories. These 8 SNPs and one indel variant could distinguish early and late flowering-maturing accessions. This indicate a possible association of these SNPs with early and late flowering-maturity trait, however, it needs to be verified on large set of germplasm accessions for establishing more conclusive association. SNPs in AP2 genes and their association with flowering time traits have also been reported in other plants. In radish, SNPs associated with flowering related transcription factors AP2/EREB among others have been identified in early, middle, and late flowering accessions (Kim et al. 2019). Similarly, in maize in a comprehensive study using 8000 maize lines had identified nearly 1000 flowering time associated SNPs distributed around 220 candidate genes. Interestingly, a total of 20 associated SNPs were found in the genic regions of candidates including RAP2.7, which encodes an AP2-like transcription factor (Li et al. 2016; Salvi et al. 2007).

Conclusion

The study revealed high genetic variability for important agro-morphological traits which contribute to high seed yield of linseed. Several economically important traits specific accessions have been identified which are expected to harbor hitherto unknown alleles for the traits in question and have potential to contribute to linseed genetic improvement programme. SSR based genetic diversity between and within the populations in the panel of accessions for flowering-maturity time gives a vital information in designing breeding programme for early maturing varieties in linseed. SNPs identified in putative floral homeotic AP2 gene which could distinguish two early and late flowering-maturing accessions indicate possible association with the trait, however, needs further verification on large set of accessions. The genetic diversity pattern identified in present study using morphological traits as well as SSR markers and the genetic and genomic resources identified are expected to aid in trait specific breeding programme especially for flowering and maturity traits in linseed.