Introduction

Seed oils, mainly composed of triacylglycerols (TAGs), are used for both nutritional and industrial purposes. The quality of the seed oil is determined by the constituent fatty acids (FAs) found in the TAGs. The different FA types and their relative proportions in TAGs vary widely both inter- and intra-specifically (Eckey 1954; Hilditch and Williams 1964; Gurr 1980; Harwood 1980; Welch et al. 2000; Marsalkiene et al. 2009). Further, seed oil composition appears to have been under strong selection on a latitudinal cline at both the macro- and micro-evolutionary scales (Linder 2000). The composition of seed oils influences their melting points and differs according to temperature regimens. At lower temperature or higher latitude, the melting point is lower because of greater proportions of unsaturated FAs (Linder 2000). Thus, identifying genomic regions affected by local adaptation and characterizing the traits and genes regulating seed oil composition under different environmental conditions will be useful for breeding oilseed crops and understanding seed oil evolution.

In recent years, our understanding of seed TAGs biosynthetic pathways has progressed rapidly (Browse and Somerville 1991; Ohlrogge and Browse 1995; Harwood 1996; Broun et al. 1999; Lung and Weselake 2006; Cahoon et al. 1997; Barker et al. 2007). However, the understanding of the genetics underlying the regulation of FA and TAG synthesis is still limited (Ohlrogge and Jaworski 1997; Sharma et al. 2008; Baud and Lepiniec 2009). The proportions of some FAs (i.e., the saturated FAs in seed oils) are nearly constant under a wide variety of environmental conditions, and thus under strict genetic control (Hilditch 1956; Canvin 1965; Harris et al. 1978; Djemel et al. 2005).

Previous studies have shown considerable variation in the seed oil composition of Arabidopsis (Millar and Kunst 1999; O’Neill et al. 2003) and suggested that populations derived from selected crosses between parental lines differing in oil composition will be useful for investigating these traits. Hobbs et al. (2004) identified QTLs responsible for regulating the seed oil composition of A. thaliana in the Ler-2 × Cvi recombinant inbred line (RIL) population. However, since QTL are often population-specific (Lynch and Walsh 1998), determining the full genetic architecture of complex traits using a single population is likely to describe only a fraction of the “global genetic architecture” for said traits (Symonds et al. 2005). To more fully elucidate genetic variations responsible for controlling quantitative trait variation, many experimental segregating populations, derived from a diverse array of accessions, are needed (Tonsor et al. 2005).

The aim of this study is to map QTL for FA traits in four RIL populations (Ler-0 × Sha, Ler-0 × Col-4, Ler-2 × Cvi, Ler-0 × No-0) for capturing the range of genetic variation in seed oil composition. QTL for FA traits, which influence the seed oil composition and melting point (MP), can be very useful in understanding adaptation to temperature and for selecting suitable genotypes for breeding better quality oil in the Brassica and other crop species. A further objective is to identify promising candidate genes (CGs) in seed oil biosynthesis giving additional insight into the aforementioned purposes of breeding and seed oil evolution.

Materials and methods

Mapping populations

Four recombinant inbred line (RIL) mapping populations (Table 1) were used: Ler-0 × Sha (El-Lithy et al. 2004), Ler-0 × Col-4 (Lister and Dean 1993), Ler-2 × Cvi (Alonso-Blanco et al. 1998), and Ler-0 × No-0 (Magliano et al. 2005). All were obtained from the Arabidopsis Biological Resource Center (http://www.arabidopsis.org/abrc/). All four populations were previously genotyped in the F8 generation or later by their creators. Genotyping details for each population can be found in their original descriptions and at the Arabidopsis Information Service (http://www.arabidopsis.org) and the Nottingham Arabidopsis Stock Center (http://nasc.nott.ac.uk).

Table 1 Four RIL populations used for QTL mapping

Determination of fatty acid composition of triacylglycerols

Triacylglycerols (TAGs) were extracted twice from 15–20 seeds of each RIL line (unreplicated) and all of the parental lines (two replicates) for all four RIL populations. The unreplicated design was chosen to optimize the power of QTL detection given the number of phenotypical observations, which are very labor intensive (Capelle et al. 2010). All the extracted TAGs from the RIL and parental lines were converted to fatty acid methyl esters (FAMEs) (Metcalfe and Wang 1981). The FAMEs were resolved on a Hewlett-Packard 5890A gas chromatograph using a 30 m 70% Cyanopropyl polysilphenylene-siloxane capillary column (BPx70, SGE, Inc.) and detected by flame ionization. For all runs, an initial oven temperature of 180°C was maintained for 5.5 min and then ramped to 240°C at a rate of 7.5°C/min. Ramping was followed by a final time of 0.50 min at 240°C. Injection volume was 2 μl with a 1:100 split ratio. The results for each run were compiled and analyzed by HP Chemstation software (version A.04.02) with the proportion of each fatty acid estimated by the area under the curve for each peak as a proportion of the total area under the peaks. We measured nine of the eleven fatty acids found in A. thaliana seeds (16:0, 18:0, 18:1, 18:2, 18:3, 20:0, 20:1, 20:2 and 22:1). The other two fatty acids (20:3 and 22:0) occurred in minor amounts (each <1% of the total) that could not always be accurately measured. The chromatography peaks were identified using three size standards: RM-6, NHI-F and 189-19 (Supelco).

Statistical analyses

QTL analysis

The relative proportions of individual FAs, total saturated FAs (16:0, 18:0 and 20:0), melting point of TAGs and FAs synthesized in plastids (≤18:1) were used as traits for QTL mapping (Windows QTL Cartographer v2.5; Wang et al. 2007, R/qtl; Broman et al. 2003). Prior to the QTL analysis, phenotypic data for seed oil composition for the four RIL populations were analyzed with Systat 12 and the variation within each trait was assessed using a one-way analysis of variance to compare the means of the four populations. Interval mapping (IM) based on the LOD-score test (Lander and Botstein 1989) and composite interval mapping (CIM) with model 6 activated in the Zmapqtl module were performed. Default parameters (5 co-factors and a window size of 10) were used for QTL detection. Statistically significant QTL (p = 0.05) were determined using permutation-derived empirical thresholds based upon 1,000 permutations for each trait mapped (Churchill and Doerge 1994). The estimated additive effect and the percentage of variance explained by each major effect QTL were obtained from the software using the CIM model.

Analysis of the seed oil composition included the ten individual FAs, the total saturated FAs, fatty acids synthesized in plastids and melting point of triacylglycerols were measured. QTL analysis of fatty acids synthesized in plastids was done to examine maternal or plastid-specific effects. QTL analysis of the total saturated fats and estimated melting point of the triacylglycerols were done as evidence suggests these traits may be under selection (Linder 2000). The melting point of each RIL’s triacylglycerols was estimated using the relative proportion of each constituent FA to weight the melting point of the FA. Therefore, the estimated melting point of the triacylglycerols of each RIL was the average of the weighted melting points of the constituent fatty acids (Malkin 1954). This estimated measure of the TAGs melting point may not precisely match the empirically determined melting point (Stumpf 1980), but it does not influence the relative ordering and values of the melting points. Thus, our estimated melting points of the TAGs in a RIL are useful even if slightly deviant from the absolute melting point.

Epistasis

The four RIL populations were also analyzed for epistatic interactions between QTLs by building a stepwise QTL model using R/qtl (Broman et al. 2003). T tests and ANOVAs were performed using Systat 12 (Systat 12 2007) to determine the effects of the genotypes for each trait in the four RIL populations. One-way ANOVAs were performed where, for each FA, the relative proportion of a FA was the dependent variable and the genotypes were the independent factors.

Cytoplasmic inheritance in the Ler-2 × Cvi RIL population

The Ler-2 × Cvi RIL population was reciprocally crossed, so we tested for cytoplasmic interactions for all major QTL identified in the Ler-2 × Cvi RIL population, using the model Y = (Q 1 + ··· +Q n + cytoplasm + Q 1*cytoplasm··· + Q n*cytoplasm) with Y as the trait, Q 1 to Q n as the identified QTL and cytoplasm as the covariate. The tests were performed in R/qtl (Broman et al. 2003). We also tested for nuclear-cytoplasmic interactions for all QTL identified in the Ler-2 × Cvi RIL population by performing two-way ANOVAs in Systat 12 where, for each test, an individual FA was the dependent variable, and the cytoplasm and markers were the independent factors.

Genetic correlations among traits

We estimated genetic correlations among traits for each RIL population using the Pearson product-moment pairwise correlation between the 12 traits within each of the four RIL populations (Systat 12) to test whether there were significant correlations between traits (Table S1–S4). P values were adjusted for multiple-hypothesis testing in R (version 2.9.0 alpha) using the q value false discovery rate (FDR) plug-in (Storey and Tibshirani 2003; Storey et al. 2004) and the Benjamini and Hochberg method (BH-FDR) (Benjamini and Hochberg 1995) in the built in function, p-adjust. Default parameters were used for the q value module except that the bootstrap method was used wherever possible.

Melting point indices of fatty acids

To assess how seed oil melting point (MP) might be affected by genetic correlations between fatty acid QTL, we developed an index for comparing the relative effects of significant genetic correlations among FAs on the melting points of TAGs (Sanyal and Linder, manuscript submitted). For a pair of significantly correlated FAs, the difference between the individual FAs melting points (MPFA1, MPFA2) and the average melting point of the TAGs (MPTAG) was calculated. The product of these differences was then multiplied by the FAs’ genetic correlation (corrFA1,FA2) and the two FAs’ average proportions (p FA1, p FA2) in a set of RILs. The difference between the FAs’ melting points and the melting point of the TAG accounts for the degree to which change in one FA would affect the TAG melting point due to a change in the correlated FA; the correlation coefficient determines the degree to which selection on one FA would affect the other FA, and the relative proportions of the two FAs involved in the interaction accounts for the relative weight of each FA in determining the melting point of the TAG prior to selection. The said parameters are mathematically annotated by the equation:

$$ \left( {{\text{MP}}_{{{\text{FA}}1}} - {\text{MP}}_{\text{TAG}} } \right)*\left( {{\text{MP}}_{{{\text{FA}}2}} - {\text{MP}}_{\text{TAG}} } \right)*{\text{corr}}_{{{\text{FA}}1,{\text{FA}}2}} *p_{{{\text{FA}}1}} *{\text{p}}_{{{\text{FA}}2}} $$

For each RIL set, the indices for the pairs of FAs were standardized by dividing each index by the maximum index value of the set of indices calculated. Therefore, the largest scaled index has a value of 1 with all other indices being between −1 and 1.

Candidate gene and QTL co-location analysis

To determine whether CGs co-localized with QTL more often than expected, the QTL confidence interval (CI) was randomized across the linkage maps 1,000 times, and the distribution of the proportion of QTL co-locating with CGs was determined for each RIL population. The number of QTL placed matched the number of QTL found for each RIL population, and the number of possible locations matched the number of markers used for each RIL set. Afterwards, the proportion of times the CGs collocated with the QTL at rates equal to or greater than the rates we observed to obtain a p value was calculated. If this proportion of the randomized QTL co-locating with the CGs was less than the original co-location of QTL and CGs 95% of the time, the co-location was deemed significant (Pelgas et al. 2011).

Map alignments

There is considerable variation in the total length of the linkage maps in the four populations, which ranged from 386.4 to 490 cM (Table 1). Hence, the linkage positions of markers in each population were aligned with their corresponding physical positions on the Col physical map (ARABIDOPSIS GENOME INITIATIVE 2000). QTL physical positions were determined from the known flanking marker positions. Further, the physical positions of the QTL were estimated by using a least-squares regression approach based on the coarse linear relationship between the physical and linkage map position of each marker. The QTL in the Ler-0 × Col-4 population was mapped directly on the physical map. The QTL mapped in the different populations was considered to be the same locus when the physical positions of markers within 2-LOD support intervals overlapped (Symonds et al. 2005). In addition, the co-location of the QTL with CGs further confirmed the overlap of QTL in the Ler-0 × Sha and Ler-0 × Col-4 populations.

Results

All 12 FA traits in each of the four RIL populations showed continuous phenotypic variation (Table 3; Figs. S1–S4) and transgressive segregation exhibiting both higher and lower values than the high- and low-value parent, respectively, suggesting action of multiple genes for the traits and that selection could act on genetically diverse populations to rapidly change seed oil compositions to suit different conditions. ANOVA showed significant variation for relative proportions of 12 FA traits with the exception of 20:2 among RILs in the four RIL populations at p ≤ 0.005 indicating that a strong genetic component underlies the FA traits.

The Ler-0 × Sha population had the largest number of QTL, 36 for all 12 FA traits scattered across all five chromosomes. The Ler-0 × Col-4 population had 23 QTL mapped for all traits with the exception of 20:0 and 20:1 on all five chromosomes. The Ler-2 × Cvi population had seven QTL mapped on all chromosomes with the exception of chromosome 3 and Ler-0 × No-0 population had six QTL mapped on all five chromosomes. All four populations had one or more QTL co-locating with other QTL (Tables 4, 5, 6, 7; Figs. 2, 3, 4, 5).

Mapping of QTL for fatty acid composition

A total of 72 QTL was mapped with individual effects that explained from 3.2 to 79.8% of within-population variation. The QTL for 16:0, 18:1, 18:2, 22:1, and FAs present in plastids were observed in multiple populations (Tables 4, 5, 6, 7). Models including gene-by-gene interactions showed the presence of epistasis for only 20:1 and 20:2 in the Ler-0 × Sha population and 18:1 in the Ler-0 × Col-4 RIL populations.

In the Ler-0 × Sha population, three QTL hot spots for FA traits were identified where seven to nine QTL co-located on chromosomes 1, 3, and 4 (Table 4; Fig. 2). Interestingly, all nine QTL on chromosome 4 were in the same location. In addition, overlapping QTL to a lesser extent was also observed on chromosomes 2 and 5, respectively. All QTL collocated with at least one QTL. Several of these QTL have the highest LOD peak at the same location suggesting that the same QTL is influencing several traits (Table 4; Fig. 2). Importantly, the four MP QTL co-locate with one or more FA QTL whose correlations could affect the melting point (Table S1). The largest QTL for 20:1, which explained 79.8% of the phenotypic variance, was identified on chromosome 4, respectively. One to four QTL for each of the 12 FA traits identified across all chromosomes explained 29.1–94.6% of the phenotypic variance. The four QTL for 20:1 explained 94.6% of the phenotypic variance. QTL effects were unidirectional for 18:0, 18:2, MP and saturated FA traits where the ‘Ler’ allele was the high allele for all of them excluding 18:2 where it was a low allele. QTL effects were in opposite directions for the other eight FA traits where the ‘Ler’ allele was the high allele at some loci and low allele at other loci for the same trait suggesting considerable genetic variation (Table 4). In the Ler-0 × Col-4 population, the 23 QTL identified for 10 FA traits explained 7–41.3% of the phenotypic variance (Table 5; Fig. 3). One to four QTL identified for each of the FA traits explained 7.9–58.6% of the phenotypic variance. The largest QTL for 18:1 on chromosome 3 explained 41.3% of the phenotypic variance. In addition, one or more QTL co-located on chromosomes 1, 2 and 3, respectively, and their LOD peaks were often in the same position suggesting that the same QTL could be controlling multiple traits. The QTL effects were unidirectional for seven of the FA traits where the ‘Ler’ allele was the high allele for 16:0, 18:2, 20:2 and saturated FA traits, and the low allele for 18:1, 18:3 and 22:1, respectively. The QTL effects were in opposite directions at different loci for 18:0, MP, and FAs present in plastids suggesting considerable genetic variation (Table 5). The MP QTL co-locates with 22:1 on chromosome 1, which may not affect the MP, as the correlation is not significant (Table S2). Seven QTL for five FA traits were identified in the Ler-2 × Cvi population, which explained 13.7–47.8% of the phenotypic variance. The three QTL for 16:0 explained 47.8% of the phenotypic variance for that trait. The QTL mapped on chromosomes 1 and 5 collocated with each other (Table 6). The six QTL identified in the Ler-0 × No-0 population explained 13.9–23.4% of the phenotypic variance. Two QTL for 16:0 explained 37.3% of the phenotypic variance. Furthermore, the QTL for 16:0 collocated with a QTL for saturated FAs on chromosome 3 (Table 7).

Co-location of QTL and candidate genes

In the Ler-0 × Sha population, 22 of the 36 QTL co-located with one or more CGs, and 9 of the 15 a priori identified CGs co-located with at least one QTL. Remarkably, all the QTL on chromosomes 3 and 4 co-located with relevant CGs, consistent with respect to functions. Two QTL each on chromosomes 2 and 5 also co-located with CGs. None of the seven co-locating QTL on chromosome 1 co-located with a CG. The QTL for 16:0, 18:1, 18:2, 18:3, 20:0, 20:1, 20:2, and 22:1 on chromosomes 3 and 4 collocated with FAE1.1 (AT3G10280), FAE1.2 (AT3G52160), FAE1.3 (AT4G34250), FAE1.4 (AT4G34520) and FAD2 (AT2G29980) genes (Table 4). Seven of these FAs are either the reactants or products of all the four FAE genes with the exception of 16:0 FA on chromosome 3 (Table 2; Fig. 1). The FAE genes elongate 18:1 and 18:2 to 20:1 and 20:2, 20:1 to 22:1, and also 20:0 to 22:0, respectively. The FAD2 also converts 18:1 to 18:2, which are precursors for 20:1 and 20:2, and thus can affect the proportions of those four FAs. The co-location of SAD1 (AT5G16240) and SAD2 (AT5G16230) with a QTL for 20:0 on chromosome 5 is relevant, as 18:0 is a substrate for both the genes and could affect the proportions of 20:0, which is formed from 18:0, respectively. The co-location of DAGAT (AT2G19450) with a QTL for 20:0 on chromosome 2 is not clear. The preference of DAGAT for a certain FA over others could affect the proportions of 20:0, respectively (Table 2). Furthermore, we observed that on chromosomes 2, 3 and 4, the CGs collocated within 2 cM of the QTL peak, respectively. On chromosome 2, DAGAT collocated within 0.2 cM of the QTL peak for 20:0, while on chromosome 3, FAD2 and FAE1.1 collocated within 1–2 cM of the QTL peak for 18:1, 18:2 and 20:2, and FAE1.2 collocated at the QTL peak for 18:3, respectively. Similarly, FAE1.3 and FAE1.4 collocated within 0.4 CM of the QTL peak for 20:2, respectively (Table 4). In the Ler-0 × Col-4 population, 11 of the 23 QTL collocated with CGs on all five chromosomes. The co-location of 16:0 with KASII on chromosome 1 is relevant as it regulates the proportions of 16:0 and 18:0. Similarly, the co-location of FAD3 with QTL for 18:2 and 20:2 on chromosome 2 correlates with the desaturase activity of FAD3 (18:2–18:3 and 18:2 is also elongated to 20:2; thus, the proportions for 20:2 could also be affected). As seen in the Ler-0 × Sha population, in the Ler-0 × Col-4 population, the co-location of FAE1.1, FAE1.3, FAE1.4 and FAD2 with QTL for 18:1, 18:2 and 22:1 on chromosomes 3 and 4 and co-location of SAD1 and SAD2 with a QTL for 18:0 on chromosome 5 was observed. In all the cases, the co-locations make sense with respect to functional relevance (Tables 2, 5). The co-location of a QTL for 22:1 with FATB (AT1G08510) cannot be explained. None of QTL in the Ler-2 × Cvi and Ler-0 × No-0 populations collocated with CGs. However, wherever co-location of CGs was not observed, other genes, which have putative function related to lipid synthesis, were seen (Tables 6, 7).

Table 2 Candidate fatty acid synthesis genes in A. thaliana and their functions
Fig. 1
figure 1

Relevant FA and TAG synthesis pathways in A. thaliana. FAs are boxed with saturated fatty acids having rounded corners. The distinction between the plastids and the ER is to designate where the reactions occur. 16:0, 18:0 and 18:1 are all exported to the ER and are incorporated into TAGs. Some 18:2 and 18:3 are synthesized in the plastids but these are not exported to the ER

Genetic correlation between FA traits

Associations amongst FAs are subject to autocorrelation because they are expressed as percentages of total identified FAs. The correlation was also greater between FAs when one was included in a composite trait, thus we observed that r was 0.935 between 16:0 and saturated FAs, which are comprised of 16:0, 18:0 and 20:0 FAs, respectively. In the biosynthetic pathway, the 14C, 16C and 18C saturated FAs are converted to 18C, 20C, and 22C unsaturated FAs, and so the two types of FAs would be expected to be negatively correlated. As expected, the results indicate significant positive correlations between saturated FAs (16:0, 18:0 and 20:0, Fig. 1; Tables S1–S4) and negative correlations between saturated and unsaturated FAs (e.g., between 18:0 and 18:2, Tables S1–S4). As expected, negative correlations between unsaturated FAs were also observed when one of the FAs is the substrate and the other FA is the product of an enzyme reaction (for e.g., 18:2 and 18:3, Fig. 1; Table S1) in a metabolic pathway. In some cases, the correlations are hard to interpret because the up-regulation or down-regulation of the gene could determine the correlations. Further, in the Ler-0 × Sha population, the four MP QTL on chromosomes 1, 3, 4, and 5 co-locate with two to eight FA QTL for all nine individual FAs. Significant correlations between MP and all individual FAs with the exception of 22:1 were observed indicating that these correlations with the individual FAs will affect the MP in the Ler-0 × Sha population only. In the Ler-0 × Col-4 population, only 22:1 co-locates with MP QTL, but the correlation was not significant and will not affect the MP.

QTL overlap between populations

Three QTL for 18:1, 18:2 and FAs synthesized in plastids on chromosome 3 and one QTL for 22:1 on chromosome 4 (Tables 4, 5; Figs. 2, 3) were identified in both Ler-0 × Sha and Ler-0 × Col populations. For all four QTL, the effects of the ‘Ler’ allele in the Ler-0 × Sha and Ler-0 × Col-4 populations were in the opposite directions suggesting considerable variation at these loci. The Sha and Col-4 alleles are quite different where the differences in the relative proportions of the FAs in the two ecotypes range from 1.6 to 5.4%, respectively (Table 3). All three QTL on chromosome 3 in both the populations collocated with two relevant CGs, FAE1.1 and FAD2, and one QTL on chromosome 4 collocated with two relevant CGs, FAE1.3 and FAE1.4, as aforementioned (Tables 4, 5). In addition, a QTL for 16:0 was also detected on chromosome 3 in the Ler-0 × Sha and Ler-0 × No-0 RIL populations. In this case, the Ler allele was the “high” allele in both populations and did not co-locate with any CG.

Fig. 2
figure 2

Major QTL for the 12 seed oil composition traits in the Ler-0 × Col RIL population. C1 to C5 indicate chromosome numbers. On the left hand, just the markers which overlap with the 95% confidence interval (CI) of one or more QTL peaks are listed. The first line represents the name of the marker and the second line is the position of the marker on the chromosomes in cM. On the right side of the chromosome, the vertical lines indicate the 95% CI of the major QTL. The horizontal lines indicate the position of the maximum height of the QTL peak and they have the trait associated with them on the right. Horizontal and vertical bar patterns match each other except when multiple peaks share same peak position

Fig. 3
figure 3

Major QTL for the 12 seed oil composition traits in the Ler-0 × Col RIL population. To interpret this figure see the legend for Fig. 2

Table 3 Fatty acid composition of seed oils of the four RIL populations
Table 4 Quantitative trait loci detected for the 12 seed oil composition traits in the Ler-0 × Sha RIL population
Table 5 Quantitative trait loci detected for 12 seed oil composition traits in the Ler-0 × Col-4 RIL population

Cytoplasmic effects and nuclear-cytoplasmic interactions

We found a significant cytoplasmic effect and a nuclear-cytoplasmic interaction only in the Ler-2 × Cvi population for the QTL: LCv4.16b, for trait 16:0 on chromosome 4 (Table 6). A cytoplasmic effect was detected for the Ler cytoplasm (p = 0.027), which is interesting because 16:0 is synthesized in the plastid. The nuclear-cytoplasmic interaction (p = 0.007) might have been caused because 16:0 is synthesized in the plastid but the enzymes for its synthesis are nuclear encoded. However, it is not clear why other plastid-synthesized FAs (18:0, 18:1) do not show nuclear-cytoplasmic interactions.

Table 6 Quantitative trait loci detected for the 12 seed oil composition traits in the Ler-2 × Cvi RIL population

Epistasis

Three significant epistatic interactions in the Ler-0 × Sha and Ler-0 × Col-4 populations were detected suggesting that epistasis is not generally common in TAG synthesis. In Ler-0 × Sha, epistatic interactions between two QTL for 20:1 and two QTL for 20:2 on chromosomes 3 and 4 (Table 4) were observed. The ‘Ler’ allele was the high allele for both the QTL for 20:1 suggesting that the same regulatory element could be controlling both. For 20:2, the ‘Ler’ allele was the low allele for the QTL on chromosome 3 and high allele for the QTL on chromosome 4, suggesting that there could be some regulation of the total amount of the FA. In the Ler-0 × Col-4 population, epistatic interactions were observed between two QTL for 18:1 on chromosomes 1 and 5 (Table 5). The ‘Ler’ allele was the low allele in both the QTL.

Discussion

The mapping of 72 QTL for FA composition in seed TAGs in the four RIL populations identified and characterized several new genomic regions that contribute to regulating TAG FA proportions in A. thaliana. The most remarkable was the co-location of nine FA QTL at the bottom of chromosome 4 between 38 and 71.8 cM and the co-location of seven individual FA QTL at the top of chromosome 3 between 0 and 22.8 cM in the Ler-0 × Sha RIL population.

Comparison with previous work

Prior work (Hobbs et al. 2004) identified seven QTL in the Ler-2 × Cvi RIL population. Interestingly, the seven QTL detected in our study for the same RIL population were different from those of Hobbs et al. (2004). Two out of seven QTL—one for 18:0 and another for 18:3 on chromosomes 1 and 2, respectively—correspond to the same locations in both the studies but with small effect in this study. Some of the differences may simply be due to the strength of the QTL peaks in the two studies. However, since Hobbs et al. did not report QTL of small effect, it is unknown whether any of our QTL with major effect co-locates with the QTL of small effect detected in their studies.

The differences in QTL presence and absence in the two studies and the differences in whether QTL were of major or minor effect could have several causes. Genotype × environment interactions (G × E) could have created different phenotypes due to both differences in the growing environment and the genotypes used in the two studies (Ungerer and Rieseberg 2003). Only nine lines were common to both studies. The differences in QTL detected might also be, partly, due to cytoplasmic effects because all the 80 lines used in our study had the Ler cytoplasm, whereas for Hobbs et al. (2004), only 25 of the 50 lines had the Ler cytoplasm.

Shared QTL between populations

The map positions of three QTL for 18:1, 18:2, and FAs present in plastids on chromosome 3 and one QTL for 22:1 on chromosome 4 showed strong overlap in the two genomic regions in the Ler-0 × Sha and Ler-0 × Col-4 populations. In addition, the co-location of the same relevant CGs with the QTL in both the populations (FAE1.1 and FAD2 on chromosome 3 and FAE1.3 and FAE1.4 on chromosome 4, Tables 4, 5) is significant. Two QTL for 16:0 on chromosome 3 in the Ler-0 × Sha and Ler-0 × No-0 populations shows an overlap but do not co-locate with CGs (Tables 4, 7). However, several other FA lipid synthesis genes and transcription factors also co-locate with these QTL peaks. Hence, the QTL may have been mapped due to variation at the same or different loci. It is possible that at each locus there might be as many as three or as few as two alleles segregating among the parental accessions. One explanation could be that the Ler-0 allele significantly differs in effect from the Sha, Col-4, and No-0 alleles.

Population-specific QTL

The mapping of different QTL in different populations (Tables 4, 5, 6, 7; Figs. 2, 3, 4, 5) in this study is likely due to an accession-specific allele as otherwise the QTL would have been identified in all four populations since all of them share the Ler accession as the parent. These four populations mimic a set of test crosses. Therefore, the Sha, Col-4, Cvi, and No-0 alleles must be very different from the Ler allele and hence, any QTL identified in one population only indicates that the allele was from the “non-Ler” accession of the population from which the parent was mapped. Thus, this design proved exquisitely useful for identifying novel alleles.

Table 7 Quantitative trait loci detected for the 12 seed oil composition traits in the Ler-0 × No-0 RIL population
Fig. 4
figure 4

Major QTL for the 12 seed oil composition traits in the Ler-2 × Cvi RIL population. To interpret this figure see the legend for Fig. 2

Fig. 5
figure 5

Major QTL for the 12 seed oil composition traits in the Ler-0 × No-0 RIL population. To interpret this figure see the legend for Fig. 2

Differences in the QTL identified for the FA traits in the four RIL populations could be due to segregation of meaningful genetic variation among natural lineages as there was no environmental variation between populations. It was also seen that the majority of genetic variation in TAGs in A. thaliana is additive in nature.

Candidate genes and FA regulation

The complete genome sequence of A. thaliana and knowledge of the seed oil biosynthetic pathway in A. thaliana helped identify CGs near the QTL. A total of 22 of the 36 QTL collocated with CGs in the Ler-0 × Sha RIL population, and 11 of the 23 QTL in the Ler-0 × Col-4 RIL population collocated with CGs. Most of the co-locations with QTL for individual FAs were functionally relevant with the exception of the co-locations of FATB with a QTL for 22:1 on chromosome 1 in the Ler-0 × Col-4 RIL population, and the co-location of FAE1.1 and FAD2 with a QTL for 16:0 in the Ler-0 × Sha RIL population. We observed six functionally relevant CGs (FAE1.1, FAE1.3, FAE1.4, FAD2, SAD1, and SAD2), which co-located with the same QTL in the two populations (Tables 4, 5). We also observed that in Ler-0 × Sha RIL population, the CGs FAE1.1, FAE1.2, FAE1.3, FAE1.4, FAD2 and DAGAT collocate within 1–2 cM of some of the associated peaks. In the Ler-0 × Col-4 RIL population, KASII collocates at the QTL peak for 16:0 on chromosome 1 respectively. Previous studies have shown that QTLs might be accurate to within 2 cM or less (Price 2006) which makes these very promising candidate genes to study. In addition, all other QTL in all the four populations, which did not co-locate with CGs collocated with one or more lipid synthesis genes and transcription factors with the exception of QTL LC5.181 in the Ler-0 × Col-4 RIL population. Recent studies show that mutations in the regulatory regions and protein-coding changes in transcription factors contribute more often to quantitative trait variation (King and Wilson 1975; Mackay 2001; Korstanje and Paigen 2002; Hsia and McGinnis 2003). Determination of the source of genetic variation in the seed oil composition will require the identification and molecular characterization of the identified QTL and the underlying CGs.

Implications for evolutionary studies

Natural variation in seed oil composition in A. thaliana is extensive (O’Neill et al. 2003). While all A. thaliana populations studied to date have the same set of FAs in their seed oils, the relative proportions of those FAs vary widely. For example, 360 accessions surveyed for very long chain fatty acids (VLCFAs) and polyunsaturated fatty acids (PUFAs) in their seeds (O’Neill et al. 2003) had relative proportions of VLCFAs ranging from 13 to 21.2% of total fatty acids and relative proportions of PUFAs from 53.3 to 66.1%. Prior work with 84 accessions in A. thaliana has also shown that there is extensive variation in the proportions of 18:1 (5–17.8%), 18:2 (23–37.8%), 18:3 (15–27%), 20:1 (16.4–27.5%) and 16:0 (5.5–10.2%) acids while the variation in the proportions of 18:0, 20:0, 20:2 and 22:1 were <5% (Sanyal and Linder, manuscript submitted). It has been observed that a greater proportion of unsaturated FAs are present at higher latitudes and cooler temperatures, so that seeds with more unsaturated oils germinate earlier and gain competitive advantage (Linder 2000). Here we observe that the composite FA measures of the total proportion of saturated FAs (which includes most of the high melting point FAs), the MPs of the seed oils and the individual FA 22:1 in 84 European accessions of A. thaliana have been shown to follow a strong latitudinal cline (Sanyal and Linder, manuscript submitted) similar to what has been demonstrated in other species and taxonomic groups (Linder 2000). Arabidopsis thaliana’s latitudinal cline in seed oil MPs and total saturated fats was mostly due to combinations of FAs and not one or two dominant FAs. Discovery of QTL affecting the MP of seed TAGs in three of the four RIL populations has important evolutionary implications. There is evidence that the MP of seed TAGs has been under selection (Linder 2000), so identification of QTL that affect TAG MPs is a first step towards the discovery of these loci in the wild. The MP QTL in the Ler-0 × Sha RIL population is particularly important because they co-locate with several FA QTL and their significant correlations could be used to explore how selection might operate on those FA loci and affect MP (Table 8). The genetic correlations between FAs whose QTL co-locate with MP QTL appear to generally enhance selection for altered TAG melting point (Table S1, Table 8). Further, since we did not detect any significant epistatic interactions among non-overlapping QTL, cytoplasmic effects or nuclear-cytoplasmic interactions; there should not be strong constraints or complicating factors on selection at particular loci from other QTL that we detected. Finally, there could be some constraints on the ability of selection to act for altered TAG melting due to the presence of multiple overlapping FA QTL. Only one of the four MP QTL in the Ler-0 × Col-4 collocated with a QTL for 22:1, a minor component of A. thaliana seed oil and the correlation was not significant (Tables S1–S2) and the QTL for MP in the Ler-2 × Cvi did not co-locate with any FA QTL. Ler-0 × Sha is therefore a particularly rich source for future work on studying the evolution of seed oil composition.

Table 8 Net effect of significant genetic correlations between FAs on selection for altered TAG melting points

Implications of breeding seed oil crops

The identification of these FA syntheses QTL in the four RIL populations will enable marker-assisted selection to move beneficial QTL alleles into elite agricultural genotypes in seed oil breeding programs especially in the Brassica oilseed crops to produce oil for the food industry and also for industrial products. The usefulness of the detailed information available for the Arabidopsis genome has helped in making progress in identifying possible CGs involved in the glucosinate pathway and flowering time in crop brassicas, FA desaturases (Hasan et al. 2008; Lagercrantz et al. 1996; Mekhedov et al. 2000; Qiu et al. 2006), and therefore, will also be valuable in elucidating the seed oil biosynthetic pathway.

Conclusions

Our results have shown that natural variation for seed oil between five ecotypes (Ler, Col-4, Sha, Cvi and No-0) can be used to investigate the genetic control of seed oil composition as an alternative approach to screening mutants. The presence of seventy-two QTL for seed oil composition provides new targets for the control of these traits. Characterization of five QTL observed in multiple populations and the QTL clusters identified especially in the Ler-0 × Sha and Ler-0 × Col-4 RIL populations would be useful in elucidating the seed oil biosynthetic pathway. The identification of QTL for seed oil MP suggests that we can select for genotypes which will be adapted to different temperature conditions. Further, seed oil MP QTL co-locate with individual QTL suggesting that these QTL could be very useful for local adaptation and hence for breeding purposes. Several of these QTL map to genomic regions overlapping with several candidate genes on the basis of strong functional arguments and provide opportunity for further studies.