1 Introduction

Blackcurrants are economically significant horticultural products in Russia, Poland, Germany, China, Scandinavia, Great Britain, New Zealand and many other eastern European countries (Brennan 1996). The estimated annual world production approaches 600,000 tons (https://www.fao.org/faostat/). Blackcurrants are primarily grown for juice production but are also processed for purees, jams, and other food products and varieties for the fresh market (e.g. Big Ben) have been developed to expand this sector. Blackcurrants are valued for their high vitamin C content (Hummer and Barney 2002) but the quality and consumer acceptability of their juices and purees is greatly influenced by their colour. The rich purple-black colour of blackcurrants is dependent on their high content of anthocyanins, in particular their characteristic anthocyanin profile based on 3-O-glucosides and 3-O-rutinosides of cyanidin and delphinidin (Macheix et al. 1990; Slimestad and Solheim 2002). Cultivars of blackcurrant show considerable variability in anthocyanin levels, and the selection of higher levels of anthocyanin represents a key objective for many breeders (Brennan 1996) to improve the colour of berries and their products. A single portion of blackcurrants can provide 100–300 mg of anthocyanins (Deighton et al. 2000; Kahkonen et al. 2003) and anthocyanins have been associated with a range of human health benefits such as reduction of the risk of cardiovascular diseases (Ghosh & Scheepens 2009), cancers (Guo et al. 2009), neurodegenerative diseases (Vepsäläinen et al. 2013; Ramassamy 2006), and diabetes (Torronen et al. 2012). However, although dominated by anthocyanins, blackcurrants also contain a range of other polyphenolic components (e.g. flavonols & hydroxycinnamic acid derivatives; McDougall et al. 2005; Mattila et al. 2011) which may contribute to health benefits in their own right and/or influence taste or flavour. Therefore, the inheritance of these potential health beneficial components is important for the selection of future blackcurrant cultivars.

A key advance in berry fruit breeding has been the use of molecular markers for the selection of key traits, particularly for traits that either cannot be evaluated until the plant is mature, such as fruit quality, or traits that require complex phenotyping procedures, e.g. pest and disease resistances. The James Hutton Institute (JHI) has a long history of blackcurrant breeding and significant investment in marker and linkage map development in recent years is now leading to considerable advances in breeding efficiency as marker-assisted selection strategies are employed. The first genetic linkage map of blackcurrant, using AFLP, SSR and SNP markers, was constructed in a small mapping population of 82 offspring developed from two lines from the JHI (then SCRI) breeding programme (Brennan et al. 2008). A major gene, Ce, for gall mite (Cecidophyopsis ribis) resistance was mapped to linkage group (LG) 2, and this has led to the development of a marker-based diagnostic for gall mite resistant germplasm (Brennan et al. 2009). QTLs for titratable juice acidity, pH, ascorbic acid, specific gravity, hundred berry weight (HBW) and phenological traits were also detected by Brennan et al. (2008). No QTLs for anthocyanin content were detected in this study, but were noted in a subsequent QTL x year analysis in an extended population from the same cross (Hackett et al. 2010).

A more detailed SNP and SSR map was constructed using 454 sequencing and 311 offspring from the same cross (Russell et al. 2011). This was extended further using genotyping-by-sequencing (GbS) by Russell et al. (2014), giving a high-quality map with up to 204 SNPs on each linkage group which allowed location of QTLs for hundred berry weight (HBW) and °Brix.

Previous work has been undertaken to assess the impact of environment and genotype on key blackcurrant quality traits such as sugar and acid content (Heiberg et al. 1992; Woznicki et al. 2017; Nwankno et al. 2012; Zheng et al. 2009), vitamin C (Hancock et al. 2007; Zheng et al. 2009; Walker et al. 2010; Vagiri et al. 2013), and polyphenol and anthocyanin content (Nour et al. 2011; Vagiri et al. 2013; Tian et al. 2019). However, very little is known regarding the genetic determinants of these key quality traits.

In this paper, we take advantage of a recently developed high-density GbS SNP linkage map (Russell et al. 2014) to map loci associated with levels of vitamin C, citrate, malate, succinate, total organic acids, total juice anthocyanins and total juice phenolics together with 46 specific polyphenol components in a blackcurrant biparental mapping population over two growing seasons.

2 Materials and methods

2.1 Plant material

Fruits were harvested from the reference blackcurrant mapping population SCRI 9328, (Brennan et al. 2008) when commercially ripe, then stored at -20 °C until processed. This population is an F1 progeny from a cross between two diverse parents grown at the James Hutton Institute, designated S36 (female) and S10 (male). The S36 parent was a selection from an SCRI 1987 cross (parents Ben Alder × Ben Loyal) and has large berries and commercially acceptable fruit quality parameters. Parent S10 was a selection from East Malling cross B1834 (parents B1426 × Ben Lomond) and has small to medium sized berries with lower fruit quality, earlier bud break and resistance to gall mite (Cecidophyopsis ribis). The population was grown in an un-replicated trial and fruit was sampled from the parents and progeny in 2008 and 2009. To ensure equivalent ripeness, fruit from different progeny was collected each day when first ripe over the harvest period (late July—early August in 2008 and 2009), frozen and a sub-sample (> 10 g) was used for juicing.

2.2 Metabolite analysis

2.2.1 Juice extractions

Juice was extracted from frozen fruits after homogenisation in a Waring blender and centrifugation at 2 500 × g, 10 min at 5 °C. The juice sample (supernatant) was frozen in aliquots.

2.2.2 Biochemical analysis

Two sets of traits were analysed in the juice samples. The first set was of general traits e.g. total anthocyanins and total phenolics. The second set consists of specific components measured separately, namely ascorbate, citrate, malate and succinate and, by summation, total acids (TOAS) by HPLC and 46 mainly polyphenol components that were measured by liquid chromatography mass spectrometry. For all chromatographic measurements, the extractions were performed twice and the standard deviation was calculated from these. For the total phenol and anthocyanin analyses, triplicates were recorded.

2.2.3 Total anthocyanin and phenolic measurements

The juice samples were analysed by the colorimetric total phenol and total anthocyanin methods described by McDougall et al. (2005). Standard curves of gallic acid (Sigma Chem Co. Ltd) and cyanidin-3-O-glucose (Extrasynthese) respectively were used for quantifications.

2.2.4 Ascorbate and organic acids

Juice was diluted 10 times in 5% metaphosphoric acid containing 5 mM TCEP and HPLC–PDA analysis carried out according to Hancock et al. (2007). The HPLC analysis by HPLC–PDA used a isocratic mobile phase of 8 mM H2SO4 in water at 0.6 ml/min over 30 min. The quantifications were based on standard curves of ascorbate, citric acid, malic acid and succinic acid (Sigma) from 0.25 to 1.0 mg/ml.

2.2.5 Liquid Chromatography Mass Spectrometry (LC–MS) Analysis

After thawing on ice, juice samples were vortexed and triplicate 475 μL samples were transferred to separate Eppendorff tubes and the internal standard solution (25 μL of 0.5 mg/mL Morin in methanol; Sigma Chem Co. Ltd) was added then vortexed mixed. After centrifugation at 10 000×g for 10 min at 5 °C, the supernatants were removed and placed in 0.45 μM PTFE filter vials (Thomson Instrument Company, Bioprocess Engineering Services Ltd, Kent, UK) prior to analysis. Samples were analysed using a LC system consisting of an Accela 600 quaternary pump and Acela PDA detector coupled to an LTQ Orbitrap mass spectrometer (Thermo Fisher Scientific Ltd.). Samples (10 μL) were injected onto a 2 × 150 mm (4 μm) Synergy Hydro-RP 80 fitted with a C18 4 × 2 mm Security Guard cartridge (Phenomenex Ltd, Macclesfield, UK.). Auto-sampler and column temperatures were maintained at 6 and 30 °C, respectively. The samples were analysed at a flow rate of 200 μL/min using a binary mobile phase of (A) 0.1% aqueous formic acid and (B) 0.1% formic acid in 50% acetonitrile/water with the following gradient: 0 − 5 min, 5% B; 5 − 22 min, 5 − 50% B; 22 − 32 min, 50 − 100% and 32–34 min 100% B. Mass detection was carried out using an LTQ Orbitrap mass spectrometer in positive ESI mode. Two scan events were employed; full-scan analysis was followed by data-dependent MS/MS of the three most intense ions using collision energies of 45 eV source voltage (set at 3.4 kV) in wide-band activation mode. The instrument was optimized by tuning against Morin at a resolution of 100,000 in a range of 80 − 2000 mass units. For optimal electrospray ionization, the source conditions were set at a source temperature of 280 °C, sheath gas at 60 arbitrary units, and an auxiliary gas at five arbitrary units. Prior to analysis, the mass accuracy of the instrument was assured by calibration following the manufacturer’s protocols. All predicted formula data presented were accurate at < 2 ppm.

The quality of the MS response was checked by monitoring blanks containing internal standards and quality control samples of juices from parental line interspersed through the sequence of samples. After peak checking, raw peak areas for the major components were obtained using the resident Xcalibur software. After export of MS peak data to Microsoft Excel, the peak areas were ratioed against the internal standard (Morin).

2.3 Statistical analysis

2.3.1 Exploratory analysis of metabolite data

The mean trait levels for the parents and each offspring were calculated as the mean over the two technical replicates for each year separately. The Pearson correlation between the offspring means for each year was calculated to assess the repeatability of each trait. The distributions of the metabolite levels were examined; most were found to be skewed and so log10 transformations were made of the metabolite means before QTL mapping. The general traits (total anthocyanins (TA), total phenols (TP), organic acids etc.) had more symmetric distributions and were analysed on the original scale.

2.3.2 Genetic map and QTL analysis

Genetic maps of this population have been published by Brennan et al. (2008) and Russell et al. (2011). Most recently, a high-density map combining SNP and SSR markers from Russell et al. (2011) and SNPs from genotyping by sequencing (GbS) has been published by Russell et al. (2014). This was constructed using 216 offspring from SCRI 9328, and has a total map length of 771 cM. The mean marker density, excluding co-segregating markers, is 1.74 markers per cM. This map was used here for the QTL mapping analysis, based on approximately 140 offspring which had both SNP genotypes and trait information available.

QTL mapping was carried out using the MapQTL 5 software (Van Ooijen 2004) and the QTL mapping routines in GenStat 17 (GenStat for Windows 17th Edition 2014. VSN International, Hemel Hempstead, UK. GenStat.co.uk). Kruskal–Wallis mapping in MapQTL 5 was used as a preliminary test to identify regions of the genome linked to each metabolite, and to explore whether alleles from one or both parents were contributing. A permutation test was carried out using GenStat to establish appropriate thresholds for the Kruskal–Wallis test. Interval mapping and multiple QTL mapping were then carried out using GenStat. This analysis combines information across genetic markers along the chromosome to estimate the probabilities of each possible QTL genotype for each offspring at each position. In a cross such as this with outbreeding parents, the parental genotypes at a QTL are usually represented as ab x cd, with offspring genotypes ac, ad, bc and bd. Genetic predictors for the S36 (maternal) additive effect (P1), the S10 (paternal) additive effect (P2) and the dominance effect (D) can then be derived from the genotype probabilities pr(ac) etc. as:

$$ P_{1} = pr(bc) + pr(bd) - pr(ac) - pr(ad) $$
$$ P_{2} = pr(bd) + pr(ad) - pr(bc) - pr(ac) $$
$$ D = pr(bd) - pr(bc) - pr(ad) + pr(ac) $$

and these can be used as explanatory variables in a linear model for each metabolite. For metabolites that were detected in both years, a multi-environment QTL analysis was carried out in Genstat to test for QTL x year interactions. The approach of Li and Ji (2005), based on the effective number of independent tests, was used to obtain a genome-wide significance threshold.

3 Results

3.1 Exploratory analysis of general and metabolite data

Forty-six metabolites were detected in the fruit of the progeny by LC–MS in 2008, and all but one (K-mal-Glc) were detected again in 2009. Most components were identified as polyphenols except for two amino acids, tryptophan and phenylalanine (Table 1; Supplementary Fig. S1). Table 2 summarises the mean levels of ascorbate, citrate, malate and succinate, total acids (TOAS), total anthocyanins (TA) and total phenolics (TP) and the metabolites for the parents S10 and S36 for each year, and the mean, minimum, maximum and standard deviation (SD) for the offspring, together with the correlation between years. The mean levels for the organic acids were higher in 2009 than in 2008, but similar between years for total anthocyanins and total phenolics. The general trait levels showed moderate correlations between the two years, with values between 0.41 and 0.68, which were all statistically significant (p < 0.001) for this size of population.

Table 1 The metabolites detected by LC–MS
Table 2 Summary statistics for the metabolites and general traits for each year

The parents had similar characteristic LC–MS traces but with some quantitative differences. Line S10 had relatively lower amounts of the major delphinidin anthocyanins and other more minor components (Fig. S1). Four anthocyanins, Dp-Glc, Dp-Rut, Cy-Glc and Cy-Rut, were the most abundant components in the parents and the progeny (Table 1; Fig. S1), which is characteristic of blackcurrants (McDougall et al. 2005; Slimestad and Solheim 2002; Allwood et al. 2019) where they can comprise up to 70% of the total phenolic composition. The anthocyanin rutinosides were generally more abundant than glucosides which is often noted in juices over solvent-extracted samples (Mattila et al. 2011). The presence of other anthocyanin components was confirmed including the later eluting, characteristic coumaroylated derivatives (Cy-coum-Glc and Dp-coum-Glc), some more minor rutinosides of Petunidin (Pt), Pelargonidin (Pg) and Peonidin (Pn) (Slimestad and Solheim 2002) along with condensation products of anthocyanins with epigallocatechin (EGC; McDougall et al. 2005). The samples also contained characteristic glucosides and rutinosides of the flavonols Myricetin (M), Quercetin (Q), Kaempferol (K) and Isorhamnetin (IR), also characteristic of blackcurrants (Slimestad and Solheim 2002; Mattila et al. 2011). Malonylated flavonol glucosides were also noted. There was evidence for the presence of coumaroylated flavonol hexoses (i.e. M-coum-Hex and K-coum-Hex). These had characteristic PDA (pronounced absorbance at 355 and 315 nm) and yielded the expected predicted molecular formulae (e.g. PubChem Compound IDs 44,259,435 & 90,657,184). Further studies in negative ionisation mode showed that the putative M-coum-Hex peak gave m/z 625.1214 with a predicted molecular formula of C30H25O15 and yielded MS2 fragments of 479 (loss of 146 = coumaroyl), 463 (loss of 162 = hexose) and 317 (loss of 308 = coumaroyl and hexose). We have noted similar evidence for the presence of M-coum-Glc in our other LC-MSn studies on blackcurrant samples (results not shown). However, these components have not yet been confirmed in blackcurrants although K-coum-Glc has been noted in strawberry extracts (Zhang et al. 2008) and Q-coum-Glc in cranberries (Vvedenskaya et al. 2004).

Apart from M-Gal-b, which was the only metabolite for which the correlation between the years was not significant, the mean metabolite levels were higher in 2009 than in 2008. Correlations (after a log transformation) between the 2 years range from 0.16 (for M-Gal-b) up to 0.88 for IR-Glc-a (Table 2). Correlations among the metabolites and general traits within each year are shown in supplementary Tables S1a and S1b. These are discussed below along with the QTL results.

3.2 QTL mapping

For most of the traits, there was good agreement between the QTLs found in the two years. The results presented here are from a QTL by year analysis, fitting joint effects of the QTLs, year and their interaction. The threshold for declaring a QTL as present was estimated by the Li and Ji (2005) approach as − log10(p) = 3.46 or 5.16 corresponding to a genome-wide significance of 0.05 and 0.001 respectively. In general, QTLs were defined as major if they had − log10(p) > 10, as strong if -log10(p) values were between 10 and 5, and as minor if − log10(p) < 5. At least one QTL was detected for each metabolite, but usually three or four QTLs were found and for chlorogenic acid (CGA) there were six QTLs across LG 1, LG 2, LG 3, LG 4, LG 6 and LG 8. Major QTLs were detected on all eight linkage groups. Details of the QTLs are given in Table 3, and the positions are shown in the linkage maps in Fig. 1a–f. The QTLs are discussed below for each linkage group in turn, ordered by their significance.

Table 3 The QTLs detected for each trait
Fig. 1
figure 1figure 1figure 1figure 1

Linkage maps showing QTL locations. The box shows the one-LOD support interval and the whiskers show the two-LOD interval. Major QTLs are shown as solid boxes, strong QTLs as diagonally filled boxes and minor QTLs as unfilled boxes. Red = anthocyanins, blue = flavonols, black = other traits. Figure 1a–h show linkage groups (LG) 1–8

3.2.1 LG 3

The most significant region was around 34–35 cM on LG 3, where major QTLs were detected for the coumaroylated metabolites, Cy-coum-Glc, Dp-coum-Glc, K-coum-hex, and M-coum-hex, with − log10(p) values between 65.1 and 101.3. These were all highly positively correlated (Table S1a, b). In general, QTLs for more than one metabolite were often found in close proximity on the linkage groups. Where metabolites, such as these coumaroylated components, have closely located QTLs, additive and dominance effects of the parents (that are similar in significance and direction) may indicate a common underlying gene causing the metabolites to increase or decrease together, while effects opposite in direction may suggest different metabolic pathways, or that one metabolite increases as another decreases. The S36 parent had higher values than the S10 parent for all of these metabolites in both years and the effect in the offspring was predominantly an additive effect of parent S36. The additive effects of the S10 parent and the dominance effects were not significant, and there was only weak evidence for any QTL × year effect (p = 0.04). Figure 2a illustrates the mean levels of the coumaroylated metabolites for the four possible offspring genotype classes at 35 cM, with the offspring receiving the B allele from parent S36 having higher levels than those with the A allele but little difference between the offspring receiving the C allele or the D allele from parent S10. This region explained around 70% of the variation in these metabolites. Candidate SNPs for marker-assisted selection for these traits will have genotype AB × AA for the parents, and the closest such marker was comp11926_c0_seq1F7.

Fig. 2
figure 2

Clusters of metabolites with significant QTLs centred around 35 cM on LG3, showing the means for each possible offspring genotype. The profiles are adjusted to have zero mean for offspring genotype AC and are scaled by dividing through by the average SED for comparability. The coumaroyl metabolites in Cluster 1 (Fig. 2a) show a positive effect for S36 (parent 1), with the offspring inheriting the B allele from this parent having significantly higher levels than the offspring inheriting the A allele. The flavonol metabolites in Cluster 2 (Fig. 2b) have lower levels associated with the B allele

A major QTL also mapped to this region for the flavonol, Q-mal-Glc, with − log10(p) = 16.0. Again, the significant effect was that of parent S36, but for this the offspring receiving the B allele had lower levels than those receiving the A allele, the opposite situation to the coumaroylated metabolites above. There were also major QTLs for the flavonol derivatives M-mal-Glc and M-mal-Gal, and strong QTLs for Q-Glc, K-Glc and IR-mal-Glc-b on LG 3, with peak significance between 30 and 43 cM, with similar effects to Q-mal-Glc. Figure 2b shows the mean levels of these metabolites for the four offspring genotype classes. These effects are consistent with the negative correlations found between these flavonols and the coumaroylated metabolites (Table S1a, b).

In addition to QTLs mapping to 34-35 cM, there was a second region of major QTLs on LG 3 between 52 and 70 cM. The most significant QTL was for nigrumin coumarate (Nig-coum) at 57 cM on LG 3, with − log10(p) = 35.2. The closest marker is comp188_c0_seq1, which is heterozygous in the S10 parent only (i.e. segregating AA x AB with AB offspring having lower mean levels than AA). There was another significant QTL (with − log10(p) = 20.2) nearby at 52 cM for nigrumin ferulate (Nig-fer). These butenenitrile components (Lu et al. 2002) were highly correlated, with a correlation of 0.77 in 2008 and 0.87 in 2009. Both parents had significant effects for these metabolites, especially S10, but there were no significant dominance effects. Citrate also has its largest QTL at 57 cM on LG3, with − log10(p) = 31.1, and shows a similar pattern of effects.

The largest QTLs for the major anthocyanins Cy-Glc and Dp-Glc were also in this region of LG 3, with the highest significance (− log10(p) = 34.4 and 18.5 respectively) at 56 cM. For Cy-Rut, the maximum peak was nearby, at 64 cM, with − log10(p) = 5.5. For each of these QTLs, there were again significant effects, with the same direction, for each parent, but no significant dominance effects. However, there was no significant QTL for the other major anthocyanin, Dp-Rut, in this region. There were further major QTLs in this region, with consistent effects with the same direction, for the less abundant anthocyanin derivatives, Dp-EGC-Glc-b, Cy-EGC-Rut-b, and EGC-antho-Glc and strong QTLs for Dp-diGlc-b, Dp-EGC-Glc-a, Pg-Rut, Pt-Rut, Dp-EGC-Rut-b and Dp-EGC-Rut-c, however all of these except Cy-EGC-Rut-b have larger QTLs elsewhere. There were also minor QTLs between 50 and 82 cM for the anthocyanins Dp-EGC-Rut-a, Cy-EGC-Rut-a and Pn-Rut. Total anthocyanin content (TA) also had a strong QTL in this region, with the parental effects in the same direction as for the above individual anthocyanin metabolites.

There were also major QTLs for the flavonols, M-mal-Gal (in addition to the QTL at 39 cM), M-Gal-b, IR-Glc-a and Q-Gal, with peaks in the region from 56-65 cM, with -log10(p) values of 28.7, 14.7, 13.0 and 12.6 respectively. There were also strong QTLs for Q-Rut and IR-mal-Glc-a in the same region and minor QTLs for M-gal-a, M-Glc, M-Rut, K-Rut, K-mal-Glc (only found in 2008) and IR-Glc-b between 60 and 79 cM. Minor QTLs were also found in this region for CGA, TP, succinate, malate, together with a major QTL for TOAS.

3.2.2 LG 4

There was a major QTL for IR-Glc-a on LG4 at 60 cM (− log10(p) = 83.1). The effect in the offspring was predominantly an effect of the S10 parent. There was a significant interaction between the effect of parent S10 and year, with the effect in 2008 being larger than in 2009. The isomer IR-Glc-b also had its largest QTL in this region but this QTL was much less significant, with − log10(p) = 5.3. There was a further strong QTL for K-coum-Hex nearby, at 55 cM. For all of these QTLs, the effect of the S10 parent was in the same direction. There was also a major QTL for M-Rut and a strong QTL for Q-Rut at 60–61 cM, where again the effect of the S10 parent was most significant, but with the opposite direction to the above metabolites. Minor QTLs for the flavonols, M-Glc, M-Gal-b, M-coum-Hex and IR-mal-Glc-b, also mapped to the region from 60 to 75 cM, and there was a QTL for IR-Gal at 99 cM. For these, the effect of the S36 parent was generally more significant, with a negative effect for the M- metabolites and a positive effect for the IR- metabolites. A strong QTL for IR-mal-Glc-a was separate from the rest, at 19 cM.

There were also strong QTLs in the region 57–60 cM of LG4 for the anthocyanin glucosides, Cy-Glc and Dp-Glc (− log10(p) = 18.6 and 10.6 respectively). None of these show a significant QTL by year interaction. Both parents had significant additive effects of similar sizes. These QTLs, while highly significant, were smaller than the QTLs for these anthocyanins on LG3. Five other anthocyanins had minor QTLs lying between 55 and 63 cM: Cy-coum-Glc, Dp-coum-Glc and Cy-EGC-Rut-a had effects with the same directions as for Cy-Glc and Dp-Glc while Dp-Rut and Pg-Rut had significant effects for S10 only, and with the opposite direction.

The region from 50 to 60 cM also contained a strong QTL for succinate, with − log10(p) = 8.3. Both parents had significant effects. There were also minor QTLs for malate, TA, and TP, with effects in the same direction, and a minor QTL for ascorbate with its effects in the opposite direction. Further minor QTLs for Nig-coum, Nig-fer and CGA were detected at 92 cM, 18 cM and 70 cM.

3.2.3 LG 5

There was a major QTL for the metabolite with M + H = 785 at 36 cM on LG5, with − log10(p) = 57.6. This metabolite may be an EGC-anthocyanin glucoside derivative (EGC-antho-Glc), as its exact mass gave the formula C36H33O20, just one oxygen different from Dp-EGC-Glc-a & b, which also had major QTLs here (− log10(p) values = 27.8 and 26.2 respectively). EGC-antho-Glc had a high correlation of 0.93 with Dp-EGC-Glc-a in 2008 and 0.87 in 2009. Dp-diGlc-a also had a major QTL in this region (− log10(p) = 12.3). All of these metabolites showed high positive correlations, and QTL effects with the same direction. QTL × year interactions were either weak or non-significant, apart from the interaction of the effect of S10 with year for EGC-antho-Glc (p = 0.002), where the effect was larger in 2008 than in 2009. The additive effects of both parents were significant, with similar sizes and the same direction, but there was little evidence of a dominance effect. This is a very dense region and there is some variation in the marker showing the highest association with these traits. However, all the most significant markers, such as CL1Contig525_204 at 36 cM, were of type AB × AB, with the AB heterozygote class having mean metabolite values between those of the AA and BB homozygotes. Two further anthocyanins, Dp-EGC-Rut-a and Cy-EGC-Rut-b, had strong QTLs between 30-35 cM but for these, the S36 parent was more significant.

There were also strong QTLs on LG5 in the region 40–44 cM for the flavonols, M-Gal-a, Q-Glc and Q-mal-Glc, and for Q-Gal at 24 cM. Both parents had significant effects, but with different patterns. For example, the alleles associated with an increase in Q-mal-Glc were associated with a decrease in Q-Glc, its immediate precursor in the metabolomic pathway. This suggests that this QTL is associated with the addition of a malonyl group to the Q-Glc precursor. This is different to the QTL for these metabolites on LG3 at 33–34 cM, where the alleles increase or decrease both metabolites together.

Further strong QTLs were found in the region 28-44 cM, for ascorbate, caffeic acid, TP, HBW and citrate (although the latter had a significant year by QTL effect with p < 0.001, and only the 2009 coefficient was significant).

3.2.4 LG 1

There was a major QTL for the minor anthocyanin, Pg-Rut, at 49 cM on LG 1 (− log10(p) = 42.7). Pt-Rut and Pn-Rut also had their largest QTLs on this LG. The peak for Pt-Rut was at 43 cM (− log10(p) = 13.5), while for Pn-Rut, two QTLs were detected at 29 cM and 60 cM (− log10(p) = 10.8 and 9.6 respectively) in a multiple QTL model. For these QTLs, only the effect of the S36 parent was significant. The relationships among these three metabolites in the offspring was complex, and is illustrated in Fig. 3 for the 2008 levels, where the offspring are coloured according to their genotype at the most closely associated marker, comp23369_c0_seq1, at 49.8 cM on LG1, which has parental genotype AB × AA. Pt-Rut and Pn-Rut were highly correlated and showed QTL effects in the same direction, with offspring having the AB genotype having lower values for both metabolites than those having the AA genotype (Fig. 3). Pg-Rut had QTL effects in the opposite direction. These data may indicate that this QTL controls the actvitiy of flavonol hydroxylases that catalyse the conversion of precursors away from the Pg branch and into the Pn and Pt anthocyanidin branches (Fig. S2). The plot of Pg-Rut against Pt-Rut shows two clear clusters, corresponding to the marker genotypes. The overall correlations of Pg-Rut with Pt-Rut and Pn-Rut are not significant, but there are positive correlations within each cluster. The corresponding plot for the 2009 data (not shown) is very similar.

Fig. 3
figure 3

The relationship among the anthocyanins Pg-Rut, Pn_Rut and Pt-Rut in 2008, labelled by the genotype at marker comp23369_c0_seq1 at 50 cM on LG1. The closed circles represent values from the AA homozygote and the open circles the AB heterozygote

There were also strong QTLs in this region of LG1 for Dp–Rut, Dp-EGC-Rut-a and Dp-EGC-Glc-a and a minor QTL for Cy-EGC-Rut-b but the rutinoside metabolites all showed strong QTL × year interactions (p < 0.001), with the effects for each parent being significant only in 2008. For Dp-EGC-Glc-a, only the S10 parent was significant, with its effect in the opposite direction to the rutinoside metabolites.

The flavonols, M-Gal-a, K-Rut and Q-Glc, also had major QTLs in this region (− log10(p) values of 11.2, 10.8 and 10.0 respectively). There were also strong QTLs for K-mal-Glc, IR-Glc-a and IR-mal-Glc-a and minor QTLs for M-Gal-b, M-Rut, IR-Gal, IR-Glc-b and IR-mal-Glc-b. The three myricetin metabolites showed significant QTL x year effects (p < 0.001) but the others did not. K-Rut, K-mal-Glc, IR-Gal, IR-Glc-b, IR-mal-Glc-a and IR-mal-Glc-b showed similar effects to Pt-Rut and Pn-Rut, with significant effects of the S36 parent only. However, M-Gal-a, Q-Glc, IR-Glc-a have significant effects for both parents, and in the opposite direction to those of the other flavonols.

There were also strong QTLs mapped to this region for citrate, succinate, ascorbate, TOAS, TA and TP. The effects of both parents were significant and similar in size apart from for citrate, which showed a strong QTL × year interaction. A minor QTL for Nig-coum was also found here. A major QTL for Trp was found at 32 cM on LG 1, with only the effect of the S10 parent being significant.

Russell et al. (2014) analysed hundred berry weight (HBW) in this cross and found that the largest QTL was also at 52 cM on LG1. The most significantly linked marker was TP3120, with genotype AB × AB. This QTL showed a strong dominance effect, and offspring with the BB genotype at this marker had on average 17.3 g greater HBW (25.4% increase) than for the AA and AB genotypes. We hypothesised that some of the metabolite QTLs might be influenced by berry size, with smaller berries having a higher proportion of skin and so having a higher level of metabolites that are more abundant in skins. This was tested by regressing the metabolite level on HBW and the genotype at marker TP3120. If the marker was significantly associated with the metabolite level in a regression model without HBW as an explanatory variable but the marker was not significant in a model including HBW, this would indicate that the metabolite level was being driven by berry size. However none of the metabolites showed this pattern. It therefore seems more likely that the QTLs on LG 1 that affect the metabolite levels are different to that for HBW.

3.2.5 LG 2

A more diverse range of metabolites had major QTLs on LG2, between 53 and 66 cM. The most significant were Dp-Rut and Dp-EGC-Rut, along with M-Gal-a, M-Rut (− log10(p) values of 22.4, 19.2, 16.0 and 10.7 respectively). For these traits, the main effect was of parent S10, with the offspring inheriting the AB genotype having lower levels than those inheriting the AA genotype. There were strong QTLs in this region for Cy-EGC-Rut-b, Q-Rut, M-Gal-b, M-mal-Gal and Q-mal-Glc and minor QTLs for Dp-Glc, Dp-EGC-Glc-b, Dp-coum-Glc, Pt-Rut, M-mal-Glc and Q-Gal, all of which had effects in the same direction.

There were further major QTLs in this region for Cy-Glc, Caff, K-Glc and Dp-diGlc-b (− log10(p) values of 12.7, 12.6, 11.9, 10.9 and 10.1 respectively), with the main effect again being parent S10, but with its effect in the opposite direction to the metabolites above. A strong QTL in this region for Q-Glc also showed the same pattern of effects.

For a few components, the effects of both parents were significant. The largest QTLs for CGA and TA occurred at 58 cM and 61 cM with − log10(p) values of 16.0 and 10.5 and there was also a major QTL for Phe. There was a minor QTL here for Cy-Rut. For Cy-EGC-Rut-a, there was a strong QTL, with only parent S36 having a significant effect but this was well-separated from the other QTLs, with the peak at 29 cM.

3.2.6 LG 6

Succinate had its most significant QTL on LG6 at 17 cM, with − log10(p) = 16.1. This was mainly an effect of parent S36. There were also strong QTLs for the anthocyanins Cy-Rut, Cy-EGC-Rut-a, Pt-Rut, Pn-Rut and minor QTLs for Cy-EGC-Rut-b and Dp-Rut with peaks between 11 and 22 cM. These were also mainly effects of the S36 parent, with smaller effects of the S10 parent. The effects for the EGC metabolites, Cy-EGC-Rut-a and Cy-EGC-Rut-b, were in the same direction as for succinate, while those for the other anthocyanins were in the opposite direction.

There were further strong QTLs in the same region for M-mal-Gal, K-Glc and citrate and minor QTLs for TA and Trp. There were minor QTLs for Caff, CGA and Phe in the range 25-31 cM and one for IR-Gal at 58 cM.

3.2.7 LG 8

There was a major QTL for CGA mapping to 53 cM on LG8 ( –log10(p) = 14.2). The main effect was of parent S36, with the offspring inheriting the AB genotype having higher levels than those inheriting the AA genotype. However LG8 has very few markers from the S10 parent in this region so its effects are difficult to estimate reliably. There was also a strong QTL for Dp-EGC-Glc-b and a minor QTL for Dp-EGC-3-Glc-a in this region, with similar effects, together with a minor QTL for citrate with its effect in the opposite direction. There were also strong QTLs at 69–70 cM on LG8 for IR-mal-Glc-a, IR-mal-Glc-b and IR-Gal and minor QTLs for K-Glc and Q-mal-Glc. These all show similar patterns of allele effects, with significant dominance.

3.2.8 LG 7

LG 7 contains very few markers from the S36 parent and the QTLs all showed significant effects of S10 only. Two metabolites, Dp-EGC-Rut-b and Dp-diGlc-a, had major QTLs on LG 7 in the region 24–25 cM with –log10(p) of 11.7 and 10.4 respectively. There were also strong QTLs in the same region for Dp-EGC-Rut-c and Dp-diGlc-b and minor QTLs for M-coum-Hex and IR-mal-Glc-a together with a minor QTL for IR-Glc-b at 36 cM. The AA genotype at this marker had higher levels of all metabolites except for the last two, which had the opposite effects. There was a further strong QTL for EGC at 49 cM and a minor one for DP-EGC-Glc-a at 62 cM.

4 Discussion

We have detected many significant QTLs affecting levels of metabolites in this cross, with generally good consistency across the two years. Commercially, anthocyanins and vitamin C have been the most important components in blackcurrant. One strong QTL for ascorbate content was found on LG5, very close to major QTLs for the anthocyanins Dp-diGlc-a, Dp-EGC-Glc-a, Dp-EGC-Glc-b and EGC-antho-Glc and another on LG1, close to strong QTLs for Pg-Rut and Pt_rut and strong or minor QTLs for several other anthocyanins and flavonols. There was also a weak QTL for ascorbate on LG4, again close to weak QTLs for several other anthocyanins. These areas need further work and may inform future research on the control of ascorbate levels in blackcurrant.

The focus was on the major QTLs for total anthocyanin content found on different linkage groups. There was a major QTL for total anthocyanins on LG2 at 61 cM, with − log10(p) = 10.5. Indeed, two of the major anthocyanins, Dp-Rut and Cy-Glc, also had major QTLs around this area but many other metabolites also map here. In decreasing order of significance, these include Dp-Rut, Dp-EGC-Rut, chlorogenic acid, Cy-Glc, caffeic acid, Phe, Dp-diGlc-b, Cy-EGC-Rut-b, Pn-Rut and Pt-Rut. These also co-locate with more minor QTLs for other anthocyanin derivatives such as Dp-EGC-Glc-b, Dp-EGC-Rut-b, Pg-Rut and Cy-Rut. There are also major QTLs for flavonol components in this area, e.g. for M-Gal-a, K-Glc and M-Rut. Perhaps understandably given that the peak of a QTL will overlie many genes that could be involved in various functions, this area is not particularly specific for the major anthocyanin components, which could be expected as total anthocyanin values are assessed on red colouration, which may be influenced by all anthocyanin derivatives.

There were major QTLs for the major anthocyanidin glucosides, Cy-Glc and Dp-Glc, on LG3 at 58 cM close to a strong QTL for total anthocyanins. Once again, this area also has major QTLs for Dp-EGC-Glc-b, EGC-antho-Glc and Cy-EGC-Rut-b but also strong or minor QTLs for other anthocyanin components. On LG4 at ~ 61 cM, there were major QTLs for Cy-Glc and Dp-Glc which lie under a minor QTL for total anthocyanins. However, once again, this also coincides with major QTLs for the flavonols, IR-Glc-a and M-Rut and a number of less significant QTLs for other components. Overall, it does not appear that any one QTL provides a major genetic influence that controls the levels of the major anthocyanins and is crucial for blackcurrant juice colour.

Although the coumaroylated anthocyanins, Cy-coum-Glc, Dp-coum-Glc, are relatively minor phenolic components in blackcurrant (Slimestad and Solheim 2002), coumaroylation has been suggested to alter both the colour and stability of anthocyanins (Sigurdson et al. 2018) and increasing the levels of coumaroylated anthocyanins may influence blackcurrant juice colour. The finding of very significant QTLs for these coumaroylated anthocyanins but also the putative coumaroylated flavonol components, K-coum-Glc and M-coum-Glc, on LG3 at ~ 34 cM, suggests some element involving the genetic control of addition of coumaroyl groups. This area also has strong QTLs for malonylated flavonols which may suggest that control of more general esterification reactions are involved. On LG8 around 65 cM, there were strong QTLs for 2 IR-mal-Glc isomers IR-Gal and a minor QTL for and Q-mal-Glc. It is possible that the grouping of these QTLs could specifically relate to control of malonylation reactions.

The detection of strong QTLs for EGC-anthocyanin derivatives (McDougall et al. 2005) is of interest. LG 5 has major QTLs for two isomers of Dp-EGC-Glc and strong QTLs for Dp-EGC-Rut-a and Cy-EGC-Rut-b close to 35 cM. The putative EGC anthocyanin derivative (EGC-antho-Glc with m/z [M + H]+ = 785) also has a major QTL in this area as does the 3, 5-diglucoside of delphinidin, Dp-diGlc-a. There are a similar collection of QTLs for Dp-diGlc-b and Dp-EGC-Rut-a on LG2. On LG7, major or strong QTLs for two Dp-EGC-Rut isomers were also closely associated with QTLs for two Dp-diGlc isomers. Overall, these strong QTLs indicate that there may be some genetic control of the levels of the EGC-anthocyanin components which suggests that they are not formed through a non-catalysed condensation between their parent anthocyanins and epigallocatechin. The relationship between the diglucosides and the EGC derivatives is not understood.

The genomic regions that underlie some of the most significant QTLs were examined for the presence of genes that could contribute to the levels of the metabolites. Given the status of annotation of the blackcurrant genome, this process uncovers many genes with homologies to proteins of unknown function and possible regulatory genes. Full exploration of the involvement of these genes in the accumulation of specific polyphenol components would require substantial further analysis and is beyond the scope of this paper. However, beneath the major QTL on LG5 at 35–36 cM, which was associated with EGC condensation products of DpGlc, there were genes with high sequence homology to structural genes involved in the anthocyanin biosynthetic pathway. In particular, there were genes with high homology to chalcone synthase (CHS; Apple—92% homology; Arabidopsis AT5G13930.1—89% homology) and leucoanthocyandin dioxgenase (LDO; Apple—77% homology; Arabidopsis AT4G22880.1—76% homology). Chalcone synthase catalyses the synthesis of naringenin chalcone from p-coumaroyl coA (Fig. S2) and is the first committed step in the flavonoid pathway that ultimately produces anthocyanins. Leucoanthocyanidin dioxgenase (also called anthocyanidin synthase, ANS), catalyses the conversion of leucoanthocyanidin precursors into the main anthocyanidins, pelargonidin, cyanidin and delphinidin (Fig. S2) and both genes have been associated with anthocyanin biosynthesis in berries (Jaakola et al. 2002).

CHS is a member of a plant-specific multigene family of polyketide synthases (Austin and Noel 2003), which may show different specificities or be differentially expressed in various tissues (Zhou et al. 2013). There are also multiple genes homologous to ANS in Arabidopsis and these multiple forms may catalyse the synthesis of specific anthocyanidins from their leucoanthocyanidin precursors and may be expressed differentially in specific tissues (https://pmn.plantcyc.org/ARA/NEW-IMAGE?object=PWY-5125). The importance of ANS in anthocyanin accumulation is highlighted by the finding that yellow raspberries that lack anthocyanins have a non-functional ANS gene with nonsense mutations (Rafique et al. 2016). Crucially, genes with similar homology to the Arabidopsis CHS gene (AT4G22880.1) and the Arabidopsis ANS gene (AT4G22880.1) were found to be expressed during ripening and colour development of blackcurrant fruit (Jarret et al. 2018). Similarly, Li et al (2015) noted that the expression of an ANS homologue was upregulated during fruit maturation and correlated with the accumulation of anthocyanins in blackcurrant. A key problem associated with the annotation of multigene families such the polyketide synthases and 2-oxoglutarate dependent dioxygenases (of which ANS is a member) is the observation that functional and sequence divergence between species can often lead to misleading gene annotation. However, the finding that these genes underlie QTL associated with specific anthocyanins combined with the previous observation that their expression occurred during fruit colour development (Jarret et al. 2018) provides a high degree of confidence that the genes are indeed correctly annotated.

5 Conclusions

The identification of major,robust QTLs for key polyphenol classes and for individual polyphenols in blackcurrant provides great potential for marker-assisted breeding for improved levels of industrially-important components. However further work is required to examine the genes that may control the levels of these components.