Introduction

Low input crop production systems practiced in developing countries significantly contribute to world food needs. These systems are currently gaining interest in resource rich countries that have traditionally practiced high input agriculture (Laperche et al. 2006). The potentially negative environmental consequences of the misuse of agro-chemicals, particularly the pollution of water resources due to the high rates of nitrogen fertiliser and pesticides, the increased production costs coupled with reduced profits and urban consumer perception of comparatively low quality products of large scale industrialized agriculture are the major reasons cited for the increased interest in low input farming in developed countries. The type of low input systems being adapted in developed countries are environmentally friendly technologies which use far less inputs and are potentially less harmful to the environments, but still ensure adequate and quality food production (Bedo et al. 2005). Furthermore, the same low input systems significantly reduce production costs, which are estimated to amount to 80% of gross farm income in high resource intensive agricultural systems (Daberkow and Reichelderfer 1988).

It has been estimated that up to 50% of yield from most modern cultivars is derived from high usage of external inputs like fertiliser, pesticides, and adequate moisture (Ceccarelli 1996b) and most modern varieties do not appear to be adapted to low input levels (Murphy et al. 2007). Therefore, breeding for low input systems should focus on genetically improving input use efficiencies. This focus on adaptation to low input systems may involve the recovery of genes that could have been lost through modern breeding. This derives from the reported erosion of genetic variation for abiotic stress tolerance caused by domestication, breeding and selection (Forster et al. 2000). Studies of barley adaptation to drought conditions confirmed the loss of drought tolerance due to breeding and selection, as reported by Forster et al. (2000). Many results from adaptation studies have shown that landraces were better adapted to stress environments, while modern genotypes were better adapted to stress-free, high yielding environments (Ceccarelli 1996a; Pswarayi et al. 2008). Landraces could be used as a source of genes for adaptation to low yielding environments characterised by drought stresses, limited moisture availability, low fertility, and other related factors.

Most contemporary studies carried out on barley adaptation to low yielding Mediterranean environments centred on identifying QTL related to stress tolerance in doubled haploid or recombinant inbred lines populations produced from single crosses (Teulat et al. 2002, 2003; Baum et al. 2003; Forster et al. 2004, Francia et al. 2004; Tondelli et al. 2006) or, more recently, on a wide collection of germplasm by association mapping (Comadran et al. 2008a). The current study was set out to investigate changes in allele frequencies of marker loci close to quantitative trait loci (QTL) for yield in landraces, old, and new cultivars. These changes may be associated to modern breeding. We also look at the effects of these allele frequency changes on adaptation to low and high yield potential environments, with the purpose of identifying possible chromosomal regions subject to selection during breeding.

Materials and methods

Genetic material

The genetic material consisted of a collection of 192 entries, grouped by breeding class (BC) into 83 landraces (L), 44 old (O) and 65 modern genotypes (M), obtained from within the Mediterranean basin (Italy, Spain, Jordan, Turkey, Morocco, and Algeria) and elsewhere (Holland, Germany, Sweden, the UK, Denmark, USA and the Czech Republic). Four of the original 192 lines were discarded as detailed genetic and phenotypic assessment revealed that they were originally misclassified, leaving 188 for analysis. Seed for field-testing was multiplied at the International Centre for Agricultural Research in the Dry Areas (ICARDA), Aleppo, Syria in 2003. Each entry was genotyped using 49 genomic and EST derived Simple Sequence Repeat (SSR) and one Single Nucleotide Polymorphism marker (SNP). These SSR and SNP gave good coverage of the barley genome (Russell et al. 2004). Genotyping was carried out to test for the existence of subpopulations within the genetic material by means of the software package Structure (Pritchard et al. 2000). Five groupings, hereafter referred to as ‘subpopulations’, related to geographical origin, spike type, and growth habit were identified: East Mediterranean (E); South West Mediterranean (SW); North Mediterranean six rows mainly winter (N6w); North Mediterranean two row spring (N2s) and Turkish genotypes (T) (Comadran et al. 2008b).

To investigate allelic diversity in the genetic material, whole-genome profiling using Diversity arrays technology (DArT®, www.diversityarrays.com) was carried out and a total of 1130 biallelic markers were identified. 811 of the 1130 DArT polymorphic markers were located mapped on a consensus DArT map assembled from seven individual barley mapping populations (Wenzl et al. 2004, 2006) and were used for association mapping. There was significant linkage disequilibrium up to 3 cM on the genome, which, given the current genome coverage, made possible a dense genome wide scan for DArT marker yield QTL trait associations (Comadran et al. 2008b).

Field trials

Trials were conducted in 2004 in Spain (ESP) and in 2004 and 2005 in Syria (SYR) on moisture (dry and wet) and fertility (inherently low and high) contrasted locations, which represented low and high yield potential environments respectively (Table 1). Throughout the paper we will use the term ‘site’ to identify the following country by year combinations, ESP4, SYR4 and SYR5 and the following codes ESP4H, ESP4L, SYR4H, SYR4L, SYR5H and SYR5L to identify the six ‘trials’, in which the last letters L and H refer to the expected low and high yield potential locations within a given site. We use the term adaptation in this paper in its simplest sense that is to define a relatively good performance of a genotype or genotype class at a given set of trials or sites.

Table 1 Soil physical properties, annual rainfall, long term average yields and agronomical practices of the four testing location used

The experimental designs for individual trials consisted of a partially replicated trial with four repeated checks that were included in a systematic diagonal fashion with replication of a random set of 25% of the entries for each trial. The checks (a local landrace, a local old variety, a local modern variety and an improved variety ‘Rihane-03’ which was common to the two countries) were used to adjust for spatial variation. Trials were sown in plots of 6 m2 and grown according to local practice for sowing rate and other inputs (Table 1).

Statistical analysis

Individual and combined site analyses of variance

Best Linear Unbiased Predictors (BLUPs) for each of the lines were first generated for each trial using a mixed model which was based on an approach described by Piepho et al. (2005). Each analysis included a post-blocking spatial adjustment by including rows and columns as random factors in the model. The partially replicated entries were considered random effects, while the repeated checks were considered fixed.

A combined analysis of variance across locations was performed on the two-way table of lines by trials means, where each mean represented the spatially adjusted Best Linear Unbiased Predictors for a line at a site. Trials were classified by the factorial combination of the expected yield potential of the environment (YP), high (H) vs. low (L), and site (ESP4/SYR4/SYR5).

Due to the lack of balance of the numbers of landraces, old and modern cultivars across the five subpopulations, it was not possible to orthogonally partition the genotypic main effects into effects due to breeding class, structure and their interaction. Thus, genotypic effects were partitioned according to a factor with 15 levels, formed from the product of the three breeding classes (landraces/old/modern) and the five subpopulations (E, N2s, N6w, SW or T). The linear model used for the across trial analysis of variance was:

$$ \begin{aligned} {\text{Yield}} = {\text{overall mean}} + [{\text{Site}} + {\text{YP}} + {\text{Site}}*{\text{YP}}] + [({\text{BC}}*{\text{S}}) + {\text{Residual G}}] + [({\text{BC}}*{\text{S}})*{\text{Site}} + ({\text{BC}}*{\text{S}})*{\text{YP}} \\ & + ({\text{BC}}*{\text{S}})*{\text{Site}}*{\text{YP}} + {\text{Residual GE}}]. \\ \end{aligned} $$

The brackets delineate the partitioning of the environmental main effect, the genotypic main effect, and the genotype by environment interaction. The environmental main effect is partitioned into Site, expected yield potential of the environment (YP) and their interaction (Site*YP). The genotypic main effect is partitioned into the effects of the factor from the product ‘breeding class × subpopulation’ ((BC*S)) and a residual genotypic main effect component. The genotype by environment interaction is partitioned into a residual component and three interaction terms: ‘breeding class × subpopulation’ by site combination ((BC*S)*Site); ‘breeding class × subpopulation’ by yield potential ((BC*S)*YP) and ‘breeding class × subpopulation’ by site by yield potential ((BC*S)*Site*YP). For the sake of simplicity, we will consider all terms in the model fixed, with the exception of the residual genotypic main effects and residual genotype by environment interaction terms.

The genotype by environment interaction (GE) was studied by the AMMI model (Gauch 1992). GE was graphically displayed by a biplot of the interaction scores, where the AMMI model was applied to the (BCxS) by trial table of means. The basis of the biplot was an AMMI2 model, i.e., a model with two GE principal component axes, IPCA1 and IPCA2. Provided that GE is sufficiently approximated by IPCA1 and IPCA2, distances from the origin are indicative of the amount of interaction exhibited by genotypes over environments or by environments over genotypes. In a vector representation, the genotype and environment points determine lines starting at the origin (0, 0). The angle between the vectors of genotype i and environment j assesses their interaction: they interact positively for acute angles, negatively for obtuse angles, and do not interact for right angles. The extent (degree) of interaction of a genotype i in environment j is approximated by projecting the genotype point onto the line determined by the environmental vector, where distance from the origin provides information about the magnitude of the interaction.

Association analyses

QTL detection on individual trials was carried out by single marker regression for each of the 811 DArT® markers of known map position on data adjusted for the five subpopulations (E, N2s, N6r, SW and T). For a given trial and for the i-th DArT marker, DArT i , the linear model (single QTL model) was:

$$ {\text{Yield}} = {\text{mean}} + {\text{subpopulation}} + {\text{DArT}}_i + {\text{error}} $$

where yield is the grain yield BLUP for any one of the 188 genotypes. The term subpopulation in the model, which was considered fixed, represented the five subpopulations (E, N2s, N6w, SW or T) to which the genotypes belonged to as identified by the application of the Structure analysis (Pritchard et al. 2000). DArT i is a binary variable (0, 1) representing presence or absence of the anonymous sequences evaluated, with i = 1…811.

To solve the multiple-testing problem occurring in association mapping, we chose to control the false discovery rate (FDR) following the procedure described by Benjamini and Hochberg (1995). We interpreted the association analyses across the six trials as a multi-trait problem and derived a common FDR threshold for all six trials simultaneously, applying the recommendations given by Benjamini and Yekutieli (2005).

To arrive at a multi-QTL model, the significant markers, or putative QTL, of the single marker analyses were used as the starting set of predictor variables in a backward stepwise regression that eliminated markers from a model which included the subpopulations, whose contribution in terms of sum of squares was not significant. To effectively carry out the variable subset selection procedure, it is essential to have no missing data points. In this study, missing DArT data for any given genotype were estimated based on the frequencies of each DArT in its five nearest non-missing entries as determined by genetic similarity.

Changes in allelic frequencies at marker loci linked to yield QTL

Two tests were carried out to test for differences between QTL allele frequencies among breeding classes (landraces, old and modern genotypes: (1) an exploratory chi-square test of homogeneity of frequencies across all subpopulations followed by (2) Fisher’s exact multinomial tests to test for selection within individual subpopulations. Fisher’s tests were carried out due to the small numbers of entries within each subpopulation class. Interpretations of results were as follows: Directional changes associated to modern breeding could be inferred by an increase of the positive alleles across breeding classes, from landraces through old to modern releases. Changes in allelic frequencies at marker loci linked to QTL were considered to favour modern cultivars when the frequency of the positive allele increased from landraces to modern cultivars. Changes in allelic frequencies due to selection were considered to favour landraces when the positive QTL allele decreased from landraces to bred cultivars.

All statistical analyses were carried out using GENSTAT version 9 (Payne et al. 2006).

Results

Individual and combined site analyses of variance

Average grain yields for the six trials ranged between 1.33 (SYR4L) and 6.32 (ESP4H) t/ha (Table 2). Average yield differences between high and low yield potential trials were 1.21, 2.66 and 3.08 t/ha for SYR5, SYR4 and ESP4 respectively. On average, and particularly for the most productive sites, ESP4 and SYR5, modern genotypes had larger yield differences between the low and high yield potential trials than old cultivars and landraces (2.53, 2.43 and 2.08 t/ha respectively). However, this was not always the case, particularly for non-local alien germplasm. For example, in ESP4 the average yield differences for modern genotypes and landraces were 3.45 and 2.66 t/ha respectively. Yet, yield differences for N6w landraces were significantly larger than for some modern genotypes, in the SW and T subpopulations.

Table 2 Average grain yields (t/ha) for 188 barley genotypes grouped by breeding class and subpopulation combination at each of the two trials differing in expected yield potential (high and low) in three Mediterranean sites

Correlations of genotypic yields between the six trials were generally low; the maximum correlation between a high and a low yielding trial at a given site was observed for SYR4 (r = 0.254, P = 0.0004). The highest correlations were observed between high and low yield potential trials at different sites (SYR5H and ESP4H: r = 0.546, P < 0.0001; and SYR5L and ESP4L: r = 0.452, P < 0.0001).

From the combined analysis of variance (Table 3), environment, genotype and the GE interaction terms were all statistically significant (P < 0.0001). Most of the strictly environmental differences were related to the contrast between high and low potential yield trials, while the site by yield potential of the environment interaction, although highly significant, explained the least part of the environmental sum of squares. Significant differences were detected between the levels of the product factor ‘breeding class × subpopulation’, (BC*S), while the corresponding residual genotypic variation within the classes defined by this product factor term was not significant (P = 0.064). The GE sum of squares was almost three times the sum of squares due to the genotypic main effect. The most significant GE term was the interaction between the product factor ‘breeding class × subpopulation’ and the contrast between high and low yield potential trials, (BC*S)*YP, reflecting differential adaptation of landraces, old and modern cultivars from the different subpopulations to high or low yield potential conditions.

Table 3 Combined analysis of variance for grain yield of 188 barley genotypes grouped by the combination of breeding class (landrace, old and modern variety, BC) and subpopulation (East Mediterranean, South West Mediterranean, North Mediterranean six rows, North Mediterranean two row and Turkish genotypes) at each of two trials differing in expected yield potential (contrast between high and low input trials, YP) in three sites in the Mediterranean Basin

In the biplot demonstrating the GE as decomposed in an AMMI2 model (Fig. 1), IPCA1 explained 77% of the sum of squares of the interaction variance and, as we could have expected, basically separated low yield potential trials, with negative scores, from high yield potential trials, with positive scores, except for SYR4H which was placed closer to the low yield potential trials. Old and modern bred cultivars (with the exception of the Turkish entries) tended to cluster with high yield potential environments (ESP4H and SYR5H) while landraces showed adaptation to low yield potential environments (SYR4L, ESP4L, SYR5L). All Turkish entries (landraces, old and modern) clustered together with low potential environments (Fig. 1). N2s and N6w modern entries adapted best to high yield potential environments (largest positive scores on IPCA1); while East Mediterranean landraces (E_L) adapted to low yield potential environments (largest negative scores on IPCA1). IPCA2 explained 15% of the interaction variation and may reflect a different behaviour of the two SYR4 trials versus the rest of the trials. Although the first interaction axis, IPCA1, could be inferred to be related to soil fertility and moisture availability (high vs. low productivity contrast), no obvious ecophysiological explanation could be found for IPCA2.

Fig. 1
figure 1

AMMI biplot. Black dots represent germplasm by structure groups. Squares represent trials. The size of each square is proportional to its average yield. The unexplained variance for each trial is shown as a gray cut-out. Key: E_L: East Mediterranean landraces. East Mediterranean old and modern entries were not included in the biplot because the number of entries (3 and 1 respectively) were considered too low to be representative; N2s_L, N2s_O, N2s_M: North Mediterranean two row generally spring habit landraces old and modern entries; N6w_L, N6w_O, N6w_M: North Mediterranean mainly six row winter landraces, old and modern entries; SW_L, SW_O, SW_M: South West Mediterranean landraces, old and modern entries; T_L, T_O and T_M: Turkish landraces, old and modern entries

Different subpopulations (E, N2s, N6w, SW, and T) showed different environmental preferences within the low and high yield potential trials (Fig. 1). N2s landraces yielded relatively consistently across environments (located close to the origin). N6w landraces were adapted to SYR4, while E landraces tended to be relatively adapted to the SYR5L trial. In relation to the high yield potential environments, SW old and modern entries were best adapted to SYR5H while N2s and N6w old and modern entries to ESP4H. All Turkish genotypes (landraces, old and modern) clustered with the low input environments, without any specific inclination to either Syrian or Spanish trials.

Association analyses

We based our threshold for significance in single marker models on a false discovery rate of 5% across the six trials. The corresponding P-value was 0.0007, which translated to 3.14 on a −log10 (P-value) scale. A very large number of significant DArT Markers (over 100 per trial, data not shown) were detected by single marker regression on the yield data (BLUPs) without adjusting for subpopulation. However, the total number of QTL detected dropped to 32 when the subpopulation term was included in the single model. Eighteen of the QTL were detected in low potential environments, and 14 in high (4 in ESP4L; 5 in SYR4L; 9 in SYR5L; 6 in ESP4H and 8 in SYR5H) Backward stepwise regression on markers from individual trials further reduced the total number of putative QTL from 32 to 28 (16 and 12 on low and high potential, respectively) (left side of Table 4). The number of QTL per trial identified by the backward stepwise regression varied up to eight (SYR5H) with some individual QTL explaining up to 10% of the total genotypic sum of squares in addition to that explained by the subpopulations. None of the QTL were found within a given bin at any chromosome in more than two of the six trials. The multiple R 2 explained just by the multiple QTL model identified by backward selection ranged from just over 5% in the case of SYR4H to almost 30% for SYR5H: SYR4H (5%): SYR5L (24.2%); SYR5H (27.7%); ESP4H (23.8%); ESP4L (22.5%); SYR4L (26.1%). Presence of an anonymous DArT sequence was more frequently associated to a yield increase (21 out of 28, Table 4) than to a yield decrease. The highest increase (bPb3852, bin 6, 5H) was 0.50 t/ha from a high potential trial (ESP4H).

Table 4 QTL from individual trials and frequencies of associated markers in landraces, old and modern cultivars in three low (top) and three higher yielding trials (bottom)

Changes in allele frequencies at marker loci linked to grain yield QTL

The right hand side of Table 4 shows the allelic frequencies for every significant QTL in the different breeding classes (landraces, old and modern varieties) for the whole set of 188 entries and separately for four of the five subpopulations (N2s, N6w, SW and T). Subpopulation E was not included in the tests for selection within subpopulations due to the low number of old cultivars (3) and modern cultivars (1) (see Table 2).

An exploratory chi-square test for homogeneity of marker allele frequencies across landraces, old, and modern genotypes at individual trials (Table 4) revealed that 11 of the total of 28 QTL detected in individual trials (39%) showed significant differences in marker/QTL allele frequencies. Most of them had allele frequencies within the old cultivars that were intermediate between landraces and modern releases and, thus, changes in allele frequencies could be a consequence of modern breeding. Based on allelic frequency and effects, nine of these 11 QTL were more frequent in modern cultivars with six and three based on apparent selection for and against alleles increasing and decreasing yield respectively. Only two QTL, associated with markers bPb9005 in 1H bin 8 and bPb2147 in 5H bin 7, showed indications for selection favouring landraces by an increase in the negative marker allele in modern genotypes: bPb9005 landraces (40%), old (76%) and modern cultivars (81%); bPb2147 landraces (46%) old (88%) and modern genotypes (94%). The remaining 17 QTL (60%) were apparently breeding neutral i.e. did not show significant differences in the allele frequencies at marker loci linked to the QTL in landraces, old and modern genotypes. All but 4 of these 17 were QTL detected under low potential yield conditions (Table 4).

This trend was also observed at the subpopulation level in Table 4 using Fisher’s multinomial exact tests. All nine QTL in which the positive alleles were more frequent in modern cultivars and the QTL at marker bPb2147 in which the negative allele was more frequent in modern genotypes, were confirmed for at least one subpopulation and in most cases for two. However, the QTL linked to marker bPb9005, in 1H bin 8, in which the positive allele was more frequent in landraces was not confirmed within any subpopulation which may suggest a spurious association unrelated to breeding. Changes in the positive allele frequency at marker loci linked to QTL related to modern breeding were most frequent within the SW subpopulation (11 QTL), followed by N2s and N6w (4 QTL). The Turkish subpopulation did not show a single event of changes in allele frequencies at marker loci linked to QTL.

Discussion

The general pattern of genotype adaptation, determined by the relative performance of the different genotypes in individual trials, coincided with many reports such as those by Ceccarelli (1996a) and Pswarayi et al. (2008). Landraces were particularly adapted to low yield potential environments while cultivars adapted best to high yielding trials (Fig. 1). Yield differences between landraces, old and modern genotypes were generally smaller within the low yield potential environment. Crops developed for adaptation to low potential environments characterised by intense abiotic stress environments show reduced yield potential under both low (stress) and high (non stress) yield environments (Rosielle and Hamblin 1981). Under these severe conditions, in which some of the landraces may have been originated, crop survival instead of grain yield, is generally the ultimate goal. Increased yield of modern bred cultivars under high input conditions did not result in significantly higher yields on low potential environments than landraces (Table 3). This fact suggest that modern cultivars are not always directly suitable to low input systems and point to their need for inputs in order to achieve high yields as reported by Ceccarelli (1996b). In fact, modern cultivars have been observed to be generally out yielded by local landraces in trials with mean yields below 2 t/ha in a wide collection of Mediterranean environments (Pswarayi et al. 2008).

The number [16 (low yield potential environments), 12 (high yield potential environments)] and magnitude of explained genotypic variation of significant QTL (R 2 up to 10% for low and high yield potential environments) were comparable for low and high input conditions. However, most QTL were specific to either high or low yielding environments. Yield under poor conditions has been seen as a trade off for increased yield under optimum condition (Ceccarelli and Grando 1993) and, therefore, genotypes cannot have high yield under both low and high input conditions. This principle is substantiated for these specific germplasm and environmental conditions. Only in two instances were QTL found within the same chromosomal bin (5H bin 4 and 5H bin 7) in both high and low yielding trials (Table 4). However, each QTL’s interaction effects differed with the environments: bPb9163 at 5H bin 4 showed negative effects (reduced yield) in both low and high potential trials, although yield reduction was relatively higher in high potential trials. This marker, thus, showed quantitative QTL by environment interaction. The putative QTL at 5H bin 7 linked to bPb2417 changed the sign of the effects in the low and high potential environments. In low potential environments there was yield decrease, whereas in high potential environments there was yield increase from the same QTL. This type of interaction is called qualitative. Environment specific QTL were also reported, for example, in barley (Romagosa et al. 1996; Malosetti et al. 2004), rice (Hemamalilni et al. 2000) and maize (Beavis and Keim 1996; Austin and Lee 1998; Boer et al. 2007) some of them grown under low and high moisture regimes. Environmental specific QTL may occur because the same traits measured in different environments are actually different traits (Falconer 1981) such that a trait like yield under stress (low potential) and non stress conditions (high potential) may actually be mutually exclusive events (Ceccarelli and Grando 1993). The mutually exclusive events might become manifest in different environments through the switching on and off different genes prompted by different environmental signals (Lin and Togashi 2002).

More significant changes in allele frequencies in landraces, old, and new barley cultivars of markers associated to QTL were detected in high yield potential environments (67%: 8 out of 12) than in low yield potential environments (19%: 3 out of 16). Thus, these results suggest that modern breeding may have increased frequencies of marker alleles close to QTL that favour production under high yield potential environments at the expense of yield under low potential conditions. An example of the possible impact of selection for positive QTL alleles under high yield potential conditions is the DArT marker bPb3852 (5H bin 6), which contributed a yield gain of 0.50 and 0.38 t/ha on ESP4H and SYR5H, respectively (Table 4). This allele increased in frequency from 15% in landraces to 36% in old cultivars and 56% in modern cultivars (Table 4). This increment was particularly important in N2s and SW subpopulations. Changes in frequency of this marker in N6w were not that steep, as its frequency was already high for the three breeding classes (Table 4).

This study was intended as exploratory and a number of weaknesses were undoubtedly present such as the reduced number of consistent QTL detected in more than one trial. Therefore these results should be interpreted with care. However, they illustrate possible consequences from modern breeding which could be used for development of improved varieties targeted for low input systems. One example of such consequences derived from these results is that breeding and selection for the low potential environments should not neglect landraces. Landraces not only adapted relatively better to low yield potential environments than old and modern cultivars, but some of the key genetic regions responsible for such an adaptation pattern may have been unintentionally ignored by modern breeding.