Introduction

Developing cold tolerant biomass sorghum genotypes is an important breeding goal in order to have an alternative crop to maize, which presently dominates the area cultivated for methane production in Central Europe. Sorghum is a thermophilic crop, mainly grown in the semi-arid tropics and subtropics. Low soil temperature in spring may delay planting time or result in low and non-uniform emergence. Planting is recommended when stable seedbed temperatures of more than 10 °C are achieved (Anda and Pinter 1994; Brar and Stewart 1994).

Agronomists and plant physiologists used different approaches to describe germination and emergence: final germination or emergence rate (Tiryaki and Buyukcingil 2009), germination indices (Afzal et al. 2008) and time to onset (T 1), end (T 100) and median time to emergence (T 50) or germination derived from functions describing the germination process regressed against time (Snapp et al. 2008). Several single-value germination indices, e.g. Kotowski’s coefficient of velocity (Kotowski 1926) or Timson’s cumulative germination index (Timson 1965) are widely used but final values cannot be traced back to direct measures for T 1, emergence rate (ER) and time span of emergence (T 100 − T 1), which describe the germination or emergence process (Brown and Mayer 1988a). The emergence process is important since both final percentage of emergence and the time, when emergence occurs, are temperature dependant, and good field emergence requires high rates of uniformly germinating seeds under both optimum and low temperature conditions (Kanemasu et al. 1975). Logistic regressions carried out on cumulative germination rates are commonly used to describe germination (Hsu et al. 1984; Schimpf et al. 1977). Several functions describing the germination process were compared by Brown and Mayer (1988b) who recommended the Weibull function (Weibull 1951). However, comparability of germination curves computed from data of different temperature regimes is limited (Dumur et al. 1990). The Weibull function for instance has a parameter describing the shape of the regression, which has an effect on the rate of increase of germination but no biological meaning. An alternative approach is the use of piecewise linear regression models (Kempenaar and Schnieders 1995). The advantages are (1) the possibility to directly compare model parameters from different datasets since parameters are not interrelated, i.e., the change of one model parameter does not necessarily lead to a change of a second parameter and (2) model parameters describe physiological processes or are simple statistical measures.

Genotypic differences in temperature response and base temperatures ranging from 5.9 to 9.8 °C were reported for germination of 16 sorghum cultivars (Wade et al. 1993). Genetic variation in base temperature and emergence rates at low temperatures was assumed to be the result of adaptation processes (Tiryaki and Andrews 2002). Chinese landraces had higher germination percentages and shorter time to 50 % germination at low temperatures than US breeding lines (Franks et al. 2006).

Quantitative trait loci (QTL) analysis for cold tolerance is a useful tool and a first step towards marker-assisted selection of cold tolerant genotypes (Knoll and Ejeta 2008). QTLs for germination rate were found in rice (Ji et al. 2009), wild barley (Vanhala and Stam 2006) and sorghum (Burow et al. 2011; Knoll et al. 2008). Knoll et al. (2008) identified QTL for field emergence in sorghum recombinant inbred lines (RIL) developed from a cross between a caudatum of African origin and the cold tolerant Chinese kaoliang ‘Shan Qui Red’ on chromosome SBI-01. Cold tolerance QTL were detected in the same region of SBI-01 by Burow et al. (2011).

For identifying QTL for adaptation processes, multi-environment trials are needed. Lacaze et al. (2009) carried out QTL analysis in a bi-parental barley population on the slope of individual genotype trait values regressed against the population mean in different environments. Kraakman et al. (2004) used a set of modern spring barley cultivars in order to detect marker–trait associations for mean yield and yield stability across environments. El Soda et al. (2010) used stability parameters to detect QTL for drought tolerance in wild barley introgression lines. It has been suggested that stability parameters can be used to distinguish between loci, in which constitutive genes are directly influenced by the environment and loci distinct from the constitutive genes but regulating them.

In contrast to QTL mapping in bi-parental crosses, association studies can be carried out on structured and unstructured populations, potentially carrying more than two alleles on a certain locus (Flint-Garcia et al. 2003). Advantages of association studies are that time consuming and expensive development of bi-parental crosses is not needed and a wider gene pool can be analyzed (Neumann et al. 2010). Genome-wide association studies were carried out, e.g., on traits like days to heading, culm diameter, leaf length and width in sorghum using SSR markers (Shehzad et al. 2009). Association mapping with Diversity Array Technology markers (DArT) was reported for barley (Pswarayi et al. 2008) and wheat (Crossa et al. 2007). The disadvantages are that DArT markers are bi-allelic and dominant and are based on unknown sequences (Mace et al. 2008). However, compared to SSRs, DArT markers allow a cost efficient and fast genome-wide genotyping.

The objective of the present study was to detect marker–trait associations for emergence across different temperature regimes in sorghum. The process of emergence in the different temperature regimes was described by cumulative emergence percentages (CEP) over time in order to derive traits like FEP, T 100 − T 1 and ER from piecewise linear regressions carried out on CEP. Since superior genotypes show high emergence percentages in a wide range of environments while emergence takes place shortly after sowing and all plants emerge nearly at the same time, the parameters FEP, ER, T 1, T 50, T 100 and T 100 − T 1 are relevant. To evaluate the temperature effect on emergence, across environment means (M) and Finlay–Wilkinson slopes (FW) (Finlay and Wilkinson 1963) were estimated. FEP was computed separately for low and normal temperatures and base temperature (T b) and thermal time (E TS) were calculated based on T 50 data. Genome-wide association studies were carried out on these parameters.

Materials and methods

Plant material

The study was carried out on a diverse set of genotypes comprising 194 biomass sorghum lines. The set includes Sorghum bicolor and S. bicolor sudanense genotypes. DNA was extracted from leaf tips using the cetyl trimethylammonium bromide (CTAB) method. The genotypes were fingerprinted with 688 polymorphic DArT markers. Marker positions were taken from Mace et al. (2008). Unmapped markers and completely linked markers were excluded from the study and further 115 markers with frequencies <5 % of the rare allele were also removed. Association studies were carried out using the remaining 171 polymorphic DArTs.

Experimental design and data collection

The experiment was conducted in growth chambers set to nine temperature regimes ranging from 9.4 to 19.9 °C. Overall mean, mean night and day air and soil temperatures are shown in Table 1. A mean temperature of 9.4 °C was used as lowest temperature treatment since pretests on a population subset revealed that the base temperature of emergence is expected to be higher than 8 °C and lower than 11 °C for most of the lines. Air and soil temperature was measured every 5 min directly above the trays and at 10 mm depth using TinyTag View 2 data loggers (Gemini Data Loggers Ltd., West Sussex, UK) during the entire duration of the study.

Table 1 Average daily mean, night and day air and mean soil temperatures in the nine temperature treatments

Individual temperature regimes were arranged as randomized complete block designs with two replications. Light was applied for 12 h with 10 h full light and 2 h twilight. The genotypes were sown in trays filled with 50 % Klasmann Potgrond P (Klasmann-Deilmann, Groß-Hesepe, Germany) and 50 % loamy humic sand. A total of 18 seeds per line, treatment and replication were sown at 10 mm depth.

The number of emerged seeds was counted daily until no further seeds emerged. A plant was defined as emerged if the coleoptile was visible. Cumulative emergence percentage (CEP) for each day was calculated using the following equation:

$$ {\text{CEP }} = \Upsigma {\text{NES}}_{i} /18 \times 100 $$
(1)

where NES i is the number of seeds emerged on day i and 18 is the total number of seeds. Mean CEPs of the two replications were calculated and used for parameter estimation.

Data analysis

A piecewise linear regression was fitted to cumulative emergence percentages (Fig. 1) in order to derive the parameters onset of emergence (T 1), median time to emergence (T 50), emergence rate (ER) and end of emergence (T 100) using SAS 9.1 (SAS Institute Inc., Cary, NC, USA). The equation used was:

$$ \begin{aligned} & {\text{CEP}} = 0 \quad t \le T \\ & {\text{CEP}} = {\text{ER }}\left( {t - T_{1} } \right) \quad T_{1} < t < T_{100} \\ & {\text{CEP}} = {\text{ER }}\left( {T_{100} - T_{1} } \right) \quad t \ge T_{100} , \\ \end{aligned} $$
(2)

where t is the actual number of days from sowing (DAS). CEP equals to final emergence percentage (FEP) if t ≥ T 100. The regression slope between T 1 and T 100 is the estimator for the daily emergence rate (ER). T 50 was estimated as follows:

$$ T_{50} = T_{1} + 0.5{\text{ FEP}}/{\text{ER}}. $$
(3)
Fig. 1
figure 1

Piecewise linear regression for calculating onset (T 1) and end (T 100) of emergence, uniformity (T 100 − T 1), emergence rate (ER) (regression slope) and the median of emergence time (T 50) (a) and T 1, T 100, T 100 − T 1, T 50 and FEP of the population mean in nine temperature treatments (b)

Time span of emergence or uniformity of emergence was defined as T 100 − T 1. For comparing the genotypes over a series of environments, stability analysis was carried out according to Finlay and Wilkinson (1963). Genotype performance across environments was estimated by regressing individual genotypes against the population mean:

$$ Y_{ij} = \mu + \beta_{i} e_{j} + g_{i} $$
(4)

where μ is the overall population mean, β i is the linear regression coefficient for the ith genotype, e j is the effect of the jth environment and g i is the effect of the ith genotype.

Data was subjected to analysis of covariance using the following model for \( i = 1,2,3, \ldots ,k \) genotypes and \( j = 1,2,3, \ldots ,n \) environments:

$$ Y_{ij} = \mu + \tau_{i} + \beta x_{ij} + \gamma_{i} x_{ij} + \varepsilon_{ij} $$
(5)

where μ i  = μ + τ i is the intercept of the ith genotype, FW i  = βx ij  + γ i x ij is the slope of the genotype performance of genotype i in nine environments regressed against the population means of the environments (Finlay and Wilkinson 1963) and ε ij is the random error of the ith genotype in the jth environment. Analysis of covariance was carried out on the parameters \( T_{1} , \, T_{50} , \, T_{100} ,T_{100} - T_{1} \), ER and on FEP. FEP was arcsine-square root transformed prior to carrying out analysis of covariance.

Mean final emergence percentage over environments 1, 2, 3, and 4 was considered as FEP under low temperature conditions (FEPcold) while FEP of environments 7, 8, and 9 was averaged to define FEP under normal conditions (FEPnormal). Two factorial analysis of variance (proc GLM SAS 9.1) was carried out on arcsine-square root transformed data considering FEPcold and FEPnormal as two treatments with the individual temperature regimes as 4 or 3 replications.

Linear regression analysis was carried out on developmental rates (1/T 50) of the 9 temperature regimes. Base temperature (T b) was estimated by linear extrapolation to define the temperature at which the development rate becomes 0:

$$ T_{bi} = - \beta_{0i} /\beta_{i} . $$
(6)

where β i is the regression slope and β 0i is the y-axis intercept of the ith genotype. The temperature sum required for emergence (E TS) was defined as 1/β i .

Pearson’s correlation coefficients were calculated between parameters and across environment means of traits. Variance components were assessed using restricted maximum likelihood (REML) estimates (proc MIXED, SAS 9.1). Broad sense heritability (h 2) was calculated according to Hill et al. (1998):

$$ h^{2} = \frac{{\sigma_{\text{G}}^{2} }}{{\sigma_{\text{G}}^{2} + \sigma_{{{\text{G}} \times {\text{E}}}}^{2} \frac{1}{n} + \sigma^{2} \frac{1}{n}}} $$
(7)

where \( \sigma_{\text{G}}^{2} \) is the genotypic variance, \( \sigma_{{{\text{G}} \times {\text{E}}}}^{2} \) is the genotype × environment interaction variance, σ 2 is the error variance, and n is the number of environments.

The population structure of 194 individuals was determined using the software package STRUCTURE assuming an admixture model (Pritchard et al. 2000). We used a burn-in phase of 10,000 iterations followed by 10,000 Markov chain Monte Carlo iterations in order to detect the “true” number of K groups in the range of K = 1–20 possible groups. δK was calculated according to Evanno et al. (2005). The cluster analysis was carried out with TASSEL 2.01 using the neighbor-joining method.

Linkage disequilibrium (LD) parameters were estimated by using the software TASSEL 2.01 (Bradbury et al. 2007). The p values of pairwise LD were computed using 1,000 permutations. LD was calculated for all pairs of loci. The critical R 2 for unlinked loci was estimated after square root transformation of the R 2 values (Breseghello and Sorrells 2006). The 95 % percentile of this distribution is the threshold beyond which LD was likely to be caused by genetic linkage. A second-degree LOESS curve was plotted through the R 2 data and the point of intersection with the threshold value was used as the genome-wide estimate of LD among loci (Breseghello and Sorrells 2006).

TASSEL 2.01 (Bradbury et al. 2007) was used for identifying significant associations between the 171 markers and a total of 16 traits. The data were subjected to both a general linear model (GLM) and a mixed linear model (MLM) (Zhang et al. 2010). The Q-matrix, which shows the probability that a genotype belongs to a subpopulation, was estimated with STRUCTURE and used in both models. A kinship matrix was computed with TASSEL 2.01 and used in MLM. An F test with 1,000 permutations was carried out in order to adjust p values of GLM (Churchill and Doerge 1994).

For verification of significant marker–trait associations, the population was divided into two subpopulations at each relevant locus according to the allelic state of the individuals and pairwise t tests (p < 0.05) were performed in order to test if the marker genotypes differ significantly for the respective trait.

Results

Figure 1b shows that FEP increased with increasing temperature. FEP of the population mean was 95.7 % at 19.9 °C but less than 80 % if air temperature was below 10.8 °C. Lowest FEP was 35.8 % at 9.4 °C. Mean FEPcold was 61.6 % and ranged between 12.5 and 93.1 % while mean FEPnormal was 93.3 % (Table 2).

Table 2 Genotype mean, minimum and maximum across all nine environments and the mean, minimum, maximum and average R 2 of Finlay–Wilkinson slopes for the parameters final emergence rate (FEP), emergence rate (ER), onset (T 1) and end (T 100) of emergence, the median of emergence time (T 50) and uniformity (T 100 − T 1). Genotype mean, minimum, and maximum for FEP under cold and normal conditions, base temperature (T b) and thermal time (E TS)

T 100 − T 1 decreased with increasing temperatures and both onset and end of emergence occurred earlier at higher temperatures. Mean T 100 − T 1 was 8.1 days across all temperature regimes (Table 2). Emergence started on average 10.2 DAS and ended 18.3 DAS. The population mean of ER averaged over environments was 15.4 % days−1 and ranged between 6 % days−1 at 9.4 °C and 27 % days−1 at 19.9 °C. T 50 of the population mean was achieved 18.5 DAS at 9.4 °C and 11 DAS at 19.9 °C. T b ranged from 5.1 to 8.7 °C, mean E TS was 54.2 °C day and ranged between 41.6 and 93.3 °C day.

Analysis of covariance revealed that both the genotype and the genotype × environment interaction (GEI) effect were significant for all analyzed traits (Table 3). Genotype effects were highly significant for FEP, T 1, T 50 and ER (p < 0.001) but also significant for T 100 (p < 0.05) and T 100 − T 1 (p < 0.01). Estimated h 2 was highest for FEP (0.92) (Table 4). For all other traits h 2 ranged between 0.73 and 0.86. Analysis of variance for FEPcold and FEPnormal revealed that genotype and temperature effects were significant while genotype × temperature interaction effect was not statistically significant (p = 0.07) (Table 5).

Table 3 Covariance analysis for final emergence percentage (FEP), emergence rate (ER), onset (T 1) and end (T 100) of emergence, the median of emergence time (T 50) and uniformity (T 100 − T 1)
Table 4 Variance components and heritability of final emergence percentage (FEP), emergence rate (ER), onset (T 1) and end (T 100) of emergence, the median of emergence time (T 50) and uniformity (T 100 − T 1)
Table 5 Analysis of variance (ANOVA) for final emergence percentage under cold (FEPcold) and normal (FEPnormal) conditions

Results of FEPcold and FEPnormal, stability analysis for FEP, T 1, T 50, T 100, ER, and T 100 − T 1 as well as T b and E TS are shown in Fig. 2. For FEP (b) and ER (g), a high genotype mean and a small FW illustrates the superiority of a genotype. A small genotype mean and FW is desirable for the traits T 1 (c), T 50 (d), T 100 (e) and T 100 − T 1 (f). Ranges of FW are shown in Table 2. Highest variation of FW among genotypes was observed for ER. ER(FW) ranged from 0.3 to 2.4. Average R 2 for FW ranged between 0.77 for T 100 − T 1 and 0.98 for T 50 (Table 2). A low T b in combination with a short E TS indicates a desirable genotype Fig. 2h.

Fig. 2
figure 2

Final emergence percentage under cold (FEPcold) and normal (FEPnormal) conditions (a), Finlay–Wilkinson regression for calculating Finlay–Wilkinson slope (FW) and across environment mean (M) for FEP (b), onset (T 1) (c), median time to emergence (T 50) (d), end of emergence (T 100) (e), uniformity (T 100 − T 1) (f) and emergence rate (ER) (g) in nine environments and the relationship between development rates and mean air temperatures for calculating base temperature (T b) and thermal time for emergence (E TS) (h). E TS is the inverse of regression slope (sl). Filled symbols indicate the genotype with highest FEP (a, b) or emergence rate (g) and development rates of T 50 (h) or the shortest duration of T 1 (c), T 50 (d), T 100 (e) and uniformity (f), respectively. Open symbols represent the worst performing genotype. Selection was done for each trait separately while selection criterion for T b and E TS was T 50

Correlations between mean genotype performance and FW were significant for T 1, T 50 and T 100 (ESM_Table 1) while correlation between FEP(M) and FEP(FW) was statistically not significant. T b was significantly correlated to T 50(FW) (0.54 p < 0.001) and E TS (-0.78, p < 0.001) while correlations between T 50(FW) and E TS were statistically not significant.

Maximum value of δK occurred at K = 2. Accordingly, each of the 194 lines was assigned to one of the K = 2 groups, 54 lines (28 %) belong to group 1 while the remaining 140 lines (72 %) belong to group 2 (Fig. 3a). Most genotypes of group 1 are members of the S. bicolor sudanense clusters of Fig. 3b. These are the clusters from genotype 9 to 189 at the bottom and from genotype 117 to 28 on the right hand.

Fig. 3
figure 3

Population structure (a) and neighbor-joining dendrogram (b) of the diversity set. The population structure shows two distinct groups: Group 1 is represented by grey boxes (a) or a grey ellipse (b) and group 2 is represented by white boxes

LD in relation to the genetic distance of marker pairs on the same chromosome is shown for the whole population (Fig. 4b) and for the two subpopulations group 1 and 2 (Fig. 4c, d). LD of marker pairs from different chromosomes is illustrated by box and whisker plots. For the whole population, significant LD (p < 0.05) was observed for 723 marker pairs (50.6 %) located on the same chromosome. Mean R 2 for all intrachromosomal marker pairs was 0.08. Group 2 showed less marker pairs (19.5 %) significantly in LD compared to group 1 (28.5 %). Mean R 2 for all intrachromosomal marker pairs of group 1 was 0.08 (1,427 marker pairs) and 0.05 in group 2 (1,222 marker pairs). The critical R2 value was 0.53 for the whole population and 0.54 and 0.40 for group 1 and 2, respectively. Beyond this value, LD was likely to be caused by genetic linkage. Mean distance of marker pairs showing an LD beyond this threshold was 13.2 cM in the whole population while in the groups mean distance was 30 cM (group 2) and 24.4 cM (group 1). The LOESS curve did not cross the critical R 2 baseline in all cases, which gives hint that LD decayed fast. Another indicator for fast LD decay is that mean R 2 fell constantly below 0.15 if the distance was larger than 8 cM (Fig. 4a).

Fig. 4
figure 4

Mean R 2 values for different centimorgan (cM) classes (a). Linkage disequilibrium parameter R 2 plotted against the genetic distance in cM for the whole population (b), group 1 (54 genotypes) (c) and group 2 (140 genotypes) (d). The bottom black line shows the second-degree LOESS curve. Box plots show the distribution of R 2 derived from pairwise LD of unlinked loci, dotted lines indicate the median and straight lines represent the mean. Boxes show the 25 and 75 % percentile, whiskers the 95 and 5 % percentile. The critical R 2 is given by horizontal black lines

A comparison of both methods, GLM without permutation test and MLM using the rank sum method according to Stich et al. (2008) shows that mean squared difference (MSD) between observed and expected p-values of GLM are for all traits higher than MSD of MLM (Table 6).

Table 6 Mean squared difference (MSD) of generalized linear model (GLM) without permutation test and a mixed linear model (MLM) data, number of significant marker–trait associations using GLM with permutation test and MLM and number of significant marker–trait associations using both models

Table 6 shows the number of significant marker–trait associations identified with GLM after carrying out the permutation test and MLM. A total of 102 marker–trait associations was congruently detected by both models while 174 loci were significantly associated to one of the analyzed traits using MLM and 196 loci using GLM. The highest number of significant marker–trait associations was detected for E TS and T 1(M). Application of GLM revealed 39 marker–trait associations for E TS while 14 marker–trait associations were detected using MLM. Only 11 of the loci turned out to be significant in both models. Only 3 loci were significant applying both models on T 100 − T 1(M) data.

ESM_Table 2 shows marker–trait associations that were significant using both GLM and MLM models. Means and standard deviations for the trait values of the two groups of marker genotypes are shown. The common allele is defined as the predominant allele. Pairwise t tests comparing the rare and common allele revealed that 17 marker–trait associations significant in both models, GLM and MLM, were not significant. E.g., all the traits T 1(M), T 1(FW) and T 50(M) were associated to marker loci sPb-6748 and sPb-3298 on chromosome SBI-09 according to GLM and MLM but the t test showed no significance between trait values of the marker-genotype groups.

If marker–trait associations not significant according to the t test are excluded, a total of 85 marker–trait associations remain for 16 traits (Fig. 5). Out of them, 24 markers were associated with only one trait and 20 markers were associated with between two and six traits. A total of 42 temperature response QTL, marker–trait associations for FW, T b and E TS, were found while 14 of them were detected on chromosome SBI-08. Some marker–traits associations were detected for genotype mean and FW at the same position, e.g. for T 100 − T 1 on chromosome SBI-01, T 1 on chromosomes SBI-03, SBI-06 and SBI-09, T 50 on chromosomes SBI-03, SBI-04 and SBI-09 and T 100 on chromosomes SBI-04, SBI-08 and SBI-09. Marker–trait associations for FEP(M) and FEP(FW) did not co-localize. Four marker–trait associations for FEP(M) were located on chromosome SBI-01 between 25 and 66 cM. Marker–trait associations for FEP(FW) were found on SBI-03, SBI-04, SBI-05, SBI-08, SBI-09 and SBI-10. Co-localization of marker–trait associations for FEPcold and FEP(M) was detected on chromosome SBI-01 and SBI-03 while marker–trait associations for FEPnormal did not co-localize with FEPcold, FEP(M) or FEP(FW). Marker-trait associations for T b and T 50(FW) were co-located on chromosome SBI-01 and SBI-08, while E TS and ER(FW) were associated with sPb-0258, sPb-1661 and sPb-1881 on chromosome SBI-08.

Fig. 5
figure 5

Marker–trait associations for final emergence percentage (FEP) under cold (checkered square) and normal (striped square) conditions and for genotype mean (filled symbols) and Finlay–Wilkinson slope (unfilled symbols) of FEP (square), onset (hexagon) and end (diamond) of emergence, uniformity (right angled triangle), emergence rate (circle) and the median of emergence time (upright triangle) and marker–trait associations for base temperatures (star) and thermal time of T 50 (sigma)

A positive effect of the rare allele on across environment means (reducing T 1, T 100 and T 100 − T 1 and increasing FEP and ER) was observed for 21 markers. The rare allele of sPb-7795 on chromosome SBI-03 increased FEP(M) while the rare allele of sPb-7290 on chromosome SBI-06 caused an earlier T 1(M). Reducing T 100 − T 1(M) was associated with the rare alleles of sPb-3801, sPb-4081 and sPb-9894 on SBI-01, SBI-02 and SBI-03 while sPb-4081 and sPb-1925 on chromosome SBI-02 increased ER(M).

Discussion

Statistical models and crop models for QTL detection

The objective of the present study was to identify marker–trait associations for sorghum emergence under a broad range of temperature regimes. Sorghum cultivation in temperate climates requires the development of genotypes with high FEP under both low and optimum temperature conditions, such ideotypes should emerge uniformly and shortly after sowing. The latter makes it necessary to understand emergence as a process, which is described best by CEP, allowing a precise estimation of T 1, T 100, T 50, ER and T 100 − T 1. Using piecewise linear regressions allowed us to describe the variability of the emergence process of 194 sorghum genotypes grown in nine temperature regimes, while it was not possible to generate model parameters for all genotypes in any environment if the Weibull function was applied (data not shown). Generally, Weibull is highly recommended for describing germination and emergence data (Brown and Mayer 1988b). However, simplicity and flexibility as well as independence and biological interpretability of all parameters (Vieth et al. 1989) make piecewise linear regressions the model of choice, enabling the direct comparison of largely contrasting genotypes cultivated under a broad range of environmental conditions (Trudgill et al. 2000).

For marker-assisted selection it is necessary to identify QTLs that are stable across environments (Burow et al. 2011). Under situations of environmental stress, reproducibility of phenotypic data and QTL detection are low. Traits that are highly influenced by environmental factors can by definition not produce the same results in different environments. A challenge is to carry out QTL analysis directly on parameters of the response curves of a trait to its influencing factors, thus, genetic dissection of adaptation processes is done best by using mathematical functions (e.g., growth functions) for QTL detection (Reymond et al. 2003). Stability parameters are good statistical estimators for such a situation and they have already been used to distinguish between QTL for the trait itself and for GEI effects (Kraakman et al. 2004; Lacaze et al. 2009).

In contrast to stability parameters, T b and E TS are broadly used crop-modeling parameters and theoretically can be used for predicting mean emergence time of any genotype in different environments. Predicting the performance of different genotypes in different environments is a major goal for combining crop-modeling approaches with quantitative genetic analyses. However, the use of stability parameters has several advantages. Detailed environmental and climatic data is lacking in many state of the art breeding trials. Multi-environment trials with many factors, which cannot be controlled completely (e.g., temperature, soil type and structure, rainfall), are commonly used to carry out stability analyses. The linear regression model for estimating T b and E TS works since 1/T 50 data of all genotypes is within the linear increase of emergence time in relation to temperature for the sampled environments. This does not hold true for all traits. Functions that fit the temperature response of FEP of individual genotypes used in the present study would include exponential, linear, and monomolecular ones since not all temperatures from the minimum to the optimum for the individuals were sampled. As parameters of different functions (e.g., exponential and monomolecular) cannot be used simultaneously for QTL detection, stability parameters, which have the disadvantage not to represent real physiological responses to the environment, are an adequate compromise in many situations (El Soda et al. 2010). Stability analyses applied on data from controlled environments, e.g., varying only in temperature, are not common but have the main advantage that different reactions of genotypes can be traced back to a single influencing factor. In conclusion, QTL for FW in our study are truly temperature response QTL, which neither interacted with nor were affected by other environmental variables.

In our study, 11 of 32 marker–trait associations for FW co-locate with genotype mean performance QTL of the analogous trait. The only trait with no co-localization of FW and genotype mean performance was FEP. According to Kraakman et al. (2004) the co-localization of QTL for mean trait performance and stability parameters is an indicator for genotypic differences in the allelic sensitivity, while QTL for stability parameters which are far from any QTL for the trait itself, suggest a gene regulatory network in which adaptive genes switch on or off the constitutive genes influenced by the environment. In the present study, there is always a strong positive correlation between FW and mean genotype performance of T 1, T 50, and T 100, since all genotypes emerge relatively fast under favorable conditions, while those with a low cold tolerance, emerge late at low temperatures. The same relation holds also true for T 100 − T 1 but not for FEP. The absence of co-localization of FW and mean genotype performance QTL for FEP suggests a gene regulatory network but another reason could be that FEP is influenced by seed quality. Negative effects on seed quality may result from some extremely late flowering genotypes, i.e., seeds may not have reached maturity at harvest time and immature seeds have a reduced FEP (Shepard et al. 1996). In contrast to FEP, time of emergence includes only those seeds, which do emerge, and FEP QTL may also include QTL for flowering time. A common approach to separate seed quality QTL from QTL for cold tolerance is to use relative values (FEP at low temperatures over FEP at normal temperatures). The use of FW can be seen as a different approach to correct FEP data for differences in seed quality. A small slope indicates that FEP is not or only slightly affected by temperature regardless of FEP at higher temperatures, i.e., cold tolerant genotypes have a small FEP(FW) value.

Genome regions affecting the germination process

Results of the present study confirm earlier QTL studies and show that most promising regions for emergence and cold tolerance during emergence are located on SBI-01 (Knoll et al. 2008; Burow et al. 2011). Burow et al. (2011) found QTL for early emergence close to Xtxp350 in the same region of a QTL for FEP(M) and FEPnormal at DArT markers sPb-3891 and sPb-2583, respectively (Fig. 6). Knoll et al. (2008) described QTL for early field emergence flanked by SSR markers OPA19 and umc83. The latter was mapped close to sPb-8947 (Mace et al. 2008), which is associated with FEP(M) in the present study. Another interesting region is between Xtxp043 and Xtxp032, where sPb-0090 was mapped according to Mace et al. (2008). sPb-0090 is associated with FEP(M) and Xtxp043 is a flanking marker of a QTL detected by Burow et al. (2011). High-resolution SNP maps allowing regional association studies are needed to identify candidate genes within these important QTL regions.

Fig. 6
figure 6

Alignment of the genetic map of chromosome SBI-01 from Burow et al. (2011) (a), the present study (b) and Knoll et al. (2008) (c). Underlined and italic markers represent flanking markers of QTL for early field emergence or germination at 30 °C (a) and early vigor, early and late emergence (b). Symbols indicate final emergence percentage of over all environments (filled square), under cold (checkered square) and normal conditions (striped square)

The rare allele of sPb-3801 on SBI-01 reduces T 100 − T 1(M) and T 100 − T 1(FW) and thus homogenizes the emergence process. Improving emergence percentage and uniformity leads to a better canopy establishment resulting in higher and more stable yields (Cisse and Ejeta 2003). However, low seedling vigor and a prolonged juvenile development at low temperatures may lead to a delayed canopy closure and yield reduction despite high FEPs and uniformly emerging seeds. Thus, improving cold tolerance of a crop is not simply done by improving seed emergence. QTL for early vigor and field emergence were identified between Xtxp043 and Xtxp032 by Knoll et al. (2008). QTL regions affecting emergence and seedling vigor at the same time in the same direction may be the most promising ones for improving the cold tolerance of a crop.

The QTL for T b on SBI-01 is difficult to interpret since the marker allele, which decreases T b, increases E TS. The parameters are negatively correlated and both T b and E TS depend on the regression slope β i (Eq. 6) of the rates of development of T 50 regressed against temperatures. If a marker allele has an effect on E TS and the intersection between the regression lines of the negative and the positive allele is >0, the increase in E TS leads to a decreasing T b and vice versa. Selection for T b makes only sense if E TS is not significantly affected (Fig. 2h) or positively affected (intersection < 0) at the same time.

Knoll et al. (2008) detected a QTL for germination at high (30 °C) and low (13 °C) temperatures on SBI-03. The QTL region is not the same as that one we identified on SBI-03 for FEP. Flanking markers of the earlier identified QTL mapped according to Mace et al. (2008) in the large gap our map shows on SBI-03. Anyway, the region between 4 and 5 cM on SBI-03 is a promising QTL region. The rare allele of sPb-7795 is associated with a positive effect on FEP(M) and FEPcold and the rare allele of the very close marker sPb-5454 decreases FEP(FW), i.e., improves cold tolerance. Srinivas et al. (2009) detected QTL for maturity close to sPb-7795 on SBI-03 (Mace and Jordan 2011), which may support the hypothesis that maturity affects seed quality.

Zhang et al. (2005) detected in a rice RIL population two QTL for germination under low temperature on chromosomes 3 and 8. Rice chromosomes 3 and 8 are widely homologous to SBI-01 and SBI-07 according to Ventelon et al. (2001) and following the nomenclature of Kim et al. (2005). A major QTL for germination at optimal temperatures was found on rice chromosome 2 in a F2 population (Li et al. 2011), chromosome 2 is globally homologous to SBI-04 (Ventelon et al. 2001). We found marker–trait associations for FEP(M) on SBI-01, SBI-03 and SBI-09 and for FEP(FW) on SBI-04. The rare allele of sPb-4851 decreases FEP(FW) and T b. Probably a reduction of T b leads to less reductions of FEP under low temperatures.

QTL for maize germination percentage under low temperature conditions were identified on maize chromosome 4 (Hund et al. 2004). Liu et al. (2011) found QTL for maize germination percentage related to seed vigor on chromosomes 4, 7 and 10. We detected a QTL for FEP(FW) on SBI-04 but no QTL for FEP(M), FEPnormal, or FEPcold on SBI-04 and SBI-05, which contain homologous regions of maize chromosome 4 (Whitkus et al. 1992) and for FEP(FW) on SBI-08 carrying homologous regions of maize chromosome 10. Another promising region on SBI-08 between 73 and 111 cM carries no FEP QTL but three marker–trait associations with a positive effect of the rare allele on ER(FW), E TS, one QTL for T b, and several QTL for traits related to emergence time. Limami et al. (2002) found QTL for T 50 on maize chromosome 2, which is homologous to regions of SBI-02 and SBI-06 (Whitkus et al. 1992) and on maize chromosome 4, which is homologous to regions on SBI-04 and SBI-05 (Whitkus et al. 1992). Our results show marker–trait associations for T 50(M) and/or T 50(FW) on SBI-04 and SBI-06. Possibly the same genes regulate cold tolerance during emergence of maize, sorghum and rice. In addition, the identification of candidate genes is required to provide more detailed information about the genetic background.

Power and reliability of QTL detection

LD of the present sorghum population decayed within 8 cM, while average marker distance was 8.7 cM and the largest gap between markers was 66 cM. Large gaps in combination with fast LD decay make it impossible to screen the whole genome for significant marker–trait associations. However, mean LD values are useful but give no information about its local extent since high variation of LD among the genome occurs (Sorkheh et al. 2008) and LD varies also between groups of a population. We observed a higher mean R 2 and critical R 2 threshold for group 1 than for group 2. One reason could be the different population size of the groups but also differences in the number of polymorphic markers in the groups. Mean R 2 of the whole population was higher compared to values obtained by Bhosale et al. (2011) and Hamblin et al. (2004). Both studies included wild sorghum accessions. Wild sorghums have higher outcrossing rates than cultivated ones and high outcrossing rates decrease the extent of LD.

Different strategies like integrating the population structure (Pritchard et al. 2000) and familial relatedness (Yu et al. 2006) have been used to reduce false positive marker–trait associations. Kinship coefficients are used to correct association studies for familial relatedness and show the probability that homologous loci are identical by descent. The MLM approach takes both population structure and kinship matrix into account while GLM as implemented in TASSEL 2.01 uses only the population structure (Casa et al. 2008; Shehzad et al. 2009). Our results show that type I error rates of GLM are higher than those of MLM, which is in accordance with Neumann et al. (2010). Neumann et al. (2010) concluded that some associations can only be detected by GLM but, since GLM may result in many false positive marker–trait associations, both approaches GLM and MLM should be used together.

We observed that controlling GLM type I error rates with a permutation test (Churchill and Doerge 1994) reduces the number of detected marker–trait associations to a similar level as MLM. However, approximately 50 % of the identified loci were shared by applying both methods and a subsequently carried out t test revealed that 16 of the shared marker–trait associations were not significant. In conclusion, even taking the population structure and/or familial relatedness into account both GLM and MLM may result in spurious marker–trait associations and comparing the results of different models may presently be the most useful way for detecting reliable associations (Shezad et al. 2009).

Conclusions

In accordance with previous studies, we conclude from the present work that one of the most promising regions for improving FEP is located on SBI-01. However, the time-point at which emergence occurs as well as across environment stability of FEP is likely be regulated by distinct QTL regions. Piecewise linear regressions gave a good estimate of the emergence process of different genotypes. However, the emergence model in combination with stability analysis was able to precisely describe the emergence process across different temperature regimes. This combination enabled the detection of QTL for GEI effects. An interesting alternative approach is to use physiologically more meaningful parameters like T b and E TS as input traits for QTL detection. A shift in T b without negatively affecting development processes is the most promising avenue to adapt crops to new cultivation areas with lower temperatures. However, the identification of stable markers and candidate genes for sorghum cold tolerance during emergence requires the development of high-density genetic maps.