Introduction

The multi-environment trial (MET) should be carried out in forest genetic tests due to the geographical applicability of forest species (Dutkowski 2005). One of the most important phenomenon in MET is that genotype-by-environment interaction (GEI) is often significant. How to accurately assess GEI is critical for further breeding and subsequent promotion of tree varieties. Thus far, most of the agricultural MET analysis adopts the joint regression method (Finlay and Wilkinson 1963); additive main effect and multiplicative interaction (AMMI) (Gauch and Zobel 1997); and genotype main effect plus genotype-by-environment interaction (GGE) biplot (Yan 2001). MET analysis generally includes genotype evaluation, test-environment evaluation, and mega-environment delineation. Both AMMI and GGE biplot are equally capable of delineating mega-environment delineation (Gauch et al. 2008). Furthermore, GGE biplot has a more complete and visual advantage in representing genotype performance and stability and identifying representative for test environments (Yan 2001). As far as we know, the application of AMMI and GGE biplot in trees has been only reported on a few species such as Pinus radiata (Ding and Wu 2008), Populus (Sixto et al. 2011) and Michelia chapensis (Wang et al. 2016).

Although AMMI and GGE biplot are widely applied in crops using phenotypic data, these methods have some limitations: (1) these methods are limited to the fixed effect model; (2) tested environment should be homogenous, and (3) trial data should be balanced. First, homogeneity of the test environment is hardly achievable in forest trails, which was confirmed by the present study that the forest test environment had various spatial variations. Secondly, the forest trials usually have missing data and are more often highly imbalanced. Moreover, genetic parameters, such as breeding value, require a random effect model to be estimated. Therefore, these problems might greatly restrict the application of AMMI and GGE biplot in forest trials.

In recent years, factor analytic (FA) method is used in forest MET, since it could possess highly imbalanced datasets (Costa e Silva et al. 2006; Costa e Silva and Graudal 2008; Cullis et al. 2014; Ivkovic et al. 2015; Chen et al. 2017). Moreover, FA model can capture complex variance structures with a relatively small number of variance parameters (Kelly et al. 2007). Unlike AMMI and GGE biplots, the FA method does not have the intuitive results, such as discrimination and representativeness of test environments, yield performance, and stability of test genotypes.

In this study, we presented one compound approach to combine FA method with AMMI and GGE biplot in order to fully demonstrate advantages of these methods in MET. We first used spatial analysis combined with the FA method to obtain the best linear unbiased prediction (BLUP) of each individual tree at each test site, and then the AMMI and GGE biplot were employed to evaluate family, test environment, and mega-environment delineation. This method enables us to avoid the limitations of a fixed effect model and test environment homogeneity, to improve the analysis result of AMMI and GGE biplot, and to serve as a reference for future implication in tree breeding programs.

Materials and methods

Experimental design

A total number of 36 open-pollinated (OP) families of Pinus taeda were used in this study. The progeny trials were carried out by randomized complete block design (RCBD) with three replications and five trees in each plot. The same design was implemented in the all six sites. The detailed information of the test sites is summarized in Table 1. Diameter at breast height (DBH) of all trees at age 15 was measured in 2010.

Table 1 Details of six test sites in South China

Statistical analysis

The model for MET BLUP by FA method (Lin 2016) is expressed as

$$y_{ij} =\upmu + S_{i} + SG_{ij} + e_{ij}$$
(1)

where \(y_{ij}\) is the observation of individual tree in the ith site, μ is overall mean, Si is the fixed effect of ith site, SGij is the random interaction effect of the individual tree with the ith site, and \(e_{ij}\) is random residual.

The variance matrix (G) of \(SG_{ij}\) could be fitted by FA method (Smith et al. 2001) as

$${\text{G}} = \left( {\Gamma \times\Gamma ^{T} +\Psi } \right) \otimes A$$
(2)

where \(\Gamma\) is a matrix of site loadings, \(\Psi\) is a diagonal matrix with special variances for each site, and A is the numerator relationship matrix of individual trees.

For spatial analysis (Dutkowski et al. 2002), the residual (\(e_{ijk} , {\text{R}}\)) could be partitioned into spatial correlated error (δ2ξ) and independent error (δ2η), and the variance matrix of R could be written as

$${\text{R}} = \delta_{\xi }^{2} \left( {\mathop \sum \limits_{c} (\rho_{c} ) \otimes \mathop \sum \limits_{r} (\rho_{r} )} \right) + \delta_{\eta }^{2} I$$
(3)

where δ2ξ is spatial correlated error, \(\mathop \sum \limits_{c} (\rho_{c} )\), \(\mathop \sum \limits_{r} (\rho_{r} )\) is autoregression matrix for column and row, ρc, ρr is autoregression parameters in column and row direction, δ2η is independent error, and I is an identity matrix. The significant levels of all parameters for spatial model were tested using the Loglikehood ratio test (LRT).

After fitting the model in ASReml 4.0 (Gilmour et al. 2016), the estimated breeding values of individual trees in each site can be obtained. Further, these values plus the overall mean (μ) were treated as BLUP data for the following analysis.

The AMMI model equation (Crossa 1990) is written as

$$y_{ijk} =\upmu + \alpha_{i} + \beta_{j} + Rep(\beta )_{jk} + \mathop \sum \limits_{n = 1}^{N} \lambda_{n} \gamma_{in} \delta_{jn} + \theta_{ij} + \varepsilon_{ijk}$$
(4)

where \(y_{ijk}\) is the raw observation or the estimated BLUP data, μ is the grand mean,\(\alpha_{i}\) is the main effect of ith family, βj is the main effect of jth site, Rep(β)jk is the main effect of kth replicate within jth site, λn is the singular value for the interaction principal component (IPC) axis n, γin is family i IPC scores for axis n, δjn is site j IPC scores for axis n, θij is the interaction residual not explained by IPC axis n, and ɛijk is the residual error.

The GGE model equation for the first two principal components (Yan 2001) is written as

$$y_{ij} - \beta_{j} = \lambda_{1} \xi_{i1} \eta_{j1} + \lambda_{2} \xi_{i2} \eta_{j2} + \varepsilon_{ij}$$
(5)

where \(y_{ij}\) is the measured mean or the estimated breeding value mean for ith family in jth site, βj is the measured mean or the estimated breeding value mean for all families in jth site, λ1 and λ2 are the singular values for the first two principal components (PC1 and PC2), ξi1 and ξi2 are the scores of family i for PC1 and PC2, ηj1 and ηj2 are the scores of site j for PC1 and PC2, and ɛij is the residual error.

AMMI stability value (ASV) is calculated by the following formula (Purchase 1997) as

$${\text{ASV}} = \sqrt {\left( {\left( {SS_{IPC1} /SS_{IPC2} } \right) \times IPC1} \right)^{2} + IPC2^{2} }$$
(6)

where \(SS_{IPC1} {\text{and }}SS_{IPC2}\) are the sum of squares for IPC1 and IPC2, IPC1 and IPC2 are the family scores for IPC1 and IPC2.

MET BLUP procedure was implemented by the program codes (Lin 2016) using software ASReml 4.0 (Gilmour et al. 2016), AMMI and GGEbiplot analysis were employed by R package agricolae (De Mendiburu 2016) and GGEBiplotGUI (Frutos et al. 2014), respectively. The GGE biplot was based on singular value decomposition with symmetrical scaling and focused on the environment (Yan 2010).

Results and discussions

Test environment heterogeneity

Since trial experiment homogeneity is required in AMMI and GGE biplot analysis, it was necessary to test whether all the trial experiments were homogenous. As a result, the environmental errors varied greatly among trial sites (Table 2). For random independent errors, site 5 was the biggest (5.04), while site 4 was not estimated and might be zero. For spatial correlated errors, site 4 was the biggest (6.96), followed by site 2 (4.58), while site 5 was not estimated and might be zero. When the random independent error was large and uncorrelated, then the spatial correlated error might be zero, and vice versa (Dutkowski 2005). In addition, except for site 5, significant autocorrelation existed in row or/and column direction at each site. There was negative autocorrelation in row (site 1 and 3) or column (site 5) direction despite the fact that they were not statistically significant, showing that there might be no or weak competition. These results indicated that forest trial environments were usually heterogeneous and that spatial analysis could effectively account for the environment heterogeneity within each site. Our study was consistent with studies in other trees species (Costa et al. 2001; Dutkowski et al. 2002, 2006; Terrance and Jayawickrama 2008). Therefore, it should be cautious to use phenotypic data directly for AMMI and GGE biplot analysis.

Table 2 Spatial variation pattern of raw data

AMMI analysis of raw data and BLUP data

The AMMI combined analysis of variance for both raw data and BLUP data revealed that the effects of site, family and their interactions (GEI) were highly significant (Table 3). For raw data, the site explained 41.05% of the total sum of squares (TSS) while GEI captured 16.03% of TSS, indicating that sites were quite diverse and could affect the family performance. Replication accounted for 3.89% of TSS and was highly significant. In addition, the first two interaction principal components (IPC1 and IPC2) only accounted for 63.99% of GEI, and only IPC1 had a significant effect. However, for BLUP data, site and GEI had extremely significant effects, along with their higher F values than those of raw data, implying stronger site and GEI effects than raw data. Although environment explained a greater percent of TSS (69.31%), it was obvious that the residual error of TSS was reduced to 16.69%, which was much smaller than the raw data (28.17%). Furthermore, replication effect only accounted for a small percent of TSS (0.23%) and was not significant, compared to the raw data (3.89%). The result was identical to the fact that ideal spatial analysis could effectively reduce the residual error and greatly cut down design effects (Dutkowski 2005, Dutkowski et al. 2006, Terrance and Jayawickrama 2008). Moreover, both IPC1 and IPC2 had significant effects and accounted for 72.81% of total GEI, which was higher than that in the raw data, even when GEI captured less percent of TSS (11.42%). These results implied that BLUP data was more reliable on the interpretation of DBH variation than the raw data.

Table 3 AMMI combined analysis of variance for raw data and BLUP data

The overall mean of DBH for each family was similar between raw data (15.62) and BLUP data (15.43), but their ASV and rank of ASV (rASV) were greatly diverse (Table 4). In addition, The Spearman correlation of rASV between raw data and BLUP data was only 0.36 (p < 0.05). According to index of rASV, family NO 30 of raw data had the lowest ASV (0.10), indicating that it was the most stable genotype, but it was rank 12 in BLUP data. For BLUP data, family NO 32 had the lowest ASV (0.10) and was the most stable genotype while it was only rank 7 in raw data. Only a few families had stable rASV in both row data and BLUP data, such as family NO 7, 18, 20, and 34, only 11.11% of total families. Since the first two IPC of BLUP data accounted for 72.81% of total GEI, higher than raw data (63.99%), revealing that ASV results of BLUP data were more credible.

Table 4 Overall mean of DBH and AMMI stability value (ASV) of raw data and BLUP data

GGE analysis of raw data and BLUP data

Which-Win-Where

The “Which-Win-Where” function of GGE biplot lines the outermost genotypes to a polygon and makes a vertical line for each edge of the polygon through the origin. Then the test environments are grouped and the superior genotypes are marked within each environment group (Yan 2010). The results showed that the six test sites of raw data were divided into three groups, sites 1, 2 and 5 in one group, sites 4 and 6 in another group, and site 3 in an independent group (Fig. 1a). Family No 10 was the highest genotype in sites 4 and 6, and family 20 was the highest genotype in sites 1, 2 and 5. Compared with raw data, BLUP data divided six sites into four groups, sites 1, 5, and 6 in one group, sites 2, 4, and 6 in independent one group, respectively (Fig. 1b). Family NO 19 was also the highest genotype in sites 1, 5, and 6, but families 9, 22, and 11 was the highest genotype in sites 2, 3 and 4, respectively. Furthermore, the first two PC of raw data only accounted for 66.21% of total GEI, less than BLUP data (78.46%). Similar to the AMMI results, the Which-Win-Where results of BLUP data were more reasonable.

Fig. 1
figure 1

Which-Win-Where views of GGE biplot based on symmetrical singular value decomposition (SVD) with Standard deviation (SD) scaling for raw data (a) and BLUP data (b). AXISI 1 and ASXIS2 stand for PC1 and PC2, respectively. The green numbers stand for genotypes and the blue characters stand for sites

Discrimination and representation of environments

The choice of test environment is directly related to the reliability of variety breeding, and an ideal test environment should be strongly discriminative and representative. The blue line with arrows represents the average environment axis, and the length of the dotted line between the test environment and the origin represents discriminative ability of test environment (Fig. 2). The angle between the test environment vector and the average environment axis represents the representative of the test environment. The smaller the angle, the stronger the representation of the test environment. If the angle is obtuse, it is not suitable as a test environment. The results showed that site 3, 4, and 6 were the best discriminative environments, and site 5 was the best representative environment for raw data (Fig. 2a). Therefore, site 5 was the ideal test environment for raw data. For BLUP data, sites 1, 4, and 6 were the best discriminative environments while site 2 was the worst one, and site 1 was the best representative environment (Fig. 2b).

Fig. 2
figure 2

Discrimination and representation of GGE biplot based on environment-focused scaling for raw data (a) and BLUP data (b). AXISI 1 and ASXIS2 stand for PC1 and PC2, respectively. The green numbers stand for genotypes and the blue characters stand for sites

Yield and stability analysis

The GGE biplot used average environment coordination (AEC) to evaluate the yield and stability of genotypes (Yan 2001). AEC included the average environmental axis (green solid line with arrow) and its solid vertical line through the origin (Fig. 3). The solid green line with an arrow was the average environmental axis, and the vertical black dotted line represented the average yield and stability of each genotype across all environments. The longer dotted line represented that the yield was more unstable.

Fig. 3
figure 3

Yield performance and stability analysis of GGE biplot based on environment-focused scaling for raw data (a) and BLUP data (b). AXISI 1 and ASXIS2 stand for PC1 and PC2, respectively. The green numbers stand for genotypes and the blue characters stand for sites

The solid green line vertical line through the origin stood for the grand (overall) mean. The genotype on the left side of the green vertical line represented its yield below the grand mean, while the genotype on the right side of the genotype represented its yield above the grand mean. Yield performance and stability were greatly differed between raw data and BLUP data. For yield performance in raw data (Fig. 3a), family NO 20 was the highest, and family NO 23 was around the overall mean, while family 9 was the lowest. The most stable genotypes were families 14 and 24, and the most unstable were families 3, 4, 10, 28, and 34.

An ideal genotype should take into account both high yield and stability (Yan 2010) making family NO 21 the best genotype in raw data. Compared with raw data, in BLUP data, the highest yield genotypes were familes 19 and 10, and the most stable was family NO 10, while the most unstable was families 11, 22, and 24 (Fig. 3b). The ideal genotype was family NO 10 in BLUP data. Although it seemed that yield performance and stability were greatly different between raw data and BLUP data, some consistency was found between these two datasets. For example, families 9, 2, and 32 were the worst genotypes in both two datasets. Similar to Which-Win-Where results, we thought that BLUP data had better analysis results in yield performance and genotype stability than raw data.

Conclusions

When using AMMI and GGE biplot directly from phenotypic data, it is not possible to obtain genetic parameters and calculate selection efficiency. Nevertheless, AMMI and GGE biplot have obvious advantages in mega-environment delineation, test-environment evaluation, and genotype evaluation. Whether the AMMI and GGE biplot is suitable for forest remains to be seen. We used a data set of six progeny testing sites to test this combined spatial/FA model with AMMI and GGE biplot for use in forest progeny trials. Our results showed that if we first obtained BLUP data from raw phenotypic data of forest MET by spatial effects with FA method, it would significantly improve the analysis result of AMMI and GGE biplot. First, spatial analysis with the FA method could eliminate effects of different spatial variation patterns from phenotypic data that resolved test-environment homogeneity. Second, BLUP data greatly reduced the percent of residual error on TSS and obviously increased variation explaining ability from AMMI analysis. Finally, raw data and BLUP data had substantially different results in the GGE biplot. Furthermore, the Spearman correlation of rASV between raw data and BLUP data was low (r = 0.36, p < 0.05), and the percent of the first two principal components on GEI in BLUP data was higher than in raw data. Therefore, we suggest that carrying out the BLUP procedure by spatial analysis with FA method will yield more credible results. It should be noted that the spatial model might be adjusted by design effects (such as block or plot) for different measured traits if their effects were significant. In addition, this BLUP procedure might also have some advantages. For example, if we have pedigree files or other relationships (such as genomic relationships) for test genotypes, we could get the missing value even if the dataset was highly imbalanced. Another advantage is that since the FA method belongs to the random effect model, it is realistic to estimate genetic and residual variances for further analysis of genetic parameters.