Introduction

Wheat production is threatened by biotic and abiotic stresses, and yellow rust is one of the most important biotic stresses which is able to cause 100% yield loss under epidemic situations. A regular regional crop loss due to YR ranges from 0.1 to 5% and occasionally up to 25% every year, globally (Wellings 2011). Puccinia striiformis Westend. f. sp. tritici Eriks. & Henn. (PST), the causing agent of the disease is a heteroecious macrocyclic fungus and can further be differentiated into the races based on the compatible/incompatible pattern on the differential sets. PST evolves rapidly into new virulent races by mutation and somatic hybridization which result in the breakdown of genetic resistance of wheat cultivars. Temperate-cool seasons along with enough moisture in the environment are the most important factors affecting the biology of PST (Chen 2020).

Host plant resistance is considered the most effective, reliable, economic, and eco-friendly approach to manage the disease (Chen 2020); therefore, breeding for resistance is one of the most important stages in wheat breeding programs. Resistance to YR in wheat conferred by different Yr genes. Till date, 83 officially and 69 temporarily designated Yr genes have been identified (Yuan et al. 2020). Race-specific and non-race-specific resistance are the two main types of resistance that inherits, respectively, qualitatively and quantitatively. Race-specific or seedling resistance genes are often conferring protection throughout all stages of the plant, usually short-lived, and vulnerable to newly emerged virulent PST races while non-race -specific resistance genes show susceptibility at the seedling stage, expresses a slow-rusting type of resistance at the post-seedling and adult plant growth stages. It involves multiple genes contributing minor to intermediate effects (Wellings 2011; Chen 2013).

METs are in the center of every breeding program for selecting new cultivars with maximum stability over the years and reliable adaptability for the focused environment(s). In classical breeding programs of wheat for YR resistance, field-based assessment of post seedling resistance is achieved by measuring of Final Rust Severity (FRS), Area Under Disease Progress Curve (AUDPC), Infection Rate (IR) and Coefficient of Infection (CI). Customarily, parametric statistical approaches have been applied to evaluate genotype and environmental main effects including ANalysis Of VAriance (ANOVA), joint-regression, multivariate clustering, Principal Component Analysis (PCA), Additive Main effect and Multiplicative Interaction analysis (AMMI), and Genotype main effect plus Genotype Environment (GGE) which have been frequently applied in YR-wheat pathosystem (Dolores Vasquez et al. 2012; Ahmad Shah et al. 2014; El-Amil et al. 2019). On the other hand, handling non-parametric and categorical data, a few approaches have been proposed which are mainly based on comparing the ranks of genotypes in each environment. These approaches have been applied to study plant reactions to the biotic and abiotic stresses in the frame of METs (Sabaghnia 2016).

Generalized linear model (GLM) is the other method to analyze non-normal data sets that suits to deal with categorical, count, or binary data derived from response variables (Davila et al. 2012). This model is essentially an extension of classical regression models in a way that makes it appear natural and generalize the classical linear models based on the normal distribution. On the other hand, PCA is the most popular dimension reduction method which has been widely used in plant sciences (Nayak et al. 2018). GLM-PCA (Collins et al. 2002) is a combination of these methods, avoids the pitfalls of normalization, and generalizes PCA to exponential family likelihoods. Recently, it has been used to manipulate the data extracted from row counts of single-cell RNA-Seq procedure by Townes et al. (2019).

The objective of this study was to identify phenotypically stable resistant sources to YR in elite wheat genotypes generated through breeding programs using non-parametric and GLM-PCA approaches.

Materials and methods

Plant materials and field screening

Thirty-five elite spring wheat genotypes along with a susceptible check (Bolani) (Table 1) were screened for yellow rust resistance at 3 locations Ardebil (38° 15′ N 48° 20′ E, 1350 m) in the northwest, Zarghan (29° 46′ N 52° 43′ E, 1604 m) in the south, and Mash’had (36° 2′ N 59° 6′ E, 1050 m) in the northwest of Iran, during 2015–2016 and 2016–2017 cropping seasons. Agro-ecologically, the climate of Zarghan is mild warm while Ardebil and Mash’had are cold and mild cold, respectively (Table 2).

Table 1 Details of spring wheat genotypes used in this study
Table 2 Climatological identities of environments (combination of location × year) in the growing seasons of trials for multi-environmental study

About 35–40 g of each accession was sown in top hills of double 1-m length rows, with a distance of plant to plant on a row about 0.2 cm. The space between rows and plots was 30 cm and 60 cm, respectively. Susceptible check cultivar “Bolani” was planted in double rows, every five plots as well as around the experimental field to facilitate the establishment and spreading of disease inside the field. Regular agronomical treatments were conducted during growing seasons. Artificial inoculation was carried out by using pathogenic isolates collected and multiplied from local samples of the disease. Rust severity (percentage of rust infection on the plant) (Peterson et al. 1948) and infection type (IT) where six major ITs assumed as O = Immune; R = Resistant; MR = Moderately Resistant; M = Moderate; MS = Moderately Susceptible; and S = Susceptible (Roelfs et al. 1992) were measured after reaching disease severity (DS) > 60% on the susceptible check. The final reaction of a plant to the disease was recorded as final disease severity (FDS) plus IT.

Race identification

Isolates of the pathogen sampled from naturally infested fields of the three locations were artificially multiplied on the susceptible check “Bolani” in the greenhouse. Then, the harvested uredospores were mixed with talcum powder in the ratio of 1:3 and immediately sprayed on wheat seedlings at the fully expanded first flag leaf stage. Subsequently, inoculated plants were sprayed using tap water and covered with plastic bags, incubated in a dark-cold room at 10 °C for 24 h, then moved to the greenhouse at 18 ± 2 °C under 16 h light/8 h dark photoperiod at 300 μmol m−2 s−1 for 20 days. Seedlings of the standard differential set were inoculated and incubated by this method and evaluated according to the scale proposed by McIntosh et al. (1995) where 0 = no visible uredium, ‘;’ = hypersensitive flecks, 1 = small uredium with necrosis, 2 = small to medium-sized uredium with green islands surrounded by necrosis or chlorosis, 3 = medium-sized uredium with or without chlorosis and 4 = large uredium without chlorosis. Designations of ‘ + ’ and ‘−’ were used with the 0–4 scale to indicate larger and smaller uredium than normal, respectively. The nomenclature was based on the avirulence/virulence formula on 19 differential testers for yellow rust (Chen et al. 2009).

Greenhouse screenings

Fresh uredospores obtained from susceptible cultivar “Bolani” belonged to two highly virulent isolates of rust pathogens, races 134E138+ , A, YR27 and 38E190+ , A, YR27 obtained from Mash’had and Zarghan; respectively, during 2015–17, were used for inoculation. The method of inoculation and scoring was the same mentioned heretofore.

Statistical analysis

All analysis was conducted in R version 4 using appropriate packages. To quantify the infection types, CI was calculated where O, R, MR, M, MS, and S reactions multiplied by 0, 0.2, 0.4, 0.6, 0.8, and 1; respectively (Elbasyoni et al. 2019). Data normality and homogeneity tests were conducted using Shapiro–Wilk and Bartlet’s tests, respectively in car package. The outliers were identified using Grubbs’ method in the package outliers and their significance was tested using P-value, then were imputed by the median values using dlookr to improve uniformity and normality of the data set. Subsequently, the best formula of a transformation of the data set (CI values) was identified using rcompanion and applied to it to reduce skewness. Two non-parametric statistics were applied to detect stable genotypes across locations including \({S}_{\mathrm{i}}^{(1)}\), \({S}_{\mathrm{i}}^{(2)}\) (the mean absolute rank difference and between-rank variance of a given genotype over environments, respectively) (Huehn 1990a, b) using package agricolae. Based on these statistics a genotype would be considered as stable if it showed similar ranking across environments. In the next step, the data set of CI values were subjected to GLM-PCA with Gaussian distribution of extension in glm, then the produced matrices were visualized using ggplot2. In the last step, to separate the genotypes with adult plant resistance (APR) from those that exhibit resistance in the seedling stage (all-stage resistance), the data achieved from greenhouse screening, were compared with field data.

Results

The maximum and the minimum temperatures were recorded during both the growth seasons at Zarghan and Ardebil, respectively. The highest amount of precipitation was observed at Mash’had and Zarghan during 2015–2016 and 2016–2017, respectively; while Ardebil faced with a high percentage of relative humidity during both years (Table 2). In all environments, during the growing seasons; the optimal temperature, humidity, and rainfall were satisfactory for YR incidence and its development in experimental plots; so, a high rate of disease severity was observed on the susceptible check over the environments.

A wide variation of virulence was observed in the pathogen population sampled from environments. Indeed, the dynamic of changes in the virulence pattern was notable during the years of study, except for Ardebil where the same race (6E190+ , A, YR27) was prevalent (Table 3).

Table 3 Avirulence/virulence patterns of races of Puccinia striiformis f.sp. tritici identified in different combinations of location × year (environments)

Different infection types and rust severities were recorded on the wheat genotypes in the combination of environments, and the performance of a given genotype was not always stable across them which could be assumed as the first signal of existence interaction between years, locations, and genotypes (Table 4). The field data set was tested for normality of distribution and homogeneity. Box and whisker plots of rust severity observed on the wheat genotypes (Fig. 1a) and in the different environments (Fig. 1b), showed frequent existence of outliers and discrepancies in the data distribution. Grubbs’ test identified CI values of the susceptible check “Bolani” in all environments as an outlier, so was imputed by the median value. Shapiro–Wilk test of normality of CI values on the wheat genotypes was W = 0.9197, P-value = 1.925 × 10–9, showing the non-normal distribution of the data. Similarly, the hypothesis of homogenous variance was rejected by achieving the value of Bartlett’s χ2 = 14.225, df = 2, P-value = 0.00081 and χ2 = 16.981, df = 1, P-value = 3.96 × 10–5 for locations and years, respectively.

Table 4 Reaction and coefficient of infection (CI) of advanced dryland wheat accessions to yellow rust in different combinations of location × year (environments)
Fig. 1
figure 1

Box plots of rust severity on wheat genotypes (a) and in environments (b). For each genotype or environment, the box represents the interquartile range, the heavy horizontal line represents the median, and the fine horizontal lines represent the minimum and maximum values excluding outliers (the circles)

Comparing the two statistics of stability of Huehn, \({S}_{\mathrm{i}}^{(1)}\) and \({S}_{\mathrm{i}}^{(2)}\), with approximate χ2 values of \({Z}_{\mathrm{i}}^{(1)}\) and \({Z}_{\mathrm{i}}^{(2)}\) indicated that neither of these statistics passes the critical values (Table 5); so, based on these statistics, the performance of the genotypes was phenotypically stable across the environments and no genotype × environment (G × E) interaction was detectable.

Table 5 Mean of the coefficient of infection (CI) of wheat genotypes against yellow rust and non-parametric statistics * of their stability across the experimental locations

The plots of scores and loadings obtained through analyzing GLM-PCA are shown in Fig. 2a, b; respectively. Figure 2a shows that the main environments are distinctly distributed across the plot where Mash’had—2017 and Zarghan—2016 are located at long distances from the center of the plot. Therefore, they could be considered as more informative environments. The cosine of the angle between the lines is representative of the correlation between variables, therefore; these environments could be considered as environments with different disease pressure on wheat genotypes. Contrarily, Ardebil-2016 and 2017 along with Zarghan-2017 were accumulated to each other, and were located roughly in the center of the plot; followed by the environment laid on the bottom of the plot, Mash’had-2016; having a small cosine of the angle between the lines. Therefore, these environments were roughly similar, had less effect on disease screening.

Fig. 2
figure 2

Plots of scores (a) and loadings (b) based on GLM-PCA of corrected CI values of yellow rust on wheat genotypes observed in different environments. The solid line below indicates the best fit-regression line. The selected genotypes were confined by the circle

Figure 2b shows the distribution of loadings across the plot constructed by two dimensions. Moreover, two horizontal and vertical lines divided it into four sections, where positive and negative values indicated a variable and a principal component were positively and negatively correlated, respectively. Large (either positive or negative) loading indicates that a variable has a strong effect on that dimension or principal component. Furthermore, a regression line smoothed to fit the data based on GLM represents the main trend of data distribution across the plot. In this plot, three genotypes of 24 (URBWYT(95–98)#7), 31 (URBWYT(95–98)#14), and 21 (URBWYT(95–98)#5) located on the far lowest side of the section four with negative values for both dimensions were separated clearly from the other members of the collection. These genotypes gaining the lowest CI values in all environments could be identified as phenotypic stable resistant entries over the environments. Although genotypes 4 (URBWYT(94–97)#4) in the section one; 6 (URBWYT(94–97)#6) and 30 (URBWYT(95–98)#13) in the section two; 34 (Qaboos) in the section three; and 14 (URBWYT(94–97)#14), 15 (URBWYT(94–97)#15), 19 (URBWYT(95–98)#2) in the section four were well separated from the rest as well, but showing susceptible reaction at least in one environment, could be determined as specific resistant entries for a given environment. Therefore, according to this result, obtaining a highly negative value for both dimensions in GLM-PCA loading plot, is necessary for a given genotype to be considered as general resistant.

The seedling tests conducted under controlled conditions in the greenhouse using two different races of yellow rust produced similar patterns of distribution of susceptibility/resistance, where 30 genotypes (83.33%) and 6 (16.67%) showed susceptibility and resistance reaction to both the races, respectively (Table 2). Genotypes number 34 (Qaboos), 32 (URBWYT(95–98)#15), 30 (URBWYT(95–98)#13), 3 (URBWYT(94–97)#3), and 4 (URBWYT(94–97)#4) showed highly resistant reactions at the seedling stage, but owing to Fig. 2b; only genotypes numbers 4 (URBWYT(94–97)#4), 30 (URBWYT(95–98)#13), and 34 (Qaboos) could be identified as all stage resistant (See Table 4).

Discussions

Availability of YR resistant cultivars is highly demanded in the wheat-producing industry around the world. Regularly, an advanced wheat genotype before releasing as a new commercial cultivar, at different stages of a breeding program is screened to the disease in the field (hot spots) under METs frame, as well as under controlled conditions using specific races of PST. Furthermore, identifying sources of resistance is a vitally important procedure in wheat breeding for disease resistance. Adult plant resistance (APR) in wheat against the disease being effective to a broad range of YR races and maintaining adequate levels of resistance even in favorable conditions for rust epidemics (durable resistance); is an ideal option for plant pathologists to manage the disease (Li et al. 2020). The robustness of APR can be evaluated in terms of time and space which mainly could be gained through genotype × environment (G × E) trials. In this study, we evaluated 35 wheat genotypes for quantitative resistance to YR in 6 environments in the field conditions. Phenotypic reactions of wheat genotypes were different among environments, a phenomenon which frequently has been reported in the previous studies (Ali et al. 2009; Randhawa et al. 2012).

The data set achieved through CI calculation did not follow a normal distribution, a feature which is common in bioscience studies such as ecology, evolution, plant pathology, and plant breeding (Bolker et al. 2009). CI values, derived from multiplying of IT and DS; inherently, could be supposed as a corrected value of an ordinal data set, despite quantifying the infection type fraction which is regarded as an absolute qualitative factor. Assessing the severity of many diseases in plant pathology falls in the same situation where ordinal scale is applied for measurement and is interpretable only in terms of their arrangement in a given order, although continuous scales, widely are in use (Shah and Madden 2004; Sabaghnia 2016). In the next step, we subjected the transformed data set of CI values into two non-parametric statistics proposed by Huehn (1990a, b) which has already been used to detect phenotypic stability of resistance in various pathosystems (Haynes et al. 1998; Forbes et al. 2005; Lillemo et al. 2010).

Predicting a genotype behavior in different environments and establishing a cut-off point to distinguish overall susceptible/resistance patterns is very difficult in METs because the genotype performance is dominantly regulated by three different fractions of total (G × E) interaction variance, between host genotype and biotypes of the biotic agent; and a threefold interaction of host genotype × environment × biotypes (Rubiales et al. 2012). Proper identification and characterization of resistance, its efficient use in the breeding program, and disease management strategy are facilitated by relating its phenotypic expression to these three factors (Mukherjee et al. 2013). The non-parametric approaches of Huehn (1990a, b), indicated no significant G × E interaction. According to Forbes et al. (2005) when these statistics do not show significant interaction, the data set would not have the feasibility to be subjected to neither GGE nor AMMI analysis. Under these circumstances, a simple way to compare disease pressure between environments is via comparing the disease severity observed on the susceptible check over the environments (Ali et al. 2009) which does not give more information about relationships between environments and the degree of stability of resistance within genotypes. Instead, in the current study; we used GLM-PCA (Collins et al. 2002), a statistical technique where a specified family link expands its type of distribution. Using this technique, two separate scatterplots based on two separate matrices were produced (Fig. 2a, b). According to Fig. 2a, the two most informative environments were detected. Furthermore, similar races were identified during 2 years of study in Ardebil. Co-ordinarily; these environments were located at a short distance from each other and Zarghan-2017; meaning that similar pressure by the pathogenic races exposed on the wheat genotypes in these environments.

In this study, we showed that when the data set cannot be subjected to parametric analysis because of its non-normal distribution and heterogeneous nature of variances, and even more when non-parametric approaches were not efficient to distinguish G × E interaction; GLM-PCA can detect phenotypically stable resistant genotypes as well as more informative environments in METs. Using facilities embedded in ggplot2, visually detection of relationships between features of G × E interaction became possible. The usefulness of PCA to detect samples with different behavior from the main population has already been reported (Reyna et al. 2017), although the application of conventional PCA analysis gave different patterns of distribution of genotypes and variables (data not shown). The plots illustrated in Fig. 2a, b, in composition; could be considered as homologous to GGE biplot which has been devised to facilitate visual cultivar evaluation and mega-environment identification.

The reaction of wheat genotypes in the field scale is not only regulated by environmental factors (Millus et al. 2009; Grabow et al. 2016); but also, by the race profile of the region (Liu et al. 2017). The absence of significant G × E interaction along with roughly similar meteorological situations among the environments means that the main reason for the different response to yellow rust observed in wheat genotypes mainly is due to the presence of different virulence factors of PST races and coincident Yr genes in the host; which is in harmony with previous studies on PST race profile of a region (Safavi et al. 2013). Nevertheless, the effect of different environmental conditions is not being completely rejected and needs complementary studies.