Introduction

Rice (Oryza sativa L.) is the staple food crop for more than half of the world’s human population and cultivated in diverse agro-climatic conditions. Rice blast caused by the fungus, Magnaporthe oryzae is one of the most devastating and one of the top 10 fungal diseases which is major threat to global food security (Dean et al. 2005). In India, it has caused a considerable yield losses over the past few decades ranging between 20–100% (Sharma et al. 2012). Majority of the plant pathogens mutated rapidly and resulted in the breakdown of resistance causing epidemics, further it is aggravated by conducive weather conditions, high disease pressure, and genome stability of the pathogen. The genome of M. oryzae is rich in retrotransposons and repetitive segments (Dean et al. 2005) which helps the fungus to change its virulence to overcome the resistance conferred by R-genes (Vasudevan et al. 2014). The rice blast system follows classical gene for gene model where a blast resistance (R) gene products prevent its infection by the races of M. oryzae carrying the corresponding avirulence (Avr) gene (Silue et al. 1992; Sahu et al. 2018). This disease can be managed by maintaining adequate flood depth, suitable planting dates, recommended dose of nitrogen fertilizer, chemical fungicides and use of resistant varieties (Bonman 1992). Among these, use of resistant varieties is the most economical and environment friendly method to manage this disease (Panda et al. 2017).

Until now, more than 100 race-specific R genes, and more than 350 QTLs for resistance to M. oryzae have been identified, and 27 have been cloned and characterized and most of them encode nucleotide-binding site-leucine-rich repeat (NBS-LRR) proteins except pi21 and Pid2, which encodes proline-containing protein and receptor kinase (Fukuoka et al. 2009; Kouzai et al. 2013; Zheng et al. 2016; Zhu et al. 2016; Yadav et al. 2019). Among them, eight genes have been located in two gene clusters; Pi2, Pi9 and Piz-t in Pi2 locus and Pik, Pik-m, Pik-p, Pi1 and Pi-ke in Pik locus (Wang et al. 2016). Most of the blast R genes are dominant except the recessive genes pi21, Pid-2 and pi66(t) (Liang et al. 2016). Most of the R-genes were identified in landraces, cultivars, or wild rice collections because of differential physiological races of M. oryzae (Tanksley et al. 1997). The existing phenotypic screening technique for blast resistance is time-consuming, laborious, and entailed specific procedures. Many PCR based molecular markers have been developed for fine mapped and cloned blast R genes for mining and identification of different R genes.

In the current scenario, molecular markers have a significant contribution in increasing the efficiency and precision to incorporate blast resistance genes in cultivars (Wang et al. 2014). Marker-assisted selection (MAS) is an advanced molecular tool in rice breeding for improvement of resistance to rice blast and with the aid of MAS many rice cultivars resistant to biotic stress have been developed and widely accepted by farmers (Xu and Crouch 2008). Association mapping (AM) is a molecular approach used for identifying target genes governing important traits in a natural population including diverse germplasm. AM can be categorized into candidate-gene association and genomewide association mapping (Zhang et al. 2017). In addition, AM utilizes a natural population as compared to biparental mapping population, thus it saves the time required to constitute the population and hasten the identification of gene in crops and hence the AM is a powerful means of genetic dissection and identification of a gene of interest.

National Rice Research Institute released varieties (NRVs) can be studied as donor source of favourable genes for biotic stresses which are grown in blast endemic areas in diverse agro-ecological zones in India. During the previous study, NRVs were genotyped for 12 major blast resistant genes using 17 molecular markers (Yadav et al. 2017). In continuation of the previous study, the present study was undertaken to investigate the genetic association of 36-mapped resistance genes in 80 NRVs using linked/functional markers. The objective of the current study was to identify the candidate R genes which confer blast resistance to these NRVs and that could be used for identification of novel donor source (R genes/alleles) for blast resistance, and genomic studies.

Material and methods

Plant material and disease reaction in uniform blast nursery

A set of 80 NRVs originated from eight different agro-ecologies (table 1) was collected from the National Gene Bank, NRRI, Cuttack (table 1 in electronic supplementary material at http://www.ias.ac.in/jgenet). These NRVs were phenotyped for leaf blast under natural conditions in the uniform blast nursery (UBN). The screening was done in two replications during dry and wet seasons, 2015–2016, at the research farm of NRRI, Cuttack (\(85{^{\circ }}55'48''\)E longitudes and \(20{^{\circ }}26'35''\)N latitude). Thirty seeds of each NRVs were grown in a 50-cm long row with a 10 cm row spacing. The highly susceptible varieties HR12 and CO39 were used as a spreader row to ensure the uniform spread of the disease. The disease scoring was recorded from 25 to 40 days after sowing at 5-day intervals when the spreaders row showed more than 85% infection. Disease reaction was scored using the standard evaluation system (SES), IRRI, Philippines (2002) on a 0–9 scale as: resistant (0–3), moderately resistant (4–5), and susceptible (6–9). The higher disease score was considered for evaluation, whenever there were different blast disease score between replications as well as season.

Table 1 The NRRI released varieties (NRVs) of different ecologies.

Genomic DNA isolation

Young leaves from 3-week-old seedlings were collected and stored in –80\({^{\circ }}\)C freezer. The genomic DNA was isolated following Doyle and Doyle (1990) method with slight modification. In brief, the 200 mg leaf sample was grinded with liquid nitrogen; powder was immediately transferred to 1ml CTAB isolation buffer and incubated at 65\({^{\circ }}\)C in a recirculating water bath. After one hour, equal volume of PCI (Phenol:Chloroform:Isoamyl alcohol; 25:24:1) was added, and centrifuged at 10,000 rpm for 10 min. The aqueous phase was transferred to new 2ml tube and mixed with Chloroform:Isoamyl alcohol (24:1) and again centrifuged as above. The aqueous phase was pipetted out to a new 2ml tube and absolute alcohol was added twice the volume followed by \(1/10^{\mathrm{th}}\) of sodium acetate (3.5 M) and mixed properly. The samples were kept for 2 h in −20\({^{\circ }}\)C followed by centrifuged of 10,000 rpm for 10 min at room temperature (RT). The white pellet was wash with 70% ethanol by centrifuging at 7000 rpm for 7 min at RT followed by air dry. The completely dried pellet was dissolved in nuclease free water for further quantification and used for PCR amplification. The quantity and quality of nuclear DNA were assessed by 0.8% agarose gel electrophoresis and Nano-drop ND-1000 Spectrophotometer (Thermofisher Scientific, Waltham, USA). Nuclease-free water was used to dilute the DNA samples to the concentration of 20 ng/\(\mu \)L for PCR amplification.

PCR amplification and visualization

The polymerase chain reactions (PCR) were executed in 20 \(\mu \)L reaction volume containing 25 ng template DNA, 1xTaq buffer (10 mM Tris-HCl, 50 mM KCl, pH 8.3), 0.2 \(\mu \)M of each of dNTP, 0.2 \(\mu \)M of each forward and reverse primers, and 1 U of Taq DNA polymerase (DreamTaq, Thermo Scientific, USA). The PCR cycle was set up as follows: initial denaturation of 5 min at 94\({^{\circ }}\)C; 35 cycles at 94\({^{\circ }}\)C for 45 s, primers annealing for 45 s at varied temperature (table 2), extension at 72\({^{\circ }}\)C for 45 s, and a final extension for 10 min at 72\({^{\circ }}\)C. The PCR products were separated by electrophoresis in 3.5% agarose gels and visualized using a gel documentation system (Alpha Imager, USA). The amplified PCR products were scored as presence (1) or absence (0). The PCR reaction was repeated twice for each marker to cross-check the scoring data.

Table 2 List of markers used for genetic association of blast resistance in rice varieties.

Statistical data analysis

A total of 36 linked/functional markers were used to score the presence or absence of the resistance genes in the 80 NRVs. Jaccard’s coefficient similarity matrix was assessed using binary data. The polymorphism information content (PIC) value, allele number and allele frequency were estimated for each marker using the PowerMarker v3.25 (Liu and Muse 2005). The genetic distance matrix among the NRVs was assessed through principal co-ordinate analysis (PCoA) in GenAlEx 6.5 software. Similarly, analysis of molecular variance (AMOVA) between and within the populations and population assignment was estimated using GenAIEx 6.5.0 (Peakall and Smouse 2012). An unweighted neighbour-joining (NJ) unrooted tree was constructed in the DARwin 5 program (Perrier and Jacquemound-Collect 2006, DARwin Software, http://darwn.cirad.fr/darwin). The dissimilarity index was estimated using NEI coefficient (Nei 1973) with a bootstrap value of 1000. The general linear model (GLM) function in TASSEL5 software was used to understand the genetic association of blast resistance genes with the disease (Bradbury et al. 2007). The GLM model of Tassel 5 software was run with permutations of 1000. The population structure analysis was performed using the Bayesian model-based approach employed in Structure v 2.3.4 software (Pritchard et al. 2000). The number of subgroups (K) in the population varied from 1 to 10. The population structure was run using the admixture model, correlated allele frequencies and five independent iterations per K with a burn-in period length of 200,000 and 200,000 Markov chain Monte Carlo (MCMC). The optimal K was determined from the peak value of \(\Delta \)K (Evanno et al. 2005) using Structure Harvester 0.6.93 (Earl 2012).

Fig. 1
figure 1

Position of marker loci used in this study.

Results

Phenotyping and genetic diversity

The NRVs panels of 80 varieties were phenotyped for resistance to leaf blast. Nineteen NRVs (24.69%) were found to be resistant, 21 (26.25%) were moderately resistant, and 40 (50%) were susceptible. The disease score ranged from 0 to 9. Interestingly, resistant NRVs were observed across all the ecologies. Highest resistant varieties proportion was observed in irrigated (5), whereas lowest proportion was in Boro, on the other hand medium deep water and coastal saline ecologies with one each (table 1). The NRVs were genotypes for 36 markers corresponding to 36 blast resistance genes (figure 1). The genetic diversity parameters of 36 marker loci measured during current study are presented in table 3. The major allele frequency varied from 0.52 to 0.93 with a mean value of 0.75. Similarly, the genetic diversity of 36 markers had a mean value of 0.34 and varied from 0.11 to 0.49. The PIC was used to measure the informativeness of a genetic marker. The PIC value for 36 markers ranged from 0.11 (RM101 and RM 11787) to 0.37 (RM72 and Pia-STS) with an average of 0.27. The PIC value of two markers, RM72 and Pia-STS, corresponding to the Pi33 and Pia genes showed the highest value of 0.37 which can be used effectively for genetic diversity study (table 3).

Table 3 Genetic diversity indices of 36 marker loci for NRVs.
Fig. 2
figure 2

Unweighted–NJ tree based on molecular markers linked to blast resistance in 80 NRVs. These NRVs are represented corresponding to (a) subpopulations determined from structure analysis (SG1, red; SG2, pink; SP3, blue; admixture, green); (b) disease reaction (resistant, green; moderately resistant, pink; susceptible, red).

Fig. 3
figure 3

Estimated population structure of the rice NRVs which is partitioned into coloured segments that represent the estimated membership for K\(=\)3. The maximum of ad hoc measure \(\Delta \)K was observed to be K\(=\)3, which indicated that the entire population can be grouped into three subgroups.

Genetic relatedness through cluster analysis, population structure and PCoA

The cluster analysis was analysed with UPGMA and NJ methods using Darwin software based on 36 markers linked to 36 blast resistance genes. The cluster analysis categorized the NRVs into three major clusters (I, II and III) (figure 2). Cluster I consisted of 43 NRVs, was further categorized into two subclusters IA and IB. Subcluster IA included 38 NRVs, with 11 (28.94%) resistant genotypes. Subcluster IB included only five NRVs with no resistant genotype. Similarly, cluster II possessed 32 NRVs, further divided into two subclusters, IIA and IIB. Subclusters IIA consisted of 27 NRVs, having six resistant genotypes (22.22%). Conversely, subclusters IIB consisted of only five NRVs with only one resistant genotype. Cluster III is the smallest cluster comprised of five NRVs, with only one resistant variety. Interestingly, the majority of resistant genotypes were clustered together in major cluster I and few in major cluster II. Genetically similar NRVs were clustered together in the same group, on the contrary NRVs of same ecologies clustered in different groups.

The population structure of 80 NRVs were explored using the model-based population structure based on 36 markers corresponded to 36 blast resistance genes. The peak plateau of ad hoc statistic \(\Delta \)K was observed to be K=3 (figure 3), which indicated the presence of three subgroups (SG1, SG2 and SG3) in the NRVs. The threshold value of >55%, classified the entire NRVs into three subgroups with two admixture (table 4). The SG1 consisted of 24 NRVs, of which eight (33.33%) were highly resistant. The SG2 comprised of 27 NRVs, of which six (22.22%) were highly resistant. Similarly, SG3 included 27 NRVs with five (18.51%) highly resistant NRVs. Interestingly, resistant NRVs were distributed in all the three subgroups with maximum percentage observed in SG1. The moderately resistant genotypes were mostly in SG1 and SG1 whereas SG3 was dominated by susceptible genotypes. Further, high resistant variety, Sarasa belonged to SG2. Accordingly, structure analysis could not differentiate resistant genotypes but partially categorized moderately resistant and susceptible genotypes.

Table 4 Population structure group of NRVs based on inferred ancestry values.

The molecular markers genotypic data were used to calculate the PCoA to estimate the genetic relationship among NRVs. Based on disease reaction, NRVs exhibited uniform distribution across the two axes. According to the PCoA analysis, the first two axes explained 12.71% and 8.93% of the total variance (table 5). In PCoA, resistant landraces were observed to be distributed mostly in the first quadrant, moderately resistant genotypes in third and fourth quadrants, whereas, susceptible genotypes were distributed in all the four quadrants (figure 4). The population assignment test was estimated using GenAlex which partly distinguished resistant populations from moderately resistant or susceptible populations (figure 5, a&b). Similarly, it was partly able to distinguish between moderately resistant from susceptible populations (figure 5c).

Table 5 Percentage of variation explained by the first three axes using blast resistance gene in PCoA.
Fig. 4
figure 4

PCoA of 36 molecular markers linked to blast resistance in 80 NRVs.

AMOVA

AMOVA is a statistical method to detect molecular variation using molecular markers. In AMOVA analysis, 80 NRVs were categorized into three groups based on their disease reaction: resistant (19), moderately resistant (21) and susceptible (40). Through AMOVA analysis, greater variance (97%) was observed within the population, whereas less (3%) between population (figure 6; table 6a). The highest pairwise fixation indices \((F_{\mathrm{ST}})\) value of 0.040 was observed between the resistant and susceptible, while the lowest was observed between the resistant and moderately resistant populations. The inbreeding coefficient \(F_{\mathrm{IS}}\) and \(F_{\mathrm{IT}}\) were observed to be 1.000. This suggested that individuals from different populations are weakly isolated and genetically more closely related. The pairwise Nei’s genetic distance ranged from 0.34 (between resistant and moderately resistant population) to 0.043 (resistant and susceptible population) (table 6b).

Fig. 5
figure 5

Population assignment of NRVs signifying the log likelihood assignment of each NRVs using disease reaction: (a) resistant and susceptible populations, (b) resistant and moderately resistant populations, (c) moderately resistant and susceptible populations.

Genetic association of blast resistant genes

The genetic relatedness between disease score and molecular markers was investigated using the GLM to know any significance relatedness. Among the 36 markers corresponding to the 36 blast resistant genes, only two markers (RM7364, and pi21_79-3) corresponding to the blast resistant genes (Pi56(t) and pi21) were observed to be significantly associated with blast disease resistance. The phenotypic variance of the two markers varied from 4.9 to 5.1% (table 7). Among these markers, pi21_79-03 exhibited the highest phenotypic variance (5.1%) followed by RM7364 (4.9), whereas remaining markers did not exhibit significant association at \(P<\) 0.1.

Discussion

Genetic diversity of crop plants has been eroded due to replacement of landraces and traditional local varieties with improved and high yielding varieties (Tanksley et al. 1997). The emergence of new and virulent races imposed a constant threat to sustainable rice production and global food security. To keep pace with the pathogen, it is necessary to identify the potential donor for novel resistance genes/alleles to combat the nuisance caused by this disease. In this study, we performed the candidate gene-based screening of blast resistance donors (NRVs) in rice breeding distributed over eight ecologies using 36 known blast resistance genes.

Fig. 6
figure 6

AMOVA analysis of NRVs.

Among the NRVs screened for blast disease resistance in the uniform blast nursery under natural screening at NRRI Cuttack, 19 NRVs were found to be resistant to the leaf blast disease. Among 19 NRVs, seven were reported to be released as resistant to M. oryzae (Yadav et al. 2017). These seven NRVs belonged to three different agro-ecologies, namely Satya Krishna, Chandrama, and Abhishek (irrigated), Sahbhagidhan (upland), Sumit, Reeta and Samalei (shallow low land). Interestingly, all these NRVs were released after 2006 except Samalei. Similarly, Zhu et al. (2016) reported 40 cultivars as highly resistant in China and 20 were previously reported to be resistant. During the breeding programme, identification of the individual resistance gene is often difficult through phenotype-based screening, as it is influenced by the developmental stage and environmental conditions. Instead, DNA markers linked to R genes are the easy and quickest way to identify and select several blast resistance genes without performing phenotype-based screening (Hayashi et al. 2006).

Table 6 AMOVA among and within populations.
Table 7 Pairwise population matrix of Nei’s genetic distance among the three populations of NRVs.
Table 8 Genetic association of rice blast resistant genes with blast disease in 80 NRVs.

The average gene diversity was found to be 0.34, ranging from 0.11 to 0.49, whereas the major allele frequency had a mean value of 0.75 and varied from 0.52 to 0.93. The gene diversity was observed to be 0.67 in 107 NE collections, 0.25 in 80 NRVs, 0.227 in 288 landraces, and 0.32 in 167 landraces (Roy et al. 2016; Yadav et al. 2017; Susan et al. 2019; Yadav et al. 2019). The degree of polymorphism was detected by calculating PIC values that varied from 0.11 (RM101 and RM11787) to 0.37 (RM72 and Pia-STS) with an average of 0.27. It was higher than that observed by Yadav et al. (2017) in 80 NRVs (0.18) and Susan et al. (2019) and slightly lower than Roy et al. (2016) with 0.62 in NE Himalayan landraces. In the present study, low gene diversity and PIC values were observed as compared to landraces. These NRVs are developed as a result of strong artificial selection pressure, whereas it does not strongly operate in case of landraces.

Based on marker genotype data, distance based clustering categorized the entire NRVs into three major clusters. Most of the resistant NRVs were clustered in major cluster I followed by major cluster II. Our results are in accordance with the previous studies where resistant accessions were clustered in one group and susceptible in another group (Yadav et al. 2017; Susan et al. 2019). Interestingly, NRVs of same ecologies were not grouped together whereas genetically similar NRVs were clustered together.

The genetic architecture of NRVs was investigated using the model-based structure software based on 36 molecular markers. Population structure differentiated 80 NRVs into three subgroups (SG1, SG2 and SG3) with two admixtures. Most of the resistant genotypes (eight) belonged to SG1 whereas, SG2 and SG3 included six (22.22%) and five (18.51%) resistant NRVs, respectively. Consequently, a weak association for blast reaction was observed through structure analysis. Yadav et al. (2017) categorized 80 NRVs into three subgroups through structure analysis using 17 markers linked to blast resistance. Similarly, Roy et al. (2016) and Susan et al. (2019) divided the NE landraces into three and two subpopulation, respectively. The result of cluster analysis is in accordance with the structure analysis of the NRVs. The corresponding NRVs in SG1 and SG2 were found concurrent with the cluster I, NRVs of the cluster III is harmonized with SG3 whereas most of the NRVs of subcluster III belonged to SG3.

The PCoA through the scatter plots partitioned the resistant and susceptible NRVs into different quadrant. Likewise, previous study also showed the portioning of resistant and susceptible genotypes into different groups (Yadav et al. 2017; Susan et al. 2019). However, population assignment partially differentiated resistant and moderately resistant populations.

AMOVA is a method of estimating molecular variance within the species. Based on AMOVA, variation within the population was higher (97%) as compared with between populations (3%). The highest \(F_{\mathrm{ST}}\) was observed between the resistant and susceptible populations (0.040) while, the minimum was observed between the resistant and moderately resistant populations. The inbreeding coefficient \(F_{\mathrm{IS}}\) and \(F_{\mathrm{IT}}\) were observed to be 1.0. The lower \(F_{\mathrm{ST}}\) value indicated that the lower divergence between subgroups. Yadav et al. (2017) observed more variance within the population (96%), whereas less between populations (4%) and \(F_{\mathrm{ST}}\) was in accordance with our result. Similarly, Susan et al. (2019) reported higher (96%) variance within the population and lower (4%) between the populations in the NE landraces. The pairwise Nei’s genetic distance ranged from 0.034 (between resistant and moderately resistant population) to 0.043 (resistant and susceptible population).

Candidate gene-based association mapping is an approach to dissect trait of interest that investigate individual genes for genetic association with a phenotype (Neale and Savolainen 2004). Through GLM, among 36 markers tested, two markers corresponding to two blast resistance genes, Pi56(t) and pi21 were identified to be significantly associated with phenotypic variance varied from 4.9 to 5.1%. Interestingly, Pi56(t) and pi21 genes were reported to be a broad spectrum in nature and individual markers can be used for identification of rice blast resistant genes in the diverse rice germplasm. The genetic association of blast resistance in 80 NRVs, 167 landraces and 288 NE landraces against rice blast varied from 6.5 to 7.7% and explained its implication in markers identification associated with the blast resistance (Yadav et al. 2017; Susan et al. 2019; Yadav et al. 2019). However, the present study did not completely explain the genetic differences between the resistance gene(s) and the disease reaction. Similarly, Yadav et al. (2017) was not able to explain the resistance spectrum using 17 markers corresponding to the 12 blast resistance genes. The resistance spectrum of these NRVs could be explained through the identification of new blast resistance genes, their allelic variant or QTLs. In addition, these varieties had been released with multiple stress tolerance for different ecologies, which could be tested for other biotic and abiotic stresses. These resistant NRVs identified in the present study could be used as a potential donor for the breeding of blast resistance as well as genetic material for identification cloning and characterization of new blast resistant genes.