Introduction

Tar spot complex of maize (Zea mays L. subsp. mays) causes severe yield losses in some countries in America and the West Indies (Hock et al. 1995). When infection occurs on susceptible maize genotypes, and conditions are favorable for disease development, the whole maize foliage can be destroyed in 8–14 days, making tar spot complex (TSC) one of the most devastating diseases of maize (Bajet et al. 1994). TSC results from an interaction of at least three fungal species: Phyllachora maydis, Monographella maydis, and Coniothyrium phyllachorae (Hock et al. 1992). Phyllachora maydis is usually the first to cause leaf lesions. While M. maydis is a common benign saprophyte on leaf surfaces, it becomes pathogenic and highly virulent only in association with P. maydis. Coniothyrium phyllachorae may be a hyperparasite of the other two, but its role is not yet fully understood (Bajet et al. 1994; Ceballos and Deutsch 1992).

Host reaction to P. maydis and M. maydis can easily be distinguished. Phyllachora maydis induces small, round dark lesions, 1–2 mm in diameter, whereas M. maydis causes a brown elliptic necrotic hallo surrounding each P. maydis lesion (Hock et al. 1989). Some spots enlarge around the ascomata, with an initially water-soaked area becoming necrotic, to form circular-oval, brown lesions 3–8 mm in diameter with a dark outer edge (Supplementary Figure S1 a–c); this is called the ‘fish-eye’ symptom associated with TSC disease (Bajet et al. 1994; Hock et al. 1992). The fungus spreads from the lowest leaves to upper leaves, leaf sheaths and the husks of developing ears. When conditions favor disease, leaves may be fully blighted and dead in 21–30 days (Hock et al. 1992). In susceptible genotypes, 80 % or more of the leaf area may be affected, leaving little green tissue or killing the plant entirely (Supplementary Figure S1 d–g) (Ceballos and Deutsch 1992).

In the presence of M. maydis, P. maydis produces a toxin that rapidly kills plant tissue and affects the photosynthetic ability of the plant. The resulting ears are shriveled, have reduced weight, fewer kernels which may germinate prematurely on the cob (Supplementary Figure S1 h–i) (Bajet et al. 1994; Hock et al. 1989; CIMMYT Maize Program 2004). In 2012, more than 6000 ha of maize were devastated by the TSC disease in just 8 days, following a heavy 4-day fog in the state of Oaxaca, Mexico, and this resulted in an estimated yield loss of up to 70 %. In the northern province of Guatemala, an estimated yield loss of up to 75 % was reported in the 2008/2009 season, and similar problem has been observed in Honduras and El Salvador in recent years (ProMed-Mail 2009; ProMed-Mail 2011; ProMed-Mail 2013).

Developing resistant maize varieties and hybrids is the preferred method for controlling TSC because this strategy is the most cost-effective, efficient and can easily be adopted by smallholder farmers who produce most of the maize in Latin America. The International Maize and Wheat Improvement Center (CIMMYT) identified and developed breeding lines and populations that showed resistance to TSC (Bajet et al. 1994; Ceballos and Deutsch 1992; Mahuku et al. 2013). However, to date, the genetic architecture underlying resistance to TSC in maize is not well understood. Only two studies have been conducted to dissect the genetics of host resistance to P. maydis (Ceballos and Deutsch 1992; Hernández-Ramos et al. 2015). The first paper described resistance to P. maydis as simple in nature and appeared primarily to be conditioned by single dominant genes with additive effects (Ceballos and Deutsch 1992). The second paper reported general combining ability (GCA) as a very important factor in determining resistance to TSC (Hernández-Ramos et al. 2015). Given the increasing negative impact of TSC in maize production in some countries in America, it is important to identify and deploy heritable resistance to the disease.

Identification and development of molecular markers closely linked to underlying genes or quantitative trait loci (QTL) and application in marker-assisted selection (MAS) can enhance the efficiency of breeding activities to introgress TSC resistance genes into adapted but susceptible germplasm. This requires first identifying stable sources of resistance and mapping genes or QTL conditioning TSC resistance to specific regions of the maize genome to allow the use of MAS. Classical linkage mapping based on bi-parental mapping populations is an effective method and has been used to identify large effect resistance loci for several diseases of maize (Chen et al. 2012; Weng et al. 2012; Yang et al. 2010; Zhang et al. 2012). However, resolving small effect QTL is challenging with bi-parental mapping populations (Holland 2007). Recently, GWAS has shown great potential to exploit the limitations of QTL mapping by detecting QTL with high resolution in diverse germplasm panels (Myles et al. 2009), but the often high false positive rate is a drawback (Larsson et al. 2013). Classical linkage mapping and GWAS, when combined, are more powerful approaches to identify candidate QTL for complex diseases. For example, natural allelic variation for resistance to northern corn leaf blight was detected in maize using such a combined approach (Poland et al. 2011). The aluminum tolerance genes was identified by association and linkage analysis (Krill et al. 2010).

To the best of our knowledge, no reports have been made on mapping QTL associated with TSC resistance. Understanding the genetic basis of TSC resistance will facilitate selection of resistance genes and their use in maize breeding. This study was conducted to elucidate the genetic architecture of TSC resistance using GWAS with 56 K SNP markers, TSC phenotypic data collected from multiple environments and QTL mapping in three independent bi-parental populations. The objectives were to (1) identify the best-bet donors of TSC resistance among 890 tropical maize inbred lines, (2) estimate the number and chromosomal position of QTL conferring resistance to TSC, (3) identify favorable haplotypes and candidate genes for TSC resistance, and (4) determine markers flanking the favorite allele(s) that could be further developed and applied in MAS to improve TSC resistance. This is the first study using GWAS and QTL linkage mapping to detect loci associated with TSC resistance in maize.

Materials and methods

Plant material

Eight hundred and ninety (890) maize inbred lines, representing broadly the tropical/subtropical maize genetic diversity, including CIMMYT maize lines (CMLs) and germplasm derived from breeding programs targeting tolerance to drought, soil acidity, low Nitrogen, resistance to insects and pathogens were used for GWAS (Semagn et al. 2012; Wen et al. 2011). Most of these lines are included in two association mapping (AM) panels constituted under the DTMA (Drought Tolerant Maize for Africa) and IMAS (Improved Maize for African Soils) projects led by the Global Maize Program of CIMMYT, and are comprised of 300 and 400 lines, respectively (Semagn et al. 2012; Wen et al. 2011). Linkage mapping was conducted on a set of three bi-parental populations: population one (Pop1) was a doubled haploid (DH) population derived from an F1 cross between white maize inbred lines CML495 (TSC resistant) and LaPostaSEQ.C7F64-2-6-2-2-B*3 (TSC susceptible) and consisted of 201 DH lines; population 2 (Pop2) was a doubled haploid (DH) population derived from an F1 cross between yellow maize inbred lines CML451 (TSC resistant) and DTPYC9-F46-1-2-1-2-B*3 (TSC susceptible) and was composed of 116 lines; the third population (Pop3) was a white F2:3 population derived from a cross between CML492 and LPSMT and was composed of 277 families.

Field trials

The DTMA-AM panel, comprising of 300 lines was evaluated for response to TSC in field trials conducted at six locations; five in Mexico and one Colombia (Table 1). In Mexico, the panel was evaluated at CIMMYT’s lowland tropical experiment station of Agua Fria in 2009, 2011 and 2012; and in Guerrero in 2011 and 2012 (Table 1). The DTMA-AM panel was further evaluated in Tambo, Colombia in 2010 (Table 1). The IMAS-AM panel was evaluated in Agua Fria and Guerrero in 2011. A subset comprising of 138 inbred lines and representing different disease response classes and selected using data from all sites were evaluated in 2013 in Agua Fria (Table 1). All bi-parental populations were evaluated in Agua Fria in 2012 and 2013 for Pop1; and in 2012 for Pop2 and Pop3. For all trials, experimental units consisted of single-row plots arranged in α-lattice design with three replications. Twenty seeds were planted in 2-m rows, with 0.75 m between rows and 0.2 m between plants in a row. Two seeds were planted per hill and later thinned to a single plant. Standard agronomic practices were followed and application of nitrogen, phosphorous and potassium fertilizers was based on recommended amounts for each location.

Table 1 Description of environments and number of elite maize inbred lines evaluated for response to tar spot complex (TSC) in each environment

Disease evaluations

Experiments were conducted in areas known to have high and consistent natural expression of the diseases. In all years, disease pressure and severity were adequate to separate resistant from susceptible maize genotypes. Disease severity was evaluated three times, the first evaluation was conducted approximately 2 weeks after flowering, then second and third evaluations were conducted at approximately 7 day intervals. Disease expression and severity was rated on a per plot basis, using a 1–5 scale with 0.5 increments: 1 = highly resistant (HR)—no visible disease symptoms or lesions identifiable on any of the leaves; 2 = resistant (R)—moderate lesions development below the leaf subtending the ear or disease symptoms covering approximately 30 % of the leaf area; 3 = moderately susceptible (MS)—heavy lesions development on and below the leaf subtending the ear and a few lesion above it or 50 % of the leaf surface have disease symptoms; 4 = susceptible (S)—many or severe lesions development on all but the uppermost leaves, which may have a few lesions, lesions have coalesced and blighted or 70 % of leaf surface has disease symptoms; 5 = highly susceptible (HS)—all leaves are dead, no green leaf tissue remaining or disease symptoms on >80 % of the leaf surface.

Phenotypic data analysis

Description statistics (e.g. mean, range, skewness and kurtosis) and correlation of phenotypic data were conducted in Excel 2010. Other statistical analyses were performed using R software (R Core Team 2013). Violin plot, to show the probability density of TSC phenotypic data for each environment was generated using violin plot package in R software. To verify that data were normally distributed, residuals for each site were calculated using the resid function in R software, and density plots were generated using the plot and density functions in R software. Variance components were estimated using the REML option with VarCorr function after using the REML option of the R lmer model; broad sense heritability (H 2) and repeatability (H 2) for single and combined environments were estimated using the following formulae (Knapp et al. 1985):

$${\text{Single environment:}}\;H^{2} = \sigma_{\text{G}}^{2} / \, (\sigma_{\text{G}}^{2} + \sigma_{\text{e}}^{2} /r),$$
(1)
$${\text{Combined environments:}}\;H^{2} = \sigma_{\text{G}}^{2} / \, (\sigma_{\text{G}}^{2} + \sigma_{\text{GL}}^{2} /l + \sigma_{\text{e}}^{2} /lr),$$
(2)

where σ 2G is genetic variance, σ 2GL is genotype × environment variance, σ 2e is error variance, and l is number of environments, r is number of replication in each environment. Correlation coefficients were obtained using Pearson’s statistic, implemented using the cor procedure in R. Best linear unbiased prediction (BLUP) was done for single and combined environments by fitting the mixed linear model (lmer) in R package “lme4” for estimation of entry effect using the following formulae:

$${\text{Single Environment BLUP:}}\;{\text{ Pheno}} \sim {\text{Cov }} + \, (1|{\text{Entry}}) \, + \;(1|{\text{Rep}}) \, + \, (1|{\text{Block}}:{\text{Rep}})$$
(3)
$${\text{Combine BLUP:}}{\text{ Pheno}} \sim {\text{Cov }} + { (}1|{\text{Entry) }} + { (}1|{\text{Env) }} + { (}1|{\text{Rep}}:{\text{Env) }} + { (}1|{\text{Block}}:{\text{Rep}}:{\text{Env) }} + (1|{\text{Entry}}:{\text{Env)}}$$
(4)

where Pheno was trait data, Cov refers to covariance, which was silking date that was used as a fixed linear covariate to correct the phenotype, Entry refers to samples, Env to all environments, which was a combination of years and locations, Rep to the replications in each environment, and Block to blocks used in an α-lattice experimental design. The parentheses indicate random effects, the vertical bar character “|” separates an expression for a model matrix and a grouping factor, and “:” refers to interactions. Environments cluster was calculated from environment correlation matrix using ward.D method in hclust function in R. Principal Coordinate Analysis (PCO) of environments was calculated using the prcomp function in R which was based on the environment correlation matrix.

Genotypes analysis and GWAS

Total DNA was extracted from bulked young leaves for inbred lines according to the CTAB method (CIMMYT 2005), and the DNA quality for each sample was checked using gel-electrophoresis and spectrophotometer (NanoDrop ND8000, Thermo Scientific) before genotyping. Genotyping was performed using the Illumina MaizeSNP50 BeadChip which contained 56,110 evenly spaced SNP to cover the whole maize genome (Ganal et al. 2011). The genotypic data summary (allele frequency, heterozygous rate and missing rate) were calculated using the software PLINK v1.07 (Purcell et al. 2007), and markers were filtered using EXCEL 2010.

Principal Component Analysis (PCA) to visualize genetic relationships between maize lines and control population structure was conducted using prcomp function in R software, and genomic relationship (Kinship) matrix was conducted in R using the following formula (VanRaden 2008):

$$G \, = \, zz^{{\prime }} / \left[2\sum \, P_{i} \left( {1 - P_{i} } \right) \right]$$
(5)

where z was the genotype matrix of SNP codes 0 = homozygous major allele, 1 = heterozygous, 2 = homozygous minor allele, and P i is the minor allele frequency for SNP i (i = 1,…, Total SNP Number).

GWAS analysis was conducted using a mixed linear model which included BLUPs, marker, kinship matrix (K) and principal component analyses (PCA) in TASSEL software (Bradbury et al. 2007). GWAS analysis was conducted for each environment separately, and the p values for each marker were combined using Fisher method as described by Chen (2011) and the result used to make a Manhattan plot. The following formulae was used for combining marker p values (Chen 2011):

$$X \, = \, - 2\sum \, ln\left( {P_{i} } \right)$$
(6)

where P i was the p value from study i, X following a \(x\frac{2}{2k}\) distribution, then p value was obtained using pchisq in R. The threshold for the Fisher combine p value was obtained using Bonferroni correction threshold (Bland and Altman 1995).

QTL mapping

Low-density markers (around 200 markers) were sufficient for bi-parental population QTL mapping, as a result, the KASPTM (KompetitiveAllele-Specific PCR) system in LGC Company was chosen as bi-parental population genotyping platform (http://www.lgcgroup.com/), because of the low cost, flexibility and high accuracy compared to other low-density marker platforms (He et al. 2014; Semagn et al. 2014). The linkage map was constructed using the software IciMapping v3.2 (Wang et al. 2012) with known SNPs anchored on each chromosome, and unknown SNP markers were grouped to each known chromosome by smallest linkage distance. The linkage map was visualized using MapDraw (Liu and Meng 2003). Inclusive Composite Interval Mapping (ICIM) (Li et al. 2008) in the software IciMapping v3.2 was used for QTL mapping (Wang et al. 2012). ICIM applies a two-step strategy to effectively separate the cofactor selection from interval mapping process, to more effectively control the background additive and dominance effects and improves mapping of QTL with additive effects compared to Composite Interval Mapping (CIM). The step for ICIM was set to 1 cM, and the logarithm of the odds (LOD) threshold was set to 2.5.

Candidate gene and haplotype analysis

Based on the GWAS result, the sequence of each significant SNP was used to perform BLAST against the “B73” genome sequence through MaizeGDB database (http://www.maizegdb.org/). The candidate gene and its hypothetical protein sequence were obtained from MaizeGDB, and the gene annotation was obtained from plantGDB (http://www.plantgdb.org/ZmGDB/) and Gramene (http://www.gramene.org/). The linkage disequilibrium (LD) between markers was estimated using standardized disequilibrium coefficients (D′) and squared allele frequency correlations (r 2) based on the method described by Flint-Garcia (Flint-Garcia et al. 2003). LD plots were generated using the LDheatmap package in R software (R Core Team 2013). Haplotype blocks were identified based on LD and the effect of each haplotype allele was calculated using lmer function in R software.

Result

Response of tropical maize germplasm to TSC

The descriptive statistics for response of maize inbred lines in seven environments are presented in Table 2. Disease expression varied between locations (Table 2; Fig. 1), but for each environment there was sufficient differentiation of resistant and susceptible germplasm and the repeatability (H 2) of the trials was generally high, ranging from 0.57 (Guerrero 2012) to 0.96 (Agua Fria 2013), indicative of reliable phenotypic data. Among the seven environments, the greatest differentiation occurred in AF13A, where disease scores ranged from 1 to 4.5 on a 1 to 5 scale, and the average disease score was 2.75. The least disease pressure was observed for COL10 (the average disease score of 1.73) and maximum disease score was 3.67, revealing possible differences in TSC pathogen population structure or variation in environmental conditions favoring disease expression. The distribution of disease scores and the high repeatability (H 2) revealed the presence of predominantly genetically controlled resistance to TSC in tropical maize germplasm. Kurtosis results revealed the predominance of major genes controlling TSC resistance. Most of the values of skewness were positive, revealing that most of the inbred lines were susceptible to TSC in most of the environments, except in AF13A, where the skewness is towards the left. This is not surprising as the genotypes used in AF13A had been selected to validate responses to TSC.

Table 2 Descriptive statistics for response of elite tropical maize inbred lines to tar spot complex (TSC) in seven environments
Fig. 1
figure 1

Violin plots depicting tar spot complex (TSC) scores of elite tropical maize inbred lines evaluated in seven environments. The black bars inside plot represent the first and third quartiles. White dots represent the median. The width of the plot represents probability density of the data at different values

The ward hierarchical clustering for the seven environments and principal coordinates analysis (PCA) revealed differences in TSC responses in different environments (Fig. 2). Therefore, GWAS was conducted for each environment. Response to TSC was negatively correlated −0.56; p < 0.0001 for anthesis time) to maturity (Table 3). Therefore, the final TSC disease response BLUP calculation used anthesis date (AD) as a covariate for correcting maturity effect on response to TSC and in GWAS analysis. From phenotypic evaluation of maize inbred lines; several lines that had a resistance response in all environments were identified (Supplementary Table S1). These inbred lines could be used immediately in breeding programs to improve TSC resistance.

Fig. 2
figure 2

The ward hierarchical clustering and principal coordinates analysis of tar spot complex (TSC) data from seven environments. The two analyses revealed differences in response to TSC in different environments (environmental grouping)

Table 3 Phenotypic (below the diagonal) and genetic (above the diagonal) correlation between response to TSC and maturity (flowering time) measured as anthesis day (AD) and silking day (SD)

Response of bi-parental populations to TSC

Three bi-parental populations (Pop1 is a DH population with 201 lines; Pop2 is a DH population with 116 lines; Pop3 is a F2:3 population with 277 families) and their parents were evaluated in Agua Fria in 2012 and 2013. The parents of these three populations showed significant differences in TSC resistance across all environments. For example, the average TSC score for the resistant parent of Pop1 (CML495) was 1.18 over the 2 years, while the susceptible parent was 3.36. Similar trends were observed for Pop2 and Pop3. Within each population, significant differences between entries was observed; for example, the minimum mean of disease score of the two environments for Pop1 was 1 and the maximum mean disease score was 4 (Table 4). The positive values of Kurtosis and Skewness revealed the presence of few genes conditioning resistance, and that most of the lines were TSC susceptible. The repeatability (H 2) of the data ranged from 0.69 in AF12 to 0.86 in AF13, with a combined (H 2) of 0.71, revealing that the data was reliable in separating resistant from susceptible lines. A similar trend was observed for populations 2 and 3. The mean score of the resistant parent (CML451) for Pop2 was 1.33 while that for the susceptible parent (DTPYC9-F46-1-2-1-2-B*3) was 3.33 and the repeatability (H 2) of the experiment was 0.68. For Pop3, the repeatability (H 2) was 0.64. Among the three populations, the repeatability (H 2) of TSC resistance was generally high, ranging from 0.64 to 0.86, revealing the presence of predominantly genetically controlled resistance in these populations.

Table 4 Descriptive statistics for response of bi-parental populations to tar spot complex (TSC)

A major QTL for TSC resistance identified by GWAS

Marker trait association analysis was conducted separately for each of the seven environments, using compressed MLM model by integrating PCA and Kinship in TASSEL software (Bradbury et al. 2007) (Supplementary Figure S2), and then the Fisher p value combine method was used to combine the p value of all environments. GWAS revealed a total of 43 markers that were significantly associated with TSC resistance using Bonferroni correction threshold (−log10 (p) > 5.99), which is a conservative threshold for avoiding false positive in GWAS (Bland and Altman 1995). These markers were distributed in three QTL regions (Supplementary Table S2). One maker was located on chromosomes 2 (bin 2.07) (PZE-102127730; p = 6.09E−08); one on chromosome 7 (bin 7.02) (PZE-107043784; p = 2.71E−10); and the rest of the markers (41) were located on chromosome 8 (bin 8.03) at around the 81 million base pair position based on B73 reference genome (B73 RefGen_v1) (Fig. 3). The most significant marker with smallest p value (3.29E−19; PZE-108048978) was on Chromosome 8 at position 82523744 on the B73 reference genome (B73 RefGen_v1), and it explained 7.25 % of the phenotypic variance. Results from this study reveal that TSC resistance appears to be simple in nature, and conditioned by three QTL located on chromosome 2, 7 and 8. The TSC locus on chromosome 8, designated qRtsc8-1, is a high LD region close to the centromere spanning about 12,000 Kb. This locus might be a major QTL (p = 3.29E−19) conditioning TSC resistance in maize. Candidate gene analysis of qRtsc8-1 identified 203 gene loci that included 104 putative uncharacterized proteins and 99 hypothetical genes with known predicted function (Supplementary Table S5). Of the genes with known function, GRMZM2G073884, coding the leucine-rich repeat receptor-like protein (LRR-RKS), was associated with SNP marker having the lowest p value in GWAS. The LRR-RKS gene family is associated with disease resistance in plants.

Fig. 3
figure 3

Genome-wide association analysis (GWAS) of resistance to tar spot complex (TSC) with mixed linear model and combined phenotypic data from seven environments and represented by a Manhattan plot, plotted with chromosomes and physical positions of SNPs on the X-axis and −log10 p value of each SNP derived from the association study on the Y-axis

The major QTL was confirmed by linkage mapping

Linkage map construction and QTL mapping were conducted using IciMapping v3.2 (Wang et al. 2012). The linkage map for Pop1 was constructed using 166 SNP markers that were polymorphic between the inbred parental lines, CML495 and LaPostaSEQ.C7F64-2-6-2-2-B*3. The map length was 1260 cM and the average interval between markers was 8.83 cM (Supplementary Figure S3). For Pop2, the linkage map was constructed using 253 SNP markers that were polymorphic between the parental lines CML451 and DTPYC9-F46-1-2-1-2-B*3, and the map length was 1352.4 cM with average marker interval of 5.57 cM (Supplementary Figure S4). A linkage map was not constructed for Pop3, as few markers (124 SNPs) were detected in this population. Instead, single maker analysis (SMA) method was used for QTL mapping. The marker order of the final two linkage map was in agreement with the maize physical map (http://www.maizegdb.org).

A total of four QTL were detected in Pop1 using combined data from the two environments in which Pop1 was evaluated, and these QTL were located on chromosome bin 1.05, 7.02, 8.03 and 10.04 (Fig. 4). The QTL on chromosome 8 was detected in both environments at a LOD score of 27.69. This QTL explained 41.46 % of the phenotypic variation for TSC in this population (Supplementary Table S3). This appears to be a major QTL for TSC resistance and it was stable across environments. The QTL on chromosome 10 was also detected in both environments, and the phenotypic variance attributed to this QTL ranged from 4.02 to 5.27 %, revealing that it is a minor QTL. The QTL on chromosomes 1 and 7 appear to be minor and not stable, as they were detected only in AF13 and not AF12.

Fig. 4
figure 4

Plot of LOD scores from quantitative trait loci (QTL) analysis for tar spot complex (TSC) resistance across 10 maize chromosomes in three bi-parental populations. Pop1 and Pop2 are doubled haploid populations composed of 201 and 116 individuals, respectively, while Pop3 is an F2:3 derived population with 277 families. A major QTL on chromosome 8.03 was detected in all three populations

Three QTL were detected in Pop2, on chromosome bin 6.02, 7.02 and 8.03 (Fig. 4). The QTL on chromosome 8 had the largest effect and it explained 18.73 % of the phenotypic variation for TSC (Supplementary Table S3). The QTL on chromosome 6 and 7 had small effects, explain 10.41 and 10.43 % of the phenotypic variance, respectively. For the F2:3 derived population (Pop3), two chromosome loci were detected using single marker analysis, one on chromosome 6.02 and another on chromosome 8.03 (Fig. 4). The region on chromosome 8 had the highest LOD value; the marker PZA03638_1 (LOD = 17.87), explained 25.71 % of the phenotypic variation for TSC resistance (Supplementary Table S4). The marker on chromosome 6 (PZB00942_1; LOD = 3.09), explained 5.01 % of the phenotypic variation for TSC.

The QTL on chromosome 8 was detected in all three bi-parental populations and across all environments, revealing that it is a stable QTL conditioning TSC resistance in tropical maize germplasm and in different environments and backgrounds. The same QTL was also detected through GWAS. This QTL, designated qRtsc8-1, had the largest LOD value, revealing that it is a major resistance locus for TSC.

Haplotype and candidate gene analysis for qRtsc8-1 region

Linkage disequilibrium (LD) analysis within the qRtsc8-1 region, revealed that the average LD between markers was 0.75 (r 2), and the maximum LD was 1 (r 2), revealing that the marker in this region were in a high LD block (Fig. 5). Haplotype construction using 41 significant SNP markers in the qRtsc8-1 region identified more than 20 haplotype alleles; the top 10 haplotypes were named H1 to H10. Two haplotypes (H1 and H2) occurred at more than 10 % frequency within the GWAS panel. The major haplotype (H1) had a frequency of 40.9 % within the GWAS panel; a second haplotype (H2) had a frequency of 17.3 %, while a third haplotype H3 had a frequency of 9.6 % (Table 5). Analysis to estimate the contribution of each haplotype revealed that haplotype H5 had the highest effect on disease, increasing TSC resistance by 14.6 % compared to the major haplotype H1 (Table 5). The frequency of occurrence of this haplotype in the GWAS panel was 3.5 %, revealing that it is a rare haplotype. The haplotype with the second greatest effect on TSC was H2, which increased resistance by 11.1 %, followed by H9 (9.5 %) and H8 (8.1 %). Haplotypes, H3, H4, H6, H7, and H10 increased susceptibility to TSC compared to the major haplotype H1, with haplotype H6 having the greatest negative effect (−12.6 %). The frequency of occurrence of the two haplotypes, H5 and H2 in the GWAS panel was 17.3 and 3.5 %, respectively, compared to H1 that occurred at a frequency of 40.9 %. Therefore, it is very valuable to increase the frequency of the favorite haplotypes, especially H2 and H5 in breeding programs, as these contributed more to TSC resistance.

Fig. 5
figure 5

Linkage Disequilibrium (LD) analysis across the significant markers detected through Genome-wide association analysis (GWAS) in the qRtsc8-1 region. LD is measured as a r 2 and b D′, with maximum LD depicted in red and least LD depicted in white. The five SNPs with the lowest p value (highly significant) are indicated on each heatmap, while the rest of the markers are shown in Tables S2 and S5

Table 5 Top ten haplotypes, haplotype genotype and frequencies in the qRtsc8-1 region and haplotype contribution to TSC resistance or susceptibility

Discussion

A major QTL conditions TSC resistance

Previous studies using segregating bi-parental populations revealed the presence of a single dominant gene conditioning TSC resistance in tropical maize (Ceballos and Deutsch 1992). Using GWAS and linkage mapping, we confirmed the presence of a major QTL conditioning TSC resistance in tropical maize germplasm, and further showed that the major QTL is located on chromosome 8, bin 8.03. The QTL was confirmed in three bi-parental populations developed from different parental lines. We propose to name this QTL qRtsc8-1, and to the best of our knowledge, this is the first loci reported to condition TSC resistance in maize.

In addition to qRtsc8-1, two minor QTL were detected on chromosome 2 (bin 2.07) and chromosome 7 (bin 7.02). The QTL on chromosome 7 was also detected in two bi-parental mapping populations and it explained up to 10.4 % of the phenotypic variation observed for TSC. Furthermore, three additional QTL were detected through linkage mapping, on chromosome 6 bin 6.02 (Pop2 and Pop3), chromosome 1 bin 1.05 (Pop1) and chromosome 10 bin 10.04 (Pop1). Taken together, it appears that TSC resistance loci are distributed on several chromosomes (e.g. 1, 2, 6, 7, 8, and 10) within the maize genome. The distribution of disease resistance QTL across the maize genome is not a new phenomenon and has been reported for other maize diseases (Wisser et al. 2006). QTL for TSC resistance were in regions harboring resistance QTL for other maize diseases, including gray leaf spot (localized on chromosome 1, bin 1.05; chromosome 2, bin 2.07; chromosome 6, bin 6.02; chromosome 7, bin 7.02; chromosome 8, bin 8.03 and chromosome 10, bin 10.04) (Wisser et al. 2006). Other maize disease with resistance QTL localized in the same region as that of TSC are common smut, multiple viral pathogens, southern corn leaf blight, northern corn leaf blight, and common rust (Wisser et al. 2006). Detailed analysis using the same GWAS population and evaluated against several maize pathogens would provide more information on the organization of disease resistance genes on maize genome. However, our study makes an important contribution towards understanding the genetic architecture of disease resistance in maize by providing information on genetics of TSC resistance in tropical maize.

Most maize production in Mexico and South America is by resource limited smallholder farmers; therefore, the development of TSC resistant maize germplasm is desirable to mitigate the negative effects of the disease. In breeding, major resistance genes are usually easier to transfer to an elite background through backcrossing using marker assisted selection (MAS). As a major TSC resistance gene, qRtsc8-1 can easily be transferred to improve elite and desirable inbred lines that are crucial for South American based breeding programs. However, to prolong the utility of qRtsc8-1, it must be protected in the background of minor QTL. Thus the minor QTL identified in this study will play a crucial role to stabilize and prolong the utility of qRtsc8-1, and should be incorporated into elite maize inbred lines with the help of molecular markers. In the meantime, the identified resistant inbred lines should be evaluated in different environments of South America where TSC is a major problem, to establish the usefulness and utility of qRtsc8-1.

Donors of TSC resistance

In addition to identifying a major QTL for TSC, this study identified 17 maize inbred lines that were highly resistant to TSC across environments. The highly resistant lines will form the foundation for maize breeding programs as donors of TSC resistance and help mitigate the detrimental effects of this disease (S1 Table). Haplotype analysis revealed that most TSC resistance donor lines contain the favorable haplotypes, H2, H5, and H9 that increased TSC resistance by 11.1 and 14.6 %, and 9.5 %, respectively. As the frequency of these three haplotypes is very low in the 890 elite inbred lines evaluated (17.3, 3.5, and 1.7 %, respectively), efforts should be made to increase the frequency of these favorable haplotypes in breeding germplasm, and deploy them at farm level. The SNP markers PZE-108048700, PZE-108048709 and PZE-108048710 were strongly associated with TSC resistance, and segregated with the favorite haplotype alleles; therefore, these markers can be used to detect the favorite haplotype and help increase their frequency in breeding programs through MAS. Another practical application of these markers is to survey inbred lines for presence of favorable haplotype alleles and then evaluate lines carrying favourable haplotypes under field conditions; thus reducing costs associated with evaluating large numbers of inbred lines.

Candidate gene analysis for qRtsc8-1

In maize, linkage disequilibrium (LD) is variable across chromosomes and subpopulations (Olukolu et al. 2013). The LD decay across the whole genome in maize is around 27.7 kb in temperate germplasm (Weng et al. 2011) and around 5–10 kb in tropical germplasm (Lu et al. 2011). Olukolu et al. (2013) reported that marker pairs separated by more than 10 kbp had r 2 < 0.1 on average in tropical subpopulations and 10–100 kbp on average in temperate subpopulations (Lu et al. 2011). LD decay on different locations on the same chromosome can be very different, with sites close to the centromere having higher LD decay, up to 34 times compared to sites at distal ends of chromosomes. Therefore, LD decay results guide the selection of candidate gene regions, depending on whether the significant marker is near the telomere or centromere.

The QTL on chromosome 8 (qRtsc8-1) was located around the centromere, a region with low LD decay (Supplementary Figure S5). The LD decay for this region was around 8 Mb at r 2 = 0.2, therefore, the candidate gene analysis region was big and included all significant markers. A total of 203 genes were found in the qRtsc8-1 region, and this included 104 putative uncharacterized proteins and 99 genes with predicted function. It is common for QTL regions identified by linkage mapping in maize to encompass >10 MB of sequences and hundreds of genes (Poland et al. 2009; Zuo et al. 2015). As a result, a QTL might represent the combined effect of multiple genes whose individual effects are very difficult to pin point. A QTL for head smut resistance in maize was narrowed down to a 152-kb interval that contained five genes, ZmWAK, ZmHCH, ZmTIF, ZMXa21-1 and Zm Xa21-2 (Zuo et al. 2015) showing that a QTL might result from the effect of multiple genes. Also, it is not surprising that most of the genes (104) did not have characterized protein. It is possible that the B73 reference genome might not have the causal gene, given that B73 is temperate and the lines used in this study are tropical. While fine mapping qHSR1, a QTL associated with quantitative resistance to head smut in maize, Zuo et al. (2015) showed that two of the five genes in the 152-kbp interval, ZmXa21-1 and Zm Xa21-2 from the inbred line Mo17 were completely missing in B73. Ongoing experiments to fine map qRtsc8-1 will provide more information on the number and nature of genes associated with this QTL loci.

Of the 99 genes with predicted function, nine genes were annotated as having a transmembrane transport function and seven were associated with DNA binding proteins. One gene, GRMZM2G073884 belonged to a leucine-rich repeat receptor-like protein (LRR-RLKs) family and was the closest to the most significant marker. LRR domains have long been implicated in plant disease resistance and are considered to be “R-genes” containing NB domains missing in PLKs. Several studies have shown that LRR-RLKs gene family play critical and diverse roles in plant response to stress (Roux et al. 2011; Torii 2004; Zan et al. 2013). This is likely a candidate gene associated with qRtsc8-1 and conditioning TSC resistance in tropical maize. Other candidate genes identified by GWAS included serine-threonine protein kinase genes; this family of genes is known to be involved in plant defense response. Further work is needed to fine map this gene and develop functional markers for MAS. Nevertheless, the haplotype and candidate genes analysis conducted in this study have provided the basis for fine mapping qRtsc8-1.

Combining association and QTL mapping is a powerful method for QTL detection

Linkage and association analysis are two complementary approaches commonly used to map causal genes and dissect the genetic basis of traits of interest (Lu et al. 2010; Tao et al. 2013; Tian et al. 2011). Association mapping, offers higher resolution than traditional QTL mapping and is suitable for mapping multiple traits (Breseghello and Sorrells 2006; Yu and Buckler 2006; Yan et al. 2011). A major drawback of association studies is the high rate of false positives, that results mainly from population structure (Larsson et al. 2013). Use of principal component analysis (PCA) and Structure (Q) as fixed effects and a kinship matrix (K) included in the Mixed Linear Model (MLM) significantly reduces the false positive rate (Andersen et al. 2005; Yu et al. 2006).

Use of bi-parental population does not have the problem of population structure since these are developed from two inbred lines. Therefore, it gives significantly less false positive than GWAS, but it results in very low resolution. Using both GWAS and QTL mapping exploits the complementary strengths and weaknesses of both approaches for identification of resistance loci across the genome. In this study, both GWAS and linkage mapping methods were used to identify loci associated with resistance to TSC. The qRtsc8-1 was detected as major resistance QTL in both GWAS and linkage mapping. Three minor QTL on chromosomes 2 and 7 were detected through GWAS. The QTL on chromosome 7 was also detected in bi-parental populations. In addition, three QTL, on chromosome bin 1.05, 6.02 and 10.04, were identified through linkage mapping only. These results reveal that in addition to a major TSC resistance gene on chromosome 8, minor QTL distributed on other chromosomes also condition resistance to TSC. Therefore, for durability, it is prudent to combine minor and major genes in the same background to avoid resistance breakdown.

Conclusion

This study identified 17 inbred lines that were highly resistant to TSC across environments. All lines carry favorable haplotype alleles that increase TSC resistance by 9.5–14.6 %. These elite maize inbred lines have been made available to breeding programs tasked with developing resistance to the highly destructive TSC disease. A major QTL localized on chromosome 8, bin 8.03, near the centromere region and designated qRtsc8-1, was identified by GWAS and validated through linkage mapping in three independent bi-parental populations. To the best of our knowledge, this is the first report of a major QTL conditioning TSC resistance in tropical maize germplasm. A survey to document the frequency of qRtsc8-1 in elite maize inbred lines should be conducted. This information can guide breeding programs in selecting parental lines to use in breeding programs, as well as lines to convert to TSC resistance. The identification of a major locus for TSC resistance lays the foundation for fine mapping and developing breeder ready markers for use in MAS to introgress this major QTL into elite and adaptable germplasm.

Author contribution statement

Designed the experiments: GM, LN, RS. Performed the experiments: GM, RS, KVOG, ALA, LN. Analyzed the data: JC, GM. Wrote the manuscript: GM, JC, RS, YX.