Introduction

Kernel test weight (KTW) is the weight of kernels in a specified unit volume. As an important agronomic trait, KTW is closely associated with the degree of kernel maturity, integrity, and uniformity (Cabral et al. 2018). It reflects maize grain quality and impacts corn nutritive value, milling quality, and export (Dorsey-Redding et al. 1991). However, the genetic basis of KTW remains largely unclear. Previous studies have indicated that KTW is highly heritable in oat and wheat (Pixley and Frey 1991; Sun et al. 2008). This suggested that the genetic basis of KTW can be dissected by quantitative genetics methods. Currently, quantitative trait loci (QTL) mapping has become a routine strategy for understanding the genetic architecture for complex quantitative traits (Mdela et al. 2006). QTL analysis of KTW has been conducted in various plant species, including maize and wheat. Sun et al. (Sun et al. 2008) mapped 12 QTL for wheat KTW in a recombinant inbred line (RIL) panel across four environments by utilizing a mixed model. Under two environments, five QTL that control maize KTW were detected by mixed model composite interval mapping method in a population of 225 F2:3 families derived from maize inbred lines Chang 7-2 × Zheng 58 (Ding et al. 2011).

Recently, genome-wide association study (GWAS) has been also employed to uncover the genetic architecture of KTW in wheat (Kristensen et al. 2018). Schulthess et al. (2017) identified 40 SNPs associated with KTW in wheat using a GABI-WHEAT population. However, due to the challenging detection of rare variants in GWAS and high false-positive rates in QTL mapping, only using GWAS or QTL mapping has been greatly restricted in revealing the genetic basis of quantitative traits (Andersen et al. 2005; Visscher 2008; Li et al. 2016). A powerful and efficient approach that combines association mapping and linkage analysis has been proposed as a better method for understanding the genetic architecture of complex quantitative traits in maize (Liu et al. 2015; Mahuku et al. 2016). This new method has been used for detecting loci that control various quantitative traits in maize, such as plant height (PH) (Li et al. 2016), ear height (EH) (Li et al. 2016), tassel traits (Wu et al. 2016), amino acid content (Deng et al. 2017), heavy metal accumulation (Zhao et al. 2018) and kernel size (Liu et al. 2017b). Several genetic loci associated with maize kernel size and weight-related traits were co-identified by GWAS and QTL mapping, including 37 KTW-associated SNPs identified by a stepwise regression method (Liu et al. 2017b).

Previous studies have shown that KTW is influenced by multiple factors, including kernel size, and components in kernel (Sun et al. 2008; Ding et al. 2011; Blandino et al. 2013). Therefore, some genes regulating seed development have been proven to influence KTW. The gene Opaque-2 has been reported to influence protein composition by increasing the lysine content of maize endosperm, which inevitably decreases kernel weight and test weight (Mertz E. T. 1964). The maize mutant miniature1 (mn1) that is deficient in cell wall invertase2 lowers the sucrose content in kernel and consequently reduces kernel weight and KTW (Byung-Ho et al. 2009). The gene emp4, which encodes an essential repeat protein during the development of maize seeds (Gutierrez-Marcos et al. 2007), was strongly linked to the significant KTW-associated SNPs previously detected (Liu et al. 2017b).

This study used an association panel and an IBM Sy10 DH population to identify the QTL and the significant SNPs related to KTW by combined linkage analysis and GWAS in different environments. The objectives of the study were (1) to assess phenotypic variations of KTW in two maize populations across multi-environments, (2) to map QTL and SNPs that are significantly associated with KTW, (3) to identify the candidate genes involved in KTW, and (4) to detect co-localized genetic loci in both the populations and explore potential functional genes for KTW. Our results will improve the understanding of molecular mechanisms underlying in maize KTW and provide novel molecular markers that may be used by breeders to develop superior varieties.

Materials and methods

Plant materials

For QTL mapping, an inter-mated B73 × Mo17 (IBM) Syn10 doubled haploid (DH) population comprised 252 DH lines, as well as the original parental lines B73 and Mo17 (Hussain et al. 2007). The association panel for GWAS comprised 310 inbred lines that included the parents of widely applied hybrids in southwest China, the newly selected and improved inbred lines, the inbred lines of temperate heterotic groups, and CIMMYT or US exotic inbred lines (Zhang et al. 2016).

Field experiment

The IBM population and the association panel were separately planted in a completely random design of two-row plots with two replicates. Each row comprised 16 plants and 4 m in length, with a 0.8 m spacing between rows of each plot. All the plant materials were planted at a standard density of 50,000 plants per hectare with standard cultivation practice management. In 2015, the IBM population was planted in three locations, Xishuangbanna (XSBN, Yunnan Province, 22.02°N, 100.80°E) (November 2015–April 2016), Xinxiang (XX, Henan Province, 35.19°N, 113.53°E) (June 2015–October 2015), and Chongzhou (CZ, Sichuan Province, 30.32°N, 103.38°E) (April 2015–August 2015). In 2016, the association panel was planted in the following three environments: Xishuangbanna in Yunnan Province (XSBN, 22.02°N, 100.80°E) (November 2016–April 2017), Ya’an in Sichuan Province (YA, 29.59°N, 102.57°E) (April 2016–August 2016), and Hongya in Sichuan Province (HY, 29.54°N, 103.22°E) (April 2016–August 2016).

Phenotyping and data analysis

KTW was tested using the LDS-1G KTW tester (Changsha Tengyang Instrument Co., Ltd.) according to the SAC method GB 1353-1999. For each replicate, the kernels of ten selected ears were collected and mixed together for KTW measurement. The averaged KTW of three repeated measurements was used to represent the phenotypic value of each inbred line. The phenotype performance of KTW in single environment was determined by taking the average of the two replicates of each line. ANOVA for KTW was performed for each of the two populations using general linear model (GLM) of SPSS Statistics 20.0 software (http://www.spss.com). The formula of GLM model was as follows \( y_{ijk} = \mu + E_{i} + G_{j} + R\left( E \right)_{ik} + E \times G_{ij} + \varepsilon_{ijk} \), where μ is the total mean of KTW, Ei represents the environmental effect of the ith environment, Gj is the genetic effect of the jth line, R(E)ik denotes the effect of the kth replication within the ith environment, E × Gij is the effect of interaction between the environmental and genetic effects, and εijk is the residual error. Phenotypic data were used to calculate the best linear unbiased predictions (BLUPs) using a mixed linear model in the R package “lme4”. Broad-sense heritability (h 2B ) estimations for KTW were calculated as defined by Knapp (Knapp et al. 1985).

Linkage mapping

A high-density genetic map developed using 6618 recombination bins was utilized for QTL mapping, which was derived from the iPlant database (http://www.iplantcollaborative.org/ci/discovery-environment). Linkage analysis was performed using QTL Cartographer version 1.17f with composite interval mapping (CIM) (Liu et al. 2017a). An LOD = 2.5 was set as the threshold, and QTL detected among environments were treated as the same QTL when the distance between their peaks was < 10 cM.

Genome-wide association study (GWAS) and superior allele analysis

The association panel was previously genotyped using the Illumina MaizeSNP50 Genotyping BeadChip (Zhang et al. 2016), which identified 56,110 SNPs in the population. In this study, the 56,110 SNPs were filtered to remove SNPs with a missing rate > 20%, SNPs with a minor allele frequency (MAF) < 0.05, and SNPs with a heterozygosity > 20%. The remaining 43,782 SNPs were used for GWAS. To determine the best suitable GWAS model, we tested a total of six models for identifying significant KTW-associated SNPs. Of these, four were from TASSEL 5.0 (Bradbury et al. 2007): the General Linear Model (GLM), GLM with principal components analysis (PCA), MLM, and MLM with PCA. The other two models were from R software: the compressed mixed linear model (CMLM) in GAPIT (Yu et al. 2006; Zhang et al. 2010) and a fixed and random model Circulating Probability Unification (FarmCPU) (Liu et al. 2016). We selected the most suitable model based on how the null expectation was exceeded by the observations of P values in the Q–Q plot among different models, as described by Kaler et al. (Kaler et al. 2017). The suggestive P value (0.05/N) was set as a significance threshold and N was calculated using the simpleM package in R to control false negatives (Gao et al. 2008).

For the significant SNPs, a superior genotype was determined based on the effect value of each significant SNP, as described by Ma et al. (2018). For each SNP, the percentage of the superior allele in these elite inbred lines was calculated as the number of lines containing superior alleles divided by the number of total lines. For each line, the proportion of superior alleles among these SNPs was equal to superior allele number divided by total SNP number.

Candidate gene association study

These gene models were considered the candidate genes if they were located within the 300 kb flanking regions of the co-localized SNPs by GWAS and QTL mapping. A total of 80 lines covering all the subgroups were selected to perform candidate gene association study, which were used for extracting genomic DNA at the three-leaf stage using the CTAB method (Gawel and Jarret 1991). The gene region and 2000 bp upstream of the gene region were PCR amplified for each of the candidate genes that were co-localized by GWAS and QTL mapping. Sequence alignment and assembly of the above genes were conducted using the DNAMAN program (Woffelman 2004). The alignment results were exported from DNAMAN in Phylip format. Bioedit and ClustalX were used to improve the sequences. Among the 80 lines, the sequence diversity of each gene was detected using DnaSP v5.0. The resulting SNPs and InDels were used in detecting associations with KTW of maize using FarmCPU model in GAPIT as implemented in the R software, with a minor allele frequency > 0.05. The 0.05/SNP number was set as the threshold of P value.

Results

Phenotypic descriptions

To evaluate phenotypic variations among the experimental inbred lines, we investigated maize KTW under three different environments, which showed abundant phenotypic variations in both the populations. KTW ranged from 559.50 to 806.00 g/L and 562.33–849.50 g/L, with standard deviations (SD) from 36.02 to 41.71 g/L, and from 30.08 to 42.90 g/L across different environments, respectively, in the association panel and the IBM population (Table 1). In addition, KTW segregated continuously in each of the two populations, with a skewness value < 1 in absolute value, indicating that maize KTW followed a normal distribution and conformed to the characteristic of quantitative traits. The estimated heritability (h 2B ) of KTW ranged from 0.73 to 0.80 (Table 2), suggesting that the phenotypic variance of maize KTW was mainly controlled by genetic factors. The variation coefficient of KTW was 4.18–6.28%, suggesting that maize KTW had a large potential for genetic improvement. Moreover, the combined variance analysis displayed that KTW of maize was significantly different among genotypes (G), environments (E), and the interaction of genotype and environment (G × E) in the two populations.

Table 1 Phenotypic performance of KTW in two maize populations
Table 2 ANOVA for KTW of two populations in multiple environments

Population structure and linkage disequilibrium decay

The structure of the association panel was calculated using STRUCTURE software as described in the previous study (Zhang et al. 2016). When K = 3, the ΔK reached a sharp peak, indicating that the panel with 310 inbred lines could be assigned to three subpopulations, namely, non-stiff stalk (NSS), stiff stalk (SS), and tropical (Fig. S1).

The extent of linkage disequilibrium (LD) was estimated for all the chromosomes using a total of 43,782 high-quality SNPs. The LD rapidly decreased with increasing the physical distance between SNPs (Fig. S2), but the decay rate varied among chromosomes. At r2 = 0.1, the mean LD decay was 300–400 kb across all chromosomes. Therefore, a gene was considered as a candidate gene for KTW when it was located within the 300 kb flanking regions of the significant SNPs detected by GWAS, as described in previous studies (Zhang et al. 2016; Zhao et al. 2018).

QTL for KTW

A total of 30 QTL were identified in various environments (Table 3), and the average phenotypic variation explanation (PVE) of each QTL was 5.65% with a range from 3.32 to 9.36%. Based on the physical interval of their peak in the genome, the 30 QTL were clustered into 19 common QTL. Of the 19 common QTL, five QTL were repeatedly detected across multiple environments, with one (qKTW7) identified in all the environments, 2 (qKTW1-1 and qKTW5-3) in three environments and two (qKTW4-5 and qKTW6-2) in two environments. These environment-stable QTL (except qKTW4-5) showed positive additive effects, indicating that alleles from the high-KTW parent B73 contributed to greater KTW. In addition, the QTL qKTW1-1, qKTW5-3 and qKTW7 had a PVE of 9.25%, 9.36%, and 8.90%, respectively, with a high LOD value of 6.36, 7.02, and 6.70 in the investigated environments. This suggested that these QTL were major loci influencing KTW of maize.

Table 3 QTL identified for KTW across multiple environments

GWAS loci of KTW and distribution of superior alleles in maize elite lines

GWAS was performed using R software and TASSEL 5.0. To determine which model was the optimal association analysis model, we tested and compared six models. Only the Q–Q plot from the FarmCPU model resulted in a sharp departure from the distribution of the expected P value only in the tail area (Fig. 1b and S3). This revealed that the FarmCPU model effectively controlled false positives and negatives and could be applied for GWAS of KTW. Thus, the FarmCPU model was selected to analyze the association between SNPs and maize KTW.

Fig. 1
figure 1

a Manhattan (i: YA; ii: XSBN; iii: HY; iv: BLUP) and b Q–Q plots resulting for KTW from GWAS results using FarmCPU method

Using simpleM indicated that the effective SNP number was 24,535; therefore, threshold P value was set as 2.04E−06 (0.05/24,535). Across multiple environments, 18 significant SNPs associated with KTW were determined to be randomly located on chromosomes 1, 3, 4, 5, 6, 7, and 10 (Table 4; Fig. 1a, b). Only the significant SNP on chromosome 7 (SYN18432) was co-detected in multiple environments (XSBN and BLUP). Among these SNPs, PZE-103144282 had the lowest P value (3.55E−16), which was detected in BLUP. Notably, PZE-103165542, which was detected in XSBN, was located exactly within the QTL qKTW3-2 that was identified in the IBM population in XX. The environment-stable locus and the large-effect loci mentioned above should be taken more attention for KTW improvement.

Table 4 Significant and repetitive SNPs identified in this study across multiple environments

A total of 35 maize elite lines were the parents of commercial varieties in the association panel (Table S1), which allowed us to estimate the utilization of superior alleles in maize breeding. In this study, the allele related to a higher KTW was defined as the superior allele. The percentage of the superior allele at each significant locus ranged from 17.14 to 88.57% in the elite lines (Fig. 2, Table S1), with 7 SNPs containing ≥ 50% superior alleles, while the other 11 loci showed < 50% superior alleles. Furthermore, the superior allele percentages were > 80% at the four SNP loci (SYN20747, SYN29999, PZB01964.5 and SYN18432). However, both the SNPs (PZE-110016084 and PZE-105079209) only had a superior allele percentage of < 20%, suggesting that these loci should be given preference during marker-assisted selection (MAS) breeding to improve the KTW of maize. In addition, the number of superior alleles in each elite line varied from 5 to 12, with 17 lines including ≥ 9 superior alleles, and the remaining 18 lines containing < 9 superior alleles (Fig. 1, Table S1). Three elite lines (Zheng58, 178, and 646) had 12 superior alleles, while the two elite lines (572 and Huang C) possessed only five superior alleles.

Fig. 2
figure 2

The superior allele SNP distributions in the 35 maize elite inbred lines. Red and white colors represent superior and inferior alleles, respectively (color figure online)

Candidate genes involved in maize KTW

The 18 KTW-associated SNPs were closely linked to 12 candidate genes (Table 4). The expressions of the 12 genes have been previously reported to exhibit extensive variations among different maize tissues (Debbie et al. 2007; Sekhon et al. 2011). Among them, five genes had higher expression levels in developing seeds relative to the other tissues, with absolute expression (AE) ranging from 1326.12 to 13,098.74. Specifically, Zm00001d040090, encoding homeobox-leucine zipper protein ANTHOCYANINLESS 2, had the AE of up to 12,625.73 in seeds at DAP, which has been previously proven to impact kernel size in maize (Khaled et al. 2005; Depege-Fargeix et al. 2010). Zm00001d021282 that encodes an unknown protein had a high AE (17,074.47–21,343.1) in both endosperm and seed during the whole ripening stage (Debbie et al. 2007; Sekhon et al. 2011). Zm00001d048823 was annotated as monosaccharide-sensing protein 2 and generally highly expressed (3672.89–7345.78) during seed development (Debbie et al. 2007; Sekhon et al. 2011). Remarkably, Zm00001d023710, which encodes small RNA-degrading nuclease 5, showed the highest AE (1326.12) in embryos, greater than any other tissues during the whole ripening stage (Debbie et al. 2007; Sekhon et al. 2011).

Additionally, we searched candidate genes in the confidence intervals of the mapped QTL and the LD regions of GWAS loci, which were previously reported to control seed development, kernel size, and/or kernel weight. ZmYUC1 that encodes endosperm-specific YUCCA1 protein and influences normal development of endosperm in maize (Bernardi et al. 2012) was located in the candidate flanking region (chr10: 16.06–16.66 Mb) of PZE-110016084 (P = 1.05E−06 in BLUP). As shown in Table 3, BAP2 was mapped to qKTW4-3 (chr4: 28.61–36.78 Mb), which was reported to control maize seed development (Serna et al. 2001). ZmTCRR-1, encoding a member of the type-A response regulator class of proteins, was found within qKTW4-3 (chr4: 63.55–9.23 Mb) (Muniz et al. 2006). Interestingly, ZmTCRR-1 was previously shown to be exclusively expressed in the endosperm transfer-cell layer at 8–14 DAP and influence seed development. Furthermore, dek36, which influences maize kernel size by encoding a mitochondria-targeted E+ subgroup PPR protein (Wang et al. 2017), was detected within qKTW5-1 (chr5: 4.73–0.78 Mb). ZmSWEET4c that regulates seed filling in maize by mediating transepithelial hexose transport across the basal endosperm transfer layer (Sosso et al. 2015) was located within the interval of qKTW5-2 (chr5: 130.36–46.33 Mb).

Candidate gene association analysis reveals potentially functional genes co-localized by GWAS and QTL mapping

By comparing the physical positions of GWAS loci and QTL detected in this study, we found an SNP (PZE-103165542) that was located in the QTL qKTW3-2 (according to B73 RefGen_v4). Based on the LD (= 300 kb) of the association panel, we screened the 300 kb flanking region of PZE-103165542 to obtain the open reading frames (ORFs). A total of 17 ORFs were identified as the candidate genes co-localized by GWAS and QTL mapping (Table 5). To further explore the functional genes involved in maize KTW, we separately detected the associations between the variations of each candidate gene (including the gene region and the promoter region) and KTW in the 80 lines. A total of five SNPs and one InDel from six candidate genes were significantly associated with KTW at the level of P = 0.05/N (Table S2). Of these, three variations were annotated as synonymous and intronic variants, respectively, which were assumed not to cause changes in amino acid sequence or gene expression. The remaining significant variants were within the promoters of Zm00001d044075 (S3_218394531), Zm00001d044086 (S3_218640604), and the 3’UTR of Zm00001d044081 (S3_218485181). Because the variants of promoters and 3’UTR regions are considered important factors influencing gene expression levels and causing phenotypic variations (Hughes 2006), Zm00001d044075, Zm00001d044086, and Zm00001d044081 were taken as potentially functional genes that are involved in KTW of maize. Among these, Zm00001d044075 encodes an unknown protein and Zm00001d044086 encodes a nucleic acid-binding OB-fold-like protein, which was associated with KTW across two environments (HY and YA). Zm00001d044081 was annotated as homeobox-leucine zipper protein ATHB-4 that was previously demonstrated to control apical embryo development in Arabidopsis (Turchi et al. 2013), with a high expression level in maize seeds.

Table 5 List of candidate genes and number of markers genotyped on each for association study

Discussion

Phenotypic variability in KTW in two populations

Accurate phenotypic data are necessary for identifying the genetic basis of quantitative traits. This study utilized the LDS-1G KTW tester to determine KTW of maize according to the SAC method GB 1353-1999, which enables a quick and precise screening for a number of lines (SAC 2003; Ding et al. 2011). Abundant phenotypic variety in populations is the guarantee of dissecting the genetic architecture concerning the target traits (Wu et al. 2016). A previous study (Hussain et al. 2007) has shown that the IBM Syn10 DH population, which has high recombination rates, exhibits extensive phenotypic variability, thereby contributing to the improvement of the maize germplasm and the identification of important genes. In our study, the IBM Syn10 DH population exhibited KTW phenotypic variability, ranging from 559.50 to 806.00 g/L (Table 1), which is a significantly wider range than that reported by Ding et al. (728.3–789.4 g/L) (Ding et al. 2011). Meanwhile, the GWAS population used in this study consisted of tropical stiff stalk (SS), non-stiff stalk (NSS) and tropic groups that were previously proven to have high genetic diversity (Zhang et al. 2016). In this study, the association pool also displayed extensive phenotypic variability of KTW (562.33–849.50 g/L) (Table 1), which is beneficial for understanding the genetic basis of maize KTW. Furthermore, KTW significantly differed (P < 0.001) among genotypes and environments, as well as genotype and environment interaction. Therefore, environmental impacts should be fully considered in breeding maize varieties with improved KTW.

Combination of association and QTL mapping facilitates the elucidation of the genetic basis of KTW in maize

The selection of suitable models is essential for GWAS of quantitative traits in maize. In our study, six models were used to test for obtaining an optimal model for GWAS of KTW. The Q–Q plots from GLM and GLM + PCA led to a separation that was far from the distribution of the expected P value (Fig. S3), indicating that these two models resulted in a high level of false positives (type I errors). On the contrary, the Q–Q plots of CMLM, MLM, and MLM + PCA did not contain a sharp deviation (Fig. S3), suggesting that these three models were too stringent in reducing false positives (type I errors), which resulted in more false negatives (type II errors). Comparing these models suggests that the Q–Q plots generated by the FarmCPU model provided a more ideal trend, which could effectively balance the type I and II errors. Therefore, FarmCPU was the optimum model for dissecting KTW in the present study.

A large number of studies have shown that association and QTL mapping are two complementary methods for mapping causal genes and dissecting the genetic basis of traits of interest (Lu et al. 2010; Tian et al. 2011; Tao et al. 2013; Mahuku et al. 2016). Compared to traditional QTL analysis, association mapping has a higher resolution that is more suitable for mapping multiple traits (Yu and Buckler 2006). However, the major shortcoming of association mapping is the high false-positive rates from the population structure (Larsson et al. 2013). Although the issue could be significantly reduced using PCA and structure (Q) as fixed effects and a kinship matrix (K) included in the MLM (Andersen et al. 2005; Mahuku et al. 2016), association analysis was still hard to fit all studies. QTL mapping has no drawbacks in terms of population structure, but its detection of rare variants is quite challenging (Visscher 2008). To better dissect the genetic basis of KTW in maize, we utilized the combination of GWAS and linkage mapping to identify the KTW-associated loci. In our study, a genetic locus on chromosome 3 was co-localized by GWAS and linkage mapping, revealing 16 potential candidate genes involved in KTW. Among these, three were further verified by candidate gene association analysis. Remarkably, the ortholog of Zm00001d044081 was previously proven to influence apical embryo development and meristem function in Arabidopsis (Turchi et al. 2013), which has a higher expression level in maize seeds than other tissues. These findings indicated that integrating association and QTL mapping could improve the reliability and efficiency in the detection of genetic loci correlated with maize KTW.

Genetic architecture of KTW in maize

Recently, several studies have reported the genetic basis for KTW in different plant species (Holland and Munkvold 2001; Ding et al. 2011; Schulthess et al. 2017; Cabral et al. 2018). However, only a few studies were performed with maize using both linkage analysis and GWAS (Liu et al. 2017b). In this study, we assessed the genetic architecture of maize KTW using multi-environment linkage mapping and GWAS. Phenotypic analysis revealed that KTW of maize had a high heritability of 0.73–0.80 (Table 2), indicating that the phenotypic variance of KTW is mainly controlled by genetic factors. Moreover, in the IBM Syn10 DH population, 19 QTL for KTW were detected without a major QTL identified (PVE > 10%), illustrating that maize KTW was predominantly controlled by minor genetic loci. This is generally in agreement with previous studies (Ding et al. 2011; Liu et al. 2017b). Here, nine QTL with LOD values < 7.02 were found to have a PVE > 5%, of which five were consistent with the previously reported QTL (Ding et al. 2011; Liu et al. 2017b). Specifically, the QTL qKTW4-5 and qKTW5-3 were found to, respectively, overlap with bin4.08/4.09 (bnlg292-umc1328) and bin5.06 (umc1019-phi087) QTL that were detected by Ding et al. (2011). The QTL qKTW4-4 detected in this study was also previously identified in two RIL populations (K22 × C17 and ZHENG58 × SK) by Liu et al. (2017b). Additionally, two QTL in DE3 × BY815 and one significant SNP (chr7.S_158996543) that were reported by Liu et al. (2017b) were both located in the QTL qKTW7 that was identified in the present study. In total, 57.89% (11/19) of the QTL detected by this study were consistent with the previously reported QTL by Ding et al. (2011) and Liu et al. (2017b), suggesting that these QTL were common across environments and populations. However, the other eight QTL detected in our study are novel, which represent the environment and/or population-specific QTL.

In the association pool, 19 significant KTW-associated SNPs were identified, among which 7 were mapped to the previously detected QTL intervals (Ding et al. 2011; Liu et al. 2017b) and 1 (PZB01964.5) was closely linked to the ZmOCL1 gene that influences maize kernel size (Khaled et al. 2005). The SNP SYN20747, which had a P value of 7.35E−07, was simultaneously located within the QTL intervals detected by Ding et al. (2011) and Liu et al. (2017b). The remaining five significant SNPs (SYN21456 and PZE-104127177 on chromosome 4, SYN18432 on chromosome 7, and PZE-110045620 and SYN4216 on chromosome 10) were cross validated by Ding et al. (2011) and Liu et al. (2017b) using linkage mapping through segregation populations.

Overlapped genetic basis between maize KTW and kernel size

KTW is represented by the weight of kernels in a unit volume and influenced by kernel size (Sun et al. 2008; Ding et al. 2011). Our previous study investigated the kernel size-related traits in the two populations under the same environments (Liu et al. 2019), which enables us to analyze the phenotypic correlations between KTW and kernel size. The results showed that KTW was significantly negatively associated with kernel length (KL), kernel width (KW), and kernel thickness (KT), with correlation coefficient ranging from 0.17 to 0.40 in the association pool and 0.15–0.24 in the IBM population (Table S3). Moreover, two (10.53%) KTW-QTL (qKTW5-2 and qKTW7) detected in this study overlapped with the QTL (qKS5-2 and qKW7-3) for kernel size identified by our previous study, respectively. Especially, two KTW-associated SNPs (SYN21456 and SYN4216) were exactly located in the intervals of qKT4-3 and qKW10-2, respectively. These suggested that KTW and kernel size share partial genetic basis. By comparing our present study and the results reported by Liu et al. (2017b) and Zhang et al. (2017), we found that 14 (73.7%) QTL controlling KTW detected by this study were previously reported to regulate kernel size (Liu et al. 2017b; Zhang et al. 2017). Among them, qKTW8 overlapped with the four QTL for KL, KW, KT, and kernel volume (KV), respectively, reported previously (Liu et al. 2017b; Zhang et al. 2017). Furthermore, 10 (55.6%) KTW-correlated SNPs situated within the confidence interval of the QTL for kernel size identified previously (Liu et al. 2017b; Zhang et al. 2017). Remarkably, the environment-stable SNP (SYN18432) was harbored by the two QTL (qKL7 and qKW7-1) identified previously (Liu et al. 2017b; Zhang et al. 2017). These findings supported the presumption that maize KTW and kernel size are closely related on genetic basis. However, 5 (26.32%) QTL and 8 (44.44%) SNP were specifically detected by this study, which were not reported to control kernel size of maize in the previous studies. These KTW-specific loci should be given more attention in improving maize KTW during maize breeding.

Potential application of the detected QTL/SNPs for MAS to improve KTW

Compared to traditional breeding, MAS breeding that has the advantages of reducing the effort and time required in large-scale field tests is more efficient in improving maize yield (Jiao et al. 2012). KTW is a complex quantitative trait, which means that the repetitive or co-located loci are most important for MAS and also useful for breeding (Ribaut et al. 1998; Peng et al. 2011; Zhang et al. 2017). In our study, some repetitive or co-located loci involving KTW were consistently detected across multiple environments or different populations, including qKTW1-1, qKTW5-3, qKTW7, PZE-103165542, and SYN18432. Additionally, the SNP S_3_218483378 in the 3′-UTR of the gene Zm00001d044081 was significantly associated with KTW across three environments, which represented the environment-stable loci controlling maize KTW. These loci should be a priority for MAS of maize KTW improvement and for fine mapping. More importantly, the QTL qKTW7 was detected across all environments in this study, which overlapped with the two KTW QTL identified in the RIL population (DE3 × BY815) and contained one KTW-associated SNP (chr7.S_158996543) reported by Liu et al. (Liu et al. 2017b). The QTL qKTW5-3 that was detected across three environments by this study overlapped with the QTL reported previously (Ding et al. 2011; Liu et al. 2017b). Evaluation of superior allele utilization showed that the elite lines possess only 27.8–66.7% of the superior alleles. Moreover, the 18 elite lines that contained 5–8 superior alleles had an average KTW of 688.22 g/L, which was significantly lower than the other 17 elite lines that possessed 9–12 superior alleles (739.70 g/L) (Table S3). These findings suggest that the integration of more superior alleles in these lines by MAS could improve KTW of the elite lines.