Introduction

Although as the highest yielding of the three staple crops in China, the coverage of mechanization for maize is less than that for the other two (rice and wheat). Along with the reduction of the labor force, the full mechanization of maize production is an inevitable trend and has become the key goal of breeding. However, the level of mechanized harvesting of maize always lags behind, at present, especially in this developing country. One of the most important reasons is the germplasm resource, where the grain water content (GWC) at harvest remains too high (Ma et al. 2016). The high GWC will restrict mechanized harvests and result in a series of post-harvest problems such as ear germination and grain mildew, increasing the risk of grain broken rate and drying and storage costs (Sweeney et al. 1994; Capelle et al. 2010; Xiang et al. 2012; Kebebe et al. 2015). In China, the government has set different standards for the grain moisture content to support the growth of mechanized harvest varieties of maize in different regions. It advises that the GWC should be 16.15 ~ 24.78%, and the best harvest quality was so far documented with about 20% GWC in the Huanghuaihai summer sowing group (Li et al. 2018a). Based on that, the development of methods to minimize the GWC at harvest has become an urgent project for breeders.

When maize is maturing, the change in GWC can be divided into two stages, namely, dry matter accumulation before physiological maturity (PM) and natural drying after PM (Shaw and Loomis 1950). A high grain filling rate before PM and a high grain dehydration rate (GDR) after PM contribute to a lower GWC at harvest (Freppon et al. 1992). Previous studies reported that GWC and GDR in maize are affected by many other factors, including the endosperm composition, starch synthesis in kernels, kernel shape, rows per ear, husk characterization, cultivation measures, and the temperature and humidity of the external environment (Schmidt and Hallauer 1966, Purdy and Crane 1967, Nass and Crane 1970, Hunter et al. 1979, Hadi et al. 2009; Takhar et al. 2011; Li et al. 2016; Qu et al. 2021).

Currently, genetic analysis of GWC is a hotspot of research for genetic and molecular biologists, and they have made great efforts to understand the regulation mechanism (Sala et al. 2006, 2012; Liu et al. 2010; Wang et al. 2012; Li et al. 2014; Dai et al. 2017; Jia et al. 2020; Li et al. 2020b, 2021b; Zhang et al. 2020). Sala et al. (2006) used 181 F2:3 family lines for linkage mapping and identified six putative QTL for grain moisture (GM) at harvest with 10.4 and 19.7% of the phenotypic variation, while three QTL accounted for 16.4 and 20.7% of the phenotypic and genotypic variation for GDR, respectively. Jia et al. (2020) identified seven commonly associated SNPs for GDR using 309 maize inbred lines and 285 differential expression genes combining the transcriptome. Liu et al. (2020) successfully refined one QTL-qGwc1.1 into a 2.05 Mb region using 362 recombinant inbred lines (RILs). Meanwhile, by combining an association panel, genetic population analysis, transcriptomic profiling, and gene editing, Li et al. revealed 71 QTL that influence GWC, then identified GRMZM5G805627(ZmGAR2) and GRMZM2G137211(ZmCRY1-9) as candidate genes underlying major QTL for GMC in maize (Li et al. 2021c).

Although the existing studies have made breakthrough research progress in understanding the genetic characterization of GWC and GDR, few genes were cloned and few reports focused on the hybrids widely used in production. In this study, we constructed a hybrid population comprising 442 different F1 and performed a genome-wide association analysis of GWC and GDR after physiological maturity. Then, we analyzed the genetic basis for hybrid maize GWC and nominated several candidate genes. The results are valuable to assist and guide future breeding work.

Materials and methods

Materials and field experiment design

One hybrid population containing 442 F1 cross-combinations (including four normal hybrids as checks) was derived from 113 inbred lines based on the North Carolina II (NCII) mating design (Table S1). Among them, 25 inbred lines were selected from the Shaan A group (named group A) and 88 from the Shaan B group (named group B), where nine inbred lines—A008 (KA105), A009 (KA064), A019 (91,227), and A021 (2014KA60) from group A and B018 (KB024), B031 (KB207), B054 (KB106), B110 (KB262), and B137 (KB588) from group B—were set as test lines. All F1 materials were planted with two replications based on an alpha lattice design in two locations, Yangling (34° 16′ N, 108° 40′ E) and Yulin (38° 30′ N, 109° 77′ E) in Shaanxi province. Each material was planted in four rows with a 5 m row length, 0.6 m row spacing, and 18 cm plant spacing. All field management followed the actual local production management. The detailed mating design and field experiment design were reported in the previous study (Li et al. 2020a).

Phenotype collection and statistical analysis

For each F1 material, at least 15 individuals with the same flowering period and growth vigor were selected for marking in the silking stage for subsequent measurement; the grain water content (GWC) was collected from the individuals with a uniform silking time at the kernel development stage using a digital timber-moisture meter (BLD5609, PROTIMETER Company, TIMBERMASTER), which has been validated and used in genetic analyses of GWC in other research (Yang et al. 2010). To reduce the influence of the growth period on the GWC, all phenotypes were determined from the same individuals and measured every 7 days, starting at 35 days after silking (DAS) and carrying on until harvest (GWC at DAS35 was respected as M35, and others were respected as M42, M49, M56, and M63). For each material, we determined nine data points with three ears per period and three times per ear. Meanwhile, ten other traits were investigated: the ear leaf length (ELL, cm), ear leaf width (ELW, cm), ear leaf area (ELA, cm2, calculated by ELW × ELL × 0.75), rind penetrometer resistance of the third internode above-ground (RPR_TIAG, N/mm2), rind penetrometer resistance of the first internode under the ear (RPR_IUE, N/mm2), tassel branch number (TBN), plant height (PH, cm), ear height (EH, cm), grain moisture (moisture, %, determined by PM-8188), and grain yield (yield, t/ha, adjusted to a 14% moisture content), as described in previous articles (Li et al. 2020a).

In addition, we calculated the grain dehydration rate (GDR) according to the area under the dry down curve (AUDDC) method, which can quickly and effectively identify the GDR of hybrids, as the standard for evaluating GDR (Yang et al. 2010). The lower the AUDDC, the higher the GDR. The formula is as follows:

$$\mathrm{AUDDC}={\sum }_{i}^{n-1}[({y}_{i}+{y}_{i+1})/2)]({t}_{i+1}-{t}_{i})$$

where n is the number of evaluations, y is the GWC, i is the ith measurement date, and t is the days after silking. In this method, we calculated the AUDDC of four stages (AUDDC calculated by GWC35 and GWC42 was respected as A1, AUDDC calculated by GWC42 and GWC49 was respected as A2, AUDDC calculated by GWC49 and GWC56 was respected as A3, and AUDDC calculated by GWC56 and GWC63 was respected as A4) and the total AUDDC (respected as AT, the sum of A1, A2, A3, and A4).

The descriptive statistics for the phenotype were analyzed using Microsoft Excel 2019 and R (4.04) project (https://www.r-project.org/). Analysis of variance and generalized heritability (H2) were calculated by Genstat 21st (www.vsni.co.uk). The variance analysis model is as follows:

$${Y}_{ijk}=\mu +{G}_{i}+{E}_{j}+{GE}_{ij}+{R}_{j(k)}+{\varepsilon }_{ijk}$$

where Yijk was the observed measurement for the i combinations in the k replication in the j environment, μ was the mean, Gi was the genotype effect, Ei is the environmental effect, GEij was the genotype × environment effect, Rj(k) was the replication effect, and εijk was the error term.

The calculation formula for H2 is

$${H}^{2}=\frac{{\sigma }_{G}^{2}}{{\sigma }_{G}^{2}+\frac{{\sigma }_{G\times E}^{2}}{n}+\frac{{\sigma }_{e}^{2}}{n\times r}}$$

where \({\sigma }_{G}^{2}\) was the genetic variance, \({\sigma }_{G\times E}^{2}\) was the variance of the interaction between genes and the environments, \({\sigma }_{e}^{2}\) was the variance of the random error, n was the number of environments, and r was the number of repetitions. Best linear unbiased estimations (BLUEs) were estimated by the R package lme4 (Bates et al. 2015; Li et al. 2020a).

Genome-wide association analysis in hybrid combinations and candidate genes’ screening

Genotype and genetic characteristic analysis has been described in detail in published articles (Li et al. 2020a). Additive and dominance effect of each loci and pair interaction (epistatic) effects can be estimated in genome-wide association analysis (GWAS). The baseline model used for genome-wide association mapping is the mixed linear model (Yu et al. 2006) which has become a standard method for GWAS. To comprehensively analyze the genetic mechanism of hybrid GWC and GDR, we adopted a method to conduct a genome-wide association study (GWAS) and estimate the genetic framework considering additive (Add), dominance (Dom), and epistasis effects (additive × additive AA, additive × dominance AD, dominance × additive DA, dominance × dominance DD) (Jiang et al. 2017). The estimation of effects is as follows:

$$Y\sim \mu +\sum_{i}{a}_{i}+\sum_{i}{d}_{i}+\sum_{i,j,i\ne j}(\frac{1}{2}{a}_{i}{a}_{j}+\frac{1}{2}{a}_{i}{d}_{j}+\frac{1}{2}{d}_{i}{a}_{j}+\frac{1}{2}{d}_{i}{d}_{j})+\varepsilon$$

where \(Y\) is the phenotypic value, \(\mu\) is the population mean, \({a}_{i}\) is the additive effect of site \(i\), \({d}_{i}\) is the dominance effect of site \(i\), \(\frac{1}{2}{a}_{i}{a}_{j}+\frac{1}{2}{a}_{i}{d}_{j}+\frac{1}{2}{d}_{i}{a}_{j}+\frac{1}{2}{d}_{i}{d}_{j}\) is the epistatic effect between loci \(i\) and site \(j\), and \(\varepsilon\) is the residual error.

For genome-wide association analysis, genotype data have been described in detail in previous studies (Li et al. 2020a), based on the condition that the minor allele frequency (MAF) < 0.05 and the missing rate (MR) < 0.1; 19,461 high-quality SNPs were screened, while the number of independent SNPs (883) of 19,461 was calculated (Gao et al. 2008); the significance threshold of the additive effect and dominant effect was set as 3.95 (0.1/883) and a stricter threshold for epistasis effect of 7.59 (0.01/(883*(883–1))/2]) to facilitate the identification of more reliable GWC-related interaction SNPs.

Candidate genes’ annotation and protein-coding interaction

According to the LD decay distance (150 kb) in R2 > 0.2, we considered genes around 150 kb from the significant loci as candidate genes using the B73_RefGenV4 genome database on the MaizeGDB website (https://www.maizegdb.org/) (Li et al. 2018b). Gene expression data comes from previous studies (NCBI Gene Expression Omnibus under accession number GSE15881, Qu et al. 2022). The edgeR software (http://www.bioconductor.org/packages/release/bioc/html/edgeR.html) was used for screening the differential expression genes (DEGs), and the significant threshold was set as 0.05, and the fold change was 2. Meanwhile, public databases MaizeGDB, NCBI (https://www.ncbi.nlm.nih.gov/), and InterPro (https://www.ebi.ac.uk/interpro/) were used to search for the structure and function of the protein family. For GO enrichment analysis, the background documentation came from AGRIGO (http://bioinfo.cau.edu.cn/agriGO/), and the analysis process used the tools in Gene Denovo (https://www.omicshare.com/tools/). The protein interaction network of the candidate genes was constructed using the online tool STRING (https://cn.string-db.org/) and imaged in Cytoscape (3.9.0).

Results

Phenotypic variations of GWC in hybrid population

To evaluate the dynamic variant of GWC and corresponding AUDDC from maturity to harvest in maize hybrids, the GWC was determined in five periods—35 days after silking (M35), M42, M49, M56, and M63—in two environments. The analysis of variance components showed that the genotype and genotype-environment interaction of GWC and AUDDC were significantly different (P < 0.001) for GMC and AUDDC in all periods (Table 1). In particular, the effects of both parents were significant (P < 0.001) for GWC and AUDDC in all periods, and the genotype interaction effect between bi-parents was significant in most periods, except for M49. Yet, the interaction effect between the bi-parent and environment was significant only in M42, M63, and A4.

Table 1 Analysis of variance components (σ2) and descriptive statistics of BLUEs for GWC and AUDDC in hybrid population

Along with the kernel development, the average GWC was decreased from 40.28 to 27.65%, and the coefficient of variation (CV) of them was increased from 1.99 to 8.31%. The same phenomenon also happened for AUDDC, where the average varied from 277.72 to 209.30 and CV from 2.27 to 6.56%. This result indicates that the differences between materials became increasingly apparent at later times due to the different dehydration rates. The heritability for GWC ranged from 0.42 to 0.77 at different stages and from 0.64 to 0.83 for AUDDC during different periods (Table 1). In addition, we found that GWC and AUDDC were extremely significantly positively correlated at different stages (Fig. 1), and the grain yield (GY, t/ha), grain moisture (GM) at harvest, ear leaf width (ELW), and ear leaf area (ELA) were positively correlated with GWC and GDR at later stages (Figure S1).

Fig. 1
figure 1

The pairwise correlations among the GWC, AUDDC, and other 10 agronomic traits based on the best linear unbiased estimations (BLUEs) of 442 single-cross combinations. The numbers with one or more star(s) represent the Pearson correlation coefficients at different significant level, * for 0.05, ** for 0.01, *** for 0.001, and not assigned for not significant

Parental effect of GWC in hybrid population

To verify the effect of different test parents on the GWC and AUDDC in the F1 generation, we compared the differences in hybrids from different tester lines using the Fisher LSD method with a significance level of 0.05 (Fig. 2). The decline of GWC was apparently faster in the later than in the earlier stage (Fig. 2A). When we constructed the individual trend for the F1s from the same test lines, the results showed that GWC of the combination from the test lines A008, A009, and B137 was at a high level in all periods, while the combinations between A021 and B018 were always at low levels. Interestingly, the hybrid combination of B110 combination was at a high level of GWC in 35–49DAS but rapidly dehydrated in the later stage which can be used to improve the late dehydrated during breeding process. (Fig. 2B, C). This significant differential between the combinations from the same test lines indicated that the test lines make contributions to the combination. Among them, A021 and B018 can be used to improve the germplasm by improving the dehydration rate during the breeding process.

Fig. 2
figure 2

Phenotypic variation of GWC in this hybrid population. A GWC trend of all materials. B The average GWC trend of hybrid combinations with different test lines. C Differences in GWC between 35 and 63DAS of hybrid combinations with different test lines

Genome-wide association analysis of GWC and AUDDC

By conducting a genome-wide association analysis of GWC and AUDDC using BLUE values of 442 F1 hybrids in two environments (Yangling and Yulin), 48 independently associated SNPs were identified, including 22 SNPs for GWC and 26 for AUDDC, with ten common SNPs identified for GWC and AUDDC (Table 2, Fig. 3A, and Figures S2 and S3). Finally, 26 unique SNPs were left by deleting the association SNP with a higher P value according to the R2 between the SNPs, and 3 of them were simultaneously associated with more than four traits from GWC and AUDDC. One was chr4:150,010,129, which was simultaneously associated with M35, M42, M49, A1, A2, A3, and AT, the other two SNPs were chr2:14,522,675 and chr8:165,108,325, both of which were associated with M56, A3, A4, and AT. In addition, five associated SNPs with more than 10% phenotypic variant explanation (PVE) were detected: chr1:4,535,637 for M56 with 16.55% PVE and A3 with 17.65% PVE, chr2:5,320,161 for AT with 21.07% PVE, chr2:14,522,675 for A4 with 10.04% PVE, and chr6:2,463,271 for A4 with 13.09% PVE (Table 2).

Table 2 Significant SNP of GWC and AUDDC identified by genome-wide association analysis
Fig. 3
figure 3

The significance SNPs identified for GWC and AUDDC at different stages. A Distribution on the 10 chromosomes. Site color represents to -log10 P value. B Phenotypic variation explanation of different effects. ADD, additive effect; Dom, dominance effect; Epi, epistatic effect

In addition, 64 pairs of interaction SNPs were detected for GWC and 77 pairs for AUDDC, which contained 162 independent SNPs (Table S1). Among them, 28 common pairs of interaction SNPs were associated with more than two individual traits, such as chr1:156254070_chr5:26,130,738 associated with M49, A3, and AT or chr1:261928509_chr8:114,505,826 related to M35, M42, and A1. For GWC and AUDDC at different stages, the total PVE of GWC (11.39–68.2%) and AUDDC (41.07–67.02%) was widely different (Fig. 3B). Epistatic effects played an important role in PVE, especially M49 and M63, accounting for 28.05% and 11.39%, respectively. This suggested that GWC and AUDDC might be controlled by several major genes with multiple minor genes. The additive and epistatic effects might be the main genetic effects.

Relationship between significant loci and phenotype

To dissect the effect of the loci on the related phenotype, the phenotypic differences between different alleles of the associated SNPs were analyzed. We found that the favorite allele of the co-located SNPs associated with more than three traits or major SNPs, with PVE > 10%, was always a homozygous genotype, such as CC in chr1:4,535,637, GG in chr2:14,522,675, TT in chr2:5,320,161, TT in chr2:9,065,055, AA in chr4:150,010,129, CC in chr6:2,463,271, and GG in chr8:165,108,325 (Fig. 4AG). For the epistatic SNP pairs, the individual with a double homozygous genotype (TT-CC) for the SNP pair chr1:156254070_chr5:26,130,738 showed a lower trend in GWC and AUDDC (Fig. 4H). The individuals with double homozygous (TT-CC) and single heterozygous (TT-CT and CT-CC) genotypes in the co-located epistatic SNP pair chr1:261928509_chr8:114,505,826 performed significantly worse in terms of GWC and AUDDC than double heterozygous genotypes (CT-CT) (Fig. 4I). For the co-located epistatic SNP pair chr1:267007323_chr2:1,263,421, the lowest GWC and AUDDC occurred in the double homozygous individuals (CC-GG) (Fig. 4J). These results suggest that a low GWC mainly requires the accumulation of homozygous genotypes and the interaction of a small portion of the heterozygous genotypes with other loci. In the future, we can explore KASP markers using these favorite homozygous alleles, to select a low GMC germplasm for mechanized harvested varieties’ breeding.

Fig. 4
figure 4

Phenotype of different allele types of the SNP screened by multiple stages and high explanatory phenotypic variation (PVE > 10%). A SNP chr4:150,010,129. B SNP chr2:14,522,675. C SNP chr8:165,108,325. D SNP chr1:4,535,637. E SNP chr2:5,320,161. F SNP chr2:9,065,055. G SNP chr6:2,463,271. H Interaction SNP pair chr1:156254070_chr5:26,130,738. I Interaction SNP pair chr1:261928509_chr8:114,505,826. J Interaction SNP pair chr1:267007323_chr2:1,263,421. The calculation of phenotypic difference level among genotypes is based on Fisher LSD multiple comparison, and the significance level is 0.01

Distribution of favorite allele in group A and group B

To clarify the distributions of the favorite alleles in group A and group B, we calculated the favorable allele enrichment of the parent materials at different stages of GWC and AUDDC by considering the homozygous genotype with the lowest phenotypic level in the heterozygous generation as the favorable allele type. The results showed that for all materials, M35 (71.1%) and A1 (72.5%) had the highest enrichment of favorable alleles, while the enrichment period of other stages was relatively close, from 41.1 to 56.9%, except for M63 (11.39%) (Figures S4, Figures S5, and Table S2). Furthermore, the enrichment of favorable alleles in group A and group B only showed a significant difference in M35 and A1 but not in other traits. Overall, we recommend some promising inbred lines, including B056 (KB106), B018 (KB024), B151 (KB228), B022 (KB025), and A021 (2014KA60), which have the potential to reduce maize GWC through the enrichment of favorable alleles.

Candidate genes annotated by bioinformatics

To further infer the candidate genes for GWC and AUDDC, we screened the putative genes around the associated SNPs within a 150 kb distance, which was the LD decay distance in the Shaan A and Shaan B populations. In total, 727 and 860 protein-coding genes were located on the confidence region for GWC and AUDDC, respectively. Combined with the RNA-seq data for kernels at a later development stage in two inbred lines (A034:KA225 and B008:KB020) (Figure S6A), where the GWC and AUDDC were evidently different (Qu et al. 2021), we detected 398 GWC-related genes and 457 AUDDC-related genes, with 137 common genes expressed differentially in later kernel development (Tables S3 and S4 and Figure S6BE).

Through GO enrichment analysis, we found that the 398 putative genes for GWC were enriched in the vitamin E metabolic pathway (GO:0,046,136, GO:1,904,965, and GO:1,904,966), leukotriene process (GO:0,006,691 and GO:0,019,370), and hydroxyl acid oxidase activities (GO:0,008,891, GO:0,003,973, GO:0,052,852, GO:0,052,853, and GO:0,052,854), which participate in photorespiration and regulate plant development and senescence (Fig. 5A). For the 459 putative genes for AUDDC, they were enriched in the pathway of zeaxanthin epoxidase activity (GO:0,052,662 and GO:0,052,663), the metabolic processes of salicylic acid (GO:0,046,244) and cinnamic acid (GO:0,009,800 and GO:0,009,803), and the molecular means of antheraxanthin epoxidase and zeaxanthin epoxidase activity (Fig. 5B). Most of them were related to abscisic acid synthesis and would promote seed development and fruit maturation.

Fig. 5
figure 5

GO enrichment analysis of candidate genes. A GWC-related genes. B AUDDC-related genes. Enrichment score = (number of pathway genes/total number of pathway genes)/(number of candidate genes/number of annotated genes in maize genome)

Interaction network predicted by the candidate genes

To gain an understanding of the regulation network for GWC and AUDDC, we constructed the interaction network on STRING’s website (STRING: functional protein association networks (https://cn.string-db.org/)) using the proteins encoded by the overlapped candidate genes located in the confident region in GWAS and differentially expressed from the RNA-seq (Fig. 6 and Tables S3 and S4). This revealed that 76 of them interacted with each other and formed a big network, including NAD7 (NADH dehydrogenase subunit 7), PPI1 (microtubule-associated protein futsch-like isoform X1), ENO1 (enolase I), and H3 (histone H3), which are encoded as the structural proteins or key enzymes that maintain the normal function of cells. Meanwhile, apoptosis-related autophagy family proteins (ATG4B, ATG8C, and ATG18A), auxin-related proteins (ABP1, ARF4, ARF23 GRF10, and IAA31), and transcript factors (MADS5, WRKY3, GATA4, and GATA12) were involved in the regulation network.

Fig. 6
figure 6

The interaction network constructed of proteins coding by all candidate gene. Yellow for GWC candidate gene encoding protein, green for AUDDC, and red for screened by both two traits

Beyond this, to explore the interactions between the identified epistatic effect pairs, SNPs with multiple epistatic effects were screened out in an interaction network analysis of candidate gene-encoded proteins. For a group of 13 SNPs involved in epistasis, the proteins encoded by the candidate genes of these sites definitely interacted, while another group of 12 SNPs was also observed to interact (Figure S7). The results affirmed the reliability of the epistatic interaction between SNPs identified by GWAS in this study.

Discussion

GWAS analysis for GWC and KDR in F1 population

According to the development of agriculture, mechanized harvested kernels have become a major product mode. To achieve the goal of a good harvest, low grain moisture at harvest has become a new target during the breeding process all over the world (Chai et al. 2017). In past decades, many QTL were identified in genetic populations constructed by generations from bi-parents (Austin et al. 2000; Capelle et al. 2010; Li et al. 2014; Liu et al. 2020). In 2021, Li et al. identified GRMZM5G805627 and GRMZM2G137211 as candidate genes underlying major QTL for grain moisture in maize by combining GWAS analysis, transcriptomic profiling, and gene editing. Later, Qu et al. (2022) conducted GWAS for grain moisture and the dehydration rate from 7 days after pollination (DAP) to 70 DAP and recommended a series of candidate genes including Zm00001d047799 (ZmHSP5) by integrating multiomics analysis and mutants. However, most of the existing studies focus on natural populations or progeny populations derived from a few parents, and there was still a big gap to dissect the regulatory mechanism including dominance effect and epistasis effect in GWC, especially in hybrid populations. In fact, in addition to the evaluation of dominance effect and epistasis effect, the hybrid population could also construct a mapping population with a certain scale and wide genetic diversity in a relatively short time through limited parental inbred lines. Xiao et al. (2021) analyzed the possible formation mechanism of dominance effect and epistasis effect by combining the positioning results of hybrid, inbred line, and heterosis. Huang et al. (2015) analyzed 130 related QTLs of 38 traits and found that most of the heterozygous QTLs showed incomplete dominance effect.

In this study, a population including 442 F1 from 113 inbred lines was used to conduct GWAS for GWC and GDR after physical maturity, and 188 unique SNPs with 26 common SNPs were identified. Among them, 162 SNPs regulated GWC and GDR by a minor epistasis effect. Many of the SNPs located in QTL had been identified by previous researchers (Blanc et al. 2006; Wang et al. 2012; Xiang et al. 2012; Qian et al. 2016; Liu et al. 2020) (Table S5). For instance, the SNP located at chr2:14,522,675 and associated with A3, A4, AT, and M56 was also located in the confident region shown by Li et al. (2014) and Wang et al. (2012), the same went for chr5:76,245,305 and so on.

Identification of candidate genes for GWC and KDR

In previous studies, many QTL and associated SNPs were identified by linkage and association mapping (Austin et al. 2000; Capelle et al. 2010; Li et al. 20142021b; Qu et al. 2022; Liu et al. 2020). Yet, few genes were validated that affect GWC by gene editing or mutants, such as GAR2 (GRMZM2G137211), CRY1-9 (GRMZM5G805627), and ZmHSP5 (Zm00001d047799) (Li et al. 2021c; Qu et al. 2022). By combining GWAS and transcriptome analysis, 718 candidate genes were nominated including HSP70-6 and HSP70-8, belonging to the heat stock protein 70 kDa family ZmHSP5 identified in Qu et al. (2022), along with ZmCRY1 (Zm00001d028434) as the homologous gene of ZmCRY1-9, which was identified in Li et al. (2021c) (Table S4). These genes were enriched in the metabolic processes of vitamin E, leukotriene, salicylic acid, and cinnamic acid, and the molecular function with hydroxyl acid oxidase, antheraxanthin epoxidase, and zeaxanthin epoxidase activities.

In addition, three autophagy-related genes were identified: ATG8C (Zm00001d002257), ATG4B (Zm00001d026641), and ATG18A (Zm00001d011920) (Tables S3 and S4), which have been reported as indicating that autophagy participates in the senescence of tissues or organs and can transfer the nitrogen in aging organs to grains (Yamada et al. 2009; Reyes et al. 2011; Wang et al. 2013; Li et al. 2015; Masclaux-Daubresse et al. 2017). Similarly, we also detected many candidate genes involved in flower development (Zm00001d045231-MADS5), seed maturation (Zm00001d048369-COL3 and Zm00001d007107-COL13), and auxin metabolism (Zm00001d041711-ABP1, Zm00001d011953-ARF4, and Zm00001d041418-IAA31), which may play a key role in the middle stage of seed dehydration (Silva and Goring 2002; Haffani et al. 2006; Bai et al. 2009; Chen et al. 2020; Shen et al. 2020; Huang et al. 2021; Li et al. 2021a; Ghorbani et al. 2021; Ramirez-Ramirez et al. 2021; Mengarelli and Zanor 2021). It is worth noting that MADS family proteins have been reported to regulate IAA through ARF4 (Ge et al. 2016), which is consistent with the discovery that MADS5 interacted with IAA3 by CRR2 and ARF4 in this study (Fig. 6). Furthermore, another putative gene—Zm00001d002517 on chr2:14,522,675—encodes C/IF2 as a small molecule protein and has been reported to regulate seed and fruit development and senescence by inhibiting sucrose hydrolysis (Qin et al. 2016; Tang et al. 2017; Zhang et al. 2018), along with the ARF4 encoding gene Zm00001d011953 of chr8:165,108,325. Therefore, we conferred that the regulation of autophagy and auxin plays an important role in GWC and GDR. This can provide a reference for determining the truth-functional gene in further research.

Improvement of germplasm with low GWC for breeding

In maize production, germplasm and hybrids with a low GWC at harvest are conducive to adapting to mechanized operation. In traditional breeding, people selected rapidly dehydrating maize empirically by the morphological characteristics or determined GWC by oven-drying, which often had a low accuracy when it was quickly and easily carried out, with a high accuracy more time-consuming and laborious (Borras et al. 2003; Sala et al. 2007; Wang et al. 2012; Liu et al. 2020). Then, a more rapid GWC measurement method was developed, and many articles have used this method to study the genetic mechanisms of GWC and GDR (Reid et al. 2010; Yang et al. 2010; Kebede et al. 2016; Qian et al. 2016; Zhou et al. 2018). However, the accuracy of the electrical devices used will still be biased by some objective factors, such as the determining location, climate differences, and operator. With the development of molecular technology, exploring a molecular marker for screening low GWC materials will be an effective method to accelerate the process of maize mechanized germplasm breeding (Chai et al. 2012; Hao et al. 2014; Wang et al. 2016; Zhao et al. 2012).

By conducting GWAS and dissecting the genetic effects of the associated SNPs in the F1 hybrid population, we inferred that GWC and GDR were mainly affected by additive effects and could be stably inherited from parents to hybrids, which was consistent with the previous studies (Crane et al. 1959; FDLIPENCO et al. 2013; Li et al. 2020b2021b). Furthermore, for the identified SNPs with > 10% PVE or commonly associated with more than two traits, the homologous genotype was the favorite allele (Fig. 4). This suggested that we can explore molecular markers and apply them to directly select the genetic improvements to be made to maize with a low GWC or rapid dehydration trait, especially for the SNPs with > 10% PVE (Fig. 4).

Conclusions

In this study, we conducted a genome-wide association analysis to determine the genetic characterization of GWC and GDR in a maize hybrid population with 442 F1 crosses, and we proposed that GWC and GDR were mainly affected by the additive effect and epistatic effect. Based on 788 candidate genes, we speculated that autophagy and auxin regulation were key pathways affecting GWC, and we identified candidate genes related to them for further research and analysis. This research contributes to clarifying the genetic mechanisms of GWC and GDR, and it also provides excellent germplasm information to adopt in mechanized harvests, which is useful for future breeding of maize.