Introduction

Maize (Zea mays L.) is the most popular cereal crop and is widely cultivated for use as staple food, livestock feedstock, edible oil, and biofuel (Mackay 2009). Maize ranks second after wheat in terms of harvested area, but its production and yields have been rising globally because of increases in the harvested area of maize. World production, harvested area, and yield for maize were 1162.4 million metric tons, 202.0 million ha, and 5.8 t/ha, respectively, in 2020 (FAOSTAT 2020). Maize, a naturally cross-pollinated plant, undergoes pollination by other maize plants for over 97% of its kernels under field conditions (Brittan 2006). Maize hybrids typically exhibit greater yield and quality, such as uniform grain color and size, compared with open-pollinated varieties (OPVs) in open-field conditions owing to the effect of heterosis, which results in higher vigor and productivity (van Heerwaarden et al. 2009). Therefore, almost all commercially cultivated maize varieties are hybrids (Wagner et al. 2021). Among the different types of hybrids, such as single-cross, three-way, and double cross hybrids, single-cross hybrid refers to a hybrid maize variety produced by crossing two inbred lines, and this type is widely used in modern agriculture because of its high yield potential and uniformity compared with other types of hybrids (Duvick 2001; MacRobert et al. 2014). However, F1 seed productivity of a single-cross hybrid is lower than that of other hybrids because the female is an inbred line (MacRobert et al. 2014). To ensure effective hybrid development, it is important to possess sufficient understanding of the genetic diversity among maize inbred lines, as well as their breeding potential (Adu et al. 2021). This knowledge serves as a guide when selecting appropriate parental candidates for the desired hybrids. Consequently, development and selection of elite inbred lines are critical for the success of a hybrid breeding program. Normal maize, which is widely cultivated and utilized in food and feed worldwide, contains a starch that is approximately 25% amylose and 75% amylopectin (Nelson and Rines 1962). On the other hand, waxy maize, in which the starch is exclusively made up of amylopectin, is a distinct type of maize that is utilized in food production in China and South Korea (Devi et al. 2017). A large number of waxy maize inbred lines have been developed from various sources in South Korea. These inbred lines were either obtained from local farmers in South Korea and other countries or collected from the Genebank at the Rural Development Administration (RDA) in South Korea.

According to breeding practice, the performance of parental inbred lines is not consistent with the hybrid performance, i.e., elite hybrid varieties are not necessarily derived from elite parental inbred lines (Yu et al. 2020). Therefore, breeders and researchers for maize hybrids should evaluate a parental inbred line by its performance and potential to produce superior hybrids (Riedelsheimer et al. 2012). Information on the genetic diversity (GD) and relationships among diverse maize inbred lines has a significant impact on improvement of new cultivars because it is useful for planning crosses for hybrid and inbred line development, assigning lines to heterotic groups, and protecting plant variety (Hallauer et al. 1988; Pejic et al. 1998). Traditional methods, such as morphological data and pedigree record, had limitations in assessing GD and relationships among inbred lines because of environmental interactions and being labor intensive. Using molecular markers can overcome these limitations, and several researchers have attempted to estimate GD and relationships using various molecular markers (Melchinger et al. 1992; Russell et al. 1997; Dao et al. 2014; Wang et al. 2008). Initial studies on predicting hybrid performance using molecular markers relied on calculating the genetic distance between lines. However, correlations between GD of parents and hybrid performance were not significant in previous studies (Melchinger 1999, Jordan et al. 2003). Heterosis, also known as hybrid vigor, is a complex phenomenon that cannot be attributed to a single hypothesis. It is influenced by various factors including the genetic distance between the parents, the mode of reproduction, the specific traits being studied, and the environmental conditions in which parental lines perform well (Kaushik et al. 2018).

As a result of population growth and a decrease in cultivated land, food insecurity has become a pressing issue in recent years. Consequently, improving crop yield has become the most important aspect of maize breeding programs (Li et al. 2011). Thus, there has been an increasing focus on grain yield and yield-related traits of maize. Kernel number and weight are essential components for crop yield and are also key traits for domestication and breeding (Kesavan et al. 2013). Kernel number is generally considered the main factor determining grain yield because maize accommodates environmental variation by adjusting seed number (Borrás et al. 2004; Egli 1998). Although relatively stable in response to environmental variation, kernel weight can still have an impact on grain yield (Borrás and Gambín 2010; Peltonen-Sainio et al. 2007). In South Korea, the Maize Experiment Station operated by the Kangwon Agricultural Research and Extension Services has developed a substantial number of elite inbred lines. These elite inbred lines have been used as parental lines for superior single-cross hybrid cultivars in South Korea. To ensure the long-term success of maize breeding programs in South Korea, it is necessary to conduct a genetic characterization of these inbred lines because they are rarely analyzed for GD and population structure for breeding programs using molecular markers. Therefore, this study genotyped a number of waxy maize inbred lines along with other types of maize, such as popcorn and flint maize, using simple sequence repeat (SSR) markers and evaluated 100-kernel weight (100KW) as a component of grain yield for each inbred line. The objective of this study is not only to investigate the GD and population structure using SSR markers for waxy maize inbred lines, which were used for developing superior varieties, but it is also to confirm the SSR markers related to 100KW using association analysis in waxy maize inbred lines collected from South Korea.

Materials and Methods

Plant Materials and Phenotypic Analysis

This study used 41 elite maize inbred lines that were generated and obtained from the Maize Experiment Station, Gangwon Agricultural Research and Extension Service, Hongcheon. Although almost all the inbred lines were waxy maize, there were some popcorn and flint maize inbred lines. The inbred lines are utilized as parental lines to produce various F1 hybrid varieties (Table 1). For example, HW9 and HW3 served as parental lines for Mibaek2ho, which is the most popular waxy maize variety in South Korea (Park et al. 2007). Although the series of Gangwonchal and Saekchalgyo are still unregistered, most of the maize varieties that were developed by crossing these inbred lines have been registered as cultivars in the KSVS and are being used in South Korea. Ten individuals of each inbred line were evaluated using completely randomized design with three replicates and 70 × 25 cm of planting density for 100KW of the 41 elite maize inbred lines at the College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do, in 2022. For association analysis, measurement of 100KW was performed using the topmost ears from each line and measured the seeds from the middle portion of each ear, and this study used three replicates for each line and an average value (Fig. 1).

Table 1 Main inbred lines and F1 hybrids developed in the Maize Experimental Station
Fig. 1
figure 1

Bar plot of 100KW traits in the waxy maize inbred lines. The color of the bar plot represents the three repetition values of each inbred line

DNA Extraction and SSR Amplification

Genomic DNA for the waxy maize inbred lines was obtained from young leaves using the Dellaporta et al. (1983) method with minor modifications. In the analysis of GD, population structure, and the association between markers and 100KW traits in the waxy maize inbred lines, a total of 150 SSR markers were utilized, with an average of 15 loci per each of the 10 maize chromosomes (Table 3, Supplementary Table 1). Information on the SSR markers, including their chromosome location, and sequences of forward and reverse primers were obtained from MaizeGDB (http://www.maizegdb.org/).

An amplification test for SSRs was carried out using an Ex Taq polymerase chain reaction (PCR) kit (Takara, Ohtsu, Japan). For PCR amplification of the SSR loci, a total volume of 20 µL of the product contained 20 ng of genomic DNA, 1 × Ex Taq buffer, 0.5 µM of forward and reverse primers, 0.2 mM dNTP mixture, and 1 unit of Ex Taq polymerase. The PCR protocol involved an initial denaturation step at 94 °C for 5 min, followed by a second step consisting of denaturation at 94 °C for 1 min, annealing at 65 °C for 1 min, and extension at 72 °C for 2 min. The annealing temperature was then gradually reduced in increments of 1 °C following every annealing stage until it reached a final annealing temperature of 55 °C. The second step was then repeated 36 times. After the completion of the two steps, a final third step was carried out at 72 °C for 5 min for extension.

Following the PCR reaction, DNA electrophoresis was conducted using a mini vertical electrophoresis system (MGV-202–33, CBS Scientific Company, San Diego, USA). Three µl of the PCR product was mixed with 3 µl of formamide loading dye (composed of 98% formamide, 0.02% bromophenol blue (BPB), 0.02% xylene C, and 5 mM NaOH). Two µl of the sample was loaded onto a 6% acrylamide‐bisacrylamide gel (19:1) in 0.5 × Tris–borate-EDTA (TBE) buffer and electrophoresed at 250 V for 40 ~ 60 min. The separated DNA fragments were then visualized by staining with ethidium bromide (EtBr).

Data and Statistical Analyses

The number of genotypes and alleles, GD, PIC, and MAF for the waxy maize inbred lines was identified using PowerMarker software (Liu and Muse 2005). Genetic similarities (GS) between each pair of lines were calculated using the Dice similarity index (Dice 1945). The similarity matrix was then utilized to construct a dendrogram based on UPGMA, with the help of SAHN clustering from NTSYS-pc (Rohlf 1998). The population structure of the waxy maize inbred lines was assessed using model-based program STRUCTURE software (Pritchard and Wen 2003). This software was run five times for each simulation subgroup (K value) ranging from 1 to 10 with a burn-in of 100,000 and a run length of 100,000 in an admixture model. The ΔK value based on the degree of change for log probability by Evanno et al. (2005) was calculated using STRUCTURE HARVESTER (http://taylor0.biology.ucla.edu/structHarvester/). The subgroup was assigned using the run result with maximum likelihood among five runs of estimated numbers, with lines with membership probabilities of ≥ 0.80 assigned to subgroups and lines with less than 0.80 assigned to an admixed group (Stich et al. 2005). Association analysis was performed using TASSEL 3.0 (Bradbury et al. 2007), which was used to confirm marker-trait associations using a Q + K MLM. In this study, 100KW was considered a quantitative and qualitative trait for the Q + K MLM method. This study divided all maize inbred lines into low and high groups based on the average value. The Q + K MLM method was performed by combining the Q matrix derived from the STRUCTURE and the K matrix derived from the TASSEL at P < 0.05 (28.54). Furthermore, basic statistical analysis was performed using Microsoft Office Excel 2016. The difference between low and high groups was estimated using Student’s t-test at P < 0.01 in IBM SPSS Statistics version 21 (IBM Corp., Armonk, NY, USA).

Results

Phenotypic Analysis for 100KW in Maize Inbred Lines

Phenotypic variations for 100KW in waxy maize inbred lines are shown in Fig. 1 and Table 2. The average value of 100KW across all of the waxy maize inbred lines was 19.8 ± 4.7 g, with a range of 12.5 to 29.4 g. Based on the average value (19.8 g) for all the inbred lines in this study, the inbred lines were divided into low and high groups. The low group consisted of 21 inbred lines with an average value of 100KW of 16.5 ± 2.0 g, ranging from 12.5 to 19.4 g, while the high group contained 20 inbred lines with an average value of 23.2 ± 2.4 g, ranging from 20.4 to 29.4 g (Fig. 1, Table 2). To confirm differences in 100KW variation between the low and high groups, a t-test was performed using the phenotypic data (Table 2), and this showed statistically significant differences between the two groups (t =  − 9.64, P < 0.001).

Table 2 Results of 100KW evaluation for the waxy maize inbred lines and the low and high groups

Genetic Diversity Using SSR Markers Among Maize Inbred Lines

In this study, 150 SSR loci were utilized to assess the GD index, including the number of alleles, genotypes, GD, polymorphic information content (PIC), and major allele frequency (MAF) in waxy maize inbred lines (Fig. 2, Table 3, Supplementary Table 1). This study found 591 and 648 alleles and genotypes, respectively, in the waxy maize inbred lines using the 150 SSR loci. The number of alleles per locus varied between 2 and 10, with an average 3.94 alleles per locus, while the number of genotypes per locus varied from 2 to 11, with an average of 4.32 genotypes per locus. In addition, the average GD was 0.581, with a range of 0.049–0.844; and the average PIC was 0.522, with a range of 0.048–0.827. The average MAF was 0.528, with a range of 0.244–0.975 (Table 3, Supplementary Table 1). When comparing the genetic diversity index among ten chromosomes, it was found that chromosomes 2, 5, 6, and 7 showed relatively high number of both allele and genotype (Table 3). In particular, chromosome 4 exhibited the lowest average values for allele, genotype, GD, and PIC, while chromosome 7 showed the highest average number of alleles, and chromosome 2 showed the highest average number of genotypes, GD, and PIC values (Table 3).

Fig. 2
figure 2

Frequency of genotype and allele number (A), gene diversity index per locus (B) in the waxy maize inbred lines

Table 3 Genetic diversity index of each chromosome for 150 SSR markers in the waxy maize inbred lines

To clearly understand genetic variation in the waxy maize inbred lines of the low and high groups for 100KW trait, this study verified the allele and genotype numbers, GD, PIC, and MAF in the low and high groups. The total number of alleles was 549 and 540 with an average of 3.66 and 3.60 in the low and high groups, respectively. The total number of genotypes was 575 and 573 with an average of 3.83 and 3.82 in the low and high groups, respectively. Furthermore, the averages of the GD, PIC, and MAF values were 0.570, 0.510, and 0.537, respectively, in the low group; and these values were 0.561, 0.501, and 0.546, respectively, for the high group (Table 4).

Table 4 Comparison of total number of alleles and genotypes and genetic diversity index between low and high group for 100KW

Genetic Similarity and Kinship Between Parental Lines for Korean Maize Varieties

Pairwise genetic similarity coefficients among the waxy maize inbred lines ranged from 0.229 to 0.893 with an average of 0.385 (Supplementary Table 2), where values close to zero indicate lack of similarity, while those close to one indicate complete similarity. The minimum similarity value was observed in combinations of GP5 and 17CS5047 (0.229), and the maximum value was observed in combinations of HCW5 and 17CS8006 (0.893). Furthermore, two combinations (HCW5 and 17CS8006, HW17 and HW18) had values above 0.800. Out of all combinations, 746 combinations had values under 0.500, while the remaining 74 combinations ranged from 0.500 to 0.893 (Supplementary Table 2).

The pairwise relative kinship coefficients among the waxy maize inbred lines ranged from 0.000 to 1.859 with an average of 0.401 (Supplementary Table 3), where values close to zero indicate lack of relationship, while those close to two indicate complete relationship. Out of all the pairs, only one combination (HF12 and HW8) had a relative kinship value of zero, and seven combinations (HF12 and KL103, GP5 and HCW2, HW3 and HW16, GP5 and 17CS5047, HW16 and 17CS8006, HF12 and 15RS8056, HW12 and 12BS5076-8) had relative kinship values close to zero (0.020 ~ 0.050). Out of all the combinations, 775 combinations varied between 0.050 and 1.000, while the remaining combinations ranged from 1.000 to 1.859 (Supplementary Table 3).

Population Structure Analysis in Korean Waxy Maize Inbred Lines

To confirm the genetic structure and relationships among the waxy maize inbred lines in South Korea, this study used a distance-based dendrogram from an unweighted pair group method with arithmetic mean (UPGMA) analysis and a model-based STRUCTURE program to subdivide the inbred lines into appropriate subgroups. A dendrogram based on distance was constructed using the 150 SSR loci (Figs. 3 and 4, Table 1). All waxy maize inbred lines were classified into five subgroups at a genetic similarity of 35.1%. Group I consisted of 16 inbred lines, with nine low inbred lines (11BS8016-7, 12S8052, 15RS8002, HW8, KL103, 12BS5076-8, 15RS8039, 17YS8003, and HW16) and seven high inbred lines (HW1, HW7, HW4, 17YS6032, HW17, HW18, and HW11). Group II consisted of 20 inbred lines, composed of 10 low inbred lines (16CLP40, 17CS8006, HW19, HCW2, 15S8021-3, 16S8068-9, 15RS8056, 16CLP23, 17CS8067, and KW7) and 10 high inbred lines (14S8025, HCW4, HW3, HCW5, 17CS5047, HCW3, HW9, HW10, HW12, and HCW1). Groups III and V had each two inbred lines: two high lines (HW15 and HF12) and low lines (GP3 and GP5), respectively. Group IV had only one high inbred line (HF22) (Fig. 3, Table 1).

Fig. 3
figure 3

UPGMA dendrogram based on waxy maize inbred lines using 150 SSR markers. Filled circle: high 100KW lines; unfilled circle: low 100KW lines

Fig. 4
figure 4

The magnitude of ΔK as a function of K (A), and population structure pattern for K = 2 and 4 (B) of waxy maize inbred lines based on 150 SSR markers. Filled circle: high 100KW lines; unfilled circle: low 100KW lines

Based on the STRUCTURE results, this study applied the ad hoc measure ΔK because it was difficult to separate subgroups from the LnP(D) of the data. Although the highest ΔK value was obtained for K = 2 using the 150 SSR loci in all maize inbred lines, all inbred lines were not clearly distinguished based on 100KW. By setting a membership probability of 0.8, all inbred lines were classified into three groups: Group I, Group II, and an admixed group (Fig. 4, Table 1). Group I consisted of six maize inbred lines, with three low inbred lines (16CLP40, 17CS8006, and HW19) and three high inbred lines (HCW4, HCW5, and HW19); whereas in Group II, there were 27 maize inbred lines, composed of 13 low inbred lines (11BS8016-7, 12BS5076-8, 12S8052, 15RS8002, 15RS8039, 16CLP23, 17CS8067, 17YS8003, GP3, GP5, HW16, KL103, and HW8) and 14 high inbred lines (17YS6032, HCW1, HCW3, HF22, HW1, HW10, HW11, HW12, HW15, HW17, HW18, HW4, HW7, and HW9). The admixed group contained eight maize inbred lines, with five low inbred lines (15RS8056, 15S8021-3, 16S8068-9, HCW2, and KW7) and three high inbred lines (14S8025, 17CS5047, and HF12) (Fig. 4, Table 1).

This study confirmed the population structure at K = 4 in order to obtain a more detailed classification (Fig. 4, Table 1). All maize inbred lines were divided into four main groups and an admixed group. Group I included six maize inbred lines, consisting of three low inbred lines (17CS8006, HW19, and 16CLP40) and three high inbred lines (HW3, HCW4, and HCW5). Group II comprised only two maize inbred lines, HW18 and HW17. Group III consisted of 16 maize inbred lines, with nine low inbred lines (11BS8016-7, 12S8052, 15RS8002, 17YS8003, GP3, GP5, HW16, HW8, and KL103) and seven high inbred lines (17YS6032, HF12, HF22, HW1, HW15, HW4, and HW7). Group IV was composed of three maize inbred lines, consisting of one low inbred line (17CS8067) and two high inbred lines (HCW3 and HW9). The admixed group comprised 14 maize inbred lines, with eight low inbred lines (12BS5076-8, 15RS8039, 15RS8056, 15S8021-3, 16CLP23, 16S8068-9, HCW2, and KW7) and six high inbred lines (14S8025, 17CS5047, HCW1, HW10, HW11, and HW12) (Fig. 4, Table 1).

Association Analysis Using Q + K MLM

Association analysis between a total of 150 SSR markers and the 100KW traits in the waxy maize inbred lines was performed by a population structure (Q) + kinship (K) mixed linear model (MLM). This study detected two (umc2159 and umc1612) marker-trait associations (MTAs) as qualitative character and three (umc2159, umc1716, and bnlg1246) MTAs as quantitative character, involving four SSR markers associated with 100KW at P < 0.05 (Table 5). The MTAs were located on chromosome 4 (umc1612 and umc1716), chromosome 5 (umc2159), and chromosome 6 (bnlg1246). Phenotypic variation for each marker ranged from 11.7 (bnlg1246) to 36.6% (umc2159). Among these MTAs, umc2159 showed higher phenotypic variation with overlap in qualitative and quantitative characters.

Table 5 Information on SSR markers using the Q + K MLM for 100KW in waxy maize inbred lines

Discussion

Investigation of the characteristics for parental inbred lines is the most important for commercial success of any maize hybrid variety. Evaluating the productivity of inbred lines as parents of varieties in a production field is a means of estimating the commercial viability of maize hybrid production (Pinnisch et al. 2012). This study focused on investigating 100KW in waxy maize inbred lines. Among these waxy maize inbred lines, the lowest 100KW value was detected in GP5 (12.5 g) and GP3 (12.6 g), which are parental lines for the commercial popcorn variety Oryun2ho in Korea. In the case of popcorn, kernel size is negatively correlated with popping extension volume, indicated that a small size kernel has better popping expansion (Kaur et al. 2021). The average 100KW in the waxy maize inbred lines was 19.8 g, which is similar to the 19.6 g of 40 waxy maize inbred lines and lower than the 25.3 g of 40 flint maize inbred lines in a study by Sa et al. (2015). Moreover, this average value was lower than the 23.1 g of 27 dent maize inbred lines from a US study (Pinnisch et al. 2012). To confirm statistical differences of the four maize populations mentioned above, this study performed an analysis of variance (ANOVA) (data not shown). Although the maize population in this study included different kernel types of maize, including popcorn (GP3 and GP5) and flint maize (HF12 and HF22), it also included some duplicated inbred lines (HW3, HW4, HW7, HW8, and KW7) with the 40 waxy maize populations of Sa et al. (2015). The results showed no statistically significant difference between the two waxy maize populations (P < 0.05). However, we found statistically significant differences between the waxy population in this study and the flint population (Sa et al. 2015) and the dent population (Pinnisch et al. 2012) in the other studies (P < 0.05). Actual kernel weight may be influenced by many factors, such as market demand and environmental conditions, and waxy maize genotypes had a yield 18 ~ 27% less than that of dent and flint maize (Solaimalai et al. 2020). However, the yields of newer waxy maize hybrids are better than those of older waxy hybrids through improvement of the waxy maize inbred lines in comparison with normal maize hybrids (Darrah et al. 2019).

Assessing genetic variation is essential for understanding GD and relationships and also for the development of new cultivars in crops. This study detected an average number of 3.94 alleles per locus and an average GD and PIC of 0.581 and 0.522, respectively, in waxy maize inbred lines (Table 3, Fig. 2). In a previous study of 40 waxy and 40 flint inbred lines, which is a similar number to the total of inbred lines in this study, an average of 6.34 alleles per locus and GD and PIC values of 0.66 and 0.62, respectively, were detected in the 40 waxy inbred lines (Sa et al. 2015). Moreover, an average of 6.54 alleles per locus with GD and PIC values of 0.69 and 0.65, respectively, was confirmed in the 40 flint inbred lines (Sa et al. 2015). In another study, an average of 3.62 alleles per locus was detected in 129 maize accessions from Agriculture and Agri-Food Canada (AAFC), using 105 SSR markers, with an average PIC value of 0.68 (Reid et al. 2011). In the International Maize and Wheat Improvement Center (CIMMYT), 137 maize accessions were analyzed using 79 SSR loci, and an average of 7.2 alleles per locus was detected, with an average PIC value of 0.64 (Xia et al. 2005). Furthermore, in a study of 165 Chinese waxy maize landraces and inbred lines, an average of 5.2 alleles per locus was confirmed across 20 SSR markers, with an average PIC value of 0.7 (Zheng et al. 2013). In a comparison of the GD index of the population in this study with that of previous studies, although the allele number was found to be similar to that of another study (Reid et al. 2011), the average allele number, GD, and PIC value in this study were lower than those in previous studies (Sa et al. 2015; Xia et al. 2005; Zheng et al. 2013; Luo et al. 2020). This result indicates the presence of narrow GD in the Korean waxy maize inbred lines used in this study. Among the 150 SSR markers in this study, 13 SSR markers exhibited a high number of over 6 alleles per locus, and 79 and 56 SSR markers showed values of GD and PIC higher than 0.6 (Supplementary Table 1). These SSR markers, which have a higher allele number and GD index, have the potential to provide information for the identification and characteristics of diverse maize lines in future molecular breeding programs.

To develop new hybrid varieties with high performance, it is best to select parental lines with better phenotypic performance and a wide genetic base that can increase genetic variation (Biswas et al. 2008; Ertiro et al. 2017). To select the optimal parental combination for creating new hybrids, there are several methods including evaluation of pedigree relationships, assessment of phenotypic performance for desired traits, and analysis of genetic distances based on both phenotypic traits and molecular markers (Bertan et al. 2007). This study identified genetic distances using measures of genetic similarity and kinship among waxy maize inbred lines (Supplementary Tables 2 and 3). Among all possible combinations in this study, inbred lines with close genetic distances were selected from the same population, while inbred lines with a more distant relationship were combined with inbred lines selected from Korean landrace and unknown foreign populations. Moreover, this study confirmed the genetic similarity and kinship of 11 parental combinations for cultivars registered in the Korea Seed & Variety Service (KSVS) (http://www.seed.go.kr) (Supplementary Tables 2 and 3). The average values of similarity and kinship were 0.406 and 0.478, respectively, in the 11 parental combinations. Among these cultivars, the highest similarity and kinship values were confirmed in cultivar Oryun2ho (popcorn variety; GP5 and GP3), while the lowest values for similarity were detected in cultivar Arichal (waxy variety; HW12 and HW11) and the lowest values for kinship in Mibaekchal (waxy variety; HW4 and HW3) (Supplementary Tables 2 and 3). Previous studies have generally reported inconsistent correlations between the genetic distance of parental lines and the heterosis of progeny (Flint-Garcia et al. 2009). Similarly in this study, we also did not detect any correlations between heterosis for 100KW and genetic distance factors, such as genetic similarity and kinship (data not shown).

Determining the population structure in a set of cultivars is very important for marker-assisted selection (MAS) and association analysis (Wang et al. 2008; Flint-Garcia et al. 2005). Our study used two methods to evaluate population structure: a distance-based phylogenetic method and a model-based clustering method (Figs. 3 and 4, Table 1). This study divided the waxy maize inbred lines into five groups by the distance-based method (Fig. 3, Table 1). At a genetic similarity of 37.3%, Group I was divided into three subgroups: Group I-1, consisting of nine inbred lines; Group I-2, consisting of the five inbred lines; and Group I-3, consisting of the two inbred lines. Similarly, Group II was also divided into two subgroups, at a genetic similarity of 39.5%: Group II-1, consisting of 11 inbred lines, and Group II-2, consisting of nine inbred lines. Additionally, since the model-based method included 65% of the lines (27 inbred lines) in Group II at K = 2, these maize inbred lines were further subdivided at K = 4, which had the next highest ΔK value (Fig. 4). Although the number of maize inbred lines in the admixed group increased at K = 4 compared with K = 2, the maize inbred lines assigned to the main group were subdivided into four subgroups instead of two groups (Fig. 4). In the results, the maize inbred lines of G1 at K = 2 were consistent with G1 at K = 4, and the inbred lines of G2 at K = 2 were subdivided into G2, G3, G4, and admixed group at K = 4. Although G3 and the admixed group at K = 4 with the model-based method still showed inconsistencies with the distance-based method, both methods were in good agreement with the patterns of population structure identified by the K = 4 of the model-based method and the subgroup of the distance-based method (Table 1). The relationship between lines within the same group at K = 4 has been confirmed based on pedigree data. HW3 was located on G1 with five different inbred lines, and these five inbred lines were near-isogenic lines generated from HW3. Moreover, HW17 and HW18 inbred lines were included in G2, which were selected same population made by a foreign collection. Although pedigree of inbred lines in G3 was complexed or unknown, this group contained inbred lines generated by Korean landrace. Maize inbred lines, HW9 was located on G4 with two different inbred lines (HCW3 and 17CS8067), and these two inbred lines were generated from HW9.

Compared with traditional quantitative trait locus (QTL) mapping, in recent years, association analysis has become a more powerful tool for detecting molecular markers associated with specific morphological traits (Liu and Qin 2021). However, association analysis suffers from the problem of false positives (Type-I errors), which can result in invalid associations due to confounding effects from population structure and kinship (Zhang et al. 2010). In this study, a Q + K MLM, using both the Q and K matrices, identified five MTAs associated with 100KW including four significant SSR markers, umc1716, umc1612, umc2159, and bnlg1246 (Table 5). Among these SSR markers associated with 100KW, SSR marker umc1612 at GRMZM2G012391 (cyp37), which encodes cytochrome P450 37, was located on the middle of the long arm of chromosome 4 (bin 4.08). The cytochrome P450 enzyme family is also involved in various physiological processes, including the regulation of seed size and development in Gramineae crops (Li and Wei 2020). Although umc2159 has been detected in both qualitative and quantitative characters for 100KW, it has not been fully characterized. However, as this marker was identified as a flanking marker for a QTL that conditions kernel weight on kernel length in maize (Zhang et al. 2014), further analysis is needed to better understand this locus. SSR marker, umc1716 at GRMZM2G119393, which encodes DNA-directed RNA polymerase II polypeptide, was located on the distal tip of the long arm of chromosome 4 (bin 4.11) (Haag et al. 2014). Moreover, bnlg1246 was mapped different four chromosomes on 3, 5, 6, and 8 (http://www.maizeGDB.org). Furthermore, bnlg1246 has been identified as a flanking marker for QTLs related to aboveground biomass dry weight traits on chromosome 5 in a previous study (Chen et al. 2008).