Introduction

Rice, scientifically known as Oryza sativa L., holds considerable agricultural importance as it is a primary food source for approximately 50% of the global population. Furthermore, it ranks second in crop consumption, following wheat (Babu et al. 2014). Rice, the second most widely consumed cereal, is a significant dietary staple for approximately 3.5 billion individuals globally, contributing to about 20% of their caloric intake. According to recent projections, there is an anticipated increase in global rice consumption to reach 852 million tonnes by the year 2035. This surge in demand can be attributed to the expected growth in the global population within the next quarter of a century (Khush 2013). Egypt serves as the primary source of sustenance for a population exceeding 105.82 million individuals. Egypt is positioned at the 18th spot globally regarding its ranking among the largest nations in rice production. The cultivation of rice receives greater emphasis in the lower region of the Nile River.

Nevertheless, the cultivation of rice in Egypt has been constrained as a consequence of inadequate water resources. Moreover, it is worth noting that there has been a substantial decline in the diversity of rice species over the past few centuries (Choudhary et al. 2013). Consequently, the development of novel rice cultivars assumes paramount importance in meeting the demands of the global population and mitigating the challenges posed by biotic and abiotic stresses. Implementing crop advancement strategies can result in higher crop yields and enhanced resilience against abiotic stressors (Rai et al. 2023). The success of breeding projects may depend on genetic diversity within rice genotypes.

Consequently, the pursuit of greater cultivar variability necessitates examining and quantifying genetic diversity across a wide range of genetic lineages. The existing diversity within plant populations can be harnessed to facilitate the development of cultivars that exhibit enhanced tolerance to abiotic stressors such as drought and salinity, which are prevalent abiotic factors in the Egyptian context. To effectively address genetic diversity, two distinct approaches can be employed: (1) the utilization of agronomic characteristics and (2) the assessment of DNA variations through molecular techniques.

Some limitations of assessing genetic diversity based on physiological traits include the extensive time required, the need to utilize multiple sites and years, the impact of environmental factors, and the insufficient resolving power to distinguish highly similar genotypes. In contrast, using molecular DNA markers has yielded significant progress in evaluating genetic diversity in various crop species (Mourad et al. 2019; Mohanty et al. 2021; Naaz et al. 2022; Safhi et al. 2022; Ibrahim et al. 2023a, b). Microsatellites, known as simple sequence repeats (SSRs), are highly polymorphic DNA markers widely employed in species diversification research (Abd El-Moneim 2021; Hassan and Hama-Ali 2022; Mesfer ALshamrani et al. 2022; Ibrahim et al. 2023a, b). In order to investigate the genetic variability in rice, microsatellite markers are utilized due to their co-dominant nature, abundance in the genome, informativeness, repeatability, reliability, low cost, and ability to exhibit a high rate of polymorphism (Borba et al. 2010; Hassan and Hama-Ali 2022).

In previous research, cluster analysis was employed to ascertain the interconnectedness between genotypes, aiming to characterize the biological diversity present. The progress in population structures has led to the development of diverse software applications that facilitate a more thorough comprehension of the genetic makeup of a population. One of the key components of these programs is structure, which produces notable clusters based on the presence of Hardy–Weinberg disequilibrium and linkage disequilibrium (LD) resulting from population admixture (Pritchard et al. 2000; Safhi et al. 2022). Considering linkage disequilibrium (LD) among genotypes has enhanced the quality of clustering findings (Falush et al. 2003; EL-Mansy et al. 2021; Essa et al. 2023). This approach facilitates the efficient utilization of genetic diversity in breeding endeavors to develop novel cultivars harboring genes associated with enhanced productivity and heightened resilience to biotic and abiotic stressors. This study aims to investigate the population structure and gene flow of 27 Egyptian rice genotypes using microsatellite markers, assess the level of genetic variability among these genotypes, and compare genetic characteristics between subpopulations using microsatellite markers.

Materials and methods

Plant materials

An extensive and diverse collection of 27 rice genotypes was used for the analysis. These genotypes were obtained from the Genetic Stocks Oryza Collection, the USA, and the Agricultural Research Center (ARC), Giza, Egypt. A list of the rice genotypes, accession number, source of seeds, and subspecies group is presented in Table S1.

Sample preparation and DNA extraction

Total genomic DNA was extracted from young leaves of 8-weeks-old seedlings growing under a greenhouse for five-seedling individuals for each genotype used as a genetic pool in the Faculty of Science, Arish University 2022. DNA extraction was performed according to Salem et al. (2004).

PCR amplification of microsatellite markers

Twenty-three microsatellite markers representing chromosomes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and 12 were employed to assay genetic variation for all genotypes based on the Rice Genes Database (http://gramene.org). Detailed information on the microsatellite markers is given in Table S2. In a total volume of 25 μl, the Reaction mixture comprised 50–100 ng template DNA, 0.2 mM dNTPs (Qiagen), 1.5 mM MgCl (Qiagen), 250 nM forward primer, 250 nM reverse primer (Applied Biosystems, ThermoFisher), 2.5 μl PCR buffer (10X) and 1U Taq DNA polymerase (Qiagen). The following program was used to perform PCR: 5 min at 94 °C, followed by 35 cycles for 1 min at 94 °C, 1 min at primer’s annealing temperature accordingly, and 2 min at 72 °C, with a final extension of 5 min at 72 °C, as described by Wu and Tanksley 1993. The amplification products were resolved on 10% polyacrylamide denaturing gels (PAGE). A binary data matrix was used to score the gels. The presence (1) and absence (0) were recorded for each variation of alleles.

Genetic diversity estimation

A dendrogram of rice genotypes was generated based on the unweighted pair group method with arithmetic mean (UPGMA) method via Power Marker DARWIN version 5 was used to calculate the variation in the experimental sample using allele scoring data (Perrier and Jacquemoud-Collet 2006). A dendrogram of rice genotypes was generated based on the unweighted neighbor-joining method using the Dice similarity index method via DARWIN version 5 (Perrier and Jacquemoud-Collet 2006). The observed heterozygosity (Ho) and expected heterozygosity (He), the number of alleles per locus, the number of polymorphic loci, as well as Shannon (I) and Nei information indices (between and within the populations) were determined by using POPGENE version 1.32 (Falush et al. 2003; Yeh et al. 1997). The polymorphic information content (PIC), gene diversity (GD), and allelic frequency for all markers were analyzed by Power Marker version 3.25 (Lui 2005).

Population structure analysis

A basic STRUCTURE model (Stanford University, Stanford, California) described by Pritchard et al. (2000) was used to estimate and analyze population structure. It uses a correlated allele frequency and admixtures model (Falush et al. 2003; Hubisz et al. 2009). The values of K were set from 1 to 10, and each value with ten repeats. The length of the burn-in period was set at 20,000, and 2,00,000 Monte Carlo Markov chain runs were set to estimate the value of the K. Structure Harvester software program, which was used to determine the final population based on second-order data (Earl and Von Holdt 2012).

Analysis of molecular variance (AMOVA) and venn diagram

The ARLEQUIN version 3.114 was implemented for calculating molecular variance (AMOVA). The VENNEY 2.1 software program was used to generate the Venn diagram and to determine shared germplasm among populations and hierarchical groups (Oliveros 2007). Using R Software, Principal component analysis was used to illustrate the relationships among the studied rice genotypes.

Results

Genetic variability of microsatellite loci among rice genotypes

In order to evaluate the genetic variability in rice, a total of 23 microsatellite markers specific to rice were employed. All microsatellite markers exhibited polymorphism, and the dataset contained no duplication. A comprehensive analysis revealed the presence of 106 distinct alleles among a set of 27 rice genotypes. The alleles identified in this study exhibited a range of two for RM144, RM151, RM215, and RM277 to eight for RM11, resulting in an average of 4.61 alleles per locus. The amplified fragments exhibited a range of lengths from 78 to 264 base pairs (Table 1).

Table 1 Genetic features of 23 microsatellite markers used for assessing genetic diversity in 27 rice genotypes

In this study, the efficacy of the microsatellite markers was evaluated by measuring parameters such as the number of observed alleles (N), number of effective alleles (Ne), polymorphic information content (PIC), Nei gene diversity, and Shannon index (I) across all genotypes. The data about the bands generated are presented in Table 1. The effective alleles (Ne) observed at each locus exhibited a range of 1.87 to 4.53, with a mean value of 2.93. Similarly, the genetic diversity (GD) observed per locus varied between 0.466 (RM151) and 0.779 (RM19). The range of He values observed in this study varied from 0.475 (RM151) to 0.794 (RM19). The Shannon (I) and Nei information indices exhibited a range of 0.659 to 1.690 and 0.466 to 0.779, respectively, with mean values of 1.193 and 0.631. In addition, the values of PIC varied from 0.358 (RM151) to 0.751 (RM19), with an average of 0.580. The microsatellite marker RM19, which possesses a polymorphic information content (PIC) value of 0.751, exhibited a higher level of polymorphic information. The marker indicates a maximum PIC value of RM19 and displays six distinct alleles across the various genotypes, as depicted in Table 1. The study revealed a significant and robust Pearson correlation coefficient (r) between the number of alleles and both polymorphic information content (PIC) (r = 0.785**, P < 0.01) and gene diversity (r = 0.713**, P < 0.01) (Figure S1 and S2).

Population structure analysis

The utilization of the STRUCTURE program facilitated the assessment of the genetic structure of the individual rice samples, resulting in the most dependable differentiation of the various genotypes. The population genetic structure analysis of 27 rice genotypes was conducted using microsatellite genotyping data from the STRUCTURE program. The research offers valuable data for establishing the suitable population structure of the Egyptian rice genotypes. The genetic composition analyzed in this research was derived from a total of 106 distinct alleles that were distributed among 27 different germplasm genotypes. The optimal number of subpopulations (K) was determined using the Structure Harvester algorithm, and the results were presented in Fig. 1, indicating that the most suitable value for K was 2 (K = 2). As a result, the subpopulation results of K = 2 were selected using the STRUCTURE software (Fig. 1).

Fig. 1
figure 1

The distribution of delta K (ΔK) for different numbers of subpopulations to determine the true number of subpopulations (K). The sharp peak of DK at K = 2 suggests two subpopulations

The red cluster, denoted as P1, comprises twelve distinct rice genotypes, while the green cluster, designated as P2, encompasses fifteen distinct rice genotypes. Concerning the microsatellite diversity parameters, the levels of heterozygosity (Ho) and effective population size (Ne) did not exhibit significant diversity in P1, P2, and P1P2, as indicated in Table 2. The level of heterozygosity (He) observed in the P2 subpopulation was relatively higher when compared to the P1 and P1P2 subpopulations. Following this, when the K value was set to 3, 27 genotypes were classified into three distinct subpopulations, namely P1, P2, and P3. It is worth noting that in these subpopulations, the number of admixtures exceeded that observed when K was set to 2. Figure 2 illustrates a visual representation of subpopulations at two distinct K values. At a K value of 3, the subpopulations P1, P2, and P3 consisted of 11, 5, and 11 genotypes, respectively. At a K value of 2, except admixtures, all genotypes assigned to the subpopulations (P1 and P2) were deemed pure as their membership probability exceeded 0.90 (Fig. 2).

Table 2 Model-based analysis of populations and their admixtures
Fig. 2
figure 2

Population structure estimates for 27 Egyptian rice genotypes using microsatellite markers. a Two sub-populations identified with STRUCTURE analysis. b Three sub-populations identified with STRUCTURE analysis. The y-axis represents sub-group membership, whereas the x-axis represents genotype

Additionally, to analyze the diversity within subpopulations, the scoring data was aggregated in a manner specific to each subpopulation (P1, P2, and P1P2). The mean number of distinct alleles in subpopulation P2 (5.93) exhibited a slight increase compared to subpopulation P1 (5.10) and the admixture P1P2 (5.33). The value of the He statistic in subpopulation P2 (0.779) exhibited a slight increase compared to subpopulations P1 (0.779) and P1P2 (0.779). In this study, the obtained He values support the likelihood of outbreeding by approximately 78% during seed propagation (Table 2).

Genetic diversity and phylogenetic analysis

This study presents a comprehensive examination of rice genotypes from Egypt, employing a structural analysis approach. The investigation focuses on two distinct subpopulations, namely P1 and P2, and utilizes a set of 23 microsatellite markers for the analysis. A neighbor-joining tree without weighting was also constructed to verify the genetic relationships; it was accomplished using the Darwin software program (see Fig. 3). According to the findings derived from the STRUCTURE analysis, the germplasm accessions were categorized into two distinct clusters: blue (consisting of 12 genotypes) and red (comprising 15 genotypes). Therefore, the phylogenetic tree analysis further validates the STRUCTURE analysis, as it reveals the presence of two distinct clusters consistent with the previous findings.

Fig. 3
figure 3

Unrooted neighbor-joining tree of the 27 rice genotypes and based on data of 23 microsatellite markers. Cluster I is shown in blue color and cluster II is shown in red color. Bootstrap values are present in the middle of the branches. The numbers present at the tip of the branches indicate the number of genotypes

The analysis of molecular variance (AMOVA) using microsatellite data was calculated using ARLEQUIN version 3.1, considering both within and among subpopulations. Table 3 displays the results of the analysis of molecular variance (AMOVA) conducted using 23 microsatellite markers. The allelic distance matrix was utilized as input for the calculation of F-statistics. According to Table 3, the molecular variance percentage for the two subpopulations suggests that 34.22% of the variation can be attributed to differences between populations, while 65.78% can be attributed to differences within the population. It is anticipated that there will be no individual variation within single populations. A statistically significant fixation index (FST) value of 0.342 was observed (P > 0.001), along with a gene flow (Nm) value of 0.481.

Table 3 Analysis of molecular variance (AMOVA) of 27n rice genotypes based on 23 microsatellite markers

The genotypes were clustered using neighbor-joining and structure-based populations, and it was observed that they exhibited a comparable pattern. Consequently, further investigation was conducted to examine their co-linearity utilizing the Venn diagram. The Venn diagram depicting the analysis of 27 genotypes revealed the presence of neighboring-joining clusters and model-based populations. The results indicated that 37% of the genotypes obtained from POPI exhibited an identical match with cluster I. Similarly, a significant proportion of genotypes (55.6%) from the POPII dataset exhibited similarity with cluster II, as depicted in Fig. 4. Nevertheless, it is worth noting that a mere 7.4% of genotypes derived from the POPI dataset exhibited complete similarity to cluster II. In general, the analysis revealed a notable resemblance between cluster grouping and Structure-based subpopulations as depicted in the Venn diagram, thereby enhancing the validity of the findings in the current investigation.

Fig. 4
figure 4

Estimation of co-linearity between neighbor-joining-based clusters and model-based sub-populations using Venn diagram. The total number of identical germplasms between cluster and sub-population is shown in the figure. The percentage of co-linearity is reported in parentheses

Principle component analysis (PCA)

Principal Component Analysis (PCA) demonstrated the interrelationship and grouping of the examined genotypes (old and modern groups) based on molecular data. According to the data presented in Fig. 5, it can be observed that PCA1 explains 17.4% of the total variance, whereas PCA2 accounts for a comparatively lower proportion of 10.2%. The principal component analysis (PCA) results revealed that the genotypes of the modern group were situated between those of the old group. According to the findings of the principal component analysis, a total of 27 components were identified. However, only the first four components, namely PC1, PC2, PC3, and PC4, were utilized to represent the variation. According to Table 4, the initial four components exhibited the most significant eigenvalues, specifically 18.413, 10.759, 7.618, and 7.031, respectively. The initial principal component, PC1, accounted for 17.37% of the overall variability. Principal Component 2 (PC2) accounted for 10.15% of the total variation, while Principal Component 3 (PC3) collectively accounted for 7.18%. Furthermore, the combined effect of PC3 explained 6.633% of the total variation.

Fig. 5
figure 5

Principle component analysis of the studied genotypes (old and modern groups) based on the molecular attributes

Table 4 The cumulative variation, percentage, and eigenvalue of the twelve resultant components

Discussion

Understanding and examining genetic diversity and the evolutionary interconnection among genotypes are essential in breeding and enhancement initiatives for strategic fields such as rice. Rice exhibits a restricted genetic background due to the frequent utilization of shared parents despite globally active breeding, genetic enhancement initiatives, and diverse genotypes (Bharadwaj et al. 2013; Safhi et al. 2022; Ibrahim et al. 2023a, b). Hence, it is imperative to consistently assess the genetic diversity of germplasm in order to ascertain the specific alleles that contribute to enhanced productivity and resilience against diverse environmental pressures, encompassing both biotic and abiotic stresses. The assessment of genetic diversity serves as a crucial means for the introduction of novel genetic material. The process of selection and hybridization offers valuable insights to plant breeders, enabling them to develop novel and versatile cultivars; its practice broadens rice's genetic diversity (Glaszmann et al. 2010; Essa et al. 2023, Mesfer ALshamrani et al. 2022; Essa et al. 2023).

The present investigation demonstrated the novel contributions of rice genotypes to comprehend their genetic diversity, population structure, and gene flow, aiming to facilitate breeding and enhance crop improvement efforts. Plant genetic resources (PGRs) are crucial in breeding and crop enhancement initiatives. However, it is important to note that these Plant Genetic Resources (PGRs) are considered the most exemplary representatives of a wide range of genetic material. Nonetheless, it is crucial to possess prior research and understanding of genetic diversity to effectively utilize these PGRs (Bueno et al. 2019; Safhi et al. 2022; Soliman et al. 2023). Over the past few decades, there has been notable advancement in molecular marker methods, which have proven effective in assessing the extensive genetic diversity of germplasm resources (Mesfer ALshamrani et al. 2022; Naaz et al. 2022). Based on prior molecular marker data, genetic diversity was assessed using a panel of 23 microsatellite markers. All of the samples exhibited polymorphisms, indicating genetic variation, and were consequently employed in investigating genetic diversity. About 106 alleles were observed across 23 genotypes, averaging 4.61 per locus. Previous research also observed comparable outcomes in rice by examining microsatellites (Ahmed et al. 2019; Naaz et al. 2022; Hassan and Hama-Ali 2022).

Botstein et al. (1980) state that a marker with a polymorphic information content (PIC) value greater than or equal to 0.5 is considered highly informative. Conversely, a marker with a PIC value below 0.5 is classified as moderately informative, while a marker with a PIC value below 0.25 is categorized as minimally informative. The study observed a range of PIC values for the 23 microsatellite loci, from 0.358 to 0.751. The average PIC value was calculated to be 0.580, suggesting that these markers possess a high level of informativeness. The usefulness of microsatellite loci and their characteristics for detecting genotype differences is evaluated by PIC (Maclean et al. 2013; EL-Mansy et al. 2021; Abd El-Moneim et al. 2021; Essa et al. 2023). Consequently, the calculated average PIC value of approximately 0.6 indicates that the rice genotypes employed in this investigation exhibited substantial genetic diversity. Out of the total 23 microsatellite loci examined, it was observed that five specific loci (RM5, RM11, RM19, RM161, RM413, and RM474) exhibited notably high polymorphic information content (PIC) values, ranging from 0.703 to 0.751. Including these highly polymorphic microsatellite loci can potentially enhance the genetic diversity of existing genotypes. The utilization of these high-density microsatellite loci has the potential to expand the genetic diversity of current genotypes. Singh et al. (2013) and Naaz et al. (2022) reported a similar average PIC value. The determination of PIC (Polymorphic Information Content) values is subject to the influence of several parameters, such as the breeding method employed by the species, the level of genetic variability present in the selected genotypes, the size of the population under study, the specific genotypic technique utilized, and the location of primer sites within the genome (Singh et al. 2013; Naaz et al. 2022). The current genotypes exhibit a wide range of genetic diversity, spanning from 0.358 to 0.751, with an average value of 0.580. The findings of this study align with the results reported by Babu et al. (2014) in their investigation of 82 different rice genotypes. Previous research estimated the PIC values as follows: 0.240 (Anandan et al. 2016), 0.416 (Nachimuthu et al. 2015), 0.560 (Pathaichindachote et al. 2019), 0.630 (Aljumaili et al. 2018), and 0.738 (Wang et al. 2014).

The microsatellite loci exhibited considerable variation in the number of alleles, ranging from 2 alleles for four loci (RM144, RM151, RM215, and RM277) to 8 alleles for RM11, with an average of 4.61 alleles per locus. Thomson et al. (2009) reported comparable results. Maclean et al. (2013) documented the count of unique alleles for 11 microsatellite loci (RM5, RM55, RM118, RM133, RM154, RM215, RM271, RM277, RM413, RM433, and RM474) within a set of 82 rice genotypes. The analysis shows that three loci (RM133, 154, and 271) exhibited an equivalent count of unique alleles (Babu et al. 2014). The study revealed a significant and robust correlation between the number of alleles and both polymorphic information content (PIC) (r = 0.785**) and gene diversity (r = 0.713**). This finding presents an additional explanation for the notable genetic variability observed among the 27 rice genotypes. Therefore, these microsatellite loci show high productivity and can be effectively utilized to investigate genetic variation.

This study utilized 106 microsatellite alleles to assess the population structure of 27 Egyptian rice genotypes. The analysis was conducted using 23 microsatellite loci. Prior research established microsatellites as DNA markers in plants exhibiting substantial polymorphism (Naaz et al. 2022; Hassan and Hama-Ali 2022). At the highest delta K value, 27 rice genotypes were segregated into two distinct subpopulations, P1 and P2. These subpopulations exhibited significant admixture variations, contributing to a considerable variance. Singh et al. (2013) reached comparable findings. The AMOVA study revealed a statistically significant level of genetic variability within the examined population, amounting to 65.78%. The level of genetic diversity observed among subpopulations was minimal, specifically measuring 34.22%. The observed genetic variation among different genotypes can potentially be ascribed to the process of gene flow, which is facilitated by the mobility of seeds (Dhanapal et al. 2015). The practice of seed exchange among farmers contributes to the augmentation of local germplasm diversity. Hence, using model-based population structure analysis facilitates the identification and mapping of significant agronomical traits while also enabling the examination of recombination patterns through genome-wide association analysis.

Consequently, the dispersion of alleles across different populations exhibits an increase that is not influenced by their geographical proximity (Louette et al. 1997). Choudhary et al. (2013) observed a substantial evolutionary divergence among five rice populations, amounting to 92.12%. Conversely, the genetic diversity within each population was comparatively low, accounting for only 7.88%. The distribution of genetic diversity among 375 rice genotypes was observed as follows: 4% among populations consisting of 12 distinct populations, 70% among individuals, 25% within individuals, and 1% among regions (Singh et al. 2013). The current investigation demonstrates that the considerable degree of polymorphism observed among genotypes in P1 and P2 possesses the potential to enhance genetic diversity in breeding initiatives (Thomson et al. 2009).

In AMOVA analysis, FST is a metric for quantifying population differentiation by evaluating genetic structure. Wright (1965) significantly contributed to genetics by identifying genetic differentiation utilizing FST values. The three levels of differentiation, as defined by previous research (Wright 1978), are as follows: low differentiation (FST = 0.00–0.05), moderate differentiation (FST = 0.05–0.15), and high differentiation (FST > 0.30). According to the study conducted by Frankham et al. (2010), a population can be considered significantly differentiated when the FST value exceeds 0.15. Within the present context, a notable disparity was observed in the FST values (0.342) between the two subpopulations. Verma et al. (2019), Gouda et al. (2020), and Suvi et al. (2020) reported significant genetic differentiation (FST) values of 0.827, 0.490, and 0.407, respectively, when examining subpopulations. In addition, Nm is a valuable tool for assessing the significance of gene flow and genetic drift in evolutionary differentiation (Hassan and Hama-Ali 2022). Gene flow between populations within a species maintains genetic diversity, and a higher frequency of gene flow contributes to increased genetic variation (Slatkin 1994). In contrast, gene flow diminishes genetic differentiation as genetic diversity increases. Consequently, the augmentation of gene flow is expected to lead to a decline in genetic differentiation, thereby enhancing genetic diversity (Fu et al. 2016). The Nm value (0.481) observed in our study is significantly lower than the critical value of 1, suggesting a substantial genetic divergence between the two subpopulations (P1 and P2). According to Slatkin (1985), a value of Nm less than one signifies a restricted gene flow between populations.

The results of the neighbor-joining tree analysis exhibited congruent patterns concerning population structure. The classification of all 27 genotypes into two distinct groups with similar genotypes in the evolutionary tree was exemplified by the findings of the STRUCTURE analysis. The Unweighted neighbor-joining tree is utilized to depict the level of genetic diversity among different rice genotypes. Moreover, the study on co-linearity uncovered the presence of cluster grouping and subpopulations based on structural characteristics. The subpopulation P1 in the STRUCTURE analysis exhibits a similarity of 37% with cluster I in the dendrogram.

Similarly, the subpopulation denoted as P2 in the STRUCTURE analysis exhibits a 55.6% similarity with cluster II. The previous study provides further evidence supporting the results obtained from cluster and structure-based population analyses. Additionally, it highlights the efficacy of microsatellite-based genotyping as a reliable approach for investigating genetic variation in rice.

Furthermore, the population structure analysis revealed the presence of additional admixtures. The structural analysis revealed the presence of admixtures (specifically genotypes 16, 17, and 21) that are distinctly positioned in the dendrogram. This observation underscores the significant possibility of gene flow occurring between these genotypes, even in the presence of geographic barriers. Gene flow may manifest itself through the exchange of seeds for cultivation purposes among growers. Consequently, utilizing the model-based population structure analysis in the present study may contribute to mapping important agronomical traits and examining recombination patterns through genome-wide association analysis.

A unique allelic pattern was observed in each subpopulation. The variability of Na and Ne was detected to be somewhat diverse in both groups. The absence of heterozygosity was reflected in all microsatellite loci due to the high degree of homozygosity in the genotypes. Moreover, due to the self-pollination nature of rice, it is expected that microsatellite markers would exhibit only a single allele at each locus. Consequently, the level of heterozygosity is expected to be null. Firstly, the variability of Na and Ne (allele number and effective allele number) being somewhat diverse in both groups suggests that different alleles are present within each subpopulation. This could result from genetic mutations, recombination events, or natural selection acting on different alleles in each subpopulation. The absence of heterozygosity, meaning that both alleles at each microsatellite locus are the same, can be explained by the high degree of homozygosity in the genotypes (Yun et al. 2020). Homozygosity is the condition where an individual carries two identical alleles for a particular gene or genetic marker, and it can arise due to self-pollination. In the case of rice, which has a self-pollination nature, it is expected that the microsatellite markers would exhibit only a single allele at each locus, resulting in a lack of heterozygosity. Consequently, the level of heterozygosity is expected to be null because of the specific reproductive system of rice and the resulting high degree of homozygosity. This means that within each subpopulation, individuals are likely to carry the same two alleles for each microsatellite locus, leading to a unique allelic pattern in each group (Uddin et al. 2022).

Nevertheless, a substantial range was observed in the expected heterozygosity values, from 0.475 to 0.794. Measuring heterozygosity (He) in population genetics is a fundamental parameter that provides valuable insights into genetic diversity (Yu et al. 2013). The systemic inflammatory response (SII) was detected in the analysis of all 23 microsatellite loci. The SII values, ranging from 0.659 to 1.690, observed across all microsatellite loci suggest that all loci exhibit informativeness due to utilizing a genetically diverse population in the current investigation. Furthermore, it was observed that SII and Ne showed a significant and robust positive correlation (r = 0.941**).

Subpopulation P1 possesses genotypes originating from four distinct released groups, rendering it a highly valuable reservoir of genetic diversity in rice. Utilizing these genotypes will prove advantageous in future breeding initiatives that augment rice's genetic diversity. Multi-parent advanced generation inter-crosses (MAGIC) can potentially benefit marker-assisted selection and genome-wide association studies due to their enhanced diversity. The primary sources of information utilized for addressing the issue of limited gene flow in rice imports within the region were the outcomes of AMOVA, FST, and Nm analyses. It is conceivable that a substantial proportion of genotypes within the population exhibit a high degree of similarity, and it is also plausible that the practice of seed sharing among farmers is not widespread. Farmers do not receive seeds from governmental entities. Consequently, farmers reserve a portion of their seeds annually for sowing in the subsequent year. Principal Component Analysis (PCA) is a widely used unsupervised learning technique employed to decrease the dimensionality of a given dataset and extract the most significant information for subsequent analysis (Iqbal et al. 2014).

Conclusions

The results suggest that microsatellite markers are valuable tools for assessing genetic diversity among rice cultivars. The process of selecting genotypes as potential donors in a crop breeding strategy aimed at enhancing specific attributes in rice necessitates the examination of genetic diversity among individuals within populations. Moreover, it is crucial to comprehend the distribution of alleles within subpopulations as it provides insights into the valuable loci that can be employed to investigate genetic diversity. The resolution of the issue concerning inadequate gene flow in rice imports in the region relied heavily on utilizing AMOVA, FST, and Nm findings as primary sources of information.