Introduction

Sugarcane is a major agricultural cash crop grown in almost 110 countries worldwide. China is the third major sugarcane producer and plays a major role in sugar production, especially in Guangxi province, China (Li and Yang 2015; Verma et al. 2020; 2021). Sugarcane industries in Guangxi account for 58.9% of China's sugarcane production and produced around 6.34 million tonnes in the crushing season of 2018/19 (Wang 2018; Chen et al. 2020). Sugarcane has great potential as a major feedstock for biofuel production, globally. It is considered among the best options for producing biofuels due to its outstanding biomass production capacity, high carbohydrate content and good energy input/output ratio. To increase the production of biofuels, it is very important to produce improved sugarcane varieties with better biomass degradability (Hoang et al. 2015; Ali et al. 2019; Wu et al. 2019). The complexity and size of the sugarcane genome is a major drawback in genetic improvement. Genetic diversity can be determined based on various approaches, morphological traits, pedigree records and molecular markers.

Sugarcane breeding contributes to most of the sugarcane production, but it is a long process that generally takes about 12 years. Some intermediate materials that may have a significant value in research and development may be eliminated in the sugarcane breeding process. In addition, new parental materials are more important for the development of modern varieties (Deng et al. 2004; Wu et al. 2008, 2019; Medeiros et al. 2020). A high-sugar population's genetic inheritance has higher heritability than that of the sugarcane yield, and the former is far more stable to be expressed in the sugarcane breeding process (Jackson and McRae 2001; Todd et al. 2020).

Nowadays, the use of molecular markers for the evaluation of genetic diversity is drawing the attention of researchers (Rao et al. 2016; Wu et al. 2019). It is well known that intersimple sequence repeat (ISSR) and simple sequence repeat (SSR) markers were used to analyze the diversity and genetic background of the sugarcane population, such as parents of resource nurseries (Liu et al. 2015; Ali et al. 2019; Medeiros et al. 2020). Breeders have used morphological traits to identify the relationship among varieties in the traditional way; however, morphological traits are mainly affected by plant development and the environment. Identification, based on morphological traits, is not suitable (Wang et al. 2009; Tew and Pan 2010; Ahmad et al. 2018). Molecular markers are an accurate and suitable technique to determine the genetic diversity of sugarcane cultivars and species (Silva et al. 2012; You et al. 2013; Santos et al. 2014; Wu et al. 2019; Medeiros et al. 2020).

Presently, a large number of different molecular marker systems have been developed for use in sugarcane and simple sequence repeats (SSRs) have been shown to be more efficient markers for breeding program, due to their availability in large quantities, are required in low doses, are co-dominant, reliable and can be used for multi-allelic detecting (Powell et al. 1996; Pan 2016; Ali et al. 2019; Wu et al. 2019). SSRs are categorized into mono-, di-, tri-, tetra-, penta- or hexa-SSRs based on the number of repeated base pairs and into perfect, imperfect and compound SSRs, which display perfect repetitions, interruption with novel nucleotides and two or more tandem motifs. SSR markers can be sorted by genomic or expressed sequence tag (EST) levels. SSRs can be classified as nuclear (nuSSR), mitochondrial (mtSSR) or chloroplast SSRs (cpSSR) according to their location in the genome. Most genomic SSRs are nuclear SSRs (Soranzo et al. 1999; Weising and Gardner 1999; Selkoe and Toonen 2006; Ahmad et al. 2018).

SSR markers have been used mainly to study the structure of sugarcane genetic diversity and population (Nayak et al. 2014; Liu et al. 2018; Ali et al. 2019; Medeiros et al. 2020), varietal identification, genetic map (Marconi et al. 2011; Pan 2016) and genetic association (Banerjee et al. 2015; Ukoskit et al. 2019; Wu et al. 2019). However, fluorescence-labeled SSR markers combined with high-performance capillary electrophoresis (HPCE) have showed better performance in genotyping of polyploid sugarcane, due to higher accuracy and better detection power (Fu et al. 2016; Ali et al. 2017, 2019; Ahmad et al. 2018; Xu et al. 2018; Wu et al. 2019).

Simple sequence repeat markers with high stability, multiple quantity and high polymorphism are more efficient for evaluating sugarcane germplasm in China and other countries (Pan 2006; Chen 2009; Wu et al. 2019; Medeiros et al. 2020). Yu et al. (2018) concluded that hereditary base core parents in China were narrow because of the limited number of parents. Genetic diversity was analyzed on commonly used parents and genotypes by using SSR markers in various countries (Liu et al. 2015; Rao et al. 2016; Wu et al. 2019; Pocovi et al. 2020).

The objective of this study is to compare the Jaccard’ s genetic coefficient between the early maturing intermediate materials of GT series and the commonly used parents by SSR markers from the Guangxi Parental Resource Nursery, China. Understanding the genetic relationship between the high-sugar materials and the commonly used parents could help in developing these high-sugar materials as hybrid parents for the selection and development of novel high-sugar varieties of parents.

Materials and Methods

Plant Materials and DNA Extraction

In this study carried out in October 2014, a total of sixty-four sugarcane genotypes were tested. Twenty-three early maturity high-sucrose clones of GT series with Brix more than ROC22 (17.73%) were collected from Sugarcane Research Institute, Guangxi Academy of Agricultural Sciences (GxAAS), Nanning, Guangxi, China, and forty-one parents of commonly used in Sugarcane Cross Breeding Center germplasm were collected from the Chinese Academy of Agricultural Sciences, Hainan, China. Sixty-four sugarcane genotypes were divided into five groups which comprised: 23 high sugarcane sucrose clones of GT series (HSGT) and 41 commonly used parents, including 7 CP series from the USA, 9 ROC series from Taiwan Sugar Research Institute, Taiwan China, 14 mainland China (MLCH) series including some GT series and 11 others from other countries (Table 1).

Table 1 Place of origin of 23 high-sucrose clones from GT series and 41 commonly used parents from CP series, ROC series, mainland China and others places used in Guangxi, China

DNA extraction was followed by the SDS method with minor modifications according to Huang et al. (2010) and Gao et al. (2012). Photosynthetically mature sugarcane leaves (200 mg) were collected from different clones and separately ground as fine powder in liquid nitrogen and transferred to 2-ml sterilized tubes containing 1 ml pre-warm SDS buffer (SDS 1.5%, Tris100 mM, EDTA 20 mM, NaCl 500 mM). The genomic DNA was extracted following the traditional method.

SSR Analysis

Twenty-three SSR primer pairs were used to determine the diversity among high-sucrose clones from the GT series and commonly used parents. Genomic primer sequences followed the International Federation of Sugarcane Biotechnology guidelines (Cordeiro et al. 2000), and EST-SSR primer sequences were obtained from the literature (Oliveira et al. 2009).

Polymerase chain reaction (PCR) amplification was performed in a total of 20 μL volume containing 2 μL 10 × Buffer (including 2 mmol/L MgCl2), 1 μmol/L of each forward and reverse primer, 0.4 μL dNTPs 0.2 mmol/L, 1 μL template 30 ng/L DNA and 0.2 μL Taq polymerase, 14.4 μL ddH2O. PCR was carried out by initial pre-denaturation at 95 °C (5 min), denaturation at 94 °C (30 s), annealing at 53 °C (30 s) and extension at 72 °C (1 min), followed by 35 cycles with final extension step at 72 °C for 5 min (PCR, T-gradient 96, Biometra, Germany). The PCR products were electrophoresed at 120 V in 7% polyacrylamide gel for 90 min and photographed under UV light using a gel documentation system.

The Genetic Diversity of the Five Populations

To assess the proportion of polymorphic loci (PPL), Shannon's diversity index (I), Nei's gene index (H), the observed number of alleles (Na) and the effective number of alleles (Ne) to evaluate levels of genetic diversity among the populations using the POPGENE program v. 1.32 were carried out (Francis et al. 1999). The principal coordinate analysis (PCoA) was carried out to determine the genetic relationships among populations. PCoA, analysis of molecular variance (AMOVA) and Nei’s genetic distance and similarity were carried out using the GenALEx6.5 program (Peakall and Smouse 2012). The estimated groups’ numbers were used by STRUCTURE program version 2.3.4 (Evanno et al. 2005), and the results were obtained by using STRUCTURE HARVESTE (Earl and VonHoldt 2012).

Data Analysis

The data on bands generated by the twenty-three primers on sixty-four genotypes were analyzed for genetic diversity. All segregating bands were scored manually as 1 for presence and 0 for absence. Based on Jaccard's coefficient (Jaccard 1908), the genetic similarity (GS) was carried out using NTSYS-pc 2.10d. Cluster trees under the unweighted pair-group method with arithmetic mean (UPGMA) were constructed by DPS software (Tang et al. 2013).

Results

Cluster Analysis

The cluster analyses of internodes in 64 genotypes revealed a general structure between HSGT (23) and commonly used parents (41) from the germplasm of Guangxi sugarcane 'parents' (Fig. 1; Table 1). Around the degree of 0.61, all genotypes could be divided into 10 clusters. In cluster I included most of HSGT by 21 in 24 clones and 8 MLCH, 4 ROC series and 3 others in the germplasm of Guangxi sugarcane 'parents.' It indicated that these HSGT were closely related to the parents in cluster I and were carefully crossed with each other when HSGT became parents in the future. Specially, ROC22 accounts for the largest area in terms of sugarcane acreage and the main hybrid parent in China, but it is high in usage rate and low in combining Brix's ability (Wu et al. 2019). Therefore, it is suggested that in future fewer high-sugar parents should be made. But GT08-509 (X), YC89-7 (IX), POJ2827 (VIII), NcO293 & CP84-1198 (VII) and Zanz74-141& CP81-2149 (VI) were far from other genotypes. Therefore, using YC89-7 (IX), POJ2827 (VIII), NcO293& CP84-1198 (VII) and Zanz74-141 & CP81-2149 (VI) as hybrid parents it might be difficult to get high-sucrose offspring, but GT08-509 as an HSGT should fully utilize its high sucrose and other genetic backgrounds for crossing with the parent in future.

Fig. 1
figure 1

The UPGMA dendrogram of 23 high-sucrose clones from GT series and 41 commonly used parents from CP, ROC, Mainland China and others used in Guangxi, China

SSR Analysis

A total of 23 primer pairs were selected for SSR primer collection and to detect polymorphism in 64 sugarcane genotypes (Table 2). A total of 309 bands were amplified by 23 primers, with an average of about 13.48 polymorphic bands (ranking 7–22 bands). Two hundred ninety-four polymorphic bands were obtained with a polymorphic rate of 94.8% in 309 bands. Eleven primers reached 100% of polymorphic rate by mSSCIR1, mSSCIR21, mSSCIR3, mSSCIR43, mSSCIR66, mSSCIR9, mSSCIR9 and SMC851MS. This shows that the high polymorphic profiles by using these primers can assess genetic diversity in 64 genotypes.

Table 2 Twenty-three simple sequence repeats (SSR) primers used for the detection of polymorphism in sugarcane clones

Polymorphic analysis of 5 groups by 23 high-sucrose clones from GT series HSGT and 41 commonly used parents from CP series, ROC series, MLCH and others by 23 primers showed no more difference in the number and rate of occurrence of polymorphic bands. But HSGT showed the highest polymorphic band numbers (285) and polymorphic rate (95.3%) among the 5 groups (Table S1). This indicates that 23 primers worked excellently in all the 5 groups. The sharing status of amplification bands (Table S2) shows the same amplification bands in the correlation of two groups. HSGT was sharing 4 bands with the parent groups of MLCH, while the parent groups of CP series and others shared 6 bands.

Genetic Similarity Coefficient Analysis

According to the statistics of amplified bands from SSR locus, the genetic similarity coefficient of 64 genotypes ranged from 0.460 to 0.881, with an average of 0.613, and showed certain differences in 64 genotypes. The minimum genetic similarity coefficient was 0.460 and far from the genetic distance as seen in Q208 and GT008-509. The highest values of genetic similarity coefficients were found in GT07-548 and GT06-1857 (0.881), with the highest genetic correlation and close relationship of genetic similarity. In the collection of 23 clones in the HSGT group, the average GS was 0.633 indicating certain genetic differences in these clones. HSGT has great potential in the development of hybrid parents. GS of 23 clones in HSGT with 7 CP series by 0.604 and other parents by 0.608 was higher than CP series (0.581) and others-self (0.607), respectively (Table S3). It indicated that the relationship of HSGT was narrower than that of the parents of CP series and also with those of other countries in Guangxi germplasm. Since the 1950s, CP series characterized by high sugar has played an essential role in the breeding of sugarcane parents in China (Qi et al. 2012; Liu et al. 2015). Genetic similarity of 23 clones in HSGT with ROC series was the closest relation by 0.622. Presently, ROC22 and ROC16 account for the largest plantation area, more than 80% in China (Liu et al. 2018). Therefore, ROC varieties have been widely developed and utilized in sugarcane hybridization. The Guangxi province in China produces five varieties among the top 10 of the most widely commercial varieties in China including GT29, GT42, GT49, LC05-136 and ROC series offspring. Genetic similarity information between HSGT and parent groups should guide the breeder to make the program for high-sugar offspring and the utilization of HSGT as hybrid parents.

Genetic Diversity and Relationships Among Genotypes

Among the five populations, the mean values of the proportion of polymorphic loci (PPL), the observed number of alleles (Na), the effective number of alleles (Ne), Nei's gene index (H) and Shannon's diversity index (I) are presented in Table 3. At the species level, PPL, Na, Ne, I and H were 89.64%, 1.89 ± 0.03, 1.47 ± 0.01, 0.28 ± 0.01 and 0.43 ± 0.01, respectively.

Table 3 Genetic diversity of five populations of sugarcane on 309 ISSR loci

Among the five populations, the total gene diversity index (Ht) and gene diversity within the population index (Hs) were 0.21 ± 0.03 and 0.15 ± 0.01, respectively. The genetic differentiation index among the five populations (Gst) was 0.28. Analysis of molecular variance (AMOVA) showed that a relatively large proportion of genetic variation (99%) occurred within the populations, whereas only 1% of genetic variation was observed among the eight populations (p < 0.001; Table 4). The average value of Nei's genetic distance and genetic similarity index was 0.07 and 0.93 (Table S4).

Table 4 Analysis of molecular variance (AMOVA) for 5 populations of sugarcane based on 309 ISSR loci

Principal Coordinate Analysis

The principal coordinate analysis helps to illustrate the genetic relationships of sugarcane parents as compared to individual units and is calculated based on the SSR data matrix of the 5 loci of all 64 sugarcane accessions available in the present study (Fig. 2). According to the principal coordinates (Fig. 2), the distribution of HSGT among others and CP series populations was far away and the GS similarity coefficient was relatively small by 0.604 and 0.068, respectively.

Fig. 2
figure 2

Principal coordinate analysis (PCoA) of sugarcane parents using five populations of sugarcane based on genetic similarity

STRUCTURE Analysis

STRUCTURE analysis showed that the delta K displayed peaks at K = 3 (Fig. 3a and b). This indicates that the 64 individuals were clustered into 3 groups with 3 colors. Each individual is represented by a vertical colored line. The same color of different individuals indicates that they belong to the same cluster.

Fig. 3
figure 3

Cluster of 64 individuals made by STRUCTURE for K = 3

Discussion

Constant efforts are being made in China to improve sugarcane yield, particularly in cane yield and sugar content. Sugarcane breeding programs have enhanced these efforts to a great extent. Breeding programs based on genetic diversity are currently in the focus of agricultural research. In breeding programs, the hybrid varieties, when backcrossed with parents resulted in offspring with higher sugar content (Chen et al. 2009; Aitken et al. 2018; Medeiros et al. 2020). The molecular markers-based breeding program helps to overcome the limitations of conventional breeding techniques, as well as helps to understand the genetic susceptibility of the hybrids to various biotic and abiotic factors that affect plant growth and development (Govindaraj et al. 2011; Moore et al. 2013; Ahmad et al. 2018). For a successful breeding program, it is vital that the parents should be genetically divergent and the offspring tolerant to various stress factors (You et al. 2013; Neto et al. 2020). Microsatellite or SSRs markers are recent developments in research on breeding programs associated with crop improvement, and sugarcane breeding programs have identified SSR markers as more useful for identifying hybrid parents (Santos et al. 2014; Manechini et al. 2018; Ahmad et al. 2018). SSR-based breeding programs have demonstrated high variability, wide genomic distribution, co-dominant inheritance, high reproducibility, large multiallelic nature and specific chromosomal location and hence are gaining importance in breeding programs, especially in sugarcane (Neto et al. 2020; Medeiros et al. 2020).

In the present study, Jaccard's genetic coefficient was used to compare the genetic diversity of sugarcane hybrids. In this study, the sixty-four sugarcane genotypes when tested for genetic similarities using SSR markers, demonstrated high polymorphism. Based on the results obtained from the data on SSRs using the Jaccard similarity coefficient, HSGT showed the highest polymorphic band number of 285 bands and a polymorphic rate of 95.3% among the five groups. Among the 23 clones in the HSGT group, the average GS was 0.633 that indicated certain genetic differences in this clone. Studies on SSR-based breeding programs showed that those progenies which inherited parental traits, the possibility of contamination in the pedigree was evident. It was also observed that SSR markers can be used in differentiating true hybrids from those of contaminants (Santos et al. 2014; Parthiban et al. 2018). Under the breeding programs, Wu et al. (2019) applied the SSR markers for management of parental germplasm in sugarcane (Saccharum spp. hybrids). SSR markers are more suitable for the identification of parent clusters in breeding programs (Chen et al. 2009; You et al. 2013; Ahmad et al. 2018; Ali et al. 2019). Based on Jaccard's coefficient, it was observed that the GT07-548 and GT06-1857 were highly correlated, and therefore, together they can be regarded as poor parents together for the breeding program.

The present study also concluded that the 23 primers showed significant results in terms of measurement of genetic diversity and mapping with all the five groups, which was in line with the previous studies (Chen et al. 2009; Ahmad et al. 2018). These primers could be effectively used for sugarcane breeding programs to identify the genetically divergent parents. Researchers have shown that primer polymorphism enhances the efficacy of inter-specific hybrid identification (Yang et al. 2006; Saha et al. 2017; Manechini et al. 2018). The PCR-based DNA markers make it possible to analyze the degree of genetic variability that occurs among conventional progenitor species and commercial cultivars in sugarcane breeding trials (Singh et al. 2011; You et al. 2016; Ahmad et al. 2018; Medeiros et al. 2020).

Genetic similarity information between HSGT and parent groups could help the breeder to develop high-sugar offspring varieties using HSGT as hybridization parents. GT08-509 as an HSGT should fully utilize its high sucrose and other genetic background for crossing with the parent in the near future. The comparison with other populations as HSGT and ROC series found the largest similarity coefficient (0.622) by GS and small genetic distance value (0.048) by NEI. At the same time, according to structure (Fig. 3b), HSGT and ROC series showed that the rate of blue bars was more, indicating that the ROC group and HSGT are more or less similar. HSGT was at a far genetic distance from the two-parent populations. In the Guangxi breeding program, breeders should increase the rate of high sucrose in far genetic distance to obtain offspring with high sucrose and strong genetic diversity.

In conclusion, the analysis of variations in SSR fragments provides a useful tool for determining diversity to develop plant breeding strategies. In the coming years, the acceptance and use of SSR-based markers will increase significantly in the breeding of sugarcane. Identifying useful SSRs is critical, but in sugarcane, this can be a prolonged and complex process because the sugarcane genome is highly complex. These markers may be used for the construction of a genetic map in sugarcane. Further work on crosses between and within the groups identified in this study may provide useful strategies for identifying favorable genes and alleles in newly developed sugarcane varieties.