Introduction

Sorghum (Sorghum bicolor (L.) Moench) is the fifth most economically significant crop farmed after maize (Zea mays L.), wheat (Triticum aestivum L.), rice (Oryza sativa L.), and barley (Hordeum vulgare L.) under adverse environmental conditions (Adjei-Gyapong et al. 2016; Faostat 2017; Hussain et al. 2020; Prasad et al. 2020; Scavo and Mauromicale 2020; Singh and Husen 2020; Stat 2020). Sugar in the stem, lingo-cellulosic, and resilience to biotic and abiotic conditions with the benefit of high grain output are some of its desirable traits (Yadav et al. 2019).

Since conventional C4 crops, it has mostly served as a source of food and beverages for humans (Cisse et al. 2018; Galassi et al. 2020; Girard and Awika 2018; Nasidi et al. 2019; Nyoni et al. 2020; Taylor 2019; Taylor and Duodu 2019; Visarada and Aruna 2019). Grain and stem for all species, as well as other bio-industrial products, are kept (Konwar et al. 2018; Liu et al. 2020; Panda 2017). Additionally, its stem can be utilized as forage, and its grain is a key component of premium alcohols (Elangovan et al. 2020; Li and Li 2020; Mengistu et al. 2020; Shimizu et al. 2020; Yesmin et al. 2020). It has been seen as a diploid model crop for power plants, similar to polyploid sugar cane, because it is three to four times smaller than the 750 Mb of genome-sized maize (Ananda et al. 2020; Yadav et al. 2020; Yang et al. 2020).

A genetic map is a material used for labeling, cloning, and quantitative and qualitative genes and plays a key role in the marker-assisted breeding (MAB) program (Baidyussen et al. 2020; Ji et al. 2017; Maina Assane Mamadou, 2020). Genetic maps of high-density sorghum can be used to compare genomes, useful gene mining, and gene mapping (Gelli et al. 2017; Hu et al. 2019; Ji et al. 2017; Mace et al. 2019; Mathur et al. 2017; Zhang et al. 2019). By comparing homology in different plant species, genes for disease and resistance to insects, stress tolerance, sugar concentrations, and biological yields can be established, and chromosomes can also be mapped, laying the foundation for the cloning and application of genes (Nundwe 2018). High-density genetic mapping is of great importance for increasing the statistical power and accuracy of gene detection and QTL (Boyles et al. 2019; Govindarajulu et al. 2020; Habyarimana et al. 2019; Ji et al. 2017; Kajiya-Kanegae et al. 2020; Kang et al. 2020; Kong et al. 2020; Mace et al. 2019; Miao et al. 2020).

Early sorghum maps were developed primarily with labor-intensive or dominant markers such as RFLP (Restriction fragment length polymorphism), AFLP (Amplified fragment length polymorphism), and RAPD (Random amplified polymorphic DNA) (Nadeem et al. 2018; Nguyen 2019; Sengar 2018). These maps were instrumental in the mapping of the sorghum gene (QTL), comparative genomics, and genetic studies (Balakrishna et al. 2020; Cuevas et al. 2016; Disasa et al. 2017; Disasa et al. 2018; Kong et al. 2018; McCormick et al. 2016; Woldesemayat et al. 2018). However, such genetic marker systems have limited marker numbers, and dominant expression, and are not repeatable on different maps. More insightful types of markers can effectively address the above-mentioned inconveniences.

Simple sequence repetition (SSR) with features of high reproducibility, co-dominant inheritance, multi-allelic diversity, and genome abundance have replaced dominant linkage mapping markers due to the rapid advancement of sequencing and genotyping technologies (Derese 2017; Schnaithmann 2016). SSR markers were first used for the detection and identification of polymorphism groups and then used to create sorghum genetic maps with a large number of SSR markers (Disasa et al. 2016; Kumar et al. 2020a, b; Ma et al. 2020; Nahas et al. 2020; Raza et al. 2020; Zuo et al. 2020). In sorghum gene (QTL) mapping, genome evolution, molecular genetics, and MAB of several mapping with or primarily based on SSR markers have been developed and used (Disasa et al. 2017; Disasa et al. 2018; Jadhav et al. 2019; Kajiya-Kanegae et al. 2020; Kang et al. 2020; Kang et al. 2020; Kiranmayee et al. 2020; Maina Assane Mamadou 2020; Takele et al.2022).

However, genotyping-by-sequencing (GBS) has emerged as an effective genotyping system capable of identifying, sequencing, and genotyping thousands of markers across almost any genome of interest and population (Ara et al. 2020; Boopathi 2020; Jaganathan et al. 2020; Kausch et al. 2019; Khangura 2019; Nadeem et al. 2018; Sahu et al. 2020; Scheben et al. 2017). Next-generation sequencing (NGS) can precisely classify high-precision differences in the DNA sequence, making them widely used for plant and animal genetic analysis (Ansari et al. 2020; Gonzalz et al. 2020; Santos-Silva et al. 2020). The use of this approach to scanning the entire sorghum genome is of great importance for the development of high-density markers and gene mining for sorghum breeding. This study aims to create a high-density linkage map with single nucleotide polymorphisms (SNPs) via NGS technology. The goal of this linkage map building is to provide information and serve as a guide for successful gene exploration and lay a foundation for marker-assisted breeding.

Materials and methods

Recombinant inbred line population and trial evaluation

In this work, 139 RILs from bi-parental mapping populations that are generated from sweet sorghum and grain were examined. These mapping populations were developed by crossing sweet sorghum (Gambella, the pollen source) and grain sorghum (Sorcoll 163/07, the female parent), then repeatedly selfing for six generations. Sorcoll 163/07 was an inbred line characterized by drought tolerance potential with average plant height (224.5 cm), stem diameter (15.5 mm), and oBrix (6.8%). Gambella is a farmer's preferred genotype released as an improved variety by EIAR and characterized by its sweet stem with an average plant height (192.3 cm), stem diameter (14.0 mm), and oBrix (14.4%). The seeds from the progenies were used for both phenotyping and sequencing purposes. On the trial field at Ambo University’s Guder campus, 139 RILs from two parents were planted throughout the main and off-seasons of 2018 and 2019. The site, which is located at 37°46′E, 8°58′N, and has an elevation of 1900 m.a.s.l., is characterized by mid-agro ecology. Using a portable Geographical Positioning System, these coordinate data were scored (GPS). The range of the yearly rainfall was 800–3194 mm.

For both environments/seasons, an alpha-lattice design with three replications was used. All RILs and their parents were arranged in two rows per plot. Each plot has a length of 3.0 m and is spaced apart by 0.75 m. Replications and blocks were separated by 1.5 and 2.0 m, respectively. After emergence lasted for 25 days, 0.2 m of manual thinning was done. Based on the features that needed to be assessed, five plants were randomly chosen from each row. Diammonium phosphate (DAP) at the recommended dosage of (100 kg/ha) was sprayed during planting, and the same dosage of urea was added as a split application. The application of all required agronomic techniques was done, and the frequency of diseases and/or pests was periodically checked. The heads of previously marked plants were covered with paper bags after flowering to safeguard them from bird damage and pollen contamination from nearby plants. Based on the sorghum agro-morphological descriptor, all yield and yield-related data were collected (IBPGR 1993).

Phenotyping

When roughly half of the plants in a plot reached the half-flowering stage, that is when the flowering date was estimated. In a similar manner, the number of days until seeds on 50% of the plants in a plot developed a black layer on the lower third of the panicle was used to estimate the days till maturity. The crop’s height from the ground to the tip of the head was measured to determine the plant height (in cm). Additionally, the distance between the inflorescence panicle and the tip of the panicle was measured. A digital caliper LCD Stainless Electronic Ruler Micrometer was used to measure the head diameter and stem diameter. Four measurements in total were made from each plot, and an average was calculated for each. Weight of panicle, grain per plant (gm), and thousand seed weight (gm) were measured using sensitive balance.

Phenotypic data analysis

SAS 9.3 (https://www.sas.com/) was utilized to calculate the variance analysis across the progeny. The PROC GLM Method was employed to estimate variance for each attribute. The normality and homogeneity of the data for the features for the combined season were calculated using Minitab version 16 statistical tools (Minitab 2010). The means were separated using Duncan’s multiple range tests. The Pearson correlation coefficient was also calculated for each pair of characters using the PROC CORR method.

Genotyping

In the greenhouse at the World Agroforestry Center (ICRAF) headquarters in Nairobi, Kenya, seeds from 139 RIL (F6:8) mapping populations, including two parents (Gambella and Sorcoll 163/07), were planted. Two- to three-week-old seedling leaf tissues were obtained, and then genomic DNA was extracted using the Promega package (genomic DNA extraction package). Genomic DNA quality was assessed using a 0.8% agarose gel stained with GelRed ® (Biotium, USA), and genomic DNA was measured using Qubit ® 2.0 (Life Technology, Grand Island, NY).

GBS library construction and SNP calling

Genomic DNA was individually digested using enough PstI and MspI restriction enzymes in accordance with the Poland et al. (2012) methodology. The GBS collection was created in 96-plex, to put it briefly. Data collection, analysis, and management were carried out utilizing different software packages, including TASSEL v.5 to produce polymorphic SNPs across both parents and their progeny, and DNA sequencing was carried out on the Illumina Genome Analyzer at Cornell University, USA. Only SNPs with minor allele frequencies higher than 0.05 and less than 10% missing data were used for this research.

High-density linkage map

Before creating the linkage map, all heterozygous calls were categorized as missing data and a high-density linkage map was constructed in stages. In the first stage of linkage mapping, the BIN tool technique implemented in ICi Mapping software v4.1 (Wang et al. 2016) was utilized, and the binning of all 1082 SNP markers was carried out based on their segregation pattern (Winfield et al. 2016). After binning, markers were grouped using the logarithm of the odds (LOD) threshold value > 2.5. In order to assign linkage groups (LGs), the genomic locations of the SNP markers during SNP calling were used. LGs with fewer than five markers were deemed to be unlinked and eliminated from further building. LGs from the same chromosome were merged.

The order of all markers within LGs was determined using the RECORD method (Recombination counting and ORDering). The recombination frequencies between markers were converted to centimorgans using the Kosambi mapping tool. The markers were examined using the original linkage map created with the functions for duplicate lines, segregation distortion, switched alleles, and single and double cross-over that were implemented in ICiMapping v.4.1. (genotyping errors).

Genotypic data from 139 lines with 1082 filtered SNP markers were used in ICiMapping v4.1 for the final linkage map building in the second stage of linkage mapping, i.e. after error correction and dropping of markers and genotypes of low quality (Wang et al. 2016). 10 defined LGs that contained all of the sorghum chromosomes and threshold LOD > 2.5. Using the RECORD technique, 1082 SNP markers were ordered across 10 chromosomes, and the marker order was then fine-tuned utilizing rippling using the sum of adjacent recombination fraction (SARF) and a window size of 7. The Kosambi mapping method, which is based on the recombination rate, was used to modify the genetic distances between SNP markers. ICiMapping v4.1 was used to construct and draw a high-density linkage map once the best marker order had been selected (Wang et al. 2016).

Quantitative trait loci mapping

For the purpose of mapping QTL related to grain per plant, 1000 seed weight, panicle diameter, and panicle length, a total of 1082 polymorphic SNP markers were employed across 139RILs. To find and map QTL, Wang et al. (2016) ICiMapping software version 4.1 was used. It was decided to use ICIM-ADD in QTL ICiMapping as the construction system for locating QTL. A probability of 0.05 in stepwise regression was used for each mapping method and the mapping parameters for ICIM-ADD were set at 1.0 cM. All QTL are identified using the standard terminology (Wang et al. 2016).A stable QTL is one that exhibits effects over the duration of more than two seasons. We classified a large QTL as a QTL with a LOD threshold value > 2.5 and a phenotypic differential donation of 10%.

Results

A sorghum RIL population made up of 139 members of a Gambella X Sorcoll 163/07 hybrid was used to create the genetic map. The Sorcoll 163/07 is a type of grain sorghum, and Gambella is a type of sweet sorghum. The phenotypic traits of the two parents had been very different. Their kids did well in polymorphic marker screening and community linkage formation as a result.

SNP marker detection

From 1702 SSR markers that were discovered to be polymorphic between the two mapping parents, a total of 1082 (63.57%) were tested. The remaining 13 (0.76%) markers did not exhibit distinct and reproducible polymorphic patterns between the two parents, whereas 607 (35.66%) could not distinguish between the two parents and were classified as monomorphic markers. High-density linkage maps have been assigned a total of 1082 polymorphic markers.

High-density linkage map

The genetic distance between neighboring markers built up the leaner alignments of markers on chromosomes. The total of 1082 filtered SNP markers was grouped into 10 chromosomes. The final point on the genetic map was the 1082 markers (Fig. 1). With a total map length of 2174.50 cM and an average marker distance of 2.01 cM, its genetic length ranged from 72.91 to 444.02 cM. A total of 166 of the 1082 markers on chromosome 4 were assigned, with an average marker distance of 2.67 cM and a total marker distance of 444.02 cM. With a length of 444.02 cM and an average distance of just 6.32 cM between nearby markers, chromosome 4 was the largest of the ten chromosomes (Table 1). In contrast, Chromosome 8 was the shortest of the ten chromosomes, measuring only 72.91 cM in length overall, with an average gap of 2.25 cM between surrounding markers. It contained 100 markers, or 9.24 percent of all markers, for a total length (Fig. 2).

Fig. 1
figure 1

Graphical representation of high-density linkage map constructed from genotyping by a sequence derived SNPs in recombinant inbred lines from a cross between Gambella and Sorcoll 163/07

Table 1 Marker statistics of the linkage map constructed from RIL derived from a cross between Gambella and Sorcoll 163/07
Fig. 2
figure 2figure 2

The high-density genetic map of sorghum constructed from genotyping- by sequencing derived SNPs in recombinant inbred lines derived from a cross between Gambella and Sorcoll 163/07

An interval of less than or equal to 5.0 cM (Interval average 5) was used to describe the degree of association between the markers, with a range of 86.8 to 95.8 percent and an average value of 91.34 percent (Table 1). Chromosome 8 has a main interval of 29 cM. 108.2 markers on average were assigned to each chromosome that measured 253.27 cM in length (Table 1). There was a wide interval of 29 cm between 190.7 and 251.5 cm, and the interval of around 5 cm was 90% of that. On chromosome 5, the shortest markers (128.40 cM) had an average separation between neighboring markers of 2.25 cM. There was a 9.3 cM broad gap at the end of chromosome 5. The largest interval = 5 ratio was 95.8%, indicating a strong consistency in marker assignment (Table 2).

Table 2 Marker Interval of the linkage map constructed from RIL derived from a cross between Gambella and Sorcoll 163/07

Phenotypic trait analysis

The yield and yield components of the sorghum RIL population and its parents under main and off-season are given in Tables 3 and 4 respectively. A significant variation was observed for yield and components during off-season compared with main season. For panicle length ranging from 26 to 80 (cm) with a mean value of 48.96 (cm) during rainy season. Additionally, panicle length ranging from 24 to 78 (cm) with mean value of 46.45 (cm) during off-season. Grain per plant also showed significant difference between two seasons. It ranged from 98.16 to 185.9 (gm) with mean value of 156.31 (gm). The values obtained for grain per plant, 1000 seed weight, panicle diameter, and panicle length were higher in the Gambella parent than in the Sorcoll 163/07 under both seasons. The grain per plant was scored 190.7 and 187.89 (gm) for Gambella and 100 (gm) and 88.9 for Sorcoll 163/07 during rainy season and off-season, respectively. The mean value of 27.71, 94.39, 97.28 and 35.89 for panicle length, panicle diameter, grain per plant and thousand seed weight, respectively, during combined season (Supplementary data Table 3). The mean value of each trait of the RIL population was between that of the two parents and explained the character of quantitative traits. The distribution of these traits under different seasons showed that the absolute value of skewness and kurtosis for yield and its components was lower than 1 and accorded to normal distribution. Therefore, this population was suitable for yielding QTL analysis.

Table 3 Yield and yield component of Sorghum recombinant inbred line population and its parents in rain season
Table 4 Yield and yield component of Sorghum recombinant inbred line population and its parents in the off-season

Mapping of QTL

Using ICiMapping software version 4.1, the quantitative trait loci of yield and its components were mapped both during the growing season and the off-season (Meng et al. 2015). Nine QTL were discovered from the combined season, which included both off-season and main season circumstances. Table 5 and Figs. 2, 3, and 4 contain a list of the interval, impact, and contribution of the additive QTL for yield and yield components in the two seasons.

Table 5 Linkage groups of quantitative trait loci detected at the Guder campus related to yield and its components (rainy season and Off-season)
Fig. 3
figure 3

Detected quantitative trait loci for yield and yield component traits in sorghum at LOD threshold 2.5 (Off-season)

Fig. 4
figure 4

Detected quantitative trait loci for yield and yield component traits in sorghum at LOD threshold 2.5 (Rain-season)

Panicle length

For Panicle length, the additive QTL were found in both the rainy and the off-season (Table 5 and Figs. 3 and 4). One QTL was found during the off-season, while two additional QTL were found during the main season. Both seasons revealed three additive QTL. These three QTL, which were located on chromosomes 9 and 10, had positive additive effect values for panicle lengths ranging from 0.47 to 3.84 cm. On chromosomes 9 and 10, two QTL were discovered in the combined season. One QTL displayed a positive effect that ranged from − 3.48 to 3.84 cm, whereas the other one displayed a negative effect.

Panicle diameter

Table 5 is a list of the additive QTL discovered for panicle diameter. On chromosomes 8 and 9, two additive QTL were discovered during the main season; in contrast, one additive QTL was discovered during the off-season on chromosome 8. (Fig. 4). Additionally, three additive QTL on chromosomes 1, 4, and 9 that were found from a combined season indicated the negative effective value of the additive effect for panicle diameter ranging from − 3.62 to − 1.36 cm (Table 5).

Grain per plant

Under both rainy and off-season conditions, three additive QTL for grain per plant were found. On chromosomes 1 and 9, there were three additive QTL that had additive effect values ranging from − 0.35 to 2.11 cm and phenotypic variance between 6.67 and 13.88% off-season (Fig. 4). Accordingly, these three QTL were discovered on chromosomes 1 and 9, with phenotypic variation ranging from 6.02 to 10.29%, and one positive and two negative additive effect values. While in the combined season, three additive QTL for grain per plant were found. These four QTL had additive impact values ranging from − 4.41 to 3.37 cm and were found on chromosomes 2, 4, and 9. With an average phenotypic variance of 7.67%, there were two positive effects and one negative effect (Figs. 5 and 6).

Fig. 5
figure 5

Detected quantitative trait loci for yield and yield component traits in sorghum at LOD threshold 2.5 (Combined-season)

Fig. 6
figure 6figure 6

The QTL map on the high-density linkage map with 1082 SNP markers is based on the 139 individuals of sorghum RIL populations. The QTL is abbreviated as q = QTL, followed by the trait’s abbreviation name: PL = panicle diameter, PD = panicle diameter, GPP = Grain per plant, and TSW = Thousand seed weight

Thousand seed weight

For 1000 seeds, two additive QTL were found, and they were found throughout both the dry and growing seasons. These two additive QTL, which were discovered on chromosomes 6 and 9 during the main season and chromosomes 6 and 9 during the off-season, revealed negative additive impact values for all QTL detected for thousand seed weight ranging from − 0.72 to − 0.29 during the main season (Table 1 and Fig. 3). Similar to this, two additive QTL with positive and negative effects ranging from − 3.08 to 2.9 cm on the chromosomes 1 and 9 were found from a combined season (Table 6).

Table 6 Linkage classes, location, and flanking markers of quantitative trait loci detected at the Guder campus related to yield and its components (Combined season)

Discussion

High-density linkage map

Combining GBS with biparental mapping is increasingly being used to create high-density linkage maps and analyze complicated characteristics (Hirannaiah 2019; Baillo et al. 2020; Lopez et al. 2017; Mace et al. 2019; Sukumaran et al. 2016; Tefera 2019; Varoquaux et al. 2019). This study utilized GBS and biparental mapping to create high-density linkage maps of sorghum from distinct parents, Gambella (sweet sorghum) and Sorcoll 163/07 (grain sorghum). The RIL population was developed, and a high-density linkage map was created using 10 chromosomes and 1082 markers. The GBS-based linkage map technology significantly improved identification and marker quality, resulting in a dense genomic map. Previous sorghum maps were unreliable due to insufficient markers and broken chromosome sections.

This study created a genetic map of sorghum using the sorghum genome’s full sequence. Top-notch markers were found evenly distributed across 10 chromosomes, with a total length of 2174.76 cm and a range of 72.53–444.02 cm. The linkage map’s total length and average marker distance were higher and lower than previous analyses (Menz et al. 2002; Li et al. 2010). The 1082 markers used in the map were distributed across the sorghum genome’s 10 chromosomes, with chromosome 4 having the highest percentage of markers (15.34%) and chromosome 5 having the lowest percentage (5.27%) (Table 1). The study found that chromosome 5 had fewer markers compared to sorghum chromosome 4, indicating a significant reduction in diversity in chromosome 5 (Pootakham et al. 2015; Semagn et al. 2006).

The reduced density of the bi-parental mapping population may be due to the two parental lines used in its establishment, causing many chromosome areas to be related via descent (Zhang et al. 2010; Guan et al. 2011; Zhang et al. 2013; Li et al. 2014; Lin et al. 2015). The marker density of genetic maps created in sorghum using GBS methods is lower than the total mean marker distance of 2174.76 cM produced in this investigation (Zhang et al. 2015). The distribution of SNP markers is not random, with some regions having more markers and others having fewer. The majority of intervals (91.34%) are less than or equal to 5.0 cM for each linking category, while only 11.66% of intervals longer than 5.0 cM were observed overall in all chromosomes. The study found that chromosomes 3, 7, 8, and 10 were longer than 20.0 cM, with the longest being 29 cm at the distal end of chromosome 8. More comparable markers between different sorghum maps are needed to fill in gaps and gain more thorough coverage of the sorghum genome.

Ten chromosomes were constructed to create association groups, and 1082 markers were distributed among them to create a final linkage map. The distance between linkage groups was combined to equalize the number of chromosomes (Li et al. 2010). The discrepancy in linkage maps for sorghum may be due to incomplete markers, uneven dispersion across chromosomes, or insufficient connections between markers in different chromosome arms.

Mapping of QTL for yield and yield components during main and off-season

Water stress is a significant ecological factor affecting sorghum productivity. Studies have identified quantitative trait loci for sorghum moisture stress, but there is no stable QTL in general (Khangura 2019; Disasa et al. 2018). Most earlier investigations have relied on additive QTL, making it crucial to find stable QTL. The QTL results of sorghum yield and yield components during the main growing season were distinct from those during the off-season. This could be due to different gene expression patterns between the two seasons. The off-season yield and yield component QTL additive effects varied depending on the trait, consistent with the sorghum water stress breeding approach. One QTL on chromosome 9 for yield and thousand seed weight is near the QTL described by other studies (Balsalobre 2017; Menz et al. 2002).

Sorghum breeding aims to maximize development during both main and off-seasons. Quantitative trait loci analysis revealed that most QTLs (Quantitative traits) remain constant throughout two seasons, indicating stable genetic expression. Some QTLs have different expressions during different seasons, consistent with previous research (Ji et al. 2017; Takele et al. 2022). This finding suggests that there is potential for targeted breeding to enhance specific traits during either the main or off-season. By understanding genetic variations and their seasonal expressions, breeders can strategically manipulate the breeding process to optimize desired traits. This knowledge opens up new possibilities for developing sorghum varieties that are tailored to perform exceptionally well in specific seasons. Additionally, it highlights the importance of considering both main and off-seasons in sorghum breeding programmes, as they offer unique opportunities for genetic improvement. With further research and advancements in molecular techniques, breeders can unlock the full potential of sorghum and contribute to sustainable agriculture by maximizing its productivity throughout the year.

Conclusion

The main conclusion of the study is that the utilization of genotyping-by-sequencing (GBS) technology has enabled the construction of a high-density linkage map for sorghum. This linkage map provides valuable insights into the genetic architecture underlying yield and its components in sorghum. The study identified several quantitative trait loci (QTLs) associated with yield-related traits, such as plant height, panicle length, grain weight, and grain yield. These QTLs can serve as potential targets for marker-assisted selection (MAS) to enhance sorghum breeding programmes aimed at improving yield performance. Additionally, the study highlights the importance of incorporating genomic tools and techniques in crop improvement strategies to accelerate the development of high-yielding sorghum varieties.