Introduction

Cotton (Gossypium spp.) is an important crop grown in the United States and many parts of the world for its spinnable fiber. Known to the agricultural world as “White Gold” for its soft fluffy staple fiber, cotton is the major source of natural fiber for the textile industry. With the advent of highly sophisticated and efficient spinning technologies, there is a need to improve the quality of cotton fiber. Improvement of any crop requires diverse genetic resources. However, the worldwide Upland cotton gene pool is genetically impoverished due to a series of bottlenecks imposed by polyploid formation, domestication and migration (Paterson et al. 2004).

The narrow genetic base of cotton has been considered as one of the major obstacles in cotton improvement (Paterson et al. 2004). In addition, overexploitation of a few genetic backgrounds by breeders may have contributed to a plateau of yield and fiber quality traits (Esbroeck and Bowman 1998). Such a narrow genetic base might result in a crop being highly vulnerable to stresses (McCarty et al. 1998a). A decline in the genetic diversity of upland cotton and the need to broaden the genetic base of cotton germplasm useful for the improvement of lint yield, fiber quality and biotic or abiotic stresses has been widely reported (Bowman 2000; Esbroeck et al. 1999; Gutierrez et al. 2002; Iqbal et al. 2001).

One solution to the problem of cotton genetic vulnerability might be the exploration of exotic genotypes (Paterson et al. 2004). Incorporating favorable alleles, genes or gene complexes from wild relatives or accessions has been a high strategic priority for practical improvement of crops like rice, wheat, maize, barley, sorghum, millet, cassava, potato, chickpea, cowpea, lentil, soybean, bean, pigeon pea, banana and groundnut (Hajjar and Hodgkin 2007). Research has shown that primitive accessions of cotton are highly diverse and have useful genetic variability (Meredith 1991; McCarty et al. 1995, 1998a, b; McCarty and Jenkins 2001, 1992). In practice, however, utilization of genetic variability from primitive cotton accessions has been limited due to their photoperiod response. Conversion of primitive usually short-day flowering genotypes into day-neutral forms by repeated backcrossing has been undertaken to reduce the obstacles to use this germplasm (McCarty and Jenkins 2001).

The development of molecular markers has accelerated the process of selection and improvement of traits of interest. Conventional breeding has played an important role in the improvement of yield and fiber quality in Upland cotton. The progress made by such conventional schemes could be supplemented and expedited by the use of molecular markers. Since the first restriction fragment length polymorphism (RFLP) linkage map of cotton has been reported (Reinisch et al. 1994), many genetic maps have been developed for Gossypium intra-specific and inter-specific populations (Zhang et al. 2002; Wright et al. 1999; Ulloa et al. 2002; Shen et al. 2005, 2007; Qin et al. 2009; Jiang et al. 1998, 2000; Muhammad et al. 2011; Ulloa and Meredith 2000; Wang et al. 2006) and quantitative trait loci (QTLs) linked to different fiber quality, yield and yield attributes, and other agronomic and economic traits have been mapped. Identification of molecular markers linked to these QTLs or traits of interest may accelerate the selection and breeding of these traits.

In this study, we have selected microsatellite markers from “hotspots” for fiber quality traits on the basis of meta-analysis of QTLs in cotton (Rong et al. 2007) and a whole genome marker map of cotton based on the D-genome sequence of G. raimondii (Wang et al. 2013). Association of these markers with six different fiber quality traits was studied using a number of populations and significant associations are reported.

Materials and methods

Plant materials

We selected three converted exotic accessions, MDN063, MDN101 and MDN257 based upon their phenotypes for yield, lint percentage, boll size, micronaire, 2.5% span length (fiber length), elongation, and strength (Table 1). MDN063 was converted from race T0063, which was collected in the state of Chiapas, Mexico. It contributed large and positive additive genetic variation to progeny of crosses with Deltapine 16 (DPL 16) for increased boll size, reduced micronaire (indicative of finer fiber) and greater fiber elongation (Table S1). It ranked first among 79 lines tested for lowest micronaire and its progeny also ranked first among 79 progeny populations tested. It was also highly-ranked for boll size and fiber elongation.

Table 1 Origin and characteristics of day-neutral converted exotic G. hirsutum genotypes

MDN101 was converted from race T101, which was collected in the state of Jutiapa, Guatemala. It contributed large and positive additive genetic variation to progeny of crosses with DPL16 for increased fiber length, higher lint percentage, increased fiber strength, and reduced plant height (Table S1). Its progeny ranked first of the 79 populations tested for a “selection index” reflecting yield, fiber strength and lint percentage (McCarty et al. 1996). MDN257 was converted from race T257, which was collected in the state of Oaxaca, Mexico. It ranked first among 79 progenies tested for fiber length and its progeny also ranked first among 79 progeny population tested. Although this genotype is below average for some key attributes (Table S1), it brings additional botanical and geographic diversity to the sampling of genotypes. Four elite US genotypes, PD94042, DES56, PMHS200 and Acala Maxxa, respectively sampling the Eastern, Delta, Plains and Acala production regions of the US cotton genetic pool were evaluated for effects of QTLs from the exotic lines.

Development of experimental populations

Crosses were made in 2010–2011 in the green house, between the day neutral exotic lines (MDN063, MDN101 and MDN257) and four elite Upland cotton genotypes (DES56, Acala Maxxa, PD94042 and PMHS200) in different combinations (Fig. 1). Five populations were created in 2010 and their corresponding F2 and F2:3 progenies were planted in the fields in 2011 and 2012 respectively. Eight populations were created in 2011 and their corresponding F2 and F2:3 progenies were planted in the fields in 2012 and 2013 respectively. In addition, there were two planting dates in 2011 for F2 progenies of three populations (MDN101 × PD94042, MDN101 × DES56 and MDN063 × DES56) created in 2010 (Fig. 1). The first planting was done during the last week of May and the second planting was done in the first week of July, both by transplanting of small seedlings. Planting of seeds in the greenhouse and their transplanting in the field in 2011 was done to avoid excessive dry conditions in the field and get a sufficient number of F2 plants. In 2012, however, field conditions were favorable and hence, F2 seeds were directly planted in the field. Phenotypic measurements were taken from all F2 and F2:3 progenies over those 3 years (2011–2012–2013). Bolls were hand harvested and collected separately for individual crosses. All samples were hand-ginned and delinted.

Fig. 1
figure 1

Development of experimental populations. a List of populations created in 2010 with their corresponding F2 and F2:3 progenies in 2011 and 2012. The number of F2 and F2:3 progenies are shown in brackets. PD:1 and PD:2 represent the first and second planting dates respectively. b List of populations created in 2011 with their corresponding F2 and F2:3 progenies in 2012 and 2013

The F1 progenies and the parents were planted at The University of Georgia Plant Sciences Farm, Watkinsville, Georgia. As many flowers as possible from each individual F1 plant were self-pollinated, and DNA of F1 plants was checked with several microsatellite markers to verify hybridity. Selfed bolls from verified F1 plants were hand collected separately for F2 seed. Selfed bolls from the F2 plants were also collected individually to obtain F2:3 seed. F2:3 progeny plants were grown in 2012–2013 by planting 10 F2:3 seeds in unreplicated plots each 1 m in length. Replications of all parental genotypes were included in each field trial. Cultivation practices were consistent with regional recommendations, including irrigation, fertilization and pesticide application.

Sample collection and data analysis

Fiber samples from F2 plants comprised of up to 25 open pollinated bolls from individual plants. For the F2:3 progeny plots, up to 25 bolls were collected to sample all the plants in each plot. After ginning, samples were sent to Cotton Inc. (Cary, NC) for testing of six fiber quality traits [Micronaire (MIC), Upper half mean length or fiber length (UHM), Fiber uniformity index (UI), Fiber strength (STR), Fiber elongation (ELO) and Short fiber content (SFC)]. Data were analyzed using R statistical software. Single marker analyses (SMA) for fiber quality traits were done using Windows QTL Cartographer 2.5 (Wang et al. 2012).

The number of marker loci tested for association with phenotypes was different in different populations. For each trait in each population, there were the same number of statistical tests as the tested loci. Under such conditions, the experiment-wise Type I error rate would be much higher than the nominal significance rate of any single test (0.05). To mitigate this multiple comparison problem, Bonferroni correction (Significant P value ≤ 0.05/N, where N is the number of statistical tests/amplified loci used in each population) was used to obtain an appropriate significance threshold (Table S7). Associations surpassing this threshold were considered significant associations and those that surpassed the nominal significance rate of 0.05 but could not reach the more stringent Bonferroni corrected threshold were considered nominal associations.

In many cases, Bonferroni correction is too stringent, especially when the number of hypotheses to be tested is high. It is a useful tool to reduce Type I error (false positives), but it also increases the chances of false negatives. So, we have referred to those associations that could not reach the cut-off level as nominal associations. This allows us to better compare and contrast associations across multiple backgrounds, planting dates and growing seasons/years to reflect QTL stability. Furthermore, our study is performing an exploration of the ‘hotspots’ for fiber quality QTLs using highly diverse sets of parents. As such, both significant and nominal associations could be informative in explaining the contributions of the exotic parents to elite genetic backgrounds. Most of the focus of this paper remains on the associations meeting Bonferroni significance, however, nominal associations have also been discussed when they provide support regarding stability and replicability of the potential QTLs.

DNA extraction and SSR assays

Genomic DNA was isolated from young unopened leaves from each F2 plant using a modified cetylmethylammonium bromide (CTAB) method (Paterson et al. 1993). A total of 1–2 g of fresh tissue was ruptured and grinded in Omni Bead Ruptor 24 Homogenizer (Omni International) in a 1:1 mixture of cotton lysis buffer and extraction buffer. After leaving the tubes in a 65 °C water bath for about an hour, the extraction materials were purified twice with 800 µl of chloroform iso-amyl alcohol. DNA was then precipitated with 500 µl of isopropanol, cleaned with two washes of 75% ethanol (500 µl each) and centrifuged. The clean dried DNA was dissolved in 200 µl of TE buffer. The extracted DNA was checked for quality and quantity with NanoDrop 2000 (Thermo Scientific) and stored at −20 °C.

We selected eighteen “hotspot” regions from the cotton genome based on their richness in fiber quality QTLs (Table S6) to select microsatellite markers for genotyping of the populations. 720 microsatellite primer pairs from these regions were used to screen polymorphism among the parents of the mapping populations. PCR was performed in 96- and 384-well plates. Reaction mixtures contained 1 μL each of MgCl2 (3 mM), dNTPs (2 mM), and 10× PCR buffer (100 mM Tris–HCL at pH 9, 500 mM KCL, and 15 mM MgCl2), 0.2 μL of Taq DNA polymerase (1 U), 3.8 μL of ddH2O, and 2 μL of template DNA (15 ng/μL) to constitute a total volume of 10 μL. Typical cycling conditions for PCR were 95 °C for 4 min in the first step, followed by six cycles of 94 °C for 40 s, a gradient reaction from 58 to 52 °C each for 1 min, and 72 °C for 1 min. The third step was 35 cycles of 94 °C for 40 s, 55 °C for 1 min, and 72 °C for 1 min. After the last cycle, reactions were incubated at 72 °C for a final extension period of 10 min before cooling to 4 °C. Amplicons from PCR were resolved by 10% nondenaturing polyacrylamide gel electrophoresis (PAGE), and visualized by staining with silver nitrate (Bassam and Gresshoff 2007).

Results

Performance of parents, F2 and F2:3 progenies

The mean parental values for six fiber quality traits evaluated in this study are shown in Supplemental Table S2. Parents displayed wide variation for these traits over 3 years (Figs. 2, 3; Table S2). MDN101, which was selected for (UHM, showed better fiber lengths than DES56 and PMHS200 in 2011. Likewise, MDN063, selected for MIC and ELO, showed better fiber elongation than DES56 both in 2011 and 2012. It also showed better performance than Acala Maxxa for MIC in 2011. The mean F2 and F2:3 values for different fiber quality traits in our experimental populations are presented in supplemental tables S3 and S4 respectively. The absolute values of skewness for all six fiber quality traits in all the populations were less than 1 (not shown), indicating approximately normal distributions of the traits.

Fig. 2
figure 2

Distribution of populations (F2 generation) created in 2010 for six different fiber quality traits. Y-axis shows the range for each trait and x-axis shows different populations: (1) MDN101 × PD94042, (2) MDN101 × DES56, (3) MDN101 × PMHS200, (4) MDN101 × Acala Maxxa, (5) MDN063 × DES56. Average phenotypic values are shown by arrows for exotic parents (MDN101 or MDN063) and wedges for elite parents (PD94042, DES56, PMHS200 or Acala Maxxa)

Fig. 3
figure 3

Distribution of populations (F2 generation) created in 2011 for six different fiber quality traits. Y-axis shows the range for each trait and x-axis shows different populations: (1) MDN101 × PD94042, (2) MDN101 × DES56, (3) MDN101 × PMHS200, (4) MDN101 × Acala Maxxa, (5) MDN063 × DES56, (6) MDN063 × Acala Maxxa, (7) MDN257 × DES56, and (8) MDN257 × Acala Maxxa. Average phenotypic values are shown by arrows for exotic parents (MDN101, MDN063 or MDN257) and wedges for elite parents (PD94042, DES56, PMHS200 or Acala Maxxa)

The genetics and correlation among fiber quality traits

Our results revealed strong positive correlation between UHM and UI; UHM and STR; and UI and STR, for all individual populations (data not shown) as well as when data from all populations were pooled (Table 2). MIC and UHM were negatively correlated. SFC% was negatively correlated with all other fiber quality traits for all the populations. A strong negative correlation between UI and SFC was revealed in all the populations.

Table 2 Correlation among six fiber quality traits (pooled)

We also calculated the heritability for the fiber quality traits on the basis of regression of trait values of F2:3 progenies on trait values of corresponding F2 progenies. Medium to highly significant positive regression values (0.38–0.80) were found for all traits except UI (0.26–0.35) in all the populations (Table 3).

Table 3 Heritability of different fiber quality traits based on regression of F2:3 progenies on F2 progenies

Genomic distribution of polymorphic SSR markers

We screened 720 SSR primer pairs selected from eighteen “hotspot” genomic regions based on the whole genome marker map of cotton based on G. raimondii (Wang et al. 2013). Among the 720 SSR primer pairs tested for polymorphism, 113 (15.69%) were polymorphic between different parental combinations, listed in Table S5. Polymorphic markers were identified for all hotspots except “hotspot VIII”. For two QTL hotspots (III and XV), only one polymorphic SSR marker was identified. For all other hotspots, multiple polymorphic markers were identified.

Association of polymorphic markers with fiber quality traits

A total of 134 marker-trait associations were identified for the six fiber quality traits (Table S6). A total of 15 of these associations remained significant after Bonferroni correction (Table 4). Two significant associations were identified for MIC, explaining 12–16% of total phenotypic variance. Similarly, three significant associations were identified for UHM (18–33%); four for UI (13–46%); one for STR (10%); three for ELO (40–47%) and two for SFC% (19–37%).

Table 4 Markers significantly associated with different fiber quality traits and their contribution to phenotypic variation

Twenty-three, seventeen, twelve, twenty, nineteen and twenty-four nominal associations were identified for MIC, UHM, UI, STR, ELO and SFC% respectively. The proportion of phenotypic variance explained by these associations ranged from 2 to 45%. Among nominal associations that did not meet the Bonferroni significance threshold, some gained support from multiple independent discoveries (Table S6). For example, the association of BNL3359_270 with MIC was identified in the same population (MDN101 × Acala Maxxa) in both 2011 and 2012. Similarly, NAU5120_450 was associated with UI for both 2011 planting dates in the population MDN063 × DES56. Some of these associations were identified in multiple populations. For example, NAU1042_270 was associated with UHM in two populations, MDN101 × PD94042 and MDN101 × PMHS200 in 2011; while MUSB1020_650 was associated with STR in two populations, MDN063 × DES56 and MDN063 × Acala Maxxa in 2012.

Associations across planting dates, years and genetic backgrounds

Some associations were stable across planting dates, years and/or genetic backgrounds (Table 5). Marker locus DPL0279_190 was associated with MIC for the population MDN101 × Acala Maxxa in both 2011 and 2012, albeit both of these associations were only nominal. Likewise, NAU1366_450 was strongly linked with ELO for MDN101 × DES56 in both 2011 and 2012. Marker locus NAU5120_450 was strongly linked with UI and SFC in the population MDN063 × DES56 for two planting dates in 2011. The association of NAU5120_450 with UI survived the Bonferroni correction for the second planting date.

Table 5 Significance of marker-trait associations by year, genetic background, planting dates and fiber quality traits

Alleles from the exotic parent MDN101, the only one tested in all four cultivar backgrounds, were significant for different traits in different backgrounds (Table 5). Marker locus NAU3820_120 from the exotic parent was significant in two genetic backgrounds PMHS200 (nominally) and DES56 (after Bonferroni correction) for STR. Similarly, NAU1042_220 from the same exotic parent was nominally associated with UHM in PMHS200 and PD94042 backgrounds. Likewise, BNL1317_230 from the same exotic parent was nominally associated with STR in two different genetic backgrounds DES56 and PMHS200.

Some marker loci were significantly associated with multiple traits in the same population (Table 5). For example, BNL1317_220 was significantly associated with UHM, STR and SFC for MDN101 × PMHS200 in the year 2012. Similarly, BNL3359_390 was significantly associated with UHM, UI, STR and SFC for MDN257 × DES56 in the year 2012. Likewise, DPL0270_150 was significantly associated with ELO, MIC, UHM and UI for MDN101 × DES56 in the year 2011. Among these associations, only the association of DPL0270_150 with UHM in the MDN101 × DES56 population survived Bonferroni correction, the rest being only nominally significant.

Discussion

Knowledge of the genetic diversity within a crop gene pool is of great importance. The probability of recovering superior genotypes in segregating generations is greater when combinations of parents having different complementary alleles are taken as a basis. Obtaining such superior genotypes requires sufficient diversity between parents to obtain favorable new allele combinations (Rotili et al. 2012), but not so much diversity that favorable alleles come with unfavorable linkage drag. In different crops, this ‘balance’ may involve the use of different types of crosses, with appreciable evidence in cotton suggesting that overexploitation of a few genetic backgrounds by breeders may have contributed to a plateau of yield and fiber quality traits (Esbroeck and Bowman 1998), motivating exploration of more exotic germplasm.

Use of diverse parents

In our study, diverse parental lines were used to create experimental populations. Parents not only showed diverse phenotypes but also differed by more SSR marker polymorphisms than most elite cottons. Among the 15 associations meeting the Bonferroni threshold for significance, four involved favorable alleles from exotic parents for different fiber quality traits. Exotic parents generally showed positive additive effects for traits for which they were selected based on prior information (McCarty et al. 1996). For example, MDN101 showed positive additive effects for STR and MDN257 for UHM. Furthermore, positive additive effects (Bonferroni significant) were observed not only for the phenotypes for which the parents were selected, but also for other phenotypes scored in our study. MDN101, which was selected for UHM and STR, contributed favorable alleles for ELO. Exotic parents also contributed favorable alleles to multiple genetic backgrounds, some of which were nominal and some met the Bonferroni standard. Marker locus NAU3820_120, which was contributed by MDN101 for STR, met the Bonferroni standard in DES56 background while it was nominal in the PMHS200 background. Although not all the favorable alleles met the Bonferroni standard, evidence that some exerted favorable effects in multiple backgrounds reduces the likelihood that they are false positives, and indicates the usefulness of the parents in those backgrounds for different fiber quality traits.

Use of diverse parents allowed us to identify large numbers of nominal associations between polymorphic SSR markers and the six fiber quality traits. Of the 134 associations identified in the study, there were 67 (50%) in which favorable alleles were from the exotic parent. While only four of the 67 favorable alleles survived the Bonferroni correction, these results support the hypothesis that exotic G. hirsutum lines confer favorable alleles to elite cottons that can contribute to fiber quality improvement. Moreover, one would postulate that exotic alleles may also improve other traits of agronomic interest, like disease resistance, insect tolerance or drought resistance, which might have eroded from the cultivated gene pool of cotton owing to decades of selection for high yield and fiber quality.

Contribution of exotic parents

The three exotic parents (MDN063, MDN101 and MDN257) used in our study were selected based on the additive genetic variance they contributed to crosses with DPL16 for various traits (Table S1, summarized from McCarty et al. 1996). Our results show that these parents contributed positive additive effects across different genetic backgrounds not only for the fiber quality traits they were selected for, but also for other fiber quality traits (Tables 4, S6).

MDN063, selected for its positive effect on reduced (improved) MIC and increased ELO in crosses with DPL16, contributed favorable alleles (albeit only nominal) for these two traits in crosses with DES56 and for MIC in cross with Acala Maxxa, supporting the selection of this parent for these traits. It also contributed favorable alleles (nominal) for other traits like STR, UI and SFC% (in crosses with DES56 and Acala Maxxa) and for UHM in crosses with DES56. Of the 19 associations where this parent was the source of favorable alleles, 15 were in the DES56 background. So, this parent looks promising in introducing alleles of interest and usefulness in this particular background.

MDN101 was selected for UHM and STR as it contributed positive additive genetic variance for these traits in crosses with DPL16. Our results show that it contributed positive additive effects for UHM in crosses with PMHS200 and for STR in crosses with PD94942, DES56 and PMHS200, supporting the contribution of this parent to the selected traits. Among the favorable alleles contributed by this parent, the association of NAU3820_120 with STR and the association of NAU5465_220 with ELO met the Bonferroni standard for statistical significance in DES56 background. In addition, it showed positive additive effects in nominal associations for fiber elongation in all four genetic backgrounds and for fiber fineness (in crosses with PD94042, DES56 and Acala Maxxa), fiber uniformity ratio (in cross with DES56) and for short fiber content across all four backgrounds. Of the 35 associations in which this parent was the source of favorable alleles, 13 were in the DES56 background, 9 in the PMHS200 background, 7 in the Acala Maxxa background, and 6 in the PD94042 background. Thus, this parent shows promise for introducing favorable alleles into all four genetic backgrounds tested here to represent the US Upland cotton gene pool.

MDN257 was selected for UHM for its positive additive effect in crosses with DPL16. Two favorable alleles from this parent, DPL0378_510 and JESPR37_1000, which showed significant association with UHM (surviving the Bonferroni correction), contributed positive additive effects for this trait. In addition, it showed nominal positive additive effects for fiber elongation and short fiber content (in crosses with DES56 and Acala Maxxa) and for fiber fineness, fiber strength and fiber uniformity ration in the cross with DES56. Of the 13 associations in which this parent was the source of favorable alleles, 10 were in DES56 background. Thus, this exotic parent also shows promise of introducing favorable alleles into DES56 and its many derived breeding lines and cultivars.

Performance of elite parents

Elite parents contributed favorable alleles for the six fiber quality traits in 11 of the 15 significant associations and 67 of 134 nominal associations. DES56, which was involved in the largest number of crosses, was involved in 11 of the 15 significant associations and contributed favorable alleles in 7 of these 11 associations. PMHS200 and Acala Maxxa contributed favorable alleles in 2 significant associations each. PD94042 was not involved in any significant associations. All elite parents however, contributed favorable alleles related to nominal associations.

Distribution of significant associations

Among the 15 marker-trait associations that met the Bonferroni significance threshold, 12 included the exotic parent MDN101, wherein the parent contributed favorable alleles in 4 cases. Two significant associations included the exotic parent MDN257, with the parent contributing favorable alleles in both cases; and one significant association included the exotic parent MDN063 where the parent did not contribute any favorable alleles. For 10 of the 15 significant associations, the associated marker loci were located in the D sub-genome and 5 were located in the A sub-genome. Similarly, 3 of the 4 favorable marker alleles were located in the D sub-genome and only one in the A sub-genome. While the sample sizes are small, these patterns continue to support the hypothesis that the D sub-genome, from an ancestor that did not produce spinnable fiber, nonetheless makes an important contribution to the fiber quality of elite tetraploid cottons (Rong et al. 2007; Jiang et al. 1998).

The 15 significant associations were distributed among nine of the eighteen QTL hotspots. Five of these QTL hotspots (I, VI, VIII, XIII, XVI) fall in the D sub-genome while the rest (IV, XI, XIV, XVII) fall in A sub-genome. The largest number of associations was identified for hotspot XIII where UI, ELO and SFC were significantly associated with NAU5120_170. This hotspot has been reported to contain QTLs for MIC, UHM, UI, STR, ELO and SFC (Wang et al. 2013; Rong et al. 2007). Two significant associations were identified in hotspot IV (A sub-genome), which has been reported to contain QTLs for UHM, UI, STR and SFC. One of the associations identified in this hotspot was for UI, corroborating previous results, while the other was for ELO, adding to the rich cluster of fiber quality QTLs. Findings in all other hotspots supported those reported in Rong et al. (2007) and Wang et al. (2013). Thus, exploration of fiber quality QTL hotspots with selected polymorphic markers in diverse set of populations supported previous findings and added to the diversity of fiber quality QTLs reported in these hotspots.

We also performed a BLAST search for all the associated markers (both significant and nominal) against the “Combined Gossypium Gene, CDS and Unigene” database in the ‘cottongen.org’ website. Regular BLAST (blastn) integrated on the cottongen.org website was used to search for the marker sequence against the database. A strict statistical threshold of e-10 was used to look for matches and the best hit was selected. Seventeen markers were found in or near genic regions. Most of these regions were annotated against the G. raimondii genome assembly. The list of all the hits along with the Gene name, gene code, genomic position and the blast score and e-values are presented in Table S8.

Reproducibility of associations

Of the 134 nominal associations identified in our study, some (including significant associations) were identified in multiple populations (Table S6). For example, the association of BNL1317_230 with ELO was identified in two populations—MDN101 × PD94042 and MDN101 × PMHS200. Similarly, the association of NAU2152_240 with MIC was identified in populations MDN063 × DES56 and MDN257 × DES56. SFC% was associated with NAU5120_450 in MDN101 × Acala Maxxa as well as in MDN063 × DES56. The same marker locus was also associated with UI in these two populations and these associations were significant after Bonferroni correction. Similarly, NAU1042_270 was associated with UHM in populations MDN101 × PD94042 and MDN101 × PMHS200. These results highlight some specific QTLs that show promise of reproducibility across multiple genetic backgrounds relevant to multiple US production regions.

The present study adds further to knowledge of cotton fiber quality “QTL hotspots” identified largely based on interspecific crosses, in particular showing that some account for intraspecific variation in Upland cotton (G. hirsutum). For example, our study built on numerous prior findings including marker trait association with UHM on chromosome D08 (Shen et al. 2005, 2007; Shao et al. 2014; Qin et al. 2015; Chen et al. 2009; Cai et al. 2014); for UHM on chromosome A03 (Qin et al. 2009; Paterson et al. 2003; Jiang et al. 1998; Cai et al. 2014); for ELO on chromosome 5 (Tang et al. 2015; Qin et al. 2009; Paterson et al. 2003) and chromosome 26 (Tang et al. 2015); for STR on chromosome 14 (Tan et al. 2015; Paterson et al. 2003); for UI and SFC% on chromosome 26 (Wang et al. 2015; Tang et al. 2015; Sun et al. 2012); for SFC% on chromosome 23 (Shao et al. 2014); MIC on chromosomes 5 and 16 (Yang et al. 2015; Tan et al. 2015).

Stability of the associations

Although the number of significant associations were small following the highly stringent Bonferroni correction, some associations (significant or nominal) showed stability across planting dates, years or genetic backgrounds. Of special interest are stable associations in which the exotic parent contributed a favorable allele to the elite background. For example, the association of STR with an allele of marker locus NAU3820_120, contributed by the exotic parent MDN101, was significant in the DES56 background (P value of 0.0017) and nominal in the PMHS200 background (P value of 0.03). The same parent also contributed positive and favorable alleles in association with UHM in two different genetic backgrounds, PMHS200 and PD94042. MDN101 contributed favorable alleles to all four cultivars representing the respective US cotton production regions, showing promise for introducing genetic diversity and favorable alleles of potential value across the US Cotton Belt.

Some associations were stable across multiple years, while others were stable across different planting dates. For example, the association of NAU5120_450 with both UI and SFC was identified in the population MDN063 × DES56 in both 2011 planting dates. DPL0279_190 was associated with MIC in the population MDN101 × Acala Maxxa in both 2011 and 2012. Stability of associations over planting dates/years provide early evidence of their stability across variable environments. However, more testing is warranted – for example, MIC can be highly influenced by the environment, plant spacing, and boll sampling techniques, reflecting a combination of fiber maturity and fineness.

Implications in cotton breeding

The exotic parents displayed wide range of phenotypic performances among themselves and in combination with the elite cultivars. While some exotic parents performed better than the elite parents for a few traits (MDN101 had better UHM than DES56 and PMHS200 in 2011; MDN063 had better ELO than DES56 in both 2011 and 2012), elite parents performed better for most traits in most years. Nevertheless, exotic parents contributed positive favorable alleles (both significant and nominal) for all fiber quality traits.

Medium to high regression coefficients (0.38–0.80) were identified for all six fiber quality traits in most populations, indicating high heritability of these traits and suitability of early generation evaluation of these populations for diagnostic DNA markers identification. Thus, we screened the F2 populations with about 113 polymorphic microsatellite markers selected from hotspots for fiber quality QTLs and identified significant as well as nominal associations. In our attempt to explore selected genomic regions in populations created using diverse and exotic parents, we were able to identify associations in the selected hotspot regions, which further support many previous findings (Rong et al. 2007; Said et al. 2013; Wang et al. 2013) that fiber quality QTLs are clustered and non-randomly scattered in the cotton genome.

Even with small numbers of markers from the fiber QTL hotspots, we identified large numbers of associations for different fiber quality traits for selected region in the cotton genome. Increasing the number of DNA markers within these regions as well as selecting markers throughout the genome could result in identification of larger numbers of significant associations. Comparing associations identified through markers selected by targeting certain regions with associations identified through markers selected evenly from the whole genome would also allow us to test the hypothesis that the non-random concentrations of QTLs found in interspecific cotton crosses (Rong et al. 2007) translate well to these intra-specific crosses. Progress towards low-cost discovery of SNPs (Kim et al. 2015) may make this goal realistic in the near future.

Conclusion

In this study, three exotic G. hirsutum accessions converted to day-neutral flowering were used as parents in crosses with four elite US cultivars to develop experimental populations in two different years in different combinations. We studied F2 and F2:3 generations of these populations for six different fiber quality traits and used 113 different microsatellite markers selected from 18 QTL “hotspot” regions in the cotton genome to identify significant associations of the primer pairs with fiber quality traits. Our results showed useful contribution of the exotic parents for different fiber quality traits for which they were selected across different genetic backgrounds used in this study as well as for other fiber quality traits that were scored in the study. Some significant and some nominal associations were identified in different populations in different years and across different backgrounds for same exotic parents. These associations could be potential QTLs for the selected traits in selected cotton production regions. Increasing the marker density and using interval mapping in these regions, potential QTLs with significant effects could be identified.