Introduction

The genus of Gossypium (cotton) consists of at least 45 diploid and 5 allotetraploid species grouped into nine genome types with the designations A, B, C, D, E, F, G, K, and AD (Campbell et al. 2010; Fryxell 1979; Percival et al. 1999; Wendel et al. 2009). The AD-genome tetraploid cottons originated from the hybridization between A-genome and D-genome diploid cottons approximately 1–2 million years ago (MYA) (Beasley 1940; Cronn et al. 1999; Wendel and Cronn 2003). Within the Gossypium genus, only two diploid species (2n = 2X = 26) from the Old World, G. herbaceum L. (A1) and G. arboreum L. (A2), and two tetraploid species (2n = 4X = 52) from the New World, G. hirsutum L. (AD1) and G. barbadense L. (AD2), are cultivated to produce the world’s leading natural fiber and the second most important oilseed crop (Campbell et al. 2010; Kantartzi et al. 2009; Lee 1984; Ulloa et al. 2013; Wendel et al. 1992; Percy et al. 2014). The complexity of Gossypium genomes has afforded research opportunities on the evolution and diversity among diploid (Ma et al. 2010) and allotetraploid species (Brubaker et al. 1999; Jiang et al. 1998; Paterson et al. 2012). Compared to other crops (Gao et al. 2013; Huang et al. 2012; Paran and Zamir 2003), the cotton research community has less information and inadequate knowledge of the intraspecific genetic variation within Gossypium species, including the two cultivated tetraploid species (G. hirsutum and G. barbadense), that would allow precise manipulation of agronomic traits (Abdurakhmonov et al. 2012; Campbell et al. 2010; Kantartzi et al. 2009; Percy et al. 2014). Understanding the underlying genetic control of biological processes for favorable traits is relevant to a range of research objectives. Such understanding has potential applications for cotton crop improvement (An et al. 2010; Cao et al. 2014; Lacape et al. 2013; Ning et al. 2014; Rong et al. 2007; Ulloa et al. 2005, 2013).

Genetic mapping with polymerase chain reaction (PCR)-based markers, such as simple sequence repeats (SSR) and single-nucleotide polymorphisms (SNP), has facilitated portable applications across different mapping populations and research facilities (Guo et al. 2007; Lacape et al. 2009; Nguyen et al. 2004; Reddy et al. 2001; Van Deynze et al. 2009; Xiao et al. 2009; Yu et al. 2012a). To date, several thousands of SSR and SNP markers have been developed in cotton (Blenda et al. 2012; Fang and Yu 2012; Guo et al. 2007; Lacape et al. 2009; Yu et al. 2012a) and they have been surveyed for polymorphism against a 12-genotype germplasm standard (diversity panel) of six Gossypium species (Yu et al. 2012b). The localization of SSR and SNP markers to all 26 chromosomes and the creation of high-density genetic maps for the tetraploid cotton genome provide opportunities to identify genetic factors controlling and regulating the expression of traits of interest (Lacape et al. 2013; Li et al. 2013; Marathi et al. 2012; Zhu et al. 2011). However, due to the allopolyploid nature of the cotton genome, certain chromosomal structures remain to be resolved. These include deletions, duplications, or rearrangements that may be specific to species, or to particular crosses within or between species (Brubaker et al. 1999; Guo et al. 2007; Lacape et al. 2009; Rong et al. 2004; Ulloa et al. 2013; Wang et al. 2013; Yu et al. 2012a).

Most traits of interest in crop plants are likely the product of complex biological processes in which multiple genes and environments interact to quantitatively determine phenotypic expression (An et al. 2010; Collard and Mackill 2008; Huang et al. 2012; Lacape et al. 2010, 2013; Paran and Zamir 2003; Paterson et al. 1988, 1991; Rong et al. 2007). A quantitative trait locus (QTL) mapping approach can be informative for studying quantitative inheritance, for detecting genomic regions associated with morphological, yield and fiber quality traits, and for identifying molecular markers linked to the genes conferring the phenotypic expression of the important traits (Draye et al. 2005; Kohel et al. 2001; Lacape et al. 2005; Li et al. 2013; Luan et al. 2009; Park et al. 2005; Shen et al. 2005; Song and Zhang 2009; Ulloa et al. 2007; Zhu et al. 2011). Molecular markers located in the cotton genome are used to address fundamental questions regarding the genetic factors for these traits (Lacape et al. 2010, 2013; Mohan et al. 1997). Markers can be used to improve elite cotton cultivar productivity, fiber and seed quality properties, and stress tolerance through marker-assisted selection (MAS) (Cao et al. 2014; Said et al. 2013). Using MAS, the breeding process can be accelerated, labor costs and time reduced, and selection effectiveness increased in identification of improved genotypes (Buckler et al. 2009; Ulloa et al. 2010, 2011).

Many cotton QTL mapping studies were conducted using either F2 or BC1 populations (An et al. 2010; Draye et al. 2005; Jiang et al. 1998; Kohel et al. 2001; Lacape et al. 2005; Li et al. 2013; Shen et al. 2005). These QTL studies also employed either single marker regression without a genetic map or interval mapping with an established linkage group (An et al. 2010; Li et al. 2013). In some cases, multiple environments were used but QTL analyses were conducted on the basis of individual locations (Lacape et al. 2010). Recently, an interspecific backcross inbred line population was used to identify 67 QTLs for fiber properties and yield components (Yu et al. 2013). Another study used a random-mated recombinant inbred population to identify 131 QTLs for fiber quality with insignificant interaction between genotype and environment (Fang et al. 2014). In our study, we used a recombinant inbred line (RIL) population derived from an interspecific cross between Upland G. hirsutum acc. TM-1 and Pima G. barbadense acc. 3-79 to provide better estimates of QTLs. Since its initial development and at different generations, these RILs have facilitated QTL mapping of important traits with immortal and genetically stable plant materials that were phenotyped simultaneously across multiple growth environments or geographical locations (Frelichowski et al. 2006; Lacape et al. 2010, 2013; Marathi et al. 2012; Park et al. 2005; Wang et al. 2012a). In addition, these two mapping parents (TM-1 and 3-79) are currently being sequenced to produce reference genomes for eventual analysis of all potential genes of interest in cotton (unpublished data). Our single seed decent (SSD) RIL progeny of an interspecific hybrid also provided us with an opportunity of having larger numbers of DNA markers that were polymorphic between the two parents (two species standards). The utilization of diverse production environments increases our capacity to elucidate the phenotypic differences present in morphological traits of agronomic importance, yield components and differences in fiber quality under a wide range of cultivation conditions.

Our objectives in this study were to map genomic loci of plant architecture (PA, 4), yield components (YC, 6), and fiber properties or quality (FP, 14) traits obtained from three environments using the QTL mapping approach, and to utilize a recently developed saturated genetic map of the tetraploid cotton genome and molecular markers (Yu et al. 2012a; Fang and Yu 2012), and the RIL population derived from a cross between TM-1 and 3-79. The 24 cotton traits from the RIL population were obtained in three diverse production environments [College Station F&B Road (FB) TX, Brazos Bottom (BB) TX, and Shafter (SH) CA]. This study was part of our efforts to integrate the cotton genetic and genomic information that include many genetic and cytogenetic stocks and isogenic mutant lines already developed with the two cotton genetic standards (TM-1 and 3-79).

Materials and methods

Plant materials

As an immortal mapping population maintained by USDA-ARS in College Station, TX, USA, 186 RILs were developed from selfing by SSD original individual plants of an interspecific F2 population derived from a cross of the cultivated cotton species G. hirsutum (acc. TM-1) and G. barbadense (acc. 3-79) (Kohel et al. 2001; Yu et al. 2012a). TM-1 is a long-term inbred (>40 years) derived from an Upland cotton cultivar Deltapine 14 (Kohel et al. 1970), and it is the basis for a series of near-isogenic lines (NILs) for various traits. 3-79 originated as doubled haploid line of the extra long staple (ELS) cultivar 3-79. The cultivar 3-79 predates any known human-directed introgression with G. hirsutum and thus is a “pure” representative of the G. barbadense species (Niles and Feaster 1984). TM-1 and 3-79, of which genomes are currently being sequenced (unpublished data), are recognized as the global genetic standards of two Gossypium species that have many unique contrasting phenotypes (high productivity and wide adaptability for TM-1, and superior fiber properties for 3-79), allowing us to identify those genomic regions that underlie genetic control of PA, YC, and FP traits. Difficulties with stand establishment resulted in different numbers of RIL progeny at the different environment locations, and different numbers of progeny that yielded data for all traits (Tables 1, 2, 3). These difficulties have been reported in similar studies with other interspecific populations (Lacape et al. 2010, 2013; Marathi et al. 2012; Yu et al. 2013; Zhu et al. 2011).

Table 1 RIL phenotypic values of plant architecture (PA) traits
Table 2 RIL phenotypic values of yield component (YC) traits
Table 3 RIL phenotypic values of fiber property (FP) traits

Field phenotypic evaluations

The RIL progeny (F7 in average at the time of this study), together with the parents TM-1 and 3-79, were germinated in a greenhouse pellet at each study location in 2005. After the development of the true leaves, plants from each greenhouse were transplanted into the cotton production field in College Station F&B Road (FB) TX, Brazos Bottom (BB) TX, and Shafter (SH) CA, respectively. Based on soil type and other conditions, FB location is described as an upland area not adapted for commercial cotton production, and it requires supplemental irrigation but has low yield potential. Although only about 10 miles away, BB location is an alluvial river bottom area with supplemental irrigation and high yield potential. SH location is a desert environment that is a sandy-loam soil field site, uses furrow irrigation for production, and has high yield potential. Our genetic research standard procedure was followed to use transplants for stand establishment in order to maximize the use of limited seed availability. At each location, the parents and RIL progeny were grown in 1-row plots 5 m long with 1-m row spacing between rows, averaging 2–5 plants in an incomplete randomized block design with two replications per environment (Fehr 1991; Li et al. 2012).

Data from four morphological or PA traits (Table 1) were collected from these three different environments: plant height (cm), number of nodes, average internode length (cm), and main stem diameter (mm) at cotyledonous nodes. In addition, data from six YCs (Table 2) were collected: seed weight per plant (g), lint weight per plant (g), lint percent, gin turnout, seed index, and lint index. Lint percent was calculated by lint weight/(lint weight + seed weight) × 100. Gin turnout was total lint weight recovered and it was portion of lint ginned to the whole harvested seed-cotton sample. Seed Index was based on the weight of 100 seeds. Lint index was calculated based on the lint obtained from 100 seeds and the average yield obtained per seed for each RIL. Moreover, lint obtained from hand-harvested open bolls from the two replicate plots per RIL at each location was used to analyze data from 14 FPs (Table 3) by the advanced fiber information system (AFIS®, USTER, Charlotte, NC, USA) instrument in the fiber laboratory at the Cotton Incorporated, Cary, NC, USA. Open bolls were multiply harvested to prevent any field deterioration because diverse maturity of the interspecific segregating material did not allow boll sampling as in a uniform breeding nursery with uniform genotypes. AFIS, a more accurate fiber measuring technology, was used in this study because the smaller sample weight needed for AFIS allowed more individuals to be measured than with HVI technology that would otherwise require a larger sample (at least 20 g of fiber) in this genetic mapping population. The following 14 traits were collected: nep size g. (no. g), number of neps (g−1), number of seed coats, upper quartile of fiber length by weight (mm), short fiber content by weight g (kg−1), average length of all fibers by weight (mm), 5.0 % fiber span length (mm), 2.5 % fiber span length (mm), VFM percent, fiber fineness (mTx), immature fiber content by weight g (kg−1), maturity ratio (unit), mean tenacity (unit), and mean elongation (unit).

This RIL population was previously used for developing a saturated genetic map (Yu et al. 2012a). The number of RIL progeny evaluated depended on seed viability and germination at each location. It is well known the existing problems of fertility and seed production on interspecific progeny. The numbers of RILs used in each location for different traits were as follows: for PA traits, 175 (FB), 146 (BB) and 170 (SH); for YC, 145 (FB), 95 (BB) and 152 (H); and for FP, 110 (FB), 40 (BB) and 98 (SH) (Tables 1, 2, 3).

Differences among observed traits of the RILs within each location and between locations were evaluated for each experiment using PROC GLM (SAS, ver. 9.2, SAS Institute, Cary, NC, USA). Mean separations in the various examinations of main effects were conducted using the Waller–Duncan k-ratio procedure (Ott 1988). Correlation analyses were performed to examine the similarity of responses of the RIL entries at the different environments. All correlations were performed using PROC CORR (SAS ver. 9.2, SAS Institute, Cary, NC, USA).

Marker data and linkage map

Our previously published linkage maps that saturate the tetraploid cotton genome with more than 2,500 SSR and SNP markers were used in this study (Yu et al. 2012a; Fang and Yu 2012). The individual marker scores of the RILs that were evaluated phenotypically in the field served as a foundation matrix to generate linkage groups and trait association. The present study was the first to include the SNP markers that are mapped to the tetraploid cotton genome. Many markers were developed from the Unigenes of cotton expressed sequence tags (ESTs) and they would offer new opportunities for further functional analysis of cotton traits.

Quantitative trait locus analyses

Single marker analysis was conducted using a nonparametric mapping test [Kruskal–Wallis analysis (K*)] equivalent to a one-way analysis of variance (Van Ooijen 2004). In the nonparametric analysis, no assumptions were made for the probability distribution(s) of each quantitative trait, and if the data were distributed normally, the nonparametric test was as powerful as parametric methods. Also, all markers genotyped on the RIL population regardless of their linkages were used in the nonparametric test. Markers were tested at each locus separately without the use of the linkage map. QTL analyses were also conducted using MapQTL 5.0 with interval mapping and the MQM QTL model mapping procedures (Van Ooijen 2004; Arends et al. 2010). Threshold values for LOD were determined empirically after 1,000 permutation tests for all traits (Churchill and Doerge 1994). The threshold for a QTL was determined at P < 0.05 using the nonparametric and at LOD ≥ 3.0 using the MQM analyses.

PA and YC traits are known to be affected by the environments, which tend to produce genotype (G) × environment (E) interactions. In addition, although FP traits are more stable with high heritability, fiber from open bolls were obtained from multiple harvests to prevent any field deterioration because of the diverse maturity of the interspecific segregating material, which is typical of these types of interspecific genome mapping populations. This boll harvesting process did not allow uniformity as in most breeding nurseries. Means were obtained across locations (FB, BB, and SH) from common RILs and were used to examine trait effects and GxE interactions (data not shown). However, QTL analyses were performed on the RIL population from the individual datasets of the three locations (BB, FB, and SH). QTL comparisons were made among all locations after the individual analyses.

A strong QTL is named as follows: the first letter of the QTL if it derives from FP is capitalized ‘F’, and if the trait is YC, the first part of the trait name is capitalized. The second letter is also capitalized of the second name of the trait, and the third letter of the name is a small letter from the rest of the trait name. The first three letters of the trait name are followed by ‘Qtl’ and followed by the c = chromosome and the number. Finally, an underline was scribed before the number of QTLs (1h or 1b) denoted by a small letter that represents the positive allele or high value of the QTL phenotypic effect from the parents of this interspecific RIL population [G. hirsutum (h) and G. barbadense (b)].

Results

Phenotypic variation

The mapping parents, TM-1 and 3-79, and their RIL progeny were examined for the 24 traits under three diverse production environments: FB, BB, and SH. Contrasting phenotypic differences were observed between TM-1 and 3-79, on respective average, 85.57 and 96.77 cm for plant height (Ph), 28.50 and 26.75 for number of nodes (NA), 3.23 and 3.89 cm for internode length (InL), 28.65 and 24.80 mm for main stem diameter (MsD), 177.65 and 118.17 g for seed weight per plant (Swp), 100.22 and 64.18 g for lint weight per plant (Lwp), 44.00 and 43.00 for lint percent (LP), 0.36 and 0.33 for gin turnout (GT), 14.85 and 11.28 for seed index (SI), 8.37 and 6.20 for lint index (LI), 824.33 and 669.33 g (no. g) for nep size (Ns), 129.83 and 144.33 g−1 for number of neps (Nn), 33.00 and 5.50 for number of seed coats (SCN), 32.76 and 36.32 mm for upper quartile of fiber length by weight (UQL), 62.8 and 59.0 g (kg−1) for short fiber content by weight g (SFC), 21.34 and 22.61 mm for average length of all fibers by weight (ALFw), 36.83 and 40.64 mm for 5.0 % fiber span length (5.0L), 39.62 and 43.94 mm for 2.5 % fiber span length (2.5L), 1.62 and 1.51 for VFM percent (VFM), 174 and 149 mtx for fiber fineness (FTX), 5.73 and 7.32 kg−1 for immature fiber content by weight g (IFC), 0.95 and 0.91 for maturity ratio (MR), 21.2 and 29.86 for mean tenacity (MT), and 6.5 and 7.79 for mean elongation (ME).

While differences among observed traits of the RILs within each location and between locations were observed with mean separations in the various examinations of main effects, there was a similarity of overall responses of the RIL entries at the different environments (detailed data not shown). A phenotypic data summary of means, standard deviations, and ranges for PA, YC, and FP traits observed in the RIL population is presented, respectively, in Tables 1, 2 and 3 for each trait group. Although the data of the above 24 traits were collected at different time intervals among the different environments, significant phenotypic differences (P < 0.05) were observed among the RILs for all traits. While the RIL plants showed a normal distribution for the PA traits, they grew much taller, with longer but fewer internodes in the California location (SH) than in the two Texas locations (FB and BB) (Table 1). For example, the plant height of the RIL population at the SH environment had the high mean value with 100.20 cm while FB environment had the low value with 54.78 cm. Average internode length ranged from 0.14 to 7.28 cm among the three different locations FB, BB, and SH. For the YC traits, the lint weight per plant ranged from 0.10 to 103.21 g, and seed index ranged from 6.2 to 19.2 (Table 2). For the FP traits, the 2.5 % fiber span length ranged from 27.5 to 47.5 mm, and immature fiber content by weight ranged from 4.3 to 12.2 (g/kg) (Table 3). The phenotypic variation also included poor or no germination from some of the RILs under the field conditions, and lack of fiber production, causing challenges for complete data analyses.

Identification of QTLs and QTL clusters

The profound phenotypic variability of the RIL population was reflected in the identification of large numbers of putative QTLs for the PA, YC, and FP traits (Fig. 1; Supplementary Tables S1–S3). More than 600 putative QTLs, many expressed in multiple environments, were detected by single marker analysis (P < 0.05) using a nonparametric mapping test (Fig. 1; complete data not shown). For the PA traits, we detected more than 23 putative QTLs at FB, 17 QTLs at BB, and 22 QTLs at SH, explaining from 5 to 15 % of the phenotypic variation and involving 23 of the 26 cotton chromosomes. For the YC traits, we detected more than 32 putative QTLs at FB, 12 QTLs at BB, and 35 QTLs at SH, explaining from 5 to 31 % of the phenotypic variation and involving 25 of the 26 cotton chromosomes. For the FP traits, we detected more than 125 putative QTLs at FB, 110 QTLs at BB, and 135 at SH, explaining from 7 to 62 % of the phenotypic variation and involving all the 26 cotton chromosomes. More than 500 putative QTLs by trait group from the three environments were detected by interval mapping: around 62 for PA, 79 for YC, and 360 for FP (data not shown). The MQM QTL model mapping analysis identified and confirmed 72 strong QTLs from the above putative QTLs with a LOD > 3.0 threshold value at each environment (Table 4; Fig. 1). For comparison between environments, however, putative QTLs with P < 0.05 and LOD > 1.7 were included in this study (Supplementary Tables S1–S3).

Fig. 1
figure 1figure 1figure 1figure 1figure 1figure 1figure 1

Genomic locations of putative QTL loci among 26 allotetraploid cotton chromosomes that are presented in 13 At and Dt subgenome homoeologous pairs. All SSR and SNP markers shown on the right were previously developed and genetically mapped using the same TM-1 × 3-79 RIL mapping population (Yu et al. 2012a; Fang and Yu 2012). The position of the markers shown in Kosambi (1944) centiMorgan (cM) on the left reflects the calculation of marker distance and order, and chromosome orientation. A line bar connects a marker that is putatively associated with a single trait or multiple traits, and only one trait is listed if more than one environment location show a response to QTL effect. The number (in parenthesis) following the trait abbreviation indicates the environment location (1-FB, 2-BB, and 3-SH). A description of individual trait abbreviation is presented in Table 1. An asterisk denotes a strong QTL locus with the LOD score greater than 3.0 (Table 4; Supplementary Tables S1–S3)

Table 4 MapQTL mapping output showing the statistics of strong QTLs (LOD > 3.0) associated with cotton traits detected by the MQM QTL model mapping procedure

A total of 428 putative QTLs were significant (P < 0.05 or LOD ≥ 2.0) at one location and they were distributed in 159 genomic regions across the tetraploid cotton genome with several clusters located in a few chromosomes (Fig. 1; Supplementary Tables S1–S3). Such chromosomes included QTL clusters for fiber length on chromosome 10 (A10) and fiber maturation on chromosomes 5 (A05) and 15 (D01). The QTLs for most traits were largely located on non-homoeologous chromosomes of the A and D subgenomes. However, the pair of homoeologous chromosomes 5 (A05) and 19 (D05) each harbored 12 and 6 putative fiber QTLs, respectively (Fig. 1; Supplementary Table S3). For the 72 strong QTLs with a LOD score greater than 3.0, the pattern of being located largely on non-homoeologous chromosomes is similar to that of the putative QTLs while four and five strong QTL clusters were located on homoeologous chromosomes 5 (A05) and 19 (D05), respectively (Fig. 1; Table 4).

Comparison of putative QTLs among the environments and between the subgenomes

When putative QTLs identified using the MQM model procedure were compared among FB, BB and SH environments, 25 of the 62 PA QTLs were identified in more than one environment. The 25 putative QTLs were associated with the same trait or with another PA trait with at least P < 0.05 or LOD ≥ 2.0 at one location. Nineteen putative QTLs on six chromosomes belonged to the At subgenome (chromosomes 1–13), representing a 72.7 % contribution for the At subgenome, while six putative QTLs on three chromosomes belonged to the Dt subgenome (chromosomes 14–26), representing a 27.3 % contribution for the Dt subgenome (Fig. 1; Supplementary Table S1). When putatively identified QTLs of YC traits were compared among the three environments, 60 of the 79 YC putative QTLs were identified in more than one environment. The 60 putative QTLs were associated with the same trait or with another YC trait with at least P < 0.05 or LOD ≥ 2.0 at one location. Twenty-six putative QTLs on 10 chromosomes belonged to the At subgenome, representing a 46.2 % contribution for the At subgenome, while 44 QTLs on 9 chromosomes belonged to the Dt subgenome, representing a 53.8 % contribution for the Dt subgenome (Fig. 1; Supplementary Table S2). When putatively identified QTLs of FP traits were compared among the three environments, 343 of the 360 FP putative QTLs were identified in more than one environment. The 343 putative QTLs were associated with the same trait or with another FP trait with at least P < 0.05 or LOD ≥ 2.0 at one location. One hundred seventy-two putative QTLs on the 13 chromosomes belonged to the At subgenome, representing 50.4 % of the At subgenome contribution, and 171 putative QTLs on the 13 chromosomes belonged to the Dt subgenome, representing 49.6 % of the Dt subgenome contribution (Fig. 1; Supplementary Table S3). Overall, we observed almost even contribution to YC and FP traits from At and Dt subgenomes but significantly more contribution to PA traits from At than Dt subgenome for the detected QTLs (Fig. 1). Furthermore, the putative QTLs for the various traits tend to reside at or near the same locus for certain chromosomes from at least two different environments on this RIL population, confirming a cluster network with alleles showing heterogeneous phenotypic effects.

Phenotypic effect of QTLs under multiple environments

Although heterogeneous QTL results were observed, suggesting a complex network of genes for the variation of PA, YC, and FP traits, the nonparametric mapping (Van Ooijen 2004) compiled with the MQM test identified several DNA markers that are associated with strong QTLs (LOD > 3.0) for different traits or for same traits under multiple environments with P < 0.05 (Fig. 1; Table 4; Supplementary Tables S1–S3).

SSR marker TMB 1496 located on chromosome 5 (A05) explained up to 12.4, 15.5, 14.5, and 40.3 % of four fiber properties: average length of all fibers by weight, upper quartile of fiber length by weight, 5.0 % fiber span length, and mean tenacity, respectively. Alleles with the large effect on average length of all fibers by weight (FAlQtlc05_1b), upper quartile of fiber length by weight (FUqQtlc05_1b), and 5.0 % fiber span length (F5lQtlc05_1b) were contributed by 3-79, while that on mean tenacity (FMtQtlc05_1h) was contributed by TM-1 (Table 4). SSR marker TMB0694b located on chromosome 15 (D01) explained up to 18.1 and 25.6 % of the lint weight per plant (LWpQtlc15_1b) variation of the YC traits in FB and SH environments, respectively. SSR marker TMB0400 located on chromosome 21 (D11) explained 18.9 and 16.0 % of seed weight per plant (SWpQtlc21_1b) variation in FB and BB environments, respectively. Alleles with the large effect on both lint weight per plant (LWpQtlc21_1b) and seed weight per plant (SWpQtlc21_1b) were contributed by 3-79 (Table 4; Supplementary Table S2). SNP marker UCcgs10033_191 on chromosome 11 (A11) were associated with putative QTLs that contributed about 7–8 % of the phenotypic variation of lint percentage, gin turnout, and lint index at the SH environment (Supplementary Table S2). SSR marker MUCS145b on chromosome 13 (A13) associated with a strong QTL for fiber fineness of the FP traits explained up to 12.5 and 27.5 % of the fiber fineness (FFiQtlc13_1b) variation at the FB and SH environments, respectively (Table 4; Supplementary Table S3). In addition, a major contribution for fiber fineness variation also was made by NAU2140, located on chromosome 5 (A05), explaining from 17 to 24 % of the phenotypic variation in BB and SH environments (Supplementary Table S3).

Based on this study, QTLs for specific traits, discerned in at least two different environments, tended to reside at or near the same locus of the same chromosomes. Progeny homozygous for alleles identified by the markers on the chromosomes from the parents (TM-1 and 3-79) generally agreed with their contribution to YC and FP traits in their RIL progeny. The observation of improved progeny carrying alleles from both parents is an indication of additive effect of the contributing genes that could be subject to MAS.

Discussion

The identification of DNA markers that are associated with traits of agronomic significance provides a valuable tool for understanding the genetic control in the plant genome and for facilitating marker-assisted breeding of the crop plant (Collard and Mackill 2008; Lacape et al. 2013; Said et al. 2013). In this study, QTL mapping was conducted under three diverse production environments (FB, BB, and SH) to identify the genetic factors or genomic loci for PA, YC, and FP with an interspecific cotton (G. hirsutum acc. TM-1 × G. barbadense acc. 3-79) RIL population. New QTLs unique to this TM-1 × 3-79 RIL mapping population and/or a particular environmental condition were detected while others appeared to express across multiple environments and/or different mapping populations (Lacape et al. 2010, 2013; Rong et al. 2007; Song and Zhang 2009). Identification of 159 genomic regions or QTL-bearing markers from more than 2,500 mapped SSR and SNP loci of the tetraploid cotton genome would facilitate further dissection of genetic factors underlying these important traits and MAS for the cotton crop.

Although much used in linkage mapping efforts, the present investigation is one of the first studies that use the interspecific RIL population to identify and locate the genomic loci for all three sets of morphological, yield, and fiber traits in the tetraploid cotton genome. This TM-1 × 3-79 RIL population is a genetic standard resource that has been distributed by USDA-ARS among the cotton research community (Park et al. 2005; Wang et al. 2012a). Another major interspecific cotton RIL population used to investigate such traits is derived from a cross between Guazuncho 2 (G. hirsutum) and VH8-4602 (G. barbadense) (Lacape et al. 2010). In that study, subsets of 140 RILs, originated from some of approximately 600 F2 plants, were used for trait observation among different environments (Lacape et al. 2013). In addition, an interspecific backcross inbred population of 146 lines was used to identify QTLs for fiber properties and yield components (Yu et al. 2013). In our present study, the parental genotypes (TM-1 and 3-79) used in making this population represent highly divergent global genetic standards for G. hirsutum and G. barbadense species, respectively, to maximize the chance of QTL identification for all these traits. Characterizing the phenotypic variation of morphological traits, yield components, and fiber quality properties facilitates the utilization of this population by other researchers who wish to further examine the co-segregation of any type of molecular markers and phenotypes (Wang et al. 2012a; Said et al. 2013). Obtaining the information that quantifies the phenotypic variation within the TM-1 × 3-79 RIL population allows us to determine what morphological, yield and fiber quality traits may be improved through introgression using the RIL population.

While our interspecific RIL population exhibited profound variation in many traits including PA, YC, and FP, some individual RILs of the population have reduced fertility and productivity, as commonly observed in RIL and other populations of interspecific origin (Lacape et al. 2010, 2013). In this study, the allotetraploidy of cultivated parental cottons (G hirsutum and G. barbadense) provides an opportunity to investigate putative QTLs or genetic factors between the homoeologous chromosomes (Buyyarapu et al. 2013; Xu et al. 2008a). The two subgenomes of the allotetraploid cottons may contribute differently but complementarily to functional networks of cotton genes (Xu et al. 2010).

A problem arising from the variable fertility of RILs has been seed shortages, compounded with the poor germination of some RILs. Regardless of these difficulties, such segregating populations were used to map cotton yield and quality QTLs, and comparable data were obtained for traits across the multiple environment locations in other studies (Lacape et al. 2013; Yu et al. 2013). RIL populations are usually extended beyond F6-F9 generations which have provided better genetic uniformity of trait response (homozygous alleles), reducing the external or environmental effects as compared to early generations on such traits as fiber yield that may not be highly precise due to the limited information in this area. However, the general information is important for breeders who are using interspecific introgression breeding via molecular markers. The two parental species used in our study represent unique contrasting yield and quality traits that are desired in the other species. G hirsutum cotton is known for its high yield and wide adaptation while G. barbadense cotton is known for its superior fiber quality (length, fineness and strength). Even though the data of the 24 traits were collected at different time intervals among the different environments, significant phenotypic differences (P < 0.05) were observed among the RILs for all traits within each location. RIL plants showed a normal distribution for traits with large environmental effect such as PA. As expected, plants grew much taller, with longer but fewer internodes in the California location than in the two Texas locations (Table 1), and the lint weight per plant ranged from 0.10 to 103.21 g and seed index from 6.2 to 19.2 (Table 2). Although data were collected from 2–5 RIL plants per row, phenotypic variation also included poor or no germination from some of the RILs under the field conditions and lack of fiber production, causing challenges for complete data analyses. The identification of QTLs for these traits would advance our understanding and increase our knowledge in future introgression breeding efforts that have not been very successful in cotton. For more effective and precise use of the information in cotton molecular breeding, however, further research at a higher resolution of the genome sequence is desired to validate and exploit putative QTLs identified in the present study. This is especially important for complex YC traits that can largely be affected by the environment but can advance our understanding and knowledge for future breeding efforts.

The data from our study indicate that most of the 428 putative QTLs for the traits measured are located in non-homoeologous chromosomes of the tetraploid cotton genome. A study, using different populations and linkage maps, analyzed 432 cotton fiber QTLs that were largely found in non-homoeologous regions of the tetraploid subgenomes (Rong et al. 2007). In another study with an interspecific RIL population, Lacape et al. (2010) analyzed 651 cotton fiber QTLs that basically agree to the findings of Rong et al. (2007). Both studies reported the exceptional pair of chromosomes 5 (A05) and 19 (D05), in which we identified 12 and 6 putative QTLs (of which 4 and 5 QTLs were strongly expressed), respectively. Said et al. (2013) recently compiled over 1,000 reported QTLs from dozens of cotton studies, which showed an uneven distribution of QTLs in the cotton genome. Of particular interest were specific 20-cM genomic regions of these two chromosomes for Micronaire hotspots. Our result supports and augments the prior evidence of this particular pair of gene-enriched chromosomes (Lacape et al. 2010; Said et al. 2013). In chromosome 5, several gene islands for fiber initiation and fiber fineness were previously reported (Rong et al. 2007; Xu et al. 2008b). DNA markers associated with such gene islands or QTL clusters would be valuable for further investigations including MAS.

Many putative QTLs associated with different, correlated traits or with the same trait measured in different ways, co-resided in the same genomic regions, and these QTLs tended to be expressed in a particular environmental condition (Fig. 1). For example, QTLs for fiber fineness (FFiQtlc19_1b), immature fiber content (FIfQtlc19_1b), and maturity ratio (FMrQtlc19_1h) co-resided in a genomic region of chromosome 19 (D05), associated with SSR marker BNL3875 at the SH location (Table 4). Another SSR marker CIR176 on the same chromosome 19 (D05) is associated with the putative QTLs conferring six FP traits: mean elongation, 2.5 % fiber span length, 5.0 % fiber span length, upper quartile of fiber length, average length of all fibers, and nep size (Supplementary Table S3). These putative QTLs are primarily expressed for fiber traits at the BB location. In this study, DNA markers used to identify cotton QTLs include those that were developed from the bacteria artificial chromosome (BAC) clones (Yu et al. 2012a). For example, a TM-1 BAC-derived marker, TMB0799, on chromosome 12 (A12), is strongly associated with the QTLs conferring five FP traits primarily at the SH location: fiber fineness (FFiQtlc12_1h), short fiber content (FSfQtlc12_1b), average length of all fibers (FAlQtlc12_1b), immature fiber content (FIfQtlc12_1b), and maturity ratio (FMrQtlc12_1h) (Fig. 1; Table 4). With the development of integrated physical map and genome sequence map of the tetraploid cotton chromosome 12 (A12), future research with this BAC-derived marker would shed more light on the QTL cluster or gene island to better understand the genetic mechanisms underlying these fiber traits (Buyyarapu et al. 2013; Xu et al. 2008a, b).

As shown in this study and others, both superior and inferior parents may possess genetic factors that control the performance of contrasting traits otherwise undetectable by phenotype alone (Paterson et al. 1991; Tanksley and McCouch 1997; Wang et al. 2012a). TM-1 alleles at the TMB1496 locus contributed to fiber tenacity (FMtQtlc05_1h) and 3-79 alleles at the TMB0400 locus contributed to lint weight (LWpQtlc21_1b) and seed weight (SWpQtlc21_1b), which is counter intuitive for these parents by phenotypic observation. To further dissect the 159 genomic regions that harbor the 428 putative QTL loci identified in this study, for eventual use in cotton molecular breeding programs, a graphical display of each QTL locus or QTL cluster for cotton PA, YC, and FP traits needs to be developed, and a subset of the selected TM-1 × 3-79 RILs that best represent genomic regions of interest needs to be identified for detail characterization.

The functional network of QTL clusters in cultivated cottons is complex, and distinguishing orthologs from paralogs is challenging in the absence of complete genome sequences. This study is the first to include SNP markers to identify the QTLs in cotton. For example, an SNP marker UCcg11114_430 mapped on chromosome 8 (A08) is associated with the putative QTLs conferring number of neps, immature fiber content, maturity ratio, and nep size (Fig. 1; Supplementary Table S3). Another SNP marker UCcg10645_71, mapped approximately 20 cM apart from TMB0799 (discussed above) on chromosome 12 (A12), is associated with the putative QTLs conferring average length of all fibers, upper quartile of fiber length, 5.0 % fiber span length, and short fiber content (Fig. 1; Supplementary Table S3). With increasing numbers of SNP markers being developed in cotton, profound diversity can be exploited between TM-1 and 3-79 that contrast in the traits of agronomic importance (Elshire et al. 2011).

The recent development of a complete genome sequence in cotton provides an unprecedented opportunity to scan or browse the QTL regions with the finest details (Li et al. 2014; Paterson et al. 2012; Wang et al. 2012b). The markers associated (putatively or strongly) with any QTLs can be aligned to the genome sequence (Gao et al. 2013). Because the parental lines of our TM-1 × 3-79 RIL mapping population represent the global genetic standards of G. hirsutum and G. barbadense, respectively, the international cotton genome research community is collaborating closely to sequence both Gossypium tetraploid reference genomes (i.e., TM-1 and 3-79). Assembly and annotation of the two parental genome sequences, along with re-sequencing of selected progeny RILs would facilitate detailed structural, functional, and evolutionary analyses of the cotton QTLs (Gao et al. 2013; Lacape et al. 2010, 2013; Rong et al. 2007; Said et al. 2013; Ulloa et al. 2013; Wang et al. 2012a). More precise studies of the putative QTLs identified in this study and in many others would follow in the future when the tetraploid cotton genome sequence becomes available.