Introduction

Cotton (Gossypium hirsutum L.) fiber yield and quality that depend greatly on fiber development remain the most important targets of worldwide cotton breeding programs. The demands of the global market for increased uniformity, strength, extensibility, and other fiber quality traits justify new and innovative approaches toward understanding the molecular and physiological mechanisms of fiber development. Cotton fiber development is a complex process that involves fiber initiation, elongation (primary wall synthesis), wall thickening (secondary wall synthesis) and desiccation (maturation) (Basra and Malik 1984).

The seed coat of cultivated cotton is covered with lint and fuzz, where lint is a unique textile fiber and fuzz is short fiber remaining on seed coat after ginning. There are spontaneous cotton mutants that have lint, but are without fuzz, the “naked seed cottons.” Other mutants are without both fuzz and lint (Musaev and Abzalov 1972; Nadarajan and Rangasamy 1988; Zang and Pan 1991; Du et al. 2001). McLendon (1972) and Balls (1912) stated that fuzziness was dominant over naked seed trait. Thadani (1923 and 1925) also reported monogenic dominant inheritance of the fuzziness trait in Upland cotton and used N-n symbols to describe trait alleles. Fuzz and lint development in crosses of spontaneous fuzzless/lintless with fuzzy/linted mutant cottons have also been genetically analyzed by Kearny and Harrison (1927), Carver (1929), Ware (1929), Griffee and Ligon (1929), Harland (1939), Ware et al. (1947), Richmond (1947), Musaev and Abzalov (1972), Kohel (1972), Kohel (1973), Nadarajan and Rangasamy (1988), Narbuth and Kohel (1990), Zhang and Pan (1991), Du et al. (2001), and Turley and Kloth (2002), suggesting existence of dominant and recessive loci controlling naked seed trait in cotton. Recently, several fiber development cotton mutants have been comparatively analyzed and mapped using RFLP probes that explained the genetics of fiber mutants more in molecular perspective (Rong et al. 2005).

In Uzbekistan, genetic basis of the fiber development genes have been extensively studied in populations derived from fuzzless/lintless and fuzzy/linted cotton lines. Musaev and Abzalov (1972) suggested that two major genes––F t1 and F t2 , control the fuzz around the micropyle; whereas, fuzz on chalazal and lateral parts of seed were controlled by a third gene–F c in complementary interaction with other genes. They also suggested that there was an additional inhibitor gene, –I that in the dominant homozygous and heterozygous states of this gene, blocks the function of all fuzz genes, resulting in dominant fuzzlessness. Moreover, two categories of genes for lint formation were suggested: (1) basic polygenes Li A -li A and Li D -li D; (2) additional polygenes Li A 1 , Li A 2 (came from A-genome) and Li D 1 , Li D 2 (came from D-genome) which serve as an enhancer for basic lint genes (Musaev 1979). Fuzz genes (F t1 and F t2 ) contribute to lint yield through their pleiotropic interaction with lint genes. The gene––inhibitor (I) in the dominant state shows negative pleiotropic effect to lint genes inhibiting all major fuzz genes (Musaev 1979).

As a result of above-mentioned study of the fiber development genes, a unique F10–12 RIL population segregating for lint and fuzz development was developed (Musaev et al. 2000). This RIL collection is, in fact, a unique genetic resource to study the lint development loci in cotton as RIL individuals represent a wide range of segregation (0–48%) of lint percentage trait; therefore, it has potential to study the molecular determinants of fiber development genes using molecular markers. DNA-marker technology has become an important tool for mapping and quantifying the effects of loci responsible for development of quantitative traits in plants. DNA markers linked to fiber traits have the potential to increase the efficiency of various breeding strategies through marker-assisted selection programs (MAS). Simple Sequence Repeat (SSR) polymorphisms are considered to be one of the markers of choice in genome mapping in many crops because they are (1) PCR-based, (2) usually co-dominant, (3) usually multiallelic, and (4) widely dispersed throughout the genome (Tanksley and McCouch 1997; Reddy et al. 2001; Qureshi et al. 2004). SSR markers have proven utility for the resolution differences between individuals, mapping traits of interest in experimental populations, analysis of genome evolution and comparative genomics, and for the efficient utilization of wild and primitive germplasm resources in MAS (Bell and Ecker 1994; Akkaya et al. 1995; McCouch et al. 1997). A number of cotton SSR containing loci, including more than 500 BNL SSRs (Brookhaven National Laboratory, NY, USA; Blenda et al. 2006), 309 JESPR SSRs (Reddy et al. 2001) and 418 CIR SSRs (Nguyen et al. 2004; Lacape et al. 2005) have been identified and made available for cotton genome research. Also, EST specific microsatellite markers (Qureshi et al., 2004; Han et al. 2004; Han et al. 2006; Blenda et al. 2006), and SSRs derived from bacterial artificial chromosomes (BACs) (Yu et al. 2002; Yu et al. 2004a; Blenda et al. 2006) have recently been identified in cotton. The development of additional SSR markers is in progress. For example, the Cotton Genetics and Breeding Initiative of Cotton Incorporated, USA is supporting the development of additional SSR markers for cotton research community (Yu et al. 2004b; Blenda et al. 2006; http://www.mainlab.clemson.edu/cmd). These SSR markers can be used to tag fiber trait associated genes using a suitable mapping population segregating for fiber traits. EST-SSR markers have the potential to be particularly useful for tagging fiber QTLs since many of the available EST-SSRs were developed from genes expressed in developing cotton fibers, and are thus a priori associated with genes that function during fiber development.

The objective of this research was the identification of SSR markers associated with lint percentage QTL in a G. hirsutum intraspecific recombinant inbred line (RIL) population. Here we report several useful SSR markers associated with cotton lint percentage trait and chromosomal locations of some SSRs associated with lint percentage. Outcomes of this research should be useful in understanding of fiber development in cotton, and markers identified in this study can be utilized in map-based cloning of the fiber quality genes, and marker assisted selection (MAS).

Materials and methods

Plant material

We used 76 intraspecific RIL of G. hirsutum derived from the cross between L–70 (fuzzless/lintless with 0% lint on cottonseed) and L–47 (fuzzy/linted with 41–43% lint on cottonseed) lines. This RIL collection was developed in Uzbekistan and the genetic inheritance of the lint and fuzz development trait has been studied in F2 and backcross populations (Musaev 1979). The seventy-six individual RIL cottonseeds, developed through self-pollination during F10–12 generations by Dr. Musaev and his group, were obtained from cotton genetic collection at National University at Tashkent, Uzbekistan (Musaev et al. 2000). These individual RILs segregated 0 to 48% lint percentage phenotypes, and represented mostly extreme phenotypic classes (0–8% to 36–48%) and some individuals from the middle phenotypic classes (with 10–26% lint) (Fig. 1, shown with arrows). RILs were re-grown for phenotypic analysis at the Institute of Genetics and Plant Experimental Biology (Tashkent, Uzbekistan) field station. The fully opened random 25 cotton bolls from group of 12–15 plants for each RIL were manually harvested. The lint fiber was ginned manually and lint percentage was calculated based on fraction of lint weight to seedcotton weight.

Fig. 1
figure 1

Trait phenotype range, represented by mapping set of RIL individuals. (a) phenotypic distribution, P1 and P2 parents with two extreme phenotypes; (b) matured cotton bolls from some representatives of RIL population; white arrows point parental phenotypes. The middle phenotype lines genotyped in the 76 mapping set of RILs are shown with arrows

Genotyping analysis

Leaf samples were collected from RIL plants including two parental lines and stored at −80°C until genomic DNA preparation. Genomic DNA samples of RIL individuals and parental lines were extracted from leaf tissues according to Dellaporta et al. (1983). A total of 304 SSR markers including eighty-five EST-SSR primer pairs (Qureshi et al. 2004; Blenda et al. 2006), one hundred and twenty three JESPR SSR primer pairs (Reddy et al. 2001), and ninety-six BAC-end derived SSRs primer pairs from BAC library of TM–1 (TMB) (Yu et al. 2002; Yu et al. 2004a; Blenda et al. 2006; www.mainlab.clemson.edu/cmd) were genotyped in parental lines to find potential polymorphic markers (Table 1). If SSR primer pairs were polymorphic between parents, a set of 76 RIL lines representing mostly two extreme (0–8% to 36–48% lint) and some middle phenotypic characteristics (10–26% lint) for lint percentage (Fig. 1) were genotyped. Microsatellite genotyping analyses were performed as described by Reddy et al. (2001). SSR primer-pairs were amplified using a hot-start PCR protocol from genomic DNAs. Amplification reactions were performed in 25 μl volumes containing 2.2 μl 10 × PCR buffer with MgCl2, 0.4 μl BSA, 0.2 μl 25 mM of a dATP, dGTP, dTTP, and dCTP mix, 2.0 μl 25 ng/μl of each pairs of reverse and forward primers, and 1 μl 25 ng/μl template DNA. Then, 0.2U Taq DNA polymerase (Sigma, or Orbigen) were added to the reaction at the annealing temperature of first cycle. Amplifications were carried out with a first denaturation at 95°C for 3 min. followed by 45 cycles of 94°C for 1 min. 50°C for 1 min (annealing) and 72°C for 2 min (extension). A final 5 min. extension at 72°C was then performed. Polymorphism of SSR amplification products was revealed using polyacrylamide (PAG) and agarose gel systems. In the PAG system samples were electrophoresed at 20V/cm in a mega gel dual 42 cm high × 50 cm wide × 1 mm thick adjustable vertical system gel rig (CBS Scientific, USA) containing 6% polyacrylamide (29:1) in 1X TBE buffer, then visualized after staining with ethiduim bromide. In the agarose system, samples were electrophoresed on a 16 cm long horizontal gel (Stratagene, USA) containing 2% standard agarose plus 2% Metaphor® agarose (Amresco, USA) at 5.3V/cm in 0.5X TBE buffer (45 mM Tris-Borate, 1 mM EDTA, pH 8) with buffer chilling to 4°C. Gels were stained with ethiduim bromide and photodocumented using Alpha Imager 3400 (Innotech Inc., USA).

Table 1 SSR markers screened over parental lines

Molecular mapping

Genotypic data obtained in RILs were correlated with phenotypic data on lint percentage. Associations of polymorphic DNA markers with lint percentage trait were determined by single marker analyses (SMA) using MapQTL@4.0 (Van Ooijen et al. 2002) and QGene 3.0 software (Nelson 1997). The chi-square analysis of marker segregation was performed using QGene software (Nelson 1997). A permutation analysis with 10,000 shuffles was carried out to determine stable estimates for markers associated with lint percentage QTL (Churchill and Doerge 1994). The entire permutation analysis using QGene has been repeated for 10 times, and the average critical threshold values from these tests were calculated and used for evaluation of the significance of marker associations. Further, to better estimate the number of loci governing the lint percentage, we conducted interval mapping and multiple quantitative traits mapping (MQM) using MapQTL@4.0 software (Van Ooijen et al. 2002). Map file for interval and MQM mapping was generated using JoinMap @3.0 software, and linkage group maps created at LOD = 2.0, LOD ≥ 3.0 and LOD = 6.0 were used for further analyses (Van Ooijen et al. 2002). For MQM mapping, marker cofactors were automatically selected by cofactor selection option of MapQTL, and markers remained in backward elimination test at the significance level of P < 0.001 were used as a cofactor. Interval mapping results of MapQTL were also verified with QGene software (data not shown) (Nelson 1997).

Identification of chromosomal location

The monosomic F1 substitution stock used in this study has 25 chromosomes of the recipient (TM–1) and 26 chromosomes from the donor (G. barbadense or G. tomentosum), respectively (Stelly et al. 2005). Monotelodisomic F1 stocks lack only one arm of a chromosome from the recipient, TM–1. In other words, when an aneuploid BC0F1 cytogenetic stock (CS-B line) showed the heterozygous-banding pattern, such that both TM–1 and G. barbadense or G. tomentosum alleles were amplified, it was considered that the marker locus was not located in the particular missing chromosome or chromosome arm of the substitution line (BC0F1). However, when the cytogenetic stock exhibited a hemizygous-banding pattern, in which only the TM-1 band was missing, it was considered that the allele was located on that missing chromosome or chromosome arm of the aneuploid plant.

DNAs from very young leaves of G. hirsutum TM-1, G. barbadense 3–79, G. tomentosum, as well as cytologically identified monotelodisomic and monosomic substitution lines (BC0F1) for different chromosomes and chromosome arms of G. barbadense and G. tomentosum were used to identify the chromosomal location of the linked SSR markers following the overall deletion analysis strategy of Karaca et al. (2002). These aneuploid substitution lines were developed and provided by Dr. D.M. Stelly at Texas A&M University, College Station, Texas (Stelly et al. 2005).

EST-SSR sequences analyses

Marker sequences, in particular the sequences from EST-SSRs were compared with NCBI sequence database (http://www.ncbi.nlm.nih.gov) and TAIR database (www.arabidopsis.org) using BLASTN and BLASTX algorithms (Altschul et al. 1997).

Results

SSR-marker polymorphism

Eighty-five EST-SSR primer pairs (Qureshi et al. 2004) were screened over parental lines (L-70 and L-47) (Fig. 1) to find potential polymorphic markers that could be used to identify lint percentage QTLs. Seventy-six (89%) MGHES (EST-SSR) marker primer pairs, out of 85, produced PCR-products; while, 9 EST-SSR primer pairs did not amplify. PCR-products of sixty-two markers, out of 76, were monomorphic; while, 14 (18%) MGHES markers (MGHES -11a, 16, 24, 31, 38b, 40, 46, 55, 63, 65a, 66,77, 57, and 54) revealed polymorphism between parents.

One hundred and twenty three JESPR SSR primer pairs (Reddy et al. 2001) were also tested in parental lines, where 105 (85%) primer pairs amplified products of approximately expected size (Brooks 2001). Out of 105 well-amplified SSRs, twenty (19%) JESPR primer pairs (JESPR––121, 122, 128, 153, 220, 33, 42, 95, 56, 264, 292, 309, 211, 154, 156, 174, 181, 186, 19, 65) were polymorphic between parents.

Ninety-six BAC-end derived SSR primer pairs from BAC library of TM-1 (Yu et al. 2002; Yu et al. 2004a; http://www.mainlab.clemson.edu/cmd) also was genotyped in parental lines. Out of 96 primer pairs, 15 did not amplify the parental genomic DNAs, and 81 primer pairs (84%) amplified PCR-products. Eight (10%) BAC-derived SSR primer pairs (TMB0409, TMB0206, TMB0029, TMB0400, TMB0564, TMB0366, TMB0471, TMB0119) were polymorphic between parents. In total, we identified 42 (13%) polymorphic SSR primer pairs out of 304 SSR primer pairs tested in original parental lines (Table 1). These 42 informative SSR primer pairs amplified a total of 121 different alleles (20 primer pairs amplified two alleles, 9 primer pairs amplified three alleles, 11 primer pairs amplified four alleles and 2 primer pairs amplified five alleles) with an average of 3 SSR alleles per each primer pairs.

Genetic association of SSRs with the lint percentage trait

To associate products of these 42 polymorphic SSRs primer pairs with fiber traits, these polymorphic primer pairs were screened over the 76 intraspecific RILs of G. hirsutum, segregating for lint percentage, and polymorphic marker genotypes were correlated with lint percentage phenotypes of RILs (Fig. 1b). Single Marker Analysis (SMA) using Kurskal-Wallis non-parametric test and simple marker regression analyses were used to study association between the SSR markers and lint percentage. The Kruskal–Wallis rank sum test orders individuals according to phenotype, and then sorts them, one locus at a time, by marker genotype (Lehmann 1975). Although it is considered the least stringent test among the test statistics, first we used Kurskal–Wallis test to increase the likelihood of determining the significant QTLs because of the few polymorphic markers detected between parents. The SMA using the Kurskal–Wallis test of MapQTL revealed that 17 SSR markers (6 MGHES, 5 JESPR, and 6 BAC-end derived TMB SSRs) significantly associated with lint percentage at P = 0.05–0.0001 (Table 2). The results showed that these 17 SSR markers might have close association with cotton lint percentage genes, affecting fiber development. However, SMA analyses using QGene software that uses simple marker regression determined that 11 out of 17 SSR markers were significantly associated to fiber lint percentage QTL with the LOD scores of above 2.5 and P-values from 0.0007 to <0.00001 (Table 2). Chi-square (χ2) segregation analysis demonstrated that all of these strongly associated SSR markers (LOD > 2.5) except JESPR-65 significantly deviated from expected marker segregation ratio (1:1) in RIL population. JESPR-65 detected several heterozygous genotypes in RIL population and it was significantly deviated from expected 1:2:1 ratio for co-dominant markers (χ2  = 19.96 > 5.99 with df = 2, α = 0.05) (Table 2).

Table 2 Association of SSR markers with lint percentage QTL(s) in single marker analyses

Although the results of SMA suggested that several SSR markers are associated with loci that might contribute to lint percentage in cotton, because of small sample size and distributional properties of quantitative traits (Churchill and Doerge 1994), the critical threshold values were estimated and compared with marker statistics values to evaluate the significance of observed marker associations found in SMA. The permutation analyses with 10,000 shuffles of 17 significant lint percentage trait associated SSR markers in SMA demonstrated that 16 SSR markers were significant at α = 0.05 critical point, where we observe significant difference of not shuffled marker F-statistics values exceeding from the average critical threshold values of permutation. Further, 12 of 17 SSR markers determined stable association at α = 0.01 critical threshold values, and 5 SSR markers F-values were significant at 100% probabilities (Fmax), exceeding from the average critical threshold estimates of permutation test (Table 3). We have compared the marker F-statistics obtained in SMA with genome-wide threshold values and observed that only MGHES55 marker has significantly exceeded from the average critical ‘experimentwise’ values at maximum probability (Fmax), demonstrating detection power of lint percentage QTL in cotton genome by MGHES-55 (Table 3). The results of permutation analysis demonstrated that at least 12 significant SSR markers at 99% critical threshold have stable association with lint percentage QTL(s) and these markers should be useful for understanding the molecular basis of fiber development.

Table 3 The results of permutation analysis with 10,000 shuffles using QGene softwarea

Interval and multiple QTL-mapping

We identified several linkage groups at different LOD stringencies using JoinMap software. All eleven SSRs, except JESPR-65, that significantly associated with lint percentage QTL(s) in SMA, formed one linkage group at LOD = 2.0, covering 42 cM genetic distance. The linkage group at LOD = 2.0, however, get shortened at LOD ≥ 3.0 and formed a more sparse linkage group of nine markers, spanning of 41 cM genetic distance. Increasing the LOD threshold at LOD = 6.0 generated an even more sparse linkage map of 5 markers, covering 20 cM genetic distance. Very strong linkage was observed between TMB0409, TMB0119 and TMB0366 with a recombination frequency range of 0.01–0.05 (Fig. 2). The map information has been used in interval and MQM mapping of fiber QTLs.

Fig. 2
figure 2

Interval mapping results with linkage group at LOD = 2 (A). Multiple quantitative trait mapping (MQM) results at with LOD ≥ 3.0 (B), and LOD = 6 (C) generated using MapQTL

Interval mapping with LOD = 2.0 linkage group, representing 11 SSR markers in a 42 cM genetic distance span revealed that all 11 markers have a significant association with lint percentage trait in cotton (Fig. 2a) as they revealed a LOD score of equal or above 2.0 (Van Ooijen 1999). An average ‘comparisonwise’ threshold values (Table 4) from 10,000 shuffle revealed that all eleven markers had stable LOD values at α = 0.05 (an average critical LOD threshold score was equal to 1.79) and 9 markers (MGHES-55, TMB0366, TMB0409, MGHES31, TMB0119, JESPR153, TMB0471, TMB0029, MGHES46) had stable LOD values at α = 0.01 (an average empirical LOD score was equal to 2.07). This indicates that 9 markers (at least) with stable LOD scores at α = 0.01 critical threshold values have significant QTL effect. Since interval mapping usually ignores the effects of mapped or not yet mapped additional QTLs (Jansen & Stam 1994), simultaneous mapping of multiple QTLs (MQM) gives more efficient and accurate results (Knapp 1991; Haley and Knott 1992; Jansen 1993; Jansen and Stam 1994). MQM results with LOD ≥ 3.0 linkage group of nine SSR markers and MGHES-55 marker as a cofactor revealed that one highly significant lint percentage QTL coincided with the region between TMB0471 and MGHES-31 that explained about 23–59% (LOD = 4.92–12.37) of the phenotypic variation. There was another QTL region between markers MGHES-31 and TMB0366 that explained 5.4–12.5% phenotypic variation and it could be considered as a minor effect QTL (Fig. 2b). However, this QTL region effect significantly increased in the absence of QTL region around MGHES-55 marker and revealed 23.3–40.4% trait variation in MQM mapping using linkage group at of LOD = 6.0 with 5 markers in 20 cM genetic distance where MGHES-31 marker was used as a cofactor for MQM mapping purposes (Fig. 2c).

Table 4 Estimated LOD threshold values for markers in interval mappinga

Chromosomal location

We assigned several of fiber trait associated SSR markers on specific chromosomes using aneuploid cytogenetic stocks. Our results showed that TMB0366 amplified three products of 201 bp, 206 bp and 216 bp in TM-1 and two products of 201 bp and 216 bp size in G. barbadense. All of the aneuploid chromosome substitution F1 lines, except the aneuploid line for the long arm of chromosome 26 (designated as 26 Sh) had heterozygous phenotype for the polymorphic marker of TMB0366 SSR, similar to a normal F1 plant. However, the aneuploid chromosome substitution F1 plant deficient for the long arm of chromosome 26 (26 Sh) showed the presence of only polymorphic G. barbadense alleles (201 bp, and 216 bp), but missing the 206 bp allele of TM-1 suggesting the chromosomal location of polymorphic loci of TMB0366 (206 bp in TM1) is on the long arm of chromosome 26 (Fig. 3). This result indirectly confirmed that the lint percentage QTL associated with the TMB0366 locus is also located on the long arm of chromosome 26. We also have located the chromosomal location of several other lint percentage associated markers following the similar strategy. The polymorphic 204 bp marker of TMB00471 was assigned to chromosome 12 using G. barbadense aneuploid lines while the polymorphic markers of JESPR-153 (95 bp) and TMB0029 (210 bp) were assigned to chromosome 18 using aneuploid lines of G. barbadense and G. tomentosum and CS-B stocks, respectively. JESPR-122 amplifying 198 bp polymorphic band between TM-1 and 3–79 was assigned to chromosome 23 in G. barbadense aneuploid stocks. We could not detect the chromosomal locations of the other SSR markers implicated in our study due to their monomorphic nature between TM-1, G. tomentosum and G. barbadense parental lines or in cases where polymorphism existed we could not detect any missing bands in any of the aneuploid lines in our gel system. This suggests that the location of these not-assigned SSR markers is on other chromosomes/arms not represented in the cytogenetic stocks.

Fig. 3
figure 3

Capillary electrophoresis results, showing the chromosomal positions of TMB0366 (TMHNO3). (a) TM1; (b) G. barbadense; (c) Cytogenetic line with substitution of chromosome 16; (d) Cytogenetic line with substitution of long arm of chromosome 26 (26sh). Arrows show the location of SSR PCR products

Sequence blast analyses

Since EST-SSRs were a priori associated with functional genes, the sequences from EST-SSRs significantly associated with lint percentage in this study were blasted against GenBank to check if they have a homology with the known functional plant genes that would support the marker-trait association results. Sequence blast analyses of the flanking regions of MGHES-55 EST sequence, the strongly associated SSR marker with lint percentage QTL(s), revealed that it had significant similarities to a cotton boll abscission zone EST (GenBank accession numbers AI055070 and AI055346; Blast score = 6e−032) deposited in the NCBI (http://www.ncbi.nlm.nih.gov) database (Fig. 4a). Sequence blast analyses of the MGHES-31, another lint percentage QTL associated marker, against Arabidopsis protein sequences in TAIR database (www.arabidopsis.org) revealed that this EST have strong structural similarity with one of the unknown Arabidopsis protein MBK23.17 with the blast score of 7e−17 (Fig. 4b). Multiple sequence alignment of genomics sequences of MBK23.17 (GenBank accession number AB005233.1) and MGHES31 EST fragment (GenBank accession number AW186938.1) also demonstrated significant homology in nucleotide and the putative protein sequence level, corresponding to the first and the second exon of MBK23.17 (Fig. 4b, c). We did not find CAT repeats site (histidine amino acid repeat motif) in Arabidopsis MBK23 clone, implying specificity of histidine repeat motif to the cotton MGHES31 loci. Moreover, MGHES-31 EST showed significant similarity to microtubule binding and synaptonemal complex proteins (data not shown). More importantly, several EST-SSRs sequences, MGHES-31, MGHES-46, and MGHES-16 that also revealed significant association with lint percentage in this study showed nucleotide sequence similarities with 6–10 dpa fiber EST sequences of G. hirsutum and G. arboreum (GenBank accession numbers BF278023, BG446437 and AI731036, respectively) found in the NCBI database (http://www.ncbi.nlm.nih.gov).These similarities are obvious since EST-SSRs used in this study were developed from cDNAs of fiber-related tissues (74%) as well as cotton boll and cottonseed (26%) tissues (Qureshi et al. 2004).

Fig. 4
figure 4

Multiple sequence alignment for (a) MGHES-55 and cotton boll abscission ESTs–AI055070 and AI055346 refer NCBI accession number for cotton boll abscission ESTs; (b) protein and (c) nucleotide sequence alignment of MGHES-31 (acc. no. AW186938.1) and MBK23.17 of A. thaliana (acc. no. AB005233). Sequence alignments were generated using ClustalX 1.8 for Macintosh

Discussion

The cotton fiber is derived from a single cell of maternal origin that initiates from the ovule at or just before anthesis. Long lint fibers initiate between the day of anthesis and 4–6 days post anthesis (dpa); whereas, short fuzz fibers (10 mm) are initiated in the 4–10 dpa period (Basra and Saha 1999). Researchers have reported several factors that play an important role in the fiber initiation process; for instance, the involvement of the vacuolar H+-ATPase and the plasma membrane H+ATPase in fiber cell expansion by controlling cell turgor have been reported by Wilkins and Jernstedt (1999). Involvement of sucrose synthase Suc gene in fiber development has also been reported (Nolte et al. 1995). Recent research has included an analysis of the gene expression profiles of developing fiber cells (Pear et al. 1996; Ruan et al. 2001; Asnaghi et al. 2001; Wu et al. 2001; Ruan et al. 2003; Suo et al. 2003; Ji et al. 2003; Giband et al. 2003; Arpat et al. 2004; Li et al. 2005; Lee et al. 2006). These studies identified potential roles for several known pathways and genes in fiber development, including auxin signaling genes, the MAP-kinase (MAPK) pathway genes, sucrose synthase (Suc), cellulose synthase (GhCesA), the transcription factor GhMyb109 and genes associated with cytoskeleton development. Expansin and profilin genes may also play important roles in fiber initiation and development (Orford and Timmis 1998; Ji et al. 2003). Lee et al. (2006) identified several potential genes for fiber initiation and development that were down-regulated either in non-fiber tissues or in fibers of N1N1, n2n2, and LiLi fiber mutants in TM-1 isogenic background using oligo-gene microarrays. Although these genes that are specifically expressed in developing fibers are undoubtedly important candidate genes for fiber initiation and development traits, there is still no evidence that the expression of any particular gene plays an essential role in cotton fiber development, indicating the need for further studies in this direction.

An alternative avenue to tag important fiber development genes is identification of molecular markers linked to fiber-specific QTLs. This approach complements the above-mentioned efforts in tagging fiber development genes through gene expression profiling. Seven cotton fiber mutants have recently been comparatively analyzed and genetically mapped using RFLP markers (Rong et al. 2005), in that fiber development genes, Li 1 , Li 2 , N 1 , n 2 , Fbl, SMA-4, have been mapped with RFLP markers and assigned to the chromosomes based on linkage map information (Rong et al. 2004). However, there is a need to find additional portable PCR-based DNA markers for future molecular breeding programs of cotton. We examined publicly available markers from JESPR, BAC-SSRs, MGHES microsatellite marker collections to tag lint percentage genes using a set of unique intraspecific RILs derived from the cross of fuzzless/lintless and fuzzy/linted Upland cotton lines. These unique intraspecific RILs with a wide range of variation of lint percentage ranging from no fiber to 48% lint provided us an opportunity to identify QTL(s) associated with lint fiber percentage.

It should be mentioned that efficient detection of QTLs and accurate estimation of QTL effect both require large sample sizes (Soller et al. 1976; Beavis 1994). However, to reduce the number of samples to be genotyped, a selective genotyping method can be used where molecular marker genotyping is applied to individuals representing the two extremes of the segregating population (Lander and Botstein 1989; Darvasi and Soller 1992; Darvasi 1997). In our study, we genotyped mostly extreme phenotypic classes (0–8% to 36–48%), and some middle phenotypic classes (with 10–26% lint) using SSRs. We found 17 SSR markers (six MGHES, five JESPR, and six BAC-end derived TMB SSRs) significantly associated with lint percentage QTLs in the single marker analyses. The permutation analyses of these lint percentage QTL(s) associated SSR markers revealed that at least 12 SSRs have stable estimates at empirical threshold values of α = 0.01 demonstrating usefulness of these marker in understanding of molecular basis of lint percentage trait.

Although very strong linkage was observed between some SSR markers (e.g.TMB0409, TMB0119 and TMB0366), the overall map information is not strongly supported and should be interpreted with caution since several SSR markers assigned to different chromosomes (e.g. TMB 00471, JESPR 122 and TMB 0366 ) mapped together with the close genetic distance (Fig. 2). There are some constraints to construct linkage groups with few polymorphic markers and limited sample size (Young 2000). However, to get some insights about a number of QTL regions responsible for the lint development, we identified linkage group of eleven SSR markers at LOD = 2.0 and linkage group of nine SSR markers at LOD ≥ 3.0 threshold, where all SSR markers within linkage group were empirically associated with lint percentage trait in SMA. Interval mapping revealed that at least 9 markers have stable LOD scores indicating their significant associations with lint percentage QTL(s). Multiple QTL-mapping with linkage group of LOD ≥ 3.0 revealed that at least, two highly significant fiber development QTL around regions TMB0471 and MGHES-31 and around markers MGHES-31 and TMB0366. These regions have a good potential to be candidate loci that contribute to fiber development in cotton. These results are consistent with classical opinion on existence a few QTL regions to control lint/fuzz development in cotton (R. J. Kohel, personal communication).

Several of these markers could themselves be candidate loci that contribute to cotton fiber development. For example, the markers MGHES-31, MGHES-46 and MGHES-16, are themselves possible candidate loci that contribute to fiber initiation and development in cotton since they showed a significant homology with 6–10 dpa fiber EST sequences of G. hirsutum and G. arboreum when compared with sequences at NCBI database. Additionally, sequence blast analyses of some EST-SSR sequences revealed a significant homology with the sequences of known function. Sequence blast analyses of the flanking regions of MGHES-55 EST SSR had significant similarities to a cotton boll abscission zone EST while MGHES-31 had significant similarity to microtubule binding and synaptonemal complex proteins that indirectly demonstrate the putative biological activity of these ‘candidate gene’ loci. This indicates that abscission-like genes may play important role in lint development of cotton that yet to be studied further. There is evidence that genes associated with cytoskeleton development including microtubulines and actins play important role in fiber cell development (Ji et al. 2003; Li et al. 2005). Furthermore, structural homology in protein and nucleotide level of MGHES-31 with unknown expressed protein of Arabidopsis, MBK23.17, demonstrated that importance of this unknown factors in both plant families, and found structural homology could be useful to expedite the determination of possible biological function of MGHES-31 through gene-knockout technologies in Arabidopsis. Besides, several BAC-derived SSRs that determined stable association with lint percentage QTL(s) such as markers TMB0471, TMB0366, TMB0409 and TMB0119 are the possible markers that are useful for the identification of physical genomic contigs that contain fiber development genes. These markers may be utilized as starting points for future positional cloning efforts of fiber development genes in cotton with their consequent integration into the genetic linkage maps.

Our results demonstrated that marker TMB0366 associated with a lint percentage QTL was located on the long arm of chromosome 26. Another lint percentage associated marker, TMB00471, was assigned to the chromosome 12. Chromosomes 12 and 26 in cotton are considered homeologous chromosomes in tetraploid cotton (AD genome) (Endrizzi et al. 1984). Kohel (1973) reported that fuzzlessness genes in cotton were located on homologous chromosomes 12 and 26. We assigned two other lint percentage associated SSR markers, JESPR-153 and TMB0029, to chromosome 18. Several morphological trait genes associated with flower color and open bud were located on linkage group XII or chromosome 18 (Endrizzi et al. 1985). Zhang et al. (2002) reported that chromosomes 5 and 18 form homeologous linkage groups based on molecular marker analysis. Rong et al. (2004) observed seven duplicated loci on the same chromosome 18 suggesting this chromosome has undergone extensive duplication. Lacape et al. (2003) reported the existence of dense marker regions on chromosome 18 covering 162.6 cM with 37 loci. A fiber length QTL was located on chromosome 18 (Kohel et al. 2001), as were QTLs for fiber strength and fiber color yellowness (Paterson et al. 2003). Moreover, the lint percentage QTL associated SSRs in this study, JESPR-153, has also been assigned to chromosome 18, determining the fiber elongation QTL (Shen et al. 2005), implying significance of chromosome 18 in lint development. According to Turley and Kloth (2002) at least five chromosomes–12, 17,18 20 and 26 might carry loci for fuzz development. In our study, the lint percentage associated SSR markers assigned to chromosomes 12, 18, 23 and 26. Saha et al. (2006) reported that chromosomes 18 and 26 Lo had QTLs that had positive and negative additive effect respectively on lint percentage in cotton. We assigned another lint percentage associated marker, JESPR-122, to chromosome 23 suggesting existence of fiber trait loci in other chromosomes of cotton. There is evidence for existence of additional lint percentage QTLs in other chromosomes of cotton. The Ligon lintless-1 mutant phenotype was assigned to the chromosome 22 (Karaca et al. 2002, Rong et al. 2005). Chromosomal localization of the other potential markers in this study will be necessary to identify chromosomes responsible for lint development in cotton.

In conclusion, our efforts to tag lint percentage trait genes should be useful in understanding of the molecular base of fiber cell growth in cotton. These identified SSR markers that are significantly associated with lint percentage QTL(s) should help breeders to transfer these loci from donor RIL individuals into highly adapted and productive elite cotton cultivars through MAS programs.