Introduction

Cotton is one of the most important crops in the world, providing natural fibre used in many facets of daily life. The most extensively cultivated cotton species are the allotetraploid Gossypium hirsutum and G. barbadense, which account for ∼95 and 2% of total production worldwide (National Cotton Council, http://www.cotton.org). G. hirsutum and G. barbadense are known for their high yield and high fibre quality, respectively (Pang et al. 2012). The contrasting traits of two species make them ideal for studying the genetic basis of yield and fibre quality in cotton (Mei et al. 2004; Lacape et al. 2005, 2010; He et al. 2007; Yu et al. 2013). However, since yield and fibre quality in cotton are complex quantitative traits, studies on the genetic basis of these traits rely on genetic linkage maps.

The first interspecific linkage map was constructed with restriction fragment length polymorphism (RFLP) markers (Reinisch et al. 1994). However, a shortage of RFLP probes restricts worldwide use. Fortunately, the development of PCR-based markers, especially simple sequence repeats (SSRs), has increased rapidly. There are now 17,448 publicly available SSR markers in cotton (http://www.cottongen.org), which facilitate linkage map development and QTL mapping in cotton worldwide. EST-SSRs account for nearly half of all SSRs. Several high-density interspecific linkage maps (Rong et al. 2004; Yu et al. 2011; Zhao et al. 2012; Fang and Yu 2012) and intraspecific linkage maps have been developed in cotton. In addition, many QTL mapping studies for yield and fibre quality have been conducted with both interspecific (Mei et al. 2004; Lacape et al. 2005, 2010; He et al. 2007; Yu et al. 2013) and intraspecific populations (Ulloa et al. 2005; Zhang et al. 2005, 2009, 2012; Shen et al. 2006; Wang et al. 2006; Qin et al. 2008; Wu et al. 2009; He et al. 2011; Liu et al. 2012; Sun et al. 2012). These studies contribute much to the understanding of the complexity of genetic control of cotton yield and fibre quality.

Recently, the focus on QTL mapping has shifted to the transcriptional level in cotton. Liu et al. (2009) applied cDNA-amplified fragment length polymorphisms (AFLPs) to construct fibre transcriptome groups at the secondary cell wall (SCW) thickening stage, based on an interspecific backcross (BC1) of G. hirsutum ×G. barbadense. They mapped 78 transcript-derived fragments (TDFs) into eight transcriptome groups, and detected two significant QTLs, FS1 and FS2, which explained 16.08 and 15.87% of the fibre strength variance, respectively. Liu et al. (2011) constructed a transcriptome map based on cDNA-AFLPs, using an immortalized F2 (IF2) population of the cotton hybrid Xiangzamian 2 (G. hirsutum). A total of 302 TDFs were mapped onto 26 linkage groups, and 71 QTLs for yield and yield component traits were detected, based on four environments, with 13 QTLs identified in at least two environments. Claverie et al. (2012) used quantitative cDNA-AFLP to monitor variation in the expression level of cotton fibre elongation and secondary cell wall thickening transcripts in a population of interspecific G. hirsutum × G. barbadense recombinant inbred lines (RILs). Two-thirds of the normalized intensity ratios of TDFs were mapped between 1 and 6 eQTLs.

Genomewide transcriptome maps form the foundation of gene mapping and cloning as well as comparative genomics. Usually, transcriptome maps are constructed using cDNA-AFLPs since direct mapping of transcripts with the cDNA-AFLP technique is a rapid and practical method of direct mapping of expressed genes (Brugmans et al. 2002). Transcriptome maps have been successfully constructed in Arabidopsis and potato (Brugmans et al. 2002; Ritter et al. 2008) as well as in cotton (Liu et al. 2009, 2011; Claverie et al. 2012) by cDNA-AFLP. The cDNA-AFLP technique is high-throughput, however, the procedure is complex. cDNA-SRAP has been applied for the construction of a transcriptome map in B. oleracea (Li et al. 2003); this technique is similar to but simpler than cDNA-AFLP.

Compared to these two techniques, constructing transcriptome maps using EST-SSRs is also a practical method of directly mapping expressed genes. Although EST-SSRs are developed from ESTs, EST-SSRs are typically applied in DNA research as they are simpler and easier to work with. In fact, EST-SSRs can be directly used to amplify cDNA to construct SSR-based transcriptome linkage maps. There are many EST-SSRs in different plants, with more than 8000 EST-SSRs in cotton (http://www.cottongen.org). Thus, EST-SSRs are a valuable resource for constructing transcriptome maps. Generally, EST-SSRs were first applied in DNA-based genetic maps, implying that they have been mapped to chromosomes. When they are applied in transcriptome maps, the linkage groups can easily be assigned to corresponding chromosomes. However, most TDFs from cDNA-AFLPs and -SRAPs have not been mapped to chromosomes, since it is hard to assign them to chromosomes (Brugmans et al. 2002; Li et al. 2003; Ritter et al. 2008; Liu et al. 2009, 2011).

In this study, we utilized the cDNA samples obtained from developing fibres from an F2 population of Emian22 × 3-79 at five days post anthesis (DPA) as templates to explore the possibility that a transcriptome map could be constructed by amplifying cDNA using EST-SSRs. We have shown that this practical and simple method for constructing transcriptome linkage maps will facilitate eQTL mapping in cotton, which will help to analyse the genetic basis of economically desirable traits at the transcriptional level.

Materials and methods

The mapping parents were the G. hirsutum cultivar ‘Emian22’ and the G. barbadense accession ‘3-79’, which are described in the previous work with the BC1 mapping population (Yu et al. 2011). To construct a transcriptome linkage map, a new F2 population was developed. The F2 population (∼200 individuals) was planted in the experimental field of Huazhong Agriculture University, Wuhan, Hubei, China in 2009. However, only 69 plants were used to construct the map as it was possible to obtain enough samples for RNA extraction from these plants. In the previous study, it was found that the differences in developing fibres at five DPA were obvious between the mapping parents (Liu et al. 2013). Therefore, RNA was extracted from developing fibres at five DPA using the method described by Zhu et al. (2005). First-strand cDNA was synthesized using 3 μg of RNA from each sample, following the protocol provided with the Superscript®; III RT kit (Invitrogen, Carlsbad, USA).

Polymorphism screening

All the markers used in this study were obtained from the previously published BC1 linkage map (Yu et al. 2011), which have been mapped to chromosomes. Since the markers on the BC1 map included gSSRs and EST-SSRs, the EST-SSRs were directly used to screen for polymorphisms in the cDNA of developing fibres at five DPA; the sequences of the gSSRs were blasted against the cotton ESTs to identify SSRs derived from the transcribed sequences. Next, these SSRs were used to screen for polymorphisms.

PCR amplification and electrophoresis

Polymerase chain reaction (PCR) was conducted in 20 μL volume containing 25 ng DNA, 0.2 μmol L−1 forward primers, 0.2 μmol L−1 reverse primers, 1×buffer, 1.5 mmol L−1 MgCl2, 0.3 mmol L−1 dNTPs and 0.5 units of Taq polymerase. The PCR profile consisted of an initial denaturation at 94C for 2 min followed by 30 cycles of denaturation at 94C for 1 min, annealing at 55C for 1 min, and an extension at 72C for 1 min, with a final incubation at 72C for 10 min. First, the amplification products obtained with all of the primers were genotyped using the SSR analysis protocol on 6% denaturing polyacrylamide gels at room temperature (Lin et al. 2005). Next, the monomorphic PCR products were separated based on single-strand conformation polymorphisms (SSCPs) on 6% nondenaturing polyacrylamide gels at a constant 8 W for ∼16 h at 4C.

Transcriptome linkage map construction

After polymorphism screening, all the polymorphic markers were used to genotype the entire F2 population. If a primer detected multiple loci, letters (a, b, c...) were assigned to the loci according to the descending fragment size. The linkage map was constructed with JoinMap 3.0 (Stam 1993) using a logarithm of odds (LOD) threshold of 4.0 and a maximum recombination fraction of 0.4. Map distances were calculated in centi-Morgans (cM) using the Kosambi mapping function (Kosambi 1944). Linkage groups were assigned to the corresponding chromosomes using mapped SSRs (http://www.cottongen.org).

Functional annotation of SSR-ESTs

The SSR sequences were functionally annotated using Blast2GO (Conesa et al. 2005; Götz et al. 2008) with default parameters. First, blastx analysis was carried out with an E-value of 10−5 against the nr protein database in GenBank; and the sequences with >80% identity were retained. Next, GO-mapping, annotation analysis, InterPro Scan and InteProScan GOs were performed. Finally, GO-Slim (http://www.geneontology.org/GO.slims.shtml) was then carried out to acquire specific GO terms.

Results and discussion

Marker polymorphism

In this study, the mapping parents were the same as those used to construct the DNA-based linkage map (Yu et al. 2011). The EST-SSRs in the DNA-based linkage map were directly used to construct the transcriptome map. To map more markers, genomic SSRs in the map were blasted against the cotton ESTs to identify transcribed markers. Finally, 1270 transcript-derived SSRs were included in this study (table 1). The samples used for transcriptome map construction were cDNAs from developing fibres at five DPA, as it was confirmed that a great number of TDFs differed between the mapping parents at this stage (Liu et al. 2013). After polymorphism detection, only 303 primers (23.86%) showed polymorphism, 267 (21.02%) produced amplification products, but lacked polymorphism, and 700 (55.12%) did not amplify the target (table 1) The failed amplification is likely to be the reason as these SSRs could not target tissue-specific transcripts in developing fibres at five DPA.

Table 1 Polymorphisms of different SSRs in developing fibres at five DPA between mapping parents.

Transcriptome linkage map construction

The cDNA samples obtained from the F2 population were genotyped with the 303 polymorphic primers; however, some primers generated ambiguous gel bands and were discarded in subsequent analyses. Subsequently, 244 primers produced clearly scorable gel bands with 279 polymorphic loci; among them, 58 (20.79%) were 3-79 dominant loci, 18 (6.45%) were Emian22 dominant, and 203 (72.76%) were codominant loci. Generally, SSRs are codominant. The corresponding genes of the codominant markers are all expressed in the parents. However, variations in the microsatellite region could occasionally cause alterations in the translated products, which could result in trait variations. The corresponding genes of dominant markers are only expressed in one parent, which could alter the gene network of the trait and result in trait variations.

After linkage analyses, 242 loci were mapped into 32 linkage groups, with 37 remaining loci unmapped. The total length of the transcriptome linkage map was 1938.72 cM (table 2; figure 1). The longest linkage group was 132.74 cM with 15 loci (LG05/Chr05), while the shortest linkage group was 2.33 cM with two loci (LG31/Chr25); generally, the average length of a linkage group was 60.58 cM. The average marker interval distance was 8.01 cM, while it varied from 0.01 (LG16/Chr11) to 36.12 cM (LG11/Chr09). The AT genome included 18 linkage groups with a total of 114 loci; the total length was 1041.68 cM, and the average marker interval distance was 9.14 cM. Comparatively, the DT genome included 14 linkage groups with a total of 128 loci; the total length was 897.04 cM and the average marker interval distance was 7.01 cM. The genomewide distribution of EST-SSRs implied that the fibre development in cotton involves genes distributed throughout the entire genome.

Figure 1
figure 1figure 1figure 1

Interspecific transcriptome linkage map of fibre development in cotton constructed using EST-SSRs.

Table 2 Basic information on the F2 transcriptome linkage map based on developing fibres at five DPA.

We found that dominant markers were absent on Chr02, Chr04, Chr16, Chr17, Chr21 and Chr25 (table 3). Obviously, there were more chromosomes lacking dominant markers in the DT genome than in the AT genome; however, there were more dominant markers overall in the DT genome, indicating that the dominant markers were clustered on particular chromosomes of the DT genome, especially Chr15. The uneven distribution of EST-SSRs on the chromosomes implied that different chromosomes played different roles in developing fibres at five DPA. This result also provided evidence for the existence of tissue-specific transcriptomes.

Table 3 Distribution of dominant markers in the transcriptome map.

This map is suitable for mapping eQTLs. Unfortunately, most of the cotton bolls were sampled for RNA extraction to construct the transcriptome map; thus, not enough bolls remained to phenotype the fibre quality. However, when transcriptome maps are constructed from permanent populations, it is feasible to map eQTL to understand the genetic basis of cotton fibre traits at the transcriptional level. We can also compare the eQTLs with previous QTLs based on DNA markers to uncover those reliable QTLs.

Functional annotation of SSR-ESTs

After blasting against the nr protein database, 53 of the 242 SSR-ESTs with >80% identity were functionally annotated, and they were classified into 10 categories (figure 2). Except for the 16% with unknown function, the top functions were transcription activity (15%) and transportation activity (14%). Some SSR-ESTs were involved in fibre differentiation (e.g. NAU3495), fibre initiation (e.g. NAU3496) and fibre elongation (e.g. MUSS250). These results were in accordance with the sampling phase (developing fibres at five DPA) as five DPA is the overlapping period of fuzz cell differentiation, lint initiation and elongation (Lee et al. 2007).

Figure 2
figure 2

Functional annotation of mapped SSR-ESTs.

In conclusion, an EST-SSR-based interspecific transcriptome linkage map of fibre development in cotton was constructed. This innovative new tool for constructing transcriptome linkage maps can be easily used to assign linkage groups to chromosomes and facilitate eQTL mapping in plants.