Introduction

Cotton is a worldwide cash crop that supplies indispensable natural fibre for the textile industry. Upland cotton (Gossypium hirsutum L.) has high lint yield, accounting for more than 95% of world cotton production (Li et al. 2016b). Compared with upland cotton, Extra-long staple, Pima, Egyptian or Sea-Island cotton (G. barbadense L.) has high fibre quality, but its low yield limits the areas where it is grown (Said et al. 2015b). Plant height (PH) is one of the main factors affecting light interception and photosynthesis in the canopy, playing a key role in overall photosynthate partitioning and harvesting index in several crops (Teng et al. 2013). In the process of domestication of crops, plant architecture modification is increasingly the focus of breeding efforts, i.e. ‘ideal-type breeding’, which has been widely employed to enhance crop resistance to lodging in various environments and to improve photosynthesis in several crop species (Chen et al. 2018; Liu et al. 2018). For example, via hybrid breeding, the adoption of compact and moderately short cultivars that are better suited to a high planting density significantly improves maize productivity (Du et al. 2018b; Peiffer et al. 2014). Similarly, decreased height in rice and wheat strengthens plant resistance to lodging due to heavy rain and high wind damage, which contributes to a high harvesting index and high yield in ‘green revolution’ (Peng et al. 1999; Sasaki et al. 2002). However, in cotton, short plants do not always equate to high yielding in that some environments require taller or shorter plants depending on environment and cultural practices. For example, in the US, the Texas High Plains requires a short plant due to a windy environment and the harvest method of using a cotton stripper; however, in the Southeastern USA, where the climate is humid with a great deal of rain, the cotton crop is harvested with mechanical pickers requiring tall plants. Currently, increased demand in harvest mechanization in China makes it necessary for cotton researchers to breed cotton plants with a suitable height (Su et al. 2018). Hence, understanding the genetic basis of PH will be useful for effectively improving plant architecture adapted to mechanical harvest.

The restricted recombination events and low genetic diversity in designed populations add to the difficulty of breeding by hybridization. To improve accuracy and shorten breeding cycles, marker-assisted selection (MAS) is increasingly applied in cotton crossbreeding (Cao et al. 2014; Huang et al. 2018; Jia et al. 2016; Wen et al. 2018). For MAS, a genetic map of high quality with a high marker density is very important as it lays the foundation for future studies, such as mapping reliable quantitative trait loci (QTLs) of important agronomic traits and further identifying candidate genes within these QTL regions. Following the construction of the first genetic map of allotetraploid cotton, numerous studies have focused on genetic map construction and QTL mapping (e.g., Shang et al. 2016; Sun et al. 2017; Yu et al. 2007, 2013; Zhang et al. 2015c; Zhu et al. 2018). To date, many genetic maps have been constructed using simple sequence repeat (SSR) markers. However, the low polymorphic level of SSR markers is a bottleneck for constructing a fine SSR-based genetic map on the genome scale. With the rapid development of DNA sequencing technology, single nucleotide polymorphism (SNP) markers with high genetic diversity at the genome-wide level have been widely used to construct genetic linkage maps. Accordingly, the use of SNP markers is an effective way to construct a high-quality and high-density genetic linkage map (HDGM) for detecting QTLs. Specific-locus amplified fragment sequencing (SLAF-seq) represents one of the genotyping-by-sequencing (GBS) methods and can be used to develop large quantities of genome-wide SNP markers, as detailed by Sun et al. (2013). To date, the HDGMs of various species have been successfully constructed using the SLAF-seq technique (e.g., Ji et al. 2017; Keerio et al. 2018; Tao et al. 2017; Zhang et al. 2015a, 2017).

PH is a quantitative trait, and it is affected by genotype, environment, genotype by environment interaction, and development during plant growth. Currently, although PH QTLs have been mapped onto chromosomes in cotton, such as At5 (4 QTLs), At9 (4 QTLs), At11 (6 QTLs), Dt1 (5 QTLs), Dt3 (8 QTLs), Dt5 (4 QTLs) and Dt12 (5 QTLs) (Jia et al. 2016; Said et al. 2015a; Su et al. 2018; Sun et al. 2017), few candidate genes associated with PH have been mined within these QTL regions. As is well known, plant hormones such as auxin (IAA) and gibberellic acid (GA) play a key role in plant growth and developmental processes (Friml et al. 2003; Kimura and Kagawa 2006; Palme et al. 2006; Sasaki et al. 2002). The GA biosynthesis and GA signalling have been identified to influence PH. For example, the recessive allele of the sd1 gene, encoding a mutant oxidase enzyme involved in the biosynthesis of gibberellin, leads to the short stature in rice (Monna et al. 2002; Sakamoto et al. 2004). IAA is mainly synthesized in young leaves and apical organs and promotes root elongation, stem growth and flower differentiation. At the cellular level, the uneven distribution of three IAA membrane carriers, including a number of p-glycoprotein ATP-binding cassette transporters, IAA efflux transporter (PIN) family members, and the influx carrier AUX1 protein family, leads to the directional transport of IAA (Wang et al. 2018a, b). The enzyme of PINs reportedly plays a rate-limiting role in catalysing the transport of IAA and participates in PH developmental process (Petrásek et al. 2006). In recent studies, some members of the PIN family were found to influence PH in some species, such as ZmPIN1a overexpression dwarfing PH in maize (Du et al. 2018b) and overexpression of OsPIN2 leading to a shorter PH in rice (Chen et al. 2012). However, there is no study on whether any plant hormone biosynthetic genes are associated with PH in cotton.

To provide additional information for breeding programmes, the genetic basis of PH must be further investigated in cotton. In this study, we used an interspecific backcross inbred line (BIL) population containing 250 individual lines was derived from a cross between upland cotton (CRI36) and Egyptian cotton (Hai7124). The objectives of this study were to construct a fine interspecific genetic linkage map in cotton based on 7,709 high-quality SNP markers, and to detect QTLs for PH in seven environments and across the tests using the best linear unbiased prediction (BLUP) method. As a result, a stable QTL for PH, qPH-Dt1-1, was mapped in multiple environments, which was also reported in previous studies (Jia et al. 2016; Said et al. 2015a). Through gene annotation and differential gene expression analysis in different tissues and genotypes differing in PH, one candidate gene in the qPH-Dt1-1 region encoding an IAA efflux carrier family protein, namely GhPIN3, was identified for functional verification using virus-induced gene silencing (VIGS) in the two parental lines (CRI36 and Hai7124) used to develop the BIL population. GhPIN3 represents the first candidate gene for PH identified through QTL mapping in cotton.

Methods and materials

Plant materials

An interspecific BIL population of 250 lines was utilized in this study. The BILs were produced by a cross between G. barbadense Hai7124 and G. hirsutum CRI36, using CRI36 as the recurrent parent for backcrossing with the F1 to produce BC1F1, followed by seven generations of selfing. The seeds of Hai7124 were obtained from the National Cotton Germplasm Collections of the Low-temperature Germplasm Gene Bank, Institute of Cotton Research, Chinese Academy of Agricultural Sciences (CRI-CAAS); and the detailed information of Hai7124 was described by Zhao et al. (2013). The CRI36 was bred by CRI-CAAS. The two parents and 250 BC1F7 lines were planted in seven environments in four locations: the experimental farm, CRI-CAAS, Anyang (Henan Province, 36.06°N, 114.49°E) with four tests including one each in 2015 and 2016 and two in 2017 (one in south farm and another in east farm); Sanya (Hainan Province, 18.41°N, 109.20°E) in 2016; Alaer (Xinjiang Uyghur Autonomous Region, 40.55°N, 81.28°E) in 2016; and Weixian (Hebei Province, 36.95°N, 115.46°E) in 2017. The cotton seeds were hill-sown by hand with two replications in a randomized complete block design in each environment and covered with plastic mulch applied directly by a machine during the growing season. In Anyang, Weixian and Sanya, approximately 16 plants per 4-m-long row were retained, and the row spacing was 0.80 m. In Alaer, where a high-density seeding rate was used, approximately 44 plants per 5-m-long row were retained, and the row spacing was 0.38 m. Crop management practices followed local recommendations for the production area. The use of the two cotton production systems (i.e. normal and high plant density) allowed detection of consistent QTLs for PH between the two systems.

Phenotypic measurement and analysis

In all the environments, PH was measured as the distance in cm from the soil line of the plant to the top of the apical bud at reproductive maturity. Six normally growing plants in each replication were randomly selected for height testing. The R software package ‘lme4’ was used to calculate the BLUPs for PH across the seven environments (Liu et al. 2016a). A statistical analysis including the analysis of variance (ANOVA) of PH for the BIL population across environments was performed using SPSS 23.0 (Li et al. 2016a). QTL IciMapping 4.1.0.0 software was used to estimate broad-sense heritability (H2) of PH across environments (Meng et al. 2015).

DNA and SLAF library preparation

The young leaves of the two parents and 250-BIL population were sampled in July 2015 and stored in a refrigerator at − 70 °C. Using a miniprep method, the genomic DNA of the two parents and each BIL was extracted. Several modifications of the SLAF-seq strategy were adopted in the library construction. First, the reference genomic data of TM-1 were used to perform a pilot experiment, and the quantities of markers produced by different endonucleases were assessed with an in-silico simulation software (Zhang et al. 2015b). According to the result of the SLAF pilot experiment, the SLAF library was constructed, and the HaeIII endonuclease was selected to digest the genomic DNA of the two parents and 250-BIL population. Detailed information on the SLAF-seq strategy was provided in a previous study (Zhang et al. 2015a).

Genotyping of SLAF markers

High-quality SNP markers were detected and genotyped through detailed processes as described in previous studies (Liu et al. 2016b; Sun et al. 2013). First, low-quality reads (quality score < 30e) were excluded, and the remaining reads were assigned to each offspring through duplex barcode sequences. Then, five nucleotide bases at the terminal position of each remaining read were trimmed off. Finally, 100-bp paired-end clean reads, acquired from each sample, were aligned to the reference genome of TM-1 (Zhang et al. 2015b) using Burrows–Wheeler Alignment (BWA) software (Li and Durbin 2009). Sequences from each offspring located to a common physical position with identities greater than 95% were regarded as one SNP locus (Zhang et al. 2015a). Using the software GATK, SNP markers in all of the SLAF loci between the two parents were detected. As the length of each SLAF was approximately 200 bp, three or more SNPs in one SLAF showing high heterozygosity would lead to a low accuracy of genotyping in cotton. Accordingly, SLAFs with high heterozygosity were discarded. The detailed procedures of SLAF genotyping were described by Zhang et al. (2016). The criteria used to define the SLAF repetitiveness and polymorphism were reported in previous studies (Zhang et al. 2015a, 2016). SLAF repeats were excluded, and polymorphic SNPs were consequently used. Ultimately, only common SLAFs in both the BIL population and the parents were used.

Linkage map construction and quality assessment

The TM-1 reference genome database was used to construct the linkage map of the BIL population, as described in detail by Zhang et al. (2015a). Correcting genotyping errors and ordering the SLAF sequences along the chromosomes were conducted by HighMap software (Van Ooijen 2011; Liu et al. 2014). In addition, SMOOTH was used to correct errors based on the contribution of parental genotypes (Van et al. 2005), and incorrect genotypes were filtered with a K-nearest neighbour algorithm (Van et al. 2005). Skewed markers as defined below were supplemented into the linkage map by a multipoint maximum likelihood (MML) method. The map distances were calculated by the Kosambi mapping function. The Chi-square test was used to identify the markers with segregation distortion. As described in a previous study, the segregation distortion markers (SDMs) (0.001 ≤ p ≤ 0.05) were acceptable in the HDGM (Zhang et al. 2016). The SNP markers in the linkage map were aligned to the genome with the local BLAST method. The collinear coefficient of markers in each chromosome was analysed by CIRCOS 0.66 software. A recombination hotspot (RH) was defined as a genetic distance between two adjoining markers greater than 20 cM per Mb (Zhang et al. 2015a).

QTL analysis

PH from the seven individual environments and its BLUP across the tests were used for QTL analysis, and QTLs were identified with the inclusive composite interval mapping (ICIM) method in IciMapping 4.1.0.0 software (Meng et al. 2015). A stringent logarithm of odds (LOD) threshold was calculated by a permutation test, and the parameters of the QTL mapping were set as follows. The time of calculation was 1,000; the p value of the type I error was 0.05; the PIN was 0.001; and the mapping step was 1.0 cM. The same QTLs in two or more environments with LOD threshold of > 2.5 were also considered as significant QTLs (Shang et al. 2015a). A QTL confidence interval (95%) was set as a mapping distance interval corresponding to 1 LOD decline on either side of the peak (Liu et al. 2019; Yu et al. 2012). QTLs for PH detected in two or more environments were considered as ‘stable’ when their confidence intervals overlapped (Shi et al. 2016; Yu et al. 2012; Zhang et al. 2015c). QTLs were named according to Sun et al. (2013). The positive additive effect of a QTL suggests that the allele derives from the male parent Hai7124; otherwise, the allele derives from the female parent CRI36.

Candidate gene identification

To obtain potential candidate genes, gene sequences within stable QTLs were extracted. The TAIR database was employed for the annotation of gene functions (https://www.arabidopsis.org/index.jsp). To reveal the general pattern of expression of the candidate genes, the transcriptome sequencing data for different tissues (ovule, fibre, root and stem) of TM-1 were employed as a reference (Zhang et al. 2015b). The apical bud, young stems, roots and leaves of the short-statured recurrent parent CRI36 and the tall-statured parent Hai7124 at the three-true leaf stage and apical organ mixture samples (young stems, apical buds and young leaves) of eight BILs (including four short-statured and four tall-statured lines) at the flowering stage were sampled to determine the expression level of candidate genes by qRT-PCR.

Functional analysis of GhPIN3 in cotton

For the virus-induced gene silencing (VIG) assay, primers VIGsGhPIN3-SpeI-F (5′- ATGCCTGCAGACTAGTGTCGCCATATTTGCCGTTCC -3′) and VIGsGhPIN3-AscI-R (5′- TAGACCTAGGGGCGCGCCCCGATTTCAGCGTCGGTTTC -3′) were used for VIG vector construction. A 455-bp specific fragment from GhPIN3 was cloned and inserted into the pCLCrVA vector. The resulting pCLCrVA:GhPIN3 was co-infiltrated with pCLCrVB via Agrobacterium tumefaciens into cotyledons of cotton seedlings as described in a previous study (Su et al. 2018). Plants co-inoculated with pLCrVB and empty pLCrVA were used as the negative control (CK). A well-managed phytotron (16 h light, 8 h dark and 25 °C) was used to grow the cotton plants (Gu et al. 2014). To detect the silencing power on GhPIN3, qRT-PCR was further performed to examine the expression level of GhPIN3 in young leaves of each plant (pCLCrVA:GhPIN3 and CK) when virus had infected the cotton plants for 30 d. The PH of CK and pCLCrVA:GhPIN3 plants was measured at 30, 40, 50 and 60 d.

Results

PH variation of parents and the BIL population

The PH values of the two parents and the BIL population were measured in seven environments. Among the seven environments, PH in the BILs ranged from 31.67 to 125.00 cm, with an average of 74.14 cm; and the mean PH of the parents CIR36 and Hai7124 was 72.23 and 85.74 cm, respectively (Table 1). The PH of Hai7124 was significantly (p < 0.01) greater than that of CIR36, and the value of skewness in each environment showed that PH followed a normal distribution in the BIL population (Fig. 1, Table 1). Furthermore, there was a transgressive segregation of PH in the BIL population when compared with the parents Hai7124 and CRI36. The result of variance analysis suggested significant variations in PH (p < 0.01) due to environment, genotype and genotype × environment (Supplemental Table S1). However, the broad-sense heritability estimate (H2) for PH was 88.01% (i.e. the percentage of the total phenotypic variance accounted by the genotypic variance), suggesting that PH was highly heritable (Supplemental Table S1).

Table 1 Plant height (PH) of backcross inbred lines (BILs) of Hai7124 × CRI36 hybrids and their parents
Fig. 1
figure 1

Frequency distribution of plant height of 250 BILs in different environments. 15Aync, 16Aync and 17Aync represent the environment of Anyang south farm in 2015, 2016 and 2017, respectively; 16Hnsy and 16Xjal represent the environment of Sanya and Alaer in 2016; 17Aydc and 17Hbwx represent the environment of Anyang east farm and Weixian in 2017

Statistics of the SLAF-seq data

Via SLAF sequencing, we acquired approximately 152 GB of initial DNA sequence data containing 1,136.19 M paired-end reads, and the size of each read ranged from 264 to 364 bp. The SLAF numbers of CRI36 and Hai7124 were 468,850 and 500,626 (Supplemental Figure S1a), and their average sequencing depths were 25.03 × and 21.20 × with 5.46% and 5.81% of genome coverage ratio, respectively (Supplemental Figure S1b). In the BIL population, the number of SLAFs ranged from 275,045 to 527,319, with an average of 352,920 (Supplemental Figure S1a). The average sequencing depth was 11.12 × , with 4.14% of the average genome coverage ratio (Supplemental Figure S1b). The average Q30 ratio was 91.23%, indicating that the sequences were of high quality, and the guanine-cytosine (GC) content was 39.70%.

Information of the SNP markers

From the SLAF-seq data, we detected 255,795 SNP markers between the two parents CRI36 and Hai7124, and all of the markers could be grouped into four genotypes: nn × np, lm × ll, hk × hk and aa × bb (Supplemental Table S2). Ultimately, the aa × bb type of markers with an average sequence depth ≥ 20 × in the parents and ≥ 5 × in the offspring were used for the further analysis. After the aa × bb markers with 20% missing data in the BILs were filtered, 10,694 markers were retained. Following a further filtering of the SDMs markers, 7,709 markers were retained and used to construct the genetic map (Fig. 2). The 7,709 SNP markers could be divided into six mutation genotypes: 2,728 and 2,730 SNP markers were G/A and T/C mutation genotypes, respectively; and only 351 SNP markers were G/C mutation genotypes (Supplemental Table S3). These results showed that in cotton, the SNPs from G to A or T to C replacements were more common, while that from G to C was rare.

Fig. 2
figure 2

The distribution of SNP markers in each of the 26 linkage groups/chromosomes. Acceptable segregation distortion markers (0.001 < p < 0.05) on the map are highlighted in red

Genetic map construction in the BIL population

A total of 7,709 high-quality SNP markers were mapped to 26 chromosome-corresponding linkage groups which spanned a total genetic distance of 3,433.24-cM. The 13 chromosomes in the A subgenome contained 4,641 markers with a 1,893.68-cM genetic distance, and the other 13 chromosomes in the D subgenome included 3,068 markers with a 1,539.56-cM genetic distance. The genetic distance and marker number varied among chromosomes. The longest linkage group, At1, spanned a 221.99-cM genetic length with 622 markers. In contrast, the shortest linkage group, Dt5, spanned only a 59.97-cM genetic length with 489 markers. Moreover, the linkage group At5 harboured the most markers (658), while At6 had the fewest SNP markers (52). Among the 26 linkage groups, the maximum gap between two adjacent markers varied from 2.52 cM on Dt5 to 19.43 cM on At6 (Table 2).

Table 2 Detailed information of the SNP markers in the genetic map

The quality assessment of the genetic map

Collinearity results showed that the SNP markers on 26 linkage groups had high levels of coverage across the cotton genome (Supplemental Figure S2). Moreover, the Spearman correlation coefficient for each of the 26 linkage groups ranged from 0.72 to 1.00, and the average coefficient was 0.92 (Supplemental Table S6). This result suggested that the order of most of the markers in the linkage groups has a high consistency with those in the TM-1 reference genome. Several chromosomes, namely, At4, At5, At6, At9 and Dt2, with some deviation, were identified by the collinearity analysis (Supplemental Table S6). A total of 22 linkage groups were identified with RHs (Table 2). Dt5 had the largest number (80) of RHs, while Dt8 harboured only two RHs (Supplemental Table S7). Chromosomes At8, At9, Dt12 and Dt13 did not harbour any RHs (Table 2).

QTLs for PH in the BILs and previous studies

In total, ten QTLs for PH were detected on nine chromosomes across seven environments and with BLUP. Among the ten QTLs, four and six QTLs were identified on the A subgenome and D subgenome, respectively. Five QTLs (qPH-At3-2, qPH-At5-1, qPH-At10-1, qPH-Dt11-1 and qPH-Dt13-1) had positive additive effects from Hai7124, while another five QTLs (qPH-At3-1, qPH-Dt1-1, qPH-Dt3-1, qPH-Dt9-1 and qPH-Dt12-1) had negative additive effects from Hai7124 (Table 3). Importantly, four QTLs, qPH-At3-2, qPH-At5-1, qPH-Dt1-1 and qPH-Dt3-1, were stable as they were identified in at least two tests or one test and based on BLUP of the seven tests (Fig. 3, Table 3); another six QTLs were detected in one environment or with BLUP only and explained 6.84–14.92% of the PV (Supplemental Figure S3, Table 3). We further compared these ten QTLs with PH-associated QTLs reported in previous studies. As a result, two stable QTLs were reported in previous QTL linkage or association studies (Table 3), such as SNP markers (Marker22291 and Marker22588) were found to be associated with PH (Jia et al. 2016), and their physical positions ranged from 26.90 to 32.30 Mb on the Dt1 chromosome, which was contained in the interval of the stable QTL qPH-Dt1-1 (Supplemental Table S5, Table 3). The SNP markers (Marker25959 and Marker26500) and (D03_31584163 and D03_32132408) were reported to be associated with PH in a 137-RIL population and 355 upland cotton accessions, respectively (Jia et al. 2016; Su et al. 2018), and were anchored to 31.58 to 38.69 Mb on the Dt3 chromosome and overlapped with another stable QTL qPH-Dt3-1 (Supplemental Table S5, Table 3). The stable QTLs qPH-Dt1-1 and qPH-Dt3-1 were detected in both normal and high planting density regimes (Aync and Xjal) (Table 3).

Table 3 Summary of plant height (PH) QTLs identified in different environments and using the best linear unbiased prediction (BLUP)
Fig. 3
figure 3

The information of the stable QTLs mapping. a and b show the stable QTL in At3 detected in 16Hnsy and 17Aync, respectively; c and d show the stable QTL in At5 detected in 17Aync and with BLUP, respectively; e, f and g show the stable QTL in Dt1 detected in 15Aync, 17Aydc and 16Xjal, respectively; h, i, j and k show the stable QTL in Dt3 detected in 16Xjal, 16Aync, BLUP and 17Hbwx, respectively; the black dotted line represents LOD = 3.50 or 2.50; the red line represents the percentage of phenotypic variance explained; and the blue line represents the LOD value

Prediction of candidate genes in stable QTLs

To identify candidate genes associated with PH, we adopted a method combining the genes within the stable QTL regions with functional annotation of orthologues in Arabidopsis and transcriptome data. In total, 1,220 genes were included in the regions of the four stable QTLs. Of these candidate genes, 956 had annotation information (Supplemental Table S8). As PH is mainly determined by stem growth, we focused on the genes that are preferentially expressed in young stems with significantly higher expression levels than in other important tissues (ovule, fibre and root) in cotton. This resulted in the selection of 19 genes (Fig. 4a). Among these 19 genes, Gh_D01G1471 was located in qPH-Dt1-1(Supplemental Table S8). Amino acid residues of Gh_D01G1471 had the highest (78%) identity with AtPIN3 encoding the IAA efflux carrier protein in Arabidopsis. Hence, we named Gh_D01G1471 GhPIN3. To verify that GhPIN3 was associated with PH, the qRT-PCR analysis revealed that GhPIN3 exhibited a significantly higher expression level in apical buds, young stems and leaves of CRI36 than in those of Hai7124, and the expression level was not significantly different in roots of CRI36 and Hai7124 (Fig. 4b). Moreover, we randomly selected eight lines (four short and four tall) from the 250 BILs, and the expression level of GhPIN3 in the four short-statured lines was significantly higher than that in the four tall-statured lines (Fig. 4c). The linear regression equation between GhPIN3 expression level (X) and PH (Y) of the eight lines was expressed as Y = − 10.91X + 103.64, with r = 0.64 (Fig. 4c). From the data above, we hypothesized that the expression level of GhPIN3 is negatively related to PH.

Fig. 4
figure 4

Expression level and function analysis of GhPIN3 in cotton. a Transcript profiles of promising genes for TM-1 ovule (O), fibre (F), root (R) and stem (S). b Expression patterns of GhPIN3 in different tissues of CRI36 and Hai7124 varieties at three-leaf stage, detected by qRT-PCR. c Expression patterns of GhPIN3 in mixture apical organs of eight varieties of BILs in the flowering period, detected by qRT–PCR; red and blue dots indicate four tall-statured cotton and short-statured cotton, respectively. d The mean expression level of GhPIN3 in control check (CK) and pCLCrVA:GhPIN3 (VG) plants of CRI36, detected by qRT-PCR. e The mean PH of control check (CK) and pCLCrVA:GhPIN3 (VG) plants in CRI36 cotyledons infected with virus for 30, 40, 50 and 60 days. f The mean expression level of GhPIN3 in control check (CK) and pCLCrVA:GhPIN3 (VG) plants of Hai7124, detected by qRT–PCR. g The mean PH of control check (CK) and pCLCrVA:GhPIN3 (VG) plants in Hai7124 cotyledons infected with virus for 30, 40, 50 and 60 days. h PH phenotypic of CRI36 cotyledons infected with virus for 60 days. i PH phenotypic of Hai7124 cotyledons infected with virus for 60 days. * and ** indicate significant differences at p = 0.05 and 0.01, respectively

The function of GhPIN3 in cotton

To further validate the role of GhPIN3 in cotton plant height development, we constructed the pCLCrVA:GhPIN3 recombinant virus vector to suppress the expression level of GhPIN3 in the recurrent parent CRI36, and the pLCrVA virus vector was used as the control (CK). Four pCLCrVA:GhPIN3-silenced CRI36 plants and two CRI36 CK plants were examined, and qRT-PCR was used to confirm the expression level of the silenced GhPIN3 in these plants. The result indicated that the mean expression levels in the CRI36 pCLCrVA:GhPIN3-silenced plants decreased by approximately 32.76%, as compared to the CRI36 CK (Fig. 4d). The mean PH of pCLCrVA:GhPIN3-silenced CRI36 plants was significantly higher than that of CRI36 CK plants at four different time points (30, 40, 50 and 60 d), and the difference was approximately 10.33 cm at the 60-d time point (Fig. 4e; Fig. 4h). The results were consistent with the second VIGS experiment using the other parent Hai7124. We examined three Hai7124 CK plants and six pCLCrVA:GhPIN3-silenced Hai7124 plants with qRT-PCR. The mean expression level in the Hai7124 pCLCrVA:GhPIN3-silenced plants significantly fell by 26.89%, than that in the Hai7124 CK (Fig. 4f). Similar to CRI36, the mean PH of Hai7124 pCLCrVA:GhPIN3-silenced plants was significantly higher than that of CK plants at four different time points (30, 40, 50 and 60 d), and the difference was approximately 16.17 cm at the 60-d time point (Fig. 4g, i). From the results above, we inferred that GhPIN3 contributes to PH in cotton.

Discussion

A fine genetic map plays a crucial role in identifying QTLs of traits of interest. Recently, numerous genetic maps have been constructed by PCR-based markers in cotton, such as the genetic map harbouring 155 SSR markers and spanning a total length of 959.4 cM constructed by Sun et al. (2011). Shang et al. (2015b) used 581 SSR markers to construct a genetic map, and the average distance between two adjacent markers was approximately 6.39 cM. Liu et al. (2015) constructed a genetic map with 1,675 SSR markers that spanned a total genetic distance of 3,338.2 cM. However, the number of SSR markers is not enough to construct a high-density genetic map in cotton. By the use of advanced sequencing methods, the marker density of the genetic map of cotton has undergone a remarkable increase. Jia et al. (2016) used 6,295 SNP markers and 139 SSR markers to construct a genetic map with a total length of 4071.98 cM. Zhang et al. (2016) used 5,521 SNP markers to construct a genetic map encompassing 3,259.37 cM with an average of 0.78 cM between adjacent markers. Sun et al. (2017) used 3,978 SNP markers to construct a genetic map that spanned a total genetic distance of 2,480 cM. Therefore, sequencing methods are more efficient in constructing genetic maps. In the present study, SNP markers were used to construct an HDGM through the SLAF-seq method. In total, 7,709 high-quality markers were identified in 26 linkage groups. Each linkage group contained 52–658 markers, and the length of these groups ranged from 59.97 to 221.99 cM. The average marker density of the 26 linkage groups was approximately 0.67 cM per SNP; moreover, the average marker density on the 22 linkage groups was lower than 0.5 cM per SNP, such as on Dt5, At4 and At9, where the average marker density was 0.12, 0.23 and 0.26 cM per SNP, respectively. The average ratio of gaps < 5 cM on all chromosomes was 97.27%, and the collinearity analysis validated the reliability of the constructed map. These results suggest that this high-density genetic map contributed to the reliable QTL analysis for PH in this study.

China is the largest cotton producer among more than 80 cotton-producing countries, while the planting areas for cotton in China have significantly decreased in the past decade due to the high labour force demand, especially for harvesting seed cotton by hand. To boost harvest efficiency, the mechanized harvesting of cotton is gradually spreading in the main cotton production areas. Variability in the PH of cotton cultivars is one of the main obstacles to expand mechanization. Moreover, PH is one of the main factors influencing the yield in crops, and this trait has been extensively researched in maize, rice, wheat, Brassica napus and other crops (Chen et al. 2012; Schiessl et al. 2015; Würschum et al. 2017; Xing et al. 2015). To date, most of the published genetic studies on cotton have focused on yield and fibre quality traits, whereas few studies have focused on PH. In the present study, a total of ten QTLs were identified using the fine genetic map. Among the ten QTLs, eight were identified for the first time, and the other two QTLs overlapped with PH-associated QTLs detected in previous studies. In particular, the gene Gh_D03G0922, located in qPH-Dt3-1, has been verified as responsible for PH in upland cotton by Su et al. (2018). These results suggest that more novel and reliable QTLs that contribute to PH were obtained in this study. Combining the genes in stable QTLs with functional annotation of orthologues in Arabidopsis and transcriptome data is an efficient method to mine candidate genes. Using this method, the functions of several previously undescribed causal genes were identified in cotton (Du et al. 2018a; Ma et al. 2018a, b), and the GhPIN3, located in qPH-Dt1-1, was mined in this study. When the GhPIN3 was suppressed in CRI36 and Hai7124, the pCLCrVA:GhPIN3-silenced plants were significantly taller than the CK plants. These results showed that GhPIN3 participates in PH development. This study provides the first HDGM for an interspecific BIL population of cotton by SLAF-seq. The map contains a total of 7,709 high-quality SNP markers covering a 3,433.24-cM genetic distance. Using the fine HDGM, QTLs of PH across seven environments and with BLUP were identified. In total, ten QTLs related to PH were identified: three of these were identified in the previous studies, and the other seven were first found in the present study. Through a combination of transcriptome and qRT-PCR analyses, the candidate gene GhPIN3, located in the stable QTL qPH-Dt1-1, was identified. Upon further analysis by VIG, we found that GhPIN3 influenced PH in cotton. In this study, the detected QTLs and candidate gene GhPIN3 lay a foundation for cultivating an ideal PH in cotton.

Author contribution statement

SXY, JFZ and JWY conceived and designed the experiments. YHG, GYL, JL, YPC, XZ, XLL, DL, XSZ, JKS and SRT performed the field cultivation of cotton plants and PH measurements. JFZ, WFP, MW and QFM participated in the experimental design, and drafting and revision of the manuscript. JJM performed the experiments and wrote the manuscript. All the authors read and approved the final manuscript.