Introduction

The extrusion of toxins and substances is a vital life process in organisms. Among the identified transporter families, there are several large superfamilies such as the major facilitator superfamily (MFS), the ATP-binding cassette (ABC) superfamily, the drug metabolite transporter (DMT) superfamily, the resistance/nodulation/division (RND) superfamily and the multidrug and toxic compound extrusion (MATE) family [1]. The MATE gene family is widely distributes in prokaryotes and eukaryotes, playing a key role in the exportation of toxins and substrates [2]. MATE transport proteins are composed of 400–700 amino acids, most of which contain 12 transmembrane helices with a 40% similarity among their amino acid sequences. The MATE family depends on an transmembrane electrochemical gradient for extrusion activity [3]. To date, 17 MATE transporters identified from 11 species of bacteria are involved in heterologous material efflux and drug resistance. In mammals, MATE transporters can exhaust a variety of cationic substances in liver and kidney to adjust the drug concentration in plasma [4]. In plants, many studies and expression analyses of MATE genes have been studied. The MATE gene family expands through tandem and segmental duplication in both rice and Arabidopsis [5]. A genome-wide analysis of MATE family in maize and soybean is conducted to identify some MATE genes in response to aluminum toxicity, respectively [6, 7], and other MATE genes are involved in endogenous and exogenous mechanisms of detoxification associated with secondary metabolites such as alkaloids, flavonoids, anthocyanins and other secondary metabolites apart from toxic substances [8,9,10].

Recently, a genome-wide analysis of the MATE family in diploid cotton (Gossypium raimondii and Gossypium arboreum) and their expression levels under salt, cadmium and drought stresses has been published [11], but no work has been reported in tetraploid cotton (Gossypium hirsutum). In the available reports, most of MATE genes play mainly roles in responses to abiotic stress in plants including diploid cotton, and only a few literatures reported that MATE genes are involved in the transpor of secondary metabolites. Our previous study found that GhTT12 (Genbank accession no. KF240564), one gene of a MATE family cloned in upland cotton (G. hirsutum), was involved in the secondary active transport of proanthocyanidin (PA) as a secondary carrier [12]. In a upland cotton variety with a naturally brown fiber (brown cotton), the major substance of its fiber pigment is demonstrated to be PA [13, 14].

PA, known as condensed tannin, is a flavonoid involved in the flavonoid or anthocyanin branch of the phenylpropanoid pathway and occurs as a specialized metabolite in massive higher plants for the interaction between the plant and their environment [15]. PA plays important roles in the resistance of disease for herbivores, seed dormancy and viability [16, 17]. PA is an oligomer of flavan-3-ol unit, which exists as a prominent compound in seed coat, leaf, fruit, flower, and bark of many plant species [18,19,20]. PAs and their assumed precursors like catechin and epicatechin are beneficial for cardiovascular diseases, resistibility, and life-span [21, 22]. Although the transporters mediating the precursor flavan-3-ol of PA biosynthesis into the vacuole have been identified [23, 24], the mechanism of PA transport is not yet understood.

Gossypium hirsutum is a heterologous tetraploid cotton containing AA and DD genomes, which was formed about 1–2 million years ago and is widely assumed that the donators of A and D chromosomes are from G. arboretum and G. raimondii, respectively [25, 26]. With the sequenced genomes of these cotton species [26, 27], a whole-genome analysis was carried out on the sequence characteristics, phylogenetic evolution and expression patterns of the MATE family in upland cotton. The results provide a theoretical basis for further analyzing the function of MATE genes and improving the fiber pigment stability of brown cotton.

Materials and methods

Plant materials

Upland cotton (G. hirsutum) was grown in the farm of Anhui Agricultural University, Hefei, China. The RNA of cotton fibers at 5, 10, 15, 20 and 25 days post anthesis (DPA) was isolated for this study.

Database search and sequence retrieval

The upland cotton genome sequences were downloaded to identify GhMATEs from COTTON GENOME PROJECT (http://cgp.genomics.org.cn/) followed by constructing local databases with DNATOOLS. The conserved MatE domain (PF01554) was obtained from Pfam database based on the hidden Markov model (HMM) [28], and was searched in the cotton protein database by BLASTP. All identified proteins were detected for the MatE domain by Pfam (http://pfam.sanger.ac.uk/) [29] and SMART (http://smart.embl-heidelberg.de/) [30]. Finally, all possible MATE sequences were calibrated using ClustalW [31] and potentially redundant genes were removed. The remaining MATE genes were used for further analysis. The ExPASy program (http://www.expasy.org/tools/) [32] was utilized to calculate the molecular weight and isoelectric point.

Phylogenetic analysis

For further analysis, multiple sequence alignments were clustered by ClustalX [31]. The phylogenetic tree was constructed by MEGA5.0 using the maximum likelihood (ML) method with partial deletion for gaps and missing data. For statistical reliability, the significance of each node evaluated was bootstrap analysis with 1000 replicates [33].

Gene structure and conserved motif analyses

To comprehend the structure of GhMATEs, the corresponding genomic sequences were aligned by GSDS (http://gsds.cbi.pku.edu.cn/) [34]. The MEME (http://meme.sdsc.edu/meme4.3.0/intro.html) [35] were used to detect the conserved motifs that were appraised by the SMART (http://smart.embl-heidelberg.de/) and Pfam (http://pfam.sanger.ac.uk/search) programs according to a previous study [36].

Chromosomal location and gene duplication

The chromosome mapping of GhMATEs was illustrated by MapInspect software (http://www.plantbreeding.wur.nl/uk/software_mapinspect.html). According to previous reports, MCScanX (http://chibba.pgml.uga.edu/mcscan2/) could be used to identify the duplications of GhMATEs [37, 38]. BLASTP with an E-value cutoff of 1 × 10−10 was used to combine and search against protein sequences in the whole cotton genome. Moreover, the default parameters were set to identify the synteny regions, where the duplicated MATEs consisting of various types of duplications such as collinearity, tandem, etc. were identified by a Perl script.

Detection of orthologous gene pairs

Every GhMATE was marked on the respective chromosomes through a Perl script. OrthoMCL (http://orthomcl.org/orthomcl/) was used to distinguish the orthologous MATE genes among G. hirsutum, G. arboreum and G. Raimondii [39]. Circos (http://circos.ca/) was utilized for comprehending the relationships between the orthologous gene pairs among G. hirsutum, G. arboreum and G. Raimondii [40].

Expression analysis of GhMATEs

To clarify the expression patterns of the MATE genes in upland cotton, transcriptome information was obtained from PLEXdb (http://www.plexdb.org/index.php), and the cotton fibers at different developmental stages (0 DPA, 5 DPA, 10 DPA, 20 DPA) were selected for further analysis. A heatmap was constructed by R/Bioconductor (http://www.bioconductor.org/).

RNA isolation and qRT-PCR

To confirm the expression of GhMATEs, total RNA was prepared using RNAprep Pure Plant Kit (Tiangen, Beijing), and genomic DNA was removed by DNase I. The isolated RNA was synthesized into the first chain cDNA by reverse transcription using TIANGEN FastQuant RT Kit (Tiangen, Beijing). qRT-PCR was conducted by an ABI PRISM 7500 real-time PCR system (Applied Biosystems, USA) with GhUBQ (Genbank accession no. DQ116441) as an internal control. Each reaction contains 10 µL SYBR Premix Ex Taq II (2×), 1.0 µL cDNA sample, and 0.5 µL of forward and reverse primers in a final volume of 20 µL. Each pair of primers were designed according to the nucleotide sequences of genes by Primer Express 3.0 software (Applied Biosystems, USA). The primers sequences were 5′-GTATTGGTTCTCTGGTGACG-3′ and 5′-GTAAGAGCAGCACCAAGTAG-3′ for GhTT12, and 5′-GAAGGCATTCCACCTGACCAAC-3′ and 5′-CTTGACCTTCTTCTTCTTGTGCTTG-3′ for GhUBQ7. PCR was performed as follows: 50 °C for 2 min, 95 °C for 30 s, 95 °C for 15 s, and 40 cycles of 60 °C for 34 s, 72 °C for 20 s. The specificity of the reactions was verified by melting curve analysis. The 2−ΔΔCT method was selected to calculate the relative gene expression levels [41].

Determination of the PA content

According to the report [42], 100 mg cotton fibers at 5, 10, 15, 20, 25 DPA were taken into the precooling mortar. 5 mL 80% methanol was added for grinding and the samples were transferred into 10 mL centrifuge tube with ultrasonic treatment for 30 min, then centrifuged at 4 °C, 5000 rpm for 10 min. The supernatant was soluble PA. The precipitation was resuspended with 5 mL methanol containing 1% (v/v) HCl, and water bathed at 60 °C for 1 h, then centrifuged at 4 °C, 5000 rpm for 10 min. The supernatant was insoluble PA. According to the method reported in the reference [43], the content of PA was determined by spectrophotometry. The catechin was taken as a comparison to make the standard curve. 200 µL PA extract and 2800 µL reaction mixture (methanol solution containing 5% HCl and 0.1% DMACA) were added to 10 mL centrifuge tube, reacting at room temperature for 15 min, and then the absorbance was measured at 643 nm. The PA content was calculated by consulting the standard curve.

Arabidopsis transformation and GUS assay

The recombinant plasmid pCambia1304-GhTT12 was transformed into Agrobacterium tumefaciens EHA105, and Arabidopsis transformation was performed by the floral dip method. The transformed plants were selected on MS medium supplemented with 50 mg L−1 hygromycin. The positive plants were transferred to soil in the greenhouse at 25 °C under a 16 h light/8 h dark photoperiod, and were further confirmed by PCR. Finally, T2 transgenic lines with single-copy insertion were used for GUS assay according to the instruction of GUS dyeing kit (Solarbio, Beijing). Moreover, seeds from transformed plants were obtained for qRT-PCR and the determination of PA content, with those from the wild type plants as a control.

Results

Identification of MATE genes in upland cotton genome

A total of 112 candidate MATE genes were originally identified through the HMM profile of the MatE domain, in which 40 reduplicate genes were removed. The results showed that 72 putative MATE genes contained a conserved MATE domain in G. hirsutum. These MATE genes were denominated as GhMATE1-72 according to the location on the chromosomes in G. hirsutum (Table S1). The length of the encoded proteins varied from 313 to 602 aa with an average of 488 aa, whereas the length in A. thaliana ranged from 400 to 700 aa, suggesting that similar variations existed in the MATE family. The predicted molecular weight and pI of the GhMATE proteins ranged from 33.5 to 66 kDa, from 4.97 to 10.24, respectively. The subcellular locations of these GhMATEs were predicted as follows: plasma membrane (73.61%), chloroplast (8.33%), cytoplasm (8.33%), vacuole (5.56%), nucleus (1.39%), mitochondria (1.39%) and extracellular (1.39%).

Phylogenetic analyses of the MATE family in upland cotton

A compounded phylogenetic tree of 117 MATE proteins including 72 GhMATEs and 45 reported plant MATEs was constructed (Fig. 1). The MATE proteins were preliminarily divided into four clades designated C1–C4. Clade C1 was further divided into five subgroups, named as C1-1, C1-2, C1-3, C1-4, C1-5. Subgroup C1-1 consisted of 17 members, including 9 GhMATE proteins and 8 reported MATE transporters. There were 9 GhMATE proteins in the C1-2 subgroup, and only one known MATE protein. There were 7 proteins in the C1-3 subgroup including 3 GhMATE proteins. Subgroup C1-4 contained 3 GhMATE members and 7 known MATE proteins. Subgroup C1-5 had 6 GhMATE proteins and a known MATE protein. Clade C2 contained 15 GhMATE members in addition to several reported MATE proteins. Clade C3 contained 20 GhMATE proteins and 4 known MATE proteins. Clade C4 had 24 MATE members including 7 GhMATE proteins and 17 reported MATE proteins from other plant species.

Fig. 1
figure 1

The phylogenetic tree of cotton MATE family. The phylogenetic tree was constructed by MEGA 5.0 using the maximum likelihood (ML) method. Bootstrap values in percentage (1000 replicates) are indicated on the nodes. Different subfamilies are highlighted using different colors (C1 in pink, C2 in green, C3 in gray, C4 in yellow and C5 in blue), and subgroups are marked with black arcs outside of the cycle tree. (Color figure online)

Gene structures and motifs analyses of GhMATEs

To gain a more particular knowledge about the structural diversity of the MATE genes, their structures were subsequently analyzed by comparing the CDS of each MATE gene in upland cotton. The results revealed that 11 GhMATE genes (GhMATE7, 8, 20, 42, 45, 50, 51, 53, 57, 68 and 70) had no introns in their gene structures, while the remaining GhMATE genes contained 1–14 introns (Fig. 2). In addition, the GhMATE genes in subfamily C4 contained the largest numbers of introns. For example, GhMATE19, GhMATE29, GhMATE54, GhMATE72 had 12 introns and GhMATE49 had 11 ones (Fig. 2). Although the numbers of exon/intron in some genes from the same subfamily were variant, for example, GhMATE1 had one intron more than GhMATE37, and GhMATE6 had two introns more than GhMATE39, these genes displayed a similar exon–intron structure. Through Motif analysis by the Pfam and SMART2.0 conserved motifs of the MATE proteins were identified (Table S2). The previous study reported that motif 3, 4, 13 and 14 encoded the MATE domain and played a role in MATE proteins. In the study, all GhMATE proteins contained one or more MATE motifs (Fig. 3). For example, subfamily C4 proteins contained motif 13 and 14, while other subfamilies proteins contained motif 3 and 4. Motif compositions were the similar in the closely related members, which suggested that functional similarities among the MATE proteins were showed in the same subfamily. Additionally, some subfamily-specific motifs were also detected, for example, motif 17 only existed in subfamily C4, while some motifs (motif 6, 7, 9) were almost distributed in all subfamilies. These motifs may be important for the functions of GhMATE proteins despite their unknown functions.

Fig. 2
figure 2

The gene structures of 72 GhMATEs in cotton. Yellow boxes represent exons (CDS). Black lines represent introns. Blue boxes indicate UTR. The scale on the bottom is in the unit of kilobase (kb). The genes are listed according to the order of subfamily C1–C5 from the phylogenetic tree, and different subfamilies are highlighted in different colors. (Color figure online)

Fig. 3
figure 3

Protein motifs of the MATE family in cotton. The motifs of MATE proteins in cotton are shown as colored boxes. The length of motifs are different. The GhMATE proteins are listed according to the order of subfamily C1–C5 from the phylogenetic tree, and different subfamilies are highlighted in different colors. (Color figure online)

Chromosomal location and gene duplication of GhMATE genes

To determine the organization and distribution of GhMATEs on different chromosomes in the genome of upland cotton, a physical mapping was constructed. The results showed that 72 GhMATE genes were distributed on 19 chromosomes with a non-random distribution (Fig. S1). For example, Dt/chromosome 9 contained 11 GhMATE genes, and Dt/chromosome 5 contained 6 ones By contrast, only one GhMATE gene was distributed on Dt/chromosome 10, At/chromosome 8 and At/chromosome 11, respectively. In addition, a relatively high density of GhMATE genes exhibited on the specific position of some chromosomes, such as the top of Dt/chromosome 5 and the middle of At/chromosome 9.

In this study, gene duplication, which is one of the primary driving forces in the evolution of genomes, was conducted. Among the identified 72 GhMATE genes, 5 relatively duplicated gene pairs of 27 sister pairs were identified in G. hirsutum (Fig. 4). They were all localized to segmentally duplicated regions and no one was in tandem repeats, indicating that the segmental duplications might play a main role in the expansion of the upland cotton MATE family.

Fig. 4
figure 4

Distribution and duplication of the MATE genes in the upland cotton. The distributions of GhMATEs located on the 16 chromosomes in the upland cotton genome. Numbers along each chromosome box represent sequence lengths in megabases (Mb). Duplicated genes are linked by lines

Identification of orthologous MATE genes in different cotton species

To further interpret the evolutionary history of the MATE family, it is necessary to identify orthologous MATE genes. Consequently, a comparative analysis was performed to identify orthologous MATE genes among G. hirsutum, G. arboreum and G. raimondii. A total of 42 collinear gene pairs were found within the cotton genomes due to ancient tetraploidy processes, including 28 orthologous gene pairs between G. hirsutum and G. raimondii, and 14 ones between G. hirsutum and G. arboreum (Fig. 5), which is perhaps due to the more closer relationship between G. hirsutum and G. raimondii than between G. hirsutum and G. arboreum. Meanwhile, 14 GhMATE genes were not found in form of the duplicated blocks, which suggested that there were independent duplication events in addition to the whole-genome duplication event during the evolution.

Fig. 5
figure 5

Microsynteny of the MATE genes in different cotton species. The G. hirsutum, G. arboreum and G. raimondii chromosomes shown in different colors, are labeled with the red, yellow and blue, respectively. Numbers along each chromosome box indicate sequence lengths in Mb. Black lines represent the syntenic relationships between MATE genes. (Color figure online)

Microarray analysis of GhMATE genes

Based on a genome-wide transcript data of gene expression in upland cotton, the expression patterns of GhMATE genes were investigated. The results indicated that most GhMATE genes exhibited distinct time-specific expression patterns (Fig. 6). Of the 72 GhMATE genes, some genes were highly expressed at one or more developmental stages, while other genes were expressed little or no at individual stages of fiber development. Among the highly expressed GhMATE genes, the majority was expressed at 10 DPA. Furthermore, the numbers of GhMATE genes expressed at different stages were similar. Most duplicate GhMATE genes share a similar expression pattern. The expression pattern of GhMATE genes suggested that they played different roles in diverse developments of cotton fibers, which was similar with the results of phylogenetic and protein motif analyses.

Fig. 6
figure 6

Expression profiles of the MATE genes at different fiber development stages of upland cotton. A small color box represents a set of data. Different colors indicate the genes expression level and each color reflects the absolute value of a gene expression showed on the right. The left number represents the relationship between different genes, which are divided into 22 clusters. The samples of cotton fibers are as follows: 0 day (0 DPA), 5 days (5 DPA), 10 days (10 DPA), and 20 days (20 DPA). (Color figure online)

According to the expression patterns of GhMATE genes from subgroup C1-4 at different fiber development stages (0 DPA, 5 DPA, 10 DPA and 20 DPA), the relative expression levels of GhMATE12, GhMATE16 and GhMATE38 were similar with GhTT12, which had a significant impact on the transport of PAs. The highest expression was at 10 DPA, and gradually decreased afterwards (Fig. 6). At the same time, the content of PAs was found to be also maximal at 10 DPA (Fig. 7), indicating that the accumulation of PAs was positively correlated with the expressions of GhTT12 and its family members, suggesting that the members of subgroup C1-4 might play an important role in the transport of PAs.

Fig. 7
figure 7

The content of PAs at different fiber development stages in upland cotton. a The content of soluble PAs, b the content of insoluble PAs

GUS assay and expression patterns of GhTT12 in Arabidopsis

To elucidate the spatial and temporal expression patterns of GhTT12, stable GhTT12 transgenic Arabidopsis plants were assayed. Homozygous single-insertion T2 lines subjected to histochemical GUS staining showed that GUS gene was expressed in different tissues (roots, stems and leaves) tested (Fig. 8).

Fig. 8
figure 8

The GUS staining of Arabidopsis thaliana. a The plant of Arabidopsis; b the root of Arabidopsis; c the leaf of Arabidopsis; d the stem of Arabidopsis

The expression of GhTT12 in the seed coat of transgenic Arabidopsis was 4.8 times than that of the mutant tt12 (Fig. 9a). The total PA content in the seed coat of transgenic Arabidopsis was 5.7 times than that of the mutant, while that of the wild type was 13.1 times than that of the mutant (Fig. 9b). These results suggest that GhTT12 plays an essential role in the transport and accumulation of proanthocyanidins.

Fig. 9
figure 9

Spatial and temporal expression patterns of GhTT12 in Arabidopsis. a Relative expression of GhTT12 in seed coat, b the PA content of seed coat

Discussion

In the previous studies, the genome-wide analyses and functions of some MATE family members have been conducted in plants, but most of MATE genes play mainly roles in responses to abiotic stress [6, 7, 11, 44, 45]. only a few literatures reported that MATE genes are involved in the transpor of secondary metabolites [8,9,10]. Specially, a genome-wide study of the MATE family has not been reported in upland cotton (G. hirsutum), and it is not kwon whether MATE genes are involved in the transport of PAs during plant growth and development in brown cotton. Until 2015, The whole genome sequence of upland cotton (TM-1) has been completed, which was beneficial to predict the evolutionary history of the MATE gene family in upland cotton.

In the present study, a systematic and comprehensive whole-genome analyses of the MATE family was carried out in upland cotton (G. hirsutum) including the phylogeny, gene structure, conserved motif, chromosomal location, gene duplication, expression profile and subcellular localization was performed. The numbers of MATE genes in upland cotton (72 MATE genes) were higher than those in Arabidopsis (55 MATE genes) [44] and rice (56 MATE genes) [45] while less than those in soybean (117 MATE genes) [7], which indicates that the GhMATE genes in upland cotton have been expanded comparing with those in Arabidopsis and rice. Polyploidization, which makes a whole-genome duplication, and rapid genomic reorganization and massive gene loss, are the features of a diploidization process [46,47,48]. Gene duplication is one of the primary driving forces in the evolution of genomes, and it often contains segmental and tandem duplication [49, 50]. Interestingly, the numbers of GhMATE genes involved in segmental duplication are much more than those involved in tandem duplication, suggesting that segmental duplications are the main contributors to the expansion of the MATE family in upland cotton. The number of orthologous genes between G. hirsutum and G. raimondii is more than that between G. hirsutum and G. arboreum, which provides a new resource for understanding the evolution of the MATE gene families among different species.

Phylogenetic analyses revealed that 72 GhMATE proteins were divided into four clades designated C1–C4. Clade C1 was further divided into five subgroups. In subgroup C1-1, AtFFT is a flavonoid transporter that affects flavonoid levels in Arabidopsis [51], and VvAM1, VvAM3 are involved in the transport of acylated anthocyanins into vacuoles in grapevine [52, 53]. In subgroup C1-2, only one known MATE protein, OsMATE1, is involved in the resistance to aluminum stress [45]. NtMATE1 and NtMATE2 in subgroup C1-3 could transport alkaloids from the cytosol into the vacuole in tobacco [54]. In Subgroup C1-4, AtTT12 isolated originally during screening of mutants with altered seed coloration is thought to be the first MATE transporter involved in transport of flavonoids. MtMATE1 localized in the tonoplast is a functional ortholog of AtTT12 [23], and VvMATE1 plays a role in the accumulation of PAs [55]. AtALF5 in subgroup C1-5 leads to abmity in the formation of lateral root and enhances their sensitivity to various compound resistant to toxins. Based on the functions of the known MATE transporters in subfamily C1, the MATE proteins in this clade might transport and accumulate flavonoids or alkaloids in plants. In clade C2, AtDTX1 is found to mediate the toxic compounds like plant-derived antibiotics and to detoxify the heavy metal, such as Cd2+ [44]. NtJAT1 is a secondary transporter, which shows nicotine efflux activity in yeast and is involved in nicotine translocation [56]. Therefore, subfamily C2 might participate in the efflux of various compounds. In clade C3, SlMATE may function as a vacuolar transporter of anthocyanin in tomato leaves [57]. AtADS1 is a negative regulator with associated to plant disease resistance [58]. AtZF14 is reported to be involved in iron homeostasis. Additionally, AtZF14 and AtADS1 also play a role in the regulation of hypocotyl cell elongation [59]. 7 GhMATE proteins in subgroup C4 are indicated to be involved in Al detoxification/iron translocation in cotton by reference to 17 reported MATE proteins from other plant species.

The structural diversity of genes is considered to impel the evolution of gene families. Moreover, the differences in the characteristics of different subfamilies suggested that MATE members are functionally diversified in cotton. Introns are specifically inserted into the plant genome and retained during the course of evolution [60]. Therefore, it is speculated that the MATE coding region should gain or lose introns in a subfamily-specific manner. Furthermore, Motif analysis revealed that there were different conserved motifs presented in each cotton MATE protein. However, functions of these motifs the MATE family are still unclear.

Gene expression pattern can be used to elucidate the function of a gene. Genome-wide transcript profiling data of cotton ovules was used to examine the expression of GhMATEs. Almost 1/3 of 72 GhMATE genes exhibited little expression at some specific stages, which indicates that GhMATEs are expressed at specific developmental stages. Based on a phylogenetic tree, GhMATE12, GhMATE16 and GhMATE38 from subgroup C1-4 were selected for expression analysis by qRT-PCR in upland cotton. The results showed that these MATE genes had a similar expression pattern. At the same time, the contents of PAs at different fiber development stages were closely related to the expression levels of MATE genes from subgroup C1-4 in upland cotton. In addition, GhTT12, which was reported to be involved in the transport of PA, was identified to localize in the vacuole membrane [12]. The results suggest that GhMATE12, GhMATE16 and GhMATE38 may be involved in the vacuole transportation of PA like GhTT12. In summary, the present results contribute to improve understanding the molecular evolution of the MATE gene family in plants and the functions of GhMATEs in upland cotton.