Introduction

Aquaporins (AQPs) belong to the major intrinsic protein (MIP) superfamily of integral membrane proteins present in bacteria, fungi, animals, and plants. These proteins play an important role in facilitating the diffusion of water and other small molecules such as glycerol, urea, ammonia (NH3), carbon dioxide (CO2), hydrogen peroxide (H2O2), boron, and silicon through cell membranes (Gomes et al. 2009; Maurel et al. 2015; Pommerrenig et al. 2015). Many studies demonstrated the involvement of AQPs in different physiological processes such as the response to abiotic stresses (Khan et al. 2015; Zhu et al. 2019), photosynthesis, stomata and mesophyll conductance (Perez-Martin et al. 2014), and seed longevity and viability (Mao and Sun 2015). Genome-wide analyses have been used to characterize and establish the number of homolog genes in many eudicots and monocots species, e.g., 35 in Arabidopsis thaliana (Johanson et al. 2001); 55 in Populus trichocarpa (Gupta and Sankararamakrishnan 2009); 71 in Gossypium hirsutum (Park et al. 2010); 47 in Solanum lycopersicum (Reuscher et al. 2013); 41 in Sorghum bicolor L. (Reddy et al. 2015); 28 in Beta vulgaris (Kong et al. 2017); 41 in Zea mays L (Bari et al. 2018), and 34 in Citrus sinensis (Wei et al. 2019).

AQPs have highly conserved structures in plant, such as a molecular weight varying between 23 and 31 kDa, six transmembrane helices connected by five loops named A to E, N- and C-terminal tails both exposed to the cytosol (Törnroth-Horsefield et al. 2006). The NPA (asparagine-proline-alanine) motifs in the loops B and E are highly conserved. The conservation of specific amino acids residues in key position in aquaporin sequences denotes a signature pattern in relation to the structure, function, and substrate selectivity. These residues create the aromatic/arginine selectivity filter (ar/R) which is characterized by four residues, one in helix 2 (H2), one in helix 5 (H5), and two in loop E (LE1–LE2), and acts as a size-exclusion barrier and regulates the transport specificity for water transport and small neutral solutes (Hub and de Groot 2008; Mitani-Ueno et al. 2011). Five key positions in the protein sequence (P1–P5) in which residues are associated with function, and specific physicochemical properties for each subgroup were termed Froger’s positions (Froger et al. 1998). Thus, the analysis of these positions in protein sequences may give clues about their specificity and functionality. Based on sequence homology, the aquaporins in plants can be classified into eight subfamilies. Five of them found in higher plantas: plasma membrane intrinsic proteins (PIPs), tonoplast intrinsic proteins (TIPs), nodulin 26-like intrinsic proteins (NIPs), small basic intrinsic proteins (SIPs), and uncategorized X intrinsic proteins (XIPs), the last two with still unknown function (Maurel et al. 2015). Three subfamilies: the glycerol facilitator–like intrinsic proteins (GIPs) and LIPs (large intrinsic proteins) were found exclusively in algae and mosses, respectively (Hussain et al. 2019) and the hybrid intrinsic proteins (HIPs) in lycophytes and mosses (Anderberg et al. 2012; Hussain et al. 2019).

At the molecular level, the response to water stress alters the expression of several genes, including those in the aquaporin family. In plants, aquaporins play key roles in regulating hydraulic conductance in roots and leaves, regulating water flow into and out of cells, and water transfer across membranes (Shekoofa and Sinclair 2018). The evaluation of aquaporin genes in contrasting Phaseolus vulgaris genotypes (i.e., tolerant and susceptible to water stress) indicated that the ability of the plant to conserve water during deficit in tolerant cultivars involves reduced expression of aquaporin-specific isoforms (Zupin et al. 2017). A global analysis of AQPs gene expression in C. sinensis submitted to water deficit revealed that isoforms from different aquaporin families had increased expression in leaves (Martins et al. 2015) indicating that each isoform responds specifically to the water deficit.

Despite its importance in plant physiology, little is known about the AQP superfamily in the Coffea genus (Santos and Mazzafera 2013; Miniussi et al. 2015). Coffee is one of the most important global agricultural export commodities. The perennial diploid specie Coffea canephora Pierre ex A. Froehner (2n = 2x = 22) represents approximately 30% of world production. Abiotic stresses, especially water deficit, are major constraints for coffee production in most producing countries (Da Matta and Ramalho 2006). Coffee breeding programs have aimed at characterizing genotypes with differential water deficit responses to study physiological parameters involved in drought tolerance. For example, C. canephora clone 14 characterized as tolerant, while clone 109A considered susceptible to drought (Marraccini et al. 2012; Lima et al. 2015).

The assembled sequence of the C. canephora genome (Denoeud et al. 2014) allows the use of these data for the identification of new genes with biotechnological potential. Therefore, in this study, we present the first genome-wide analysis of aquaporin genes in C. canephora (CcAQPs). For this, we performed an in silico identification of AQPs genes and, subsequently, a comprehensive analysis of the phylogenetic relationships, gene structure, chromosomal distribution, transmembrane domains, conserved motifs, identification and characterization of NPA motifs, ar/R selectivity filter residues, and Froger’s positions. In parallel, we analyzed and discussed the gene duplication events and synteny in coffee AQPs. Additionally, we also investigated the expression profile of selected CcPIPs and CcTIPs isoforms in response to water deficit.

Materials and Methods

Database Search for Identification of CcAQP Genes

Firstly, to identify AQP family members, a genome-wide search was performed using “Aquaporin” as the keyword on C. canephora database (http://coffee-genome.org; Denoeud et al. 2014). In order to perform comparative analyses, AQPs sequences were also searched in other species. The C. arabica and C. eugenioides sequences were retrieved from National Biotechnology Information Center (NCBI) database (https://www.ncbi.nlm.nih.gov; Altschul et al. 1997), and the Arabidopsis thaliana, Solanum lycopersicum, Oryza sativa, and Solanum tuberosum were downloaded from Phytozome v12.0 platform (http://www.phytozome.net; Goodstein et al. 2011). In addition, to get a better comprehension of the exon/intron structures of each CcAQPs gene, we compared the coding sequences (CDS) with their corresponding genomic DNA sequences using the online server Gene Structure Display Server (GSDS; http://gsds.cbi.pku.edu.cn; Hu et al. 2015).

Phylogenetic Analysis of AQP Proteins

Multiple sequence alignment of deduced AQPs amino acid from C. canephora, C. arabica, C. eugenioides, A. thaliana, S. lycopersicum, S. tuberosum, and O. sativa was performed using the ClustalW platform (http://www.genome.jp/tools-bin/clustalw; Thompson et al. 1994). Finally, these results were used to construct phylogenetic trees with the MEGAX software (Kumar et al. 2018) by maximum likelihood (ML) method using bootstrap procedure with 1000 replicates. The CcAQPs sequences were assigned to different subfamilies by comparing with other species.

Chromosomal Location and Identification of Duplication Events

The putative CcAQPs genes were mapped onto C. canephora chromosomes using the Mapchart 2.2 software (Voorrips 2002). Reciprocal BLASTN was carried out to establish paralogous relationship within C. canephora. The hit threshold values were set as E-value < 1e_50, score > 200, and positive > 80% (Huang et al. 2016). Hit threshold values were set as E-value < 1e_50, score > 200, and positive > 80% (Huang et al. 2016). The synonymous (Ks) and non-synonymous (Ka) substitution rates from paralogous and orthologous genes was analyzed by Ka_Ks calculator 2.0 (Zhang et al. 2006). The approximate dates of the duplication events were calculated by the Eq. (T = Ks/2λ), assuming an average value for the synonymous substitution rate (λ) of 7.41 × 10−9 (Yu et al. 2011).

Synteny Analysis of AQP Genes in Coffee Species

To analyze the synteny relationship among AQPs genes, we used the orthologous genes of the three coffee species: C. arabica (allotetraploid specie; 2n = 4x = 44), C. canephora, and C. eugenioides (diploid species; 2n =2x = 22). The Circos software was used to represent the synteny among coffee species (http://circos.ca/; Krzywinski et al. 2009).

Subcellular Localization, Transmembrane Domains, and In Silico Transcriptional Profiling

The length, molecular weight (kDa), and isoelectric point (pI) of the putative CcAQPs proteins were calculated using the ProtParam tool (http://web.expasy.org/protparam). Subcellular localizations of CcAQPs were predicted using the Plant-mPLoc algorithms with default parameters (Chou and Shen 2010). In addition, to analyze the in silico expression of AQPs isoforms in different organs and tissues in C. canephora, the RPKM (reads per kilobase million) values were retrieved from the Coffee Genome Hub database (http://coffee-genome.org; Denoeud et al. 2014). The data were compiled into a spreadsheet, log10 transformed, and used as input data for in silico expression calculation using the Heatmapper algorithm (http://www1.heatmapper.ca; Babicki et al. 2016).

Motif Conservation Analysis of CcAQP Proteins

To gain more detailed information on conserved motifs, the deduced CcAQP protein sequences were analyzed using the MEME online program (Multiple Expectation Maximization for Motif Elicitation: (http://meme.nbcr.net/meme/cgi-bin/meme.cgi; Bailey et al. 2006). The optimized parameters of MEME were as follows: the optimum width of each motif ranged from six to 50, the maximum number of motifs to find was 20, and the other parameter settings were default values. The presence and number of transmembrane (TM) helices in the CcAQPs proteins were predicted using the TMHMM Server v.2.0 program (http://www.cbs.dtu.dk/services/TMHMM/). Additionally, the identification of conserved NPA motifs, air/R selectivity filter (H2, H5, LE1, and LE2), and analysis of Froger’s positions were performed by visual alignment inspection compared with AtAQPs. Since the XIP subfamily is not present in Arabidopsis, the conserved motifs of this subfamily in CcAQP were identified by comparison with sequences of R. communis (Zou et al. 2015) and G. hirsutum (Park et al. 2010).

Plant Materials and Water Deficit Treatments

The imposition of water deficit in C. canephora plants was performed according to Santos et al. (2015). In summary, 12 18-month-old plants from two C. canephora clones characterized as being water deficit–tolerant (clone 14: drought-tolerant) and clone 109A: drought-susceptible were used in this experiment (Lima et al. 2015; Marraccini et al. 2012). The leaf water status of coffee plants was monitored by thermocouple psychrometer chambers (model C-30, Wescor, Inc., Logan, UT, USA) coupled to a datalogger (Campbell Scientific, Inc., Logan, UT, USA, model CR-7). The stress conditions were the following: irrigated (control ± − 1.35 MPa), moderate stress (± − 2.35 MPa), severe stress (± − 4.3 MPa), and recovery (rehydrated plants 72 h after reaching the water potential established as the severe stress). These samples were collected and frozen immediately in liquid nitrogen and kept at − 80 °C until RNA extraction.

Total RNA Extraction and cDNA Synthesis

Total RNA extraction from C. canephora leaves was performed for the water deficit experiments using the PureLink RNA Reagent kit™ (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s recommendations. The integrity of total RNA was verified by electrophoretic separation on 1% (w/v) agarose gel. RNA concentration and purity were determined using a NanoDrop™ Lite Spectrophotometer (Thermo Fisher Scientific). Five micrograms of total RNA was used to synthesize complementary DNA (cDNA) using the SuperScript IV Reverse Transcriptase™ (Invitrogen) in a final volume of 20 μl following the manufacturer’s instructions. The cDNA products were diluted 1:10 before use in RT-qPCR.

Primer Amplification Efficiency and Relative Expression

Primers were designed outside of the conserved region for each CcAQP isoform using Primer Express™ v3.0.1 (Applied Biosystems). RT-qPCR reactions were performed using 5 μl of SYBR Green, 0.4 μl of sense and antisense primers (5 μm), 1 μl of cDNA samples, and a final volume of 10 μl with autoclaved Milli-Q water. The amplification reaction was performed with three biological replicates (plant in each condition) and three technical replicates using StepOnePlus™ Real-Time PCR System (Applied Biosystems) following the conditions: 40 cycles of denaturation at 95 °C for 2 min, annealing at 95 °C for 30 s, and extension at 60 °C for 30 s. The efficiency of the amplification of each pair of primers was estimated using the LinReg PCR program (Ramakers et al. 2003). Relative expression quantification was calculated by the ΔCt method (Pfaffl 2001) where expression level of target genes (CcAQPs) was normalized against the CcUBQ10 as the reference gene (Marraccini et al. 2012). The significance in expression of the CcAQPs genes between the different water regimes for each clone was tested using one-way ANOVA followed by the Tukey test (p < 0.05). Paired comparisons between the two clones (14 and 109A) within each water regime were performed by the t test (*p < 0.05).

Results

Genome-Wide Identification of the CcAQP Gene Family and Phylogenetic Analysis

Thirty-three members of the AQP gene family were found in C. canephora genome database, which were grouped into five major clusters by the phylogenetic analysis (Fig. 1). The CcAQPs genes were named based on the similarity and phylogenetic relationship with A. thaliana as described by Johanson et al. (2001) (Fig. S1). To assess phylogenetic relationships among the C. canephora AQPs and to group them into the well-established subfamilies, a phylogenetic tree was generated based on multiple sequence alignments of C. canephora, C. arabica, C. eugenioides, A. thaliana, S. lycopersicum, S. tuberosum, and O. sativa. (Fig. 2). The 33 CcAQPs proteins were classified into five subfamilies: 7 PIPs, 9 TIPs, 11 NIPs, 3 SIPs 3, and XIPs. The subfamily XIP is not present in Arabidopsis. The XIP subfamily is characterized by the second NPA motif amplified to NPARC present only in this subfamily (Danielson and Johanson 2008). To verify the presence of this motif, the putative XIP sequences of C. canephora, C. arabica, and C. eugenioides were aligned, visually inspected, and compared with XIPs sequences from other dicotyledonous species including R. communis (Zou et al. 2015) and G. hirsutum (Park et al. 2010). All members of CcPIPs subfamily showed a conserved structure of four exons (Fig. 1). Most members of the NIP subfamily were characterized by four or five exons, except for CcNIP4;2 which has only three exons. Most members of the TIP subfamily have three exons, while CcTIP1;2 presents two exons and CcTIP4;1 showing only one exon with 300 bp. The isoforms CcSIP1;1 and CcSIP2;1 contain three exons, whereas CcSIP2;2 has only one. The XIP subfamily presented a conserved three-exon structure (Fig. 1).

Fig. 1
figure 1

Phylogenetic tree from CcAQPs of C. canephora generated by MEGAX using the ML method derived from the ClustalW alignment. Bootstrap analysis was performed with 1000 replicates; only bootstrap values higher than 50% are indicated above the branches. AQP gene structure is represented by exons and introns as dark gray boxes and light gray lines, respectively

Fig. 2
figure 2

Phylogenetic tree of AQPs including: C. canephora, C. arabica, C. eugenioides, A. thaliana, S. lycopersicum, Oryza sativa, and S. tuberosum generated by MEGAX using the ML method derived from the ClustalW alignment. Bootstrap analysis was performed with 1000 replicates; only bootstrap values higher than 50% are indicated above the branches. Different colors indicate different groups of CcAQP genes

A detailed description of the physical and chemical analysis of CcAQPs is listed in Table 1. The deduced full-length amino acid sequence of the protein ranged from 119 to 328 amino acids. Molecular weight varied from 12.36 to 35.10 kDa, and the isoelectric point ranged from 5.12 to 9.66 pI. Most CcAQPs were predicted with six transmembrane domains (TMHs), while CcNIP4;2 and CcTIP4;1 contain only two transmembrane helices. Additionally, the amino acid sequence of the CcAQP indicates that they are located in the cell membrane, in the vacuole or, simultaneously, in both (Table 1).

Table 1 AQPs genes identified in the C. canephora genome

Chromosomal Location, Duplication Events and Synteny Analysis of CcAQPs

CcAQP genes were mapped into 11 chromosomes of C. canephora (Fig. 3). Most of CcAQP genes were mapped on Chr02 (9); six on Chr08; three on Chr01, Chr04, and Chr07; two on Chr06 and Chr09; and only one CcAQP gene on Chr03 and Chr10. In contrast, no putative AQP was found in Chr05. Two CcAQP (CcTIP3;1 and CcSIP1;1) were assigned to the unsorted sequence scaffolds denominated ChrUn (for “unknown”) (Fig. 3).

Fig. 3
figure 3

Distribution of CcAQP genes into the 11 C. canephora chromosomes. Vertical bars represent the chromosomes and numbers at the left indicate gene positions (the scale on the left is in megabases, Mb). The chromosome number is indicated on the top of each chromosome (vertical bar, numbered Chr01–Chr11). The chromosome designated as ChrUn is not a true chromosome, but unsorted sequence scaffolds as described by Denoeud et al. (2014). Blue lines are indicative of duplication events for CcPIP genes; green lines are indicative of duplication events for the CcTIP genes; red lines are indicative of duplication events for CcXIP genes and yellow lines are indicative of duplication events for CcNIP genes

We also estimated the non-synonymous (Ka) and synonymous (Ks) substitution ratios to verify whether selection pressure participated in the expansion of CcAQP genes. Positive selection in duplication and divergence was calculated by estimating the ratios of non-synonymous (Ka), synonymous (Ks), and substitution rates (Ka/Ks) for five tandem and 28 segmental duplicated gene pairs of CcAQP (Supplementary Table S1). The results suggest segmental duplication events of the CcAQP genes in C. canephora from 0.74 mya (million years ago) (Ks = 0.02) to 10.91 mya (Ks = 0.27), whereas tandem duplicated gene pairs ranged from 1.67 (Ks = 0.24) to 9.08 mya (Ks = 0.24).

AQP gene synteny among the three coffee species (C. canephora, C. eugenioides, and C. arabica) revealed the presence of 67 syntenic relationships being (4) SIPs, (20) NIPs, (14) TIPs, (12) PIPs, and (17) XIPs (Fig. 4 and Table S2). Most CcAQPs isoforms had a linear syntenic relationship with the AQPs in C. arabica and C. eugenioides genomes with exception CcSIP2;1, CcTIP1;1, CcTIP4;1, CcNIP2;1, CcNIP4;1, CcXIP1;1, and CcXIP2;1 that were syntenic with the correspondent subfamily but on different chromosomes (Table S2). On the other hand, when the analysis was performed between C. canephora × C. arabica, it revealed 55 synthetic events (Table S3) and 39 between C. canephora × C. eugenioides (Table S4). From 33 CcAQPs identified in this work, 31 (93.93%) of them were syntenic with CeAQPs and CaAQPs (Table S3 and S4). The isoforms CcSIP2;2 and CcNIP3;1 showed no gene correspondence with any AQP in C. eugenioides or C. arabica (Table S2). Furthermore, the NIP family presented the largest number of syntenic events, 18 (C. canephora × C. arabica) (Table S3) and 13 (C. canephora and C. eugenioides) (Table S4). We identified that some syntenic AQP gene pairs shared the same C. canephora AQP member associated with C. eugenioides and C. arabica, suggesting that they may originate from a common ancestor.

Fig. 4
figure 4

Synteny analysis for AQP genes among C. canephora, C. eugenioides, and C. arabica. CcChr1 to CcChr11 represented the 11 chromosomes in C. canephora. CeChr1 to CeChr11 represented the 11 chromosomes in C. eugenioides. CacChr1 to CacChr11 and CaeChr1 to CaeChr11 represented the 11 chromosomes in the C. canephora (Cac) and C. eugenioides (Cae) sub-genomes in C. arabica, respectively. The orthologous AQP genes were mapped onto chromosomes and linked to each other. Different colored lines in the interior represent different subfamilies of AQP genes

The isoforms CcTIP3;1 and CcSIP1;1 have not yet been positioned on available C. canephora genome assembly v.01 and consequently were mapped on chromosome ChrUn (Fig. 3). Our synteny analysis revealed that the CcTIP3;1 was syntenic with CeTIP3;1, CaTIP3;1 and CaTIP3;2 localized on chromosome 4, and the CcSIP1;1 isoform was syntenic with CeSIP1;1, CaSIP1;1, and CaSIP1;2 localized on chromosome 6 of each genome (Table S2). Therefore, we suggested that these isoforms, the CcTIP3;1 and CcSIP1;1 isoforms, could possibly be positioned in chromosome 4 and 6 of C. canephora, respectively.

Motif Prediction

Fifteen conserved motifs were predicted as shown in Fig. 5. The motifs 1, 2, and 3 were widely found in the five subfamilies. The motif 4 is present in the PIP, TIP, and XIP subfamilies, and in four NIP members. Motifs 5, 6, and 7 are found exclusively in the XIP, NIP, and PIP subfamilies respectively. Motif 8 is widely present in the PIP, XIP, and NIP subfamilies. Motive 9 is widely present in the PIP and TIP subfamilies, while motif 10 is limited to the PIP subfamily (Fig. 5).

Fig. 5
figure 5

Motifs conserved among the CcAQPs proteins, where each motif is represented by different color blocks as indicated at the bottom of the figure. The same color block on different proteins indicates each particular motif

Protein Sequences Analysis

The amino acid sequences of the CcAQPs were aligned, and the transmembrane domains (TMHs), NPA motifs, air/R selectivity filter, and Froger’s position were visually inspected to gain some insight into the physiological role and substrate specificity of each aquaporin (Table 2 and Fig. S3). The transmembrane domain number varied from 2 (CcTIP4;1 and CcNIP4;2) to 7 (CcTIP1;1, 2;1, 2;2, CcNIP4;1, CcXIP2;1, and 2;2), with the majority of CcAQPs (16) presenting six TMHs (Fig. S3).

Table 2 Amino acid composition of the NPA motifs, ar/R selectivity filters (H2, H5, LE1, LE2) and Froger’s positions (P1-P5) in C. canephora aquaporins

Analysis of amino acid residues (Table 2) revealed the conserved NPA motif in the first loop (LB) of all CcNIPs, PIPs, and TIPs. In the three CcXIP, the residue A (arginine) in the third position was replaced by the isoleucine (I). In the CcSIPs, the third position residue A (arginine) was replaced by threonine (T) for CcSIP1;1 and a leucine (L) for CcSIP 2;1 and 2;2. The second NPA motif is conserved in PIP, SIP, XIP, and TIP subfamilies with the exception of CcTIP4;1 in which it is absent. On the other hand, CcNIP subfamily showed greater diversity on the second NPA motif.

The amino acid residues on the selectivity filter (ar/R) revealed that the CcNIPs and CcTIP were highly variable compared with other families (Tables 2). In CcSIPs, the selectivity filters were different from each other with CcSIP2;2 only presenting the residue K on H2. On the other hand, CcPIPs were highly conserved (F-H-T-R), whereas CcXIPs presented variation only in H2 (V/L-F-A-R) (Table 2). Froger’s positions were highly conserved within CcXIPs, CcPIPs, and CcTIPs, with the only exceptions being CcPIP1;3 and CcTIP4;1 (Table 2). The positions P1, P2, and P5 were highly variable in CcNIPs, whereas the P3 and P4 were conserved. CcSIP1;1 and 2;1 presented variation in the first two positions, while CcSIP2;2 showed residues only in P4 (E) and P5 (F) (Table 2).

Expression Analyses of CcAQP Genes in Different Tissues

The in silico expression patterns of the different members of the CcAQP gene family in C. canephora was determined using RNAseq data generated from root, stamen, pistil, leaf, perisperm, and endosperm libraries (Denoeud et al. 2014). The RPKM values of each putative gene were displayed in a heatmap (Fig. 6) to infer their transcriptional profile in the different tissues. In summary, the genes CcPIP1;3, 2;2, 2;4, and CcTIP1;2 and 2;2 are highly expressed in all tissues, while most CcXIPs and CcNIPs show low expression levels with the exception of CcXIP2;1 (pistils) and CcNIP5;1(roots), CcNIP4;1 (stamens), and CcNIP1;7 (perisperm). Other isoforms presented a specific pattern of expression according to the tissue analyzed, e.g., CcTIP1;1, 2;2, and 5;1 showed high expression only in roots, whereas CcTIP3;1 and CcNIP4;1 also presented noticeable expression in stamens (Fig. 6).

Fig. 6
figure 6

Heatmap of transcription profiling of CcAQPs genes in various tissues (root, stamen, pistil, leaf, perisperm, and endosperm). The heatmap was generated based on the RNA-seq (RPKM) datasets available at http://coffee-genome.org; Denoeud et al. 2014. Green indicates tissues with low expression and red indicates high expression

Relative Gene Expression for Selected CcPIPs and CcTIPs in C. canephora Leaves Under Water Deficit

A total of five CcAQPs with high RPKM values in leaves (CcPIP1;2, 2;3, and 2;4 and CcTIP1;2 and 2;1) were selected randomly for further RT-qPCR analysis. The dissociation curve for each primer confirmed the presence of single amplifications. Primer sequences, amplicon size, and amplification efficiency are shown in Table S5. Relative expression of each CcAQP was normalized against the expression of CcUB10 ubiquitin gene (Marraccini et al. 2012).

In the leaves of plants submitted to water deficit, the relative expressions of the CcPIP2;3 and CcPIP2;4 isoforms were significantly higher in clone 109A compared with clone 14 under moderate and severe stress. The expression levels of the isoform CcPIP1;2 also increased in clone 109A in severe stress and after re-irrigation (~ 1.5-fold) in relation to clone 14 (Fig. 7a). The expression of CcPIP2;3 and 2;4 increased ~ 2.3 and 1.5-fold in plants under moderate stress and ~ 2 and 3.6-fold at the severe stress condition, respectively (Fig. 7b and c). The transcript abundance of CcPIP2;4 was lower than the observed for the other PIP isoforms; however, difference in the expression level between the susceptible (clone 109A) and the tolerant genotypes (clone 14) was greater (Fig. 7c). Clone 109A also showed a significant increase in relative expression of the two CcTIPs isoforms (1;2 and 2;1) when the plants were subjected to severe water deficit and also in the re-irrigated treatment for CcTIP1;2 (Fig. 7d and e).

Fig. 7
figure 7

Relative expression of AQP genes in C. canephora leaves under different irrigation conditions. The expression of the target genes was calculated by ΔCt method using ubiquitin gene as internal reference. The values represent the mean (± SD) of three biological replicates for each treatment. Upper- and lower-case letters indicate comparisons between the different water regimes for clone 109A and 14, respectively (Tukey’s, p < 0.05). A paired comparison between the clones in each water regime was performed by t test (*p < 0.05)

Discussion

Aquaporins (AQPs) are water channel proteins that facilitate the rapid transport of water and small molecules across plant cell membranes (Maurel et al. 2015). In this study, we identified 33 AQPs in the C. canephora genome. The sequences were grouped into five subfamilies: 7 PIPs, 9 TIPs, 11 NIPs, 3 SIPs, and 3 XIPs. Previous studies have reported that similar numbers of AQPs were found in Arabidopsis (35) and rice (33). The AQPs were classified into only four subfamilies in this species: 13 PIPs, 10 TIPs, 9 NIPs, and 3 SIPs in Arabidopsis (Johanson et al. 2001) and 10 NIPs, 10 TIPs, 11 PIPs, and 2 SIPs in rice (Sakurai et al. 2005). The XIP subfamily, characterized by the presence of the second motif NPA extend to NPARC (Danielson and Johanson 2008), is absent in monocotyledons and in members of the Brassica family (Bari et al. 2018; Sun et al. 2018), as well as in Medicago. truncatula and Lotus japonicus (Deshmukh and Bélanger 2016). Here, we also identified a total of 52 APQs in C. arabica and 37 in C. eugenioides that were classified into five subfamilies (Figs. S5 and S6). The larger number of AQPs found in C. arabica may be presumably due to their ploidy level since it is a tetraploid species originated by natural hybridization between the diploids C. canephora and C. eugenioides (Lashermes et al. 1999).

Further analysis of the CcAQPs genes revealed that segmental duplication (28) played a more prominent role in the expansion of CcAQPS genes than tandem duplication (5). According to Cannon et al. (2004), segmental duplications may have higher occurrence in plants due to polyploidization events, leading to a large accumulation of duplicated chromosomal blocks in the genome. On the other hand, tandem duplications can provide important mechanisms to adaptive resistance genes (Flagel and Wendel 2009). Some studies have reported that such events contribute to enhanced stress tolerance of plants (Fukao et al. 2006; Zou et al. 2009).

Comparison of intron and exon arrangements (number and position) between the 33 CcAPQs revealed wide variation in gene structure as observed in other plants, such as in Linum usitatissimum (Shivaraj et al. 2017), R. communis (Zou et al. 2015), and Beta vulgaris (Kong et al. 2017). The sequences CcNIP5;1, CcTIP3;1, and CcXIP1;1 and 1;2 present molecular masses lower than 20 kDa, which has not been observed in other species (Zou et al. 2015; Kong et al. 2017). Regarding the transmembrane prediction, most of them showed the typical six transmembrane helix domains (Fig. S4). However, two transmembrane proteins were absent in sequences of CcNIP3;1, CcNIP4;2, CcTIP4;1, and CcSIP2;2. Further studies may be necessary to assess if the lack of these domains is responsible for their low expression in all tissues as detected in silico (Fig. 6).

In addition, the data from the in silico analysis showed that CcNIP genes have low level of expression and tissue type and/or cell specificity, as was also observed in other species (Liu et al. 2009; Alexandersson et al. 2010). CcNIP5;1 and CcNIP5;2 were specifically expressed in roots, NIP1;2 in leaves and endosperm, and NIP4;1 in stamens. Previous studies in Arabidopsis showed the specificity of NIP5;1 in roots (Takano et al. 2006), while NIP6;1 was expressed specifically within the vascular tissue from phloem of developing young shoots (Tanaka et al. 2008) and NIP7;1 in floral tissues (Li et al. 2011). Regarding the CcXIPs, which have been reported to have very low participation in water transport (Bienert et al. 2011), the in silico expression analysis revealed low abundance of the three isoforms, with the only exception of CcXIP2;1 in pistil. The CcPIPs and CcTIPs genes showed the highest mRNA expression in silico; however, it is noteworthy to mention that PIP1;3, 2;2, and TIPs 1;2 were constitutively expressed in all tissues, in contrast to the tissue a relative specificity found for other isoforms. Members of PIPs and TIPs of C. arientium were also highly expressed in different tissues and are regarded as constitutive transporters (Deokar and Tar’an 2016).

AQPs genes were highly syntenic among the three coffee species. We suggested that this fact is due to the evolutionary process. The C. canephora and C. eugenioides species diverged from a common diploid ancestral ~ 4.2 mya (Yu et al. 2011). This was evident when comparing the number of orthologous genes between these two species, circa 93% from 33 AQPs genes identified in C. canephora were syntenic with C. eugeniodes. Only two genes, CcSIP2;2 and CcNIP3;1, did not present syntenic events with other coffee species, suggesting that these genes originated after the divergence from the common diploid ancestral or were lost during the evolutionary process. It is also worth to mention that these isoforms presented only four THMs and low in silico expression. The high number of syntenic events (55) between C. arabica and C. canephora is also likely the result of their evolutionary history. The tetraploid species (C. arabica) consists of two sub-genomes, and it is believed that this species appeared approximately 0.666 mya through natural hybridization between C. eugenoides and C. canephora (Yu et al. 2011).

AQPs in plants play a role in water permeability function and facilitate the transport of some small neutral molecules such as glycerol, boric acid, silicic acid, urea, NH3, CO2, and H2O2 (Li et al. 2014; Afzal et al. 2016). Therefore, the predicted CcAQPs sequences were visually inspected for the presence of characteristic and functionally important domains and motifs of the AQP subfamilies. NPA double motifs, ar/R selectivity filter, and Froger’s positions are closely related to the substrate specificity of AQP proteins due to atomic resolution and molecular stimulus. The duplication of NPA motif in protein is considered a basic feature of MIPS, but exceptions to this rule have already been identified in plant NIPs and SIPs (Forrest and Bhave 2007; Danielson and Johanson 2008; Deokar and Tar’an 2016). In C. canephora, the third position of the first NPA motif was replaced by (T/L) in all CcSIPs members, whereas in CcNIP, only a few members showed the replacement (V/I) in the second NPA motif. However, incomplete NPA conserved motifs seems that they do not affect the functions of these proteins as water channels (Johanson and Gustavsson 2002; Yakata et al. 2007) and may be a consequence of adaptive responses to changing environments (Liu and Zhu 2010).

NPA consensus sequences, SGXHXNPAVT (Heymann and Engel 2000) and (GXXXNPAR(S/D)XG) (Zardoya and Villalba 2001), were observed in all CcPIPs, CcTIPs, and CcNIPs. Notably, the first NPA motif is markedly conserved in all NIP, TIP, and PIP isoforms in C. canephora, while the second NPA motif is variant in NIPs where it can be replaced by NVI or NPV. These replacements on NIP sequences are considered a signature for boron uptake (Hove and Bhave 2011).

The ar/R selectively filters in CcPIPs were highly conserved, with the exception of CcPIP1;3 and 2;4 in which the first residue (F) was absent. PIPs are known to be excellent water channels with their selectivity filters and amino acid composition remaining invariable as a result of strong selective pressure (Hooijmaijers et al. 2012), with each amino acid discrepancy being considered a selective disadvantage for the plant (Diehn et al. 2015). The ar/R selectively filters for the CcTIP subfamily members were more variable, which may reflect the different solutes transported by these genes, such as urea (Liu et al. 2003), NH3 (Loqué et al. 2005), and H2O2 (Bienert et al. 2007). The signatures SGGH(V/L/F)NPAVT, G(G/A)SMNPARS(F/L)G, and the residues TSAYW were observed in all CcTIPs and are related to NH3 uptake and water selectivity, while the signature (T/Q/F-A/S-A-YF-W/I) at P1–P5 Froger’s positions present in the most CcPIPs and CcTIPs are related to H2O2 transporter (Hove and Bhave 2011). CcPIPs and CcTIPs also showed the residues SAFW at P2–P5 Froger’s positions that are signatures for CO2 and non-metal uptake (Hove and Bhave 2011).

Plant NIPs are responsible for the transport, absorption, and translocation of various uncharged metalloids (Bienert and Bienert 2017). In plants, NIP subfamily is composed of three subgroups NIPI, NIPII, and NIPIII, which differ in their selectivity to the substrate (Hove and Bhave 2011). The residues GSGR on ar/R selectivity filter and LTAYF on the Froger’s positions are signatures for silicon transporter (Hove and Bhave 2011). Although the CcNIP2;1 presented the conserved GSGR on the ar/R selectivity filter, the motif on the Froger’s positions was slightly different (LSAYV). Most CcNIPs showed the residues (FSAYI [L/V]) for the Froger’s positions related to urea and H2O2 transporter (Hove and Bhave 2011).

All members of the CcXIP subfamily showed the NPI motif in the first loop (LB), while the second motif (NPA) in the loop LE was expanded to NPARC (Reuscher et al. 2013; Danielson and Johanson 2008). Despite these variations, Asn and Pro residues remain perfectly conserved, maintaining the aqueous pore stabilization and water permeation functions (Murata et al. 2000). Regarding ar/R filters, XIP in eudicotyledonous plants can be divided into four subclasses (Gupta and Sankararamakrishnan 2009), two of which have a similar ar/R signature on some plant NIPs, whereas two others are more hydrophobic. In C. canephora, all XIPs presented the hydrophobic residues V/LFAR, suggesting that they do not work as water channels but as transporters of more bulky and hydrophobic solutes such as urea, glycerol, and metalloids (Bienert et al. 2011). No XIP sequences of C. arabica and C. eugeniodes were found in the NCBI database, probably due to the sequences that are usually annotated based on the Arabidopsis genome in which this subfamily is absent. All putative XIP sequences on the three coffee species possess the second NPA motif amplified to NPARC that is the typical signature of XIP sub family (Fig. S2).

The SIPs in C. canephora were also highly variable and do not present a distinctive signature in the ar/R selectivity filter. In contrast, the residues at P3 to P5 (AYW) in Froger’s positions were conserved in CcSIP1;1 and CcSIP2;1. These positions are highly conserved in other plants such as Ricinus communis (Zou et al. 2015), Cicer arantinum L (Deokar and Tar’an 2016), Linum usitatissimum (Shivaraj et al. 2017), Sorghum bicolor (Kadam et al. 2017), and Citrus sinensis (Wei et al. 2019).

Finally, we analyzed the transcriptional profiles of five aquaporin isoforms (3 CcPIPs and 2 CcTIPs) in leaves of two contrasting C. canephora genotypes (clones 14 and 109A, characterized as tolerant and susceptible to water deficit, respectively) (Marraccini et al. 2012; Lima et al. 2015) subjected to water deficit. Severe water deficit significantly affected RNA transcription accumulation in all five isoforms in both clones. In that condition, clone 109A showed increased expression for all isoforms while the expression in CcPIP2;3, CcTIP1;2, and CcTIP2;1 was reduced in clone 14. Similar findings were observed in two contrasting genotypes of common bean under water deficit, where isoforms of PIPs and TIPs showed lower gene expression levels in the drought-tolerant genotype (Zupin et al. 2017). In chickpea, Cicer arietinum, different isoforms of PIP showed contrast expression patterns. CaPIP2;4 increased in tolerant and decreased in susceptible genotypes, whereas CaPIP2;1 expression increased in susceptible genotype (Hussain et al. 2019). In Chrysanthemum morifolium, the isoforms PIP1 and PIP2 had the highest expression level in leaves for both genes when compared with that in roots, stem, flower, and flower buds (Zhang et al. 2019). Nevertheless, plants under saline stress showed differential expression level of isoforms PIP1 and PIP2 in leaves (Zhang et al. 2019). A genome-wide analysis of the 35 aquaporin AQPs in A. thaliana plants subjected to water deficit revealed that most of the PIPs and TIPs genes were downregulated, suggesting that decreasing gene expression might be a way for the plant to minimize water loss and uphold turgor in leaves (Alexandersson et al. 2005, 2010). On the other hand, a genome-wide analysis in C. sinensis revealed no expression differences in leaves of CsAQPs between sensitive and tolerant cultivars during drought (Wei et al. 2019), which shows that transcriptional profiles of AQPs in plants are highly variable in response to stress depending on the isoform, plant tissue, and stress level.

AQPs in plants could play an important role in inducing the regulation of transpiration in dry soils and high vapor pressure deficit conditions, allowing plants to conserve water to sustain plant physiological processes (Shekoofa and Sinclair 2018). Studies have suggested that some AQP isoforms increase the transcription level to facilitate water transport, while others are downregulated to decrease membrane water permeability to avoid excessive water loss during the stress (Zargar et al. 2017; Zhang et al. 2019).

Conclusion

Here, we identified 33 AQP genes in the available C. canephora genome. The evolutionary analysis revealed that the AQP genes are highly conserved among the three coffee species. Water deficit affected the transcriptional level of three CcPIPs and two CcTIPs isoforms in leaves of contrasting C. canephora genotypes (clone 14-tolerant and clone 109-susceptible to water deficit stress). Particularly, our findings suggest that the differential expression patterns of the CcPIP2;3, CcTIP1;2, and CcTIP2;1 between tolerant and susceptible C. canephora clones make these genes potential candidate for further studies on their effects on the plants’ response to water deficit conditions. In addition, the analysis of the CcAQP amino acid sequences allows the construction of various hypotheses with agronomical relevance regarding the function of the CcAQPs in plant growth and yield. Therefore, the identification and full characterization of the differentially regulated AQPs genes, and their isoforms, in response to water deficit are relevant for developing drought-tolerant cultivars using biotechnological tools in C. canephora.

Data Archiving Statement

All data used in this work to perform the bioinformatics analyses are public. The genes accessions numbers are available on the supplementary Table S6. The present study did not produce itself any raw data.