Introduction

GDSL lipolytic enzymes, a type of lipid hydrolysis enzyme (lipolytic enzyme) family, widely exist in bacteria and plants. The active-site serine of GDSL lipolytic enzymes in the GDSL motif near the N-terminal is different from that in classical lipolytic enzymes with the GxSxG motif near the center of the conserved sequence (Upton and Buckley 1995). A subgroup of GDSL enzymes—SGNH-hydrolase with conserved residues Ser-Gly-Asn-His in four conserved blocks I, II, III, and V, respectively, having a flexible active site that appears to change conformation in the presence and after the binding of different substrates—has been studied (Akoh et al. 2004). Therefore, GDSL-type esterases/lipases have multifunctional properties such as broad substrate specificity, regiospecificity, and stereoselectivity. These enzymes have high potential to be used in the hydrolysis and synthesis of important ester compounds of pharmaceutical, food, and biological interests.

Numerous GDSL-lipase members were shown to have important functions in plants. Several GDSL esterases/lipases from plants, including rice, Arabidopsis, and maize, have been isolated, sequenced, and functionally characterized. Enod8, the GDSL-type esterase/lipase gene found in alfalfa (Medicago sativa) and the model legume Medicago truncatula, has acetyl and butyl esterase activity, but no aliphatic esterase activity (Coque et al. 2008; Dickstein et al. 1993). In rapeseed (Brassica napus), a sinapine esterase (BnSCE3/BnLIP2) was found to have sinapine esterase activity of long-chain aliphatic esters (Clauss et al. 2008). A novel GDSL-type pepper lipase gene, CaGLIP1, showed hydrolytic activity for short- and long-chain p-nitrophenyl esters (Hong et al. 2008). A GDSL enzyme with fatty acyl-ester hydrolase activity was isolated and purified from the post-germinated seeds of sunflower (Helianthus annuus), and several GDSL enzymes with non-specific acyl-ester hydrolase were purified from soybean leaves, potato stems, and rice brans (Beisson et al. 1997). In Arabidopsis, some biological functions of GDSL esterase/lipase family members have been identified. AtFXG1 (α-fucosidase) was reported to be associated with the regulation of plant cell wall components (de la Torre et al. 2002). Yeast with heterogeneous AtGELP60 (AtLTL1) had tolerance to LiCl stress (Naranjo et al. 2006). Epithiospecifier modifier 1 (ESM1) was found to suppress nitrile formation and increase isothiocyanate production. ESM1 plays a role in controlling glucosinolate hydrolysis and/or insect herbivory resistance (Zhang et al. 2006). Arabidopsis lipase-1 (Arab-1) is similar to the GDSL protein from Brassica napus and is characterized as an extracellular protein playing important roles in development processes (Mikleusevic et al. 2009). The extracellular lipase 4 (EXL4) improved hydration in the early pollination stage of Arabidopsis (Updegraff et al. 2009). The cuticle destructing factor 1 (CDEF1) has cutinase activity and causes cuticular defects in Arabidopsis plants (Takahashi et al. 2010). GDSL-like lipase 1 (GLIP1) regulated systemic resistance associated with ethylene signaling in Arabidopsis (Kwon et al. 2009; Kim et al. 2013, 2014). GDSL-motif lipase 2 (GLIP2) plays an important role in plant immune responses and pathogen defense and is involved in the resistance to Erwinia carotovora via the negative regulation of auxin signaling (Lee et al. 2009). The overexpression of five GDSL-type Seed Fatty Acid Reducer (SFAR) in Arabidopsis was found to reduce the total seed fatty acid content and change the seed fatty acid composition (Chen et al. 2012; Huang et al. 2015).

The comparative sequence and tree topology analyses for numerous GDSL-type esterases/lipases in land plants have shown phylogenetic relationships between GDSL-type lipases from genome-sequenced species representing bryophytes, gymnosperms, monocots, and eudicots (Volokita et al. 2011). Nevertheless, the biological or molecular functions of only few of these GDSL-type lipases have been identified from diverse plant species. In our previous study, we conducted phylogenetic relationship, protein motif architecture, and expression profiling analyses to predict the possible biological functions of rice OsGELP genes (Chepyshko et al. 2012). Recently, several GDSL-type lipases and lipid-modifying enzymes involved in disease resistance were functionally characterized in plants. The expression of these GDSL esterase/lipase genes in plants was found to be closely associated with morphogenesis and development (Chen et al. 2012; Chepyshko et al. 2012). Even though some Arabidopsis GDSL esterase/lipase genes have been identified, the physical and molecular functions of other genes in this gene family are yet not known. Therefore, the roles of large GDSL-type esterase/lipase gene family in Arabidopsis are interesting.

In this study, we identified 105 Arabidopsis thaliana GDSL-type esterase/lipase (AtGELP) genes. Phylogenetic relationship of AtGELPs compared with the additional 13 published GDSL esterases/lipases from other plant species was determined. Our findings revealed that Arabidopsis GDSL-type esterase/lipase gene family was divided into four clades that were related to morphological development, abiotic stress response, pathogen defense, and secondary metabolism. Therefore, we used knockout mutants to further characterize the biological functions of this gene family. We acquired 88 knockout homozygotes and determined the T-DNA insertion site from 199 putative AtGELP T-DNA mutant lines from the SALK database. In the phenotype analysis, functional redundancy was likely for many GDSL-type esterase/lipase genes in most knockout lines, resulting in no phenotypic differences from the wild type. Based on the analysis of phylogenetic relationship and stress responses of knockout mutants, we found that many genes in clade IIIa were involved in Pectobacterium carotovorum subsp. carotovorum (Pcc) responses. Our study findings provide a basis to select candidate Arabidopsis GDSL-type esterase/lipase genes for further functional studies.

Results

Identification and characterization of the GDSL-type lipase genes in Arabidopsis thaliana

Members of Arabidopsis thaliana GDSL-type esterase/lipase gene family were identified by performing multiple database searches. In all, 105 putative GDSL-type esterase/lipase genes containing the GDSL domain were identified and assigned AtGELP numbers in a consecutive order according to their gene numbers along the Arabidopsis genome. All genes in the TAIR database were annotated as GDSL-motif esterase/acyltransferase/lipase except AtGELP14, AtGELP28, AtGELP29, and AtGELP33 (Table S1). AtGELP14 has been identified as Arab-1 that belonged to GDSL lipase (Mikleusevic et al. 2009). AtGELP28 annotated as anther-specific proline-rich protein APG precursor in TAIR was identified as seed fatty acid reducer2 (SFAR2) that has been reported as a GDSL-type lipase (Chen et al. 2012). AtGELP29 is no longer an encoding gene in the TAIR database. AtGELP33 is an α-fucosidase (de la Torre et al. 2002). In the TAIR database, eight genes were described as SGNH hydrolase-type esterase superfamily, a subgroup of the GDSL family (Akoh et al. 2004). Twenty-nine genes were assigned other names (Table S1).

The AtGELP genes were distributed in chromosomes I, II, III, IV, and V. Numerous AtGELP genes were located on chromosomes I and V. Further, 44 and 22 genes were located on chromosomes I and V, whereas 15, 16, and 8 genes were located on chromosomes II, III, and IV, respectively (Fig. 1). Large segmental duplications of chromosomal regions during evolution, followed by gene loss, small-scale duplications, and local rearrangements led to the development of the presently found complexities in the Arabidopsis genome. The enormous size of GDSL-type lipase gene family indicates that it has evolved through numerous duplication events in Arabidopsis. By analyzing the segmental duplication events for Arabidopsis, we found that 11 pairs (AtGELP1 and 54, AtGELP2 and 27, AtGELP3 and 39, AtGELP4 and 40, AtGELP19 and 65, AtGELP20 and 64, AtGELP21 and 63, AtGELP48 and 83, AtGELP60 and 93, AtGELP62 and 84, and AtGELP80 and 100) of the segmental duplicated genes were retained during evolution (Fig. 1). By using the “Tandem Gene Duplications in Arabidopsis” tool at TAIR, we found 16 sets of tandemly duplicated AtGELP genes that contained 2–9 AtGELP genes (46 of 105). Nine of the tandemly duplicated AtGELP sets (AtGELP3 and 4, AtGELP 15 and 16, AtGELP45 and 46, AtGELP49 and 50, AtGELP56 and 57, AtGELP70 and 71, AtGELP78 and 79, AtGELP87 and 88, and AtGELP102 and 103) contained two members. Five of tandemly duplicated AtGELP sets (AtGELP 19, 20, and 21; AtGELP39, 40, and 41; AtGELP42, 43, and 44; AtGELP63, 64, and 65; and AtGELP84, 85, and 86) contained 3 members. One of tandemly duplicated AtGELP set (AtGELP22, 23, 24, and 25) contained 4 members and another (AtGELP6, 7, 8, 9, 10, 11, 12, 13, and 14) contained nine members.

Fig. 1
figure 1

Chromosomal distribution and segmental duplication events for Arabidopsis GDSL-type genes (AtGELPs). The five chromosomes are indicated by numerals, and the centromeric regions are shown using ellipses. The AtGELP genes are numbered as 1–105, and deduced chromosomal positions are marked by horizontal bars. The parentheses include gene numbers. The scale is megabases (Mb) and is adapted from the scale available on the TIGR database. Blue lines connect chromosomal segmental duplication events in AtGELP gene pairs. Each set of tandemly duplicated AtGELP genes is labeled with the same color

The characteristics of all 105 AtGELPs are listed in Table S2. The length of AtGELP coding sequence ranges between 0.65 and 1.6 kb. A comparison of the genomic loci with their corresponding cDNA sequences revealed that 72 of the predicted AtGELP genes were supported by full-length cDNA. Five AtGELP genes (AtGELP17, AtGELP26, AtGELP39, AtGELP40, and AtGELP 43) contained alternative splicing forms. The gene models of 32 AtGELP genes remain to be verified because no full-length cDNA sequences are available (Table S2). By comparing the sequences of AtGELP cDNA and the Arabidopsis genome, we found that the putative AtGELP gene structure had the exon–intron organization of the AtGELP genes, which is considerably diverse (Table S3). The results revealed that AtGELP genes contain 2–6 translated exons. Further, four-fifths (84.8%) of the gene family (89 members) contained more than four exons (Table S3).

The expression profile of AtGELP genes in various tissues

By combining the expressed sequence tag (EST) and microarray database (eFG and Genevestigator), we determined the expression profile of AtGELP genes in several representative Arabidopsis tissues, including seedling, root, stem, rosette leaf, flower, silique, and seed. We found that most (101 of 105) of the AtGELP genes were expressed in different tissues and only four genes (AtGELP5, 29, 78, and 105) were not detected from any tissue (Fig. 2). We confirmed the expression of these four genes in various tissues of different growth stages from Arabidopsis plants by using RT-PCR. The expression of these four genes was not detectable (data not shown). We assumed that these genes might be pseudogenes or expressed at specific developmental stages or under special conditions.

Fig. 2
figure 2

The expression patterns for AtGELP genes from microarray and expressed sequence tag (EST) database. Positive signal is indicated by a colored box for the following tissues: pink for seedling, yellow for root, green for rosette leaf, red for flower, blue for silique, and brown for seed. The white box indicates that no expression was detected. The number on the left indicates the classes of expression patterns, and the black points on the right show positive expression for each gene. The expression profile from Genevestigator and eGF microarray database; read levels more than 250 are identified as positive transcripts. The expression patterns for AtGELP genes from the EST database were based on the numbers of ESTs found in different libraries; numbers more than one are identified as positive transcripts

According to the expression patterns of the AtGELP genes, the AtGELP gene family can be classified into three types (Fig. 2). The largest type included 52 AtGELP genes that were expressed in particular organs, such as AtGELP12, 34, 35, 87, 88, and 90 were expressed only in the reproductive organs (siliques and seeds). The second type was expressed in all tissues, including 20 AtGELP genes in Arabidopsis. Interestingly, the third type included 29 AtGELP genes that had very specific expression, such as AtGELP21, 27, 50, 58, and 65 showed tissue-specific expression in the roots, and sixteen genes were exclusively expressed in flowers. The overall expression data analysis suggested that AtGELP genes showed diverse expression pattern and might play crucial roles in various developmental stages in Arabidopsis.

Phylogenetic classification and functional analysis of AtGELP genes

To elucidate the roles of AtGELPs, we analyzed the phylogenetic relationship among AtGELPs and 13 other plant GDSL esterase/lipase genes with known functions and constructed a tree by using Molecular Evolutionary Genetics Analysis (MEGA4) program and the neighbor-joining (NJ) method (the Jones–Taylor–Thornton model) based on the alignment of their protein sequences. The conserved alignment regions, including special features of the GDSL esterases/lipases, such as blocks I, II, III, and V, were aligned (Akoh et al. 2004). The alignment results revealed that seven AtGELP genes (AtGLP25, 29, 31, 34, 57, 77, and 84) had likely lost some common conserved GDSL enzyme blocks, as well as other shared regions, during evolution. Therefore, they were eliminated from further phylogenetic analyses. In all, 98 AtGELPs and 13 other plant GDSL esterase/lipase genes were included in the phylogenetic tree. The AtGELP gene family was found to be divided into four clades (Fig. 3). Clades I, II, III, and IV comprised 37, 27, 11, and 28 AtGELP genes and were further divided into 4, 3, 2, and 3 subclades, respectively. Orthologs and close homologous proteins were assembled together in the same subclades of the phylogenetic tree. The four clades were well supported by the bootstrap values (more than 59%; Fig. 3). Functional annotations and identity clades of 15 AtGELP family genes and 13 non-Arabidopsis GDSL orthologues or homologues are listed in Table S4. The phylogenetic analysis and identified functions of those plant GDSL lipases/esterases suggested that the GDSL-type gene family in the different subclades have diverse biological and molecular functions.

Fig. 3
figure 3

An unrooted tree was constructed based on multiple sequence alignment of AtGELPs and 13 elucidated plant GDSL esterase/lipase protein sequences (AAE, AgaSGNH, AmGDSH1, BnSCE3/BnLIP2, BrSIL1, CaGL1, CaGLIP1, Enod8, JNP1, LAE, MaAChE, Salicornia AChE, and ZmAChE; marked with asterisks) by using ClustalW program and neighbor-joining (NJ) method with 1000 bootstrap replicates. The GDSL-type esterase/lipase clades were labeled at the left part of the tree. Subclades are numbered at the right part of the tree and marked with alternating tones of a background to facilitate subclade identification. Node numbers lower than 50 are not shown

The NJ tree was constructed by aligning the highly conserved amino acid sequence regions. We considered the possibility that functionally important motifs involved in substrate specificity as well as protein structure ordering and arrangement might be included outside of the GDSL esterase/lipase consensus regions, such as conserved blocks I, II, III, and V. Therefore, Multiple Em for Motif Elicitation (MEME) search tool was used to analyze the putative motifs (Bailey et al. 2009). In all, 55 motifs with statistical significance (E-value) were found among the AtGELPs and the non-Arabidopsis plant GDSL esterase/lipase proteins (Fig. S1). Protein clusters with homologs and/or orthologs in the same subclade shared more than one additional motif except the GDSL conserved domains. The specific motifs were found in the different subclades of the phylogenetic tree. The cluster in individual clades shared similar specific motifs. For example, clade III shared specific motif 34, and most members in clade IVc contained specific motifs 16, 21, and 24 (Fig. S1). The results of motif analysis were consistent with those of phylogenetic relationship. The specific motifs were found in the different subclades of the phylogenetic tree. Thus, the proteins within the subgroups that shared these motifs likely had similar functions.

Identification and analysis of T-DNA insertional mutants

To identify the roles of the AtGELP gene family, we screened putative T-DNA insertion lines for the AtGELP genes from the T-DNA express database of the SALK Institute Genome Analysis Laboratory (SIGnAL; http://signal.salk.edu). The seeds of 199 putative T-DNA insertion lines for 85 AtGELP genes were acquired from the Nottingham Arabidopsis Stock Center and Arabidopsis Biological Resource Center (ABRC) seed stocks. These putative T-DNA insertion lines harbored predicted T-DNA insertions in the promoter (31 lines), 5ʹ untranslated region (UTR; 4 lines), exon (98 lines), intron (49 lines), or 3ʹ UTR (2 lines; Table 1). Genotypes of the atgelp mutant lines were characterized both by kanamycin or phosphinothricin (PPT) resistance and genomic PCR by using gene-specific primers and T-DNA left-border primers (Table S4). Integration sites of these putative T-DNA mutants were confirmed using genomic PCR by using specific primers for AtGELP genes combined with the left border of T-DNA plasmid (Table S5). Thus, 123 putative T-DNA insertion lines of 71 AtGELP genes that harbored T-DNA insertions in the specific AtGELP genes were identified (Table 1). The insertion sites of these mutants were localized on the promoter (23 lines), 5ʹUTR (one line), exon (66 lines), and intron (35 lines). About 22% (44/199) of atgelp mutant lines were similarly integrated in the prediction sites of the SALK database (Table 1). Southern blot was used to determine the copy number of T-DNA insertions for these atgelp mutants (Fig. 4; Table 1). Among the confirmed T-DNA mutants, 26 lines for 23 AtGELP genes and 22 lines for 21 AtGELP genes harbored one and two T-DNA integrations, respectively. Other mutants possessed more than 3 inserts ranging 3–8.

Table 1 Characteristics of T-DNA insertion lines of Arabidopsis GDSL-type esterase/lipase family members
Fig. 4
figure 4

Identification of the number of T-DNA inserts for AtGELP T-DNA insertion mutants. Genomic DNA of 14-day-old atgelp mutant seedlings was used to determine the T-DNA insertional number by using Southern blot and digoxigenin (DIG)-labeling detection system. Genomic DNA was digested with a restriction enzyme (HindIII or BamHI). a represents SALK mutants by using the probe against NPTII gene (kanamycin resistance gene); b illustrates SAIL mutants by using the probe against BAR gene. The atgelp number is the mean of specific T-DNA mutant lines as shown in Table 1. The insertion number of atgelp mutants is listed in Table 1

To acquire the atgelp knockout mutant lines, we performed RT-PCR to determine the RNA expression levels of AtGELP T-DNA insertion lines (Fig. 5) according to microarray expression pattern (Fig. 2). The RT-PCR analysis results of the homozygous atgelp plants in different tissues such as seedlings, flowers, and seeds showed that 88 atgelp T-DNA mutant lines were null alleles (Fig. 5). The atgelp T-DNA knockout mutant lines were used to elucidate the physical roles of Arabidopsis GDSL-type esterase/lipase gene family.

Fig. 5
figure 5

Analysis of AtGELP expression in AtGELP T-DNA insertion mutants. ac Represent RT-PCR analysis of AtGELP expression in 14-day-old seedlings, flowers, and seeds, respectively. The upper bands represent the specific AtGELP gene expression of T-DNA insertion lines and wild type (WT). The lower bands show actin that was run as an RNA-competence control

Phenotypic alterations in the growth of the AtGELP T-DNA knockout mutants

To investigate the physical roles of the AtGELP gene family in Arabidopsis, we elucidated the phenotypic characteristics related to plant growth and development of the knockout mutants of AtGELPs. The complete collection of homozygous T-DNA lines was analyzed for seedling growth, rosette growth, inflorescence emergence, and flower and seed development and appearance. The growth of most atgelp mutants was not significantly different from that of the wild-type plants in MS media or soil (data not shown). Because Arabidopsis GDSL-type esterase/lipase genes belong to a large family, we supposed that functional redundancy might have been responsible for the no obvious growth defects from that in the wild type in most knockout lines of AtGELPs.

Conditional phenotypic alterations in the T-DNA knockout mutants

The expression and phylogenetic analysis suggested that AtGELP gene family might be involved in various developmental stages or stress conditions in Arabidopsis (Fig. 2 and Table S4). Therefore, AtGELP transgenic plants were subjected to abiotic (salt and osmotic) and biotic (Pectobacterium carotovorum subsp. carotovorum; Pcc) stresses to determine the role of AtGELP genes in response to stress. We assessed the responses of AtGELP knockout mutants (atgelp72-1, atgelp72-2, atgelp79-1, atgelp79-2, atgelp79-3, aatgelp104-1, atgelp104-2, and atgelp104-3) under salt (LiCl, KCl, and NaCl) and osmotic (glucose and mannitol) stress. None of the mutants differed significantly from the wild-type plants in their germination and growth phenotypes on MS medium with or without salt during a 2-week observation period (data not shown). Under osmotic treatments, the germination rates of AtGELP72 knockout lines (atgelp72-1 and atgelp72-2) were more significantly inhibited by high glucose concentrations than those of the wild-type seedlings (Fig. 6a). In contrast, the germination rates of the AtGELP72 overexpression lines (AtGELP72-10 and AtGELP72-11) were higher than those of the wild-type plants subjected to 5% glucose. However, no difference was found between the AtGELP72 transgenic and wild-type plants during germination after treatment with mannitol (Fig. 6a). The results were consistent with those of our previous studies (Huang et al. 2015).

Fig. 6
figure 6

Conditional phenotypic alterations in the AtGELP T-DNA knockout mutants. a Germination rate of wild type and AtGELP72 transgenic plants in MS medium with or without 5% glucose or 5% mannitol. The atgelp72-1 and atgelp72-2 represent the knockout mutants, whereas AtGELP72-10 and AtGELP72-11 refer to overexpressed AtGELP72 transgenic plants; WT is Col-0 wild type. bc T-DNA knockout mutants in response to Pcc infection compared with wild type. Disease development was determined in 10-day-old seedlings inoculated with 5 × 108 CFUs Pcc for 6 h. b Disease rate (0–3) was calculated for atgelp20-1, atgelp97-1, atgelp97-2, and atgelp 65-1 mutant lines, as well as for the AtGELP65 overexpression line (AtGELP65-2), compared with that in the wild type. c Bacterial growth quantification from disrupted leaves of atgelp mutants was determined after Pcc inoculation. Data represent average values ± SD

According to previous reports, many genes in phylogenetic clade III (Table S4), such as AtGELP20 (GLIP2), AtGELP97 (GLIP1), and AtGLP63 (ESM1), were reported to be related to biotic stress response (Kim et al. 2014, 2013; Kwon et al. 2009; Lee et al. 2009; Zhang et al. 2006). In order to investigate the roles of the AtGELP genes in clade III in response to biotic stress, we used Pcc that causes soft rot disease to infect the AtGELP knockout mutants (atgelp19-1, atgelp19-2, atgelp20-1, atgelp21-1, atgelp24-1, atgelp24-2, atgelp63-1, atgelp65-1, atgelp97-1, and atgelp97-1) in phylogenetic clade III. The knockout mutants of AtGELP19, 21, 24, and 63 did not show significant differences in response to infection from that of the wild type, whereas atgelp20-1, atgelp65-1, atgelp97-1, and atgelp97-2 showed more serious symptoms after Pcc infection (data not shown). Furthermore, the knockout mutants of AtGELP20, 65, and 97 (atgelp20-1, atgelp65-1, atgelp97-1, and atgelp97-2) and an AtGELP65 overexpression line (AtGELP65-2) was used to study the response to Pcc pathogen. The disease rate and CFUs (colony-forming units) of atgelp20-1, atgelp65-1, atgelp97-1, and atgelp97-2 were higher than those of the wild-type plants, whereas that of AtGELP65-2 was lower (Fig. 6b, c). The atgelp20-1 is a knockout mutant (defined as glip2-4) of GLIP2 with enhanced disease symptoms in response to E. carotovora (Pectobacterium carotovorum) (Lee et al. 2009). Our results indicate that the knockout mutant of AtGELP20 (atgelp20-1) in response to Pcc pathogen (Fig. 6b, c) is similar to the study by Lee et al. (2009). These results suggested that the genes classified into clade IIIa might play crucial roles in biotic stress responses.

Discussion

In the present study, we found that AtGELP genes have multiple molecular and physical functions. Based on the phylogenetic analysis, combined with protein motif architectures, and expression profiling, we suggested that AtGELPs were involved in plant development and morphogenesis, abiotic stress responses, defenses, and secondary metabolism (Table S4). According to the analysis of gene expression profiling, AtGELP genes showed diverse expression patterns (Fig. 2). Previous studies suggested that the particular expression location of AtGELPs might be related with their functions in various developmental stages in Arabidopsis. For example, AtGELP42 and AtGELP83 were expressed in flower and promoted pollen hydration on the stigma and pollen tubes to facilitate penetration in the stigma (Takahashi et al. 2010; Updegraff et al. 2009). However, no obvious difference was found between most AtGELP knockout lines and wild-type plants under normal growth conditions (data not shown). Our results suggested that large segmental and tandem duplications of Arabidopsis GDSL-type esterase/lipase genes resulted in the formation of most AtGELP gene family members (Fig. 1). Tandem duplication leads to paralogous relationships supported by the phylogeny of AtGELP gene pairs (Figs. 1, 3). Because of the large segmental and tandem duplications of AtGELP genes, many redundant GDSL-type esterase/lipase genes might have resulted in most knockout lines having no difference in growing phenotypes from those of wild types. For example, five GDSL-type esterase/lipase genes—AtGELP26, AtGELP28, AtGELP54, AtGELP72, and AtGELP80 (SFAR1-5)—are redundantly downregulated by DELLA and are involved in reducing fatty acid storage as well as composition in matured Arabidopsis seeds (Chen et al. 2012).

The functions of some AtGELP genes have been speculated and identified (Table S4). For example, AtGELP39–44 (extracellular lipase 1–6; EXL1–6) are classified into subclade Ia and Ib and are possibly involved in pollination and fertilization. Among them, AtGELP42 is identified as extracellular lipase 4 (EXL4) that improved hydration in the early pollination stage of Arabidopsis (Updegraff et al. 2009). AtGELP28 and AtGELP59 (SFAR2 and 3) in subclade Ib were elucidated to have crucial functions in plant development, morphogenesis, and glucose stress tolerance (Chen et al. 2012). AtGELP60 (AtLTL1) with tolerance to LiCl is classified in subclade IIa (Naranjo et al. 2006). AtGELP80 (SFAR5) exerts tolerance to high glucose stress, and CaGL1 plays important roles in wound stress (Chen et al. 2012; Kim et al. 2008); they are all included in subclade IIb (Fig. S2). These findings indicate that several genes in clade II are involved in some stress responses. Clade II also contains AtGELP83 (CDEF1) and JNP1, which have esterase activity to affect pollination (Kram et al. 2008; Takahashi et al. 2010). Five members (GLIP1-5) in clade IIIa were proposed to have lipase activity. In particular, several proteins in clade III of the phylogenetic tree were found to be involved in biotic stress responses (Fig. 3 and Table S4), such as AtGELP97 (GLIP1), AtGELP20 (GLIP2), AtGELP63 (ESM1), and BrSil. AtGELP20 (GLIP2) plays a role in resistance to Erwinia carotovora (Pectobacterium carotovora) by suppressing auxin response (Lee et al. 2009), whereas AtGELP97 (GLIP1) regulates plant system immunity via ethylene signaling (Kim et al. 2014, 2013; Kwon et al. 2009). AtGELP63 (ESM1) plays a role in resistance to insect herbivory (Zhang et al. 2006). BrSil in Chinese cabbage is strongly induced by salicylic acid and Pseudomonas syringae pv. tomato (Lee 2003). These findings suggest that the genes in clade III play crucial roles in pathogen responses. Clade IV is the largest group that includes several acetylcholinesterases (ACHEs). Many members of this group play a key role in plant secondary metabolism. AtGELP26 and 72 (SFAR1 and 4) affect fatty acid components of seeds, whereas AtGELP33 (AtFXG1), an α-l -fucosidase, is related to xyloglucan metabolism and cell wall composition; they are also included in clade IV (Chen et al. 2012; de la Torre et al. 2002).

Because the function of many genes of Arabidopsis GDSL-type esterase/lipase is unknown, functional classification of Arabidopsis GDSL family members based on phylogenetic tree is not consistent with that reported previously. However, the phylogenetic tree provides a platform for researchers to select candidate genes for further detailed functional study. According to the results of phylogenetic tree (Fig. 3) and Table S4, seven members in phylogenetic clade III were selected to test the response to biotic stress. Our results suggest that three members (AtGELP20, 65, and 97) in clade IIIa were involved in tolerance to Pcc pathogen (Fig. 6). Previously reported plant GDSL orthologues or homologues with known biochemical functions were found to show broad substrate specificity in different clades of Arabidopsis GDSL family (Fig. 3 and Table S4). The physiological substrates for the vast majority of characterized plant GDSLs are unknown. Few studies showed that most substrates used to determine the molecular functions of plant GDSL family were artificial substrates, and whether this specificity has biological significance is not yet known. The actual physiological and molecular functions in different clades of Arabidopsis GDSL family need to be further verified by conducting further laboratory studies.

In the present study, phylogenetic analysis provided information regarding the possible roles of GDSL enzymes in Arabidopsis. The confirmed SALK T-DNA data also provided a basis for identifying candidate Arabidopsis GDSL-type esterase/lipase genes to further investigate the molecular and physical functions of AtGELP genes. Our study introduces a fundamental framework for further studies on the molecular and biological functions of AtGELP family members.

Materials and methods

Bioinformatic analysis

By using the Basic Local Alignment Search Tool (BLAST; https://blast.ncbi.nlm.nih.gov/Blast.cgi) algorithms BLASTP and TBLASTN available on the National Center of Biotechnology Information (NCBI; https://www.ncbi.nlm.nih.gov/) as well as The Arabidopsis Information Resource (TAIR; https://www.arabidopsis.org/) database and the Arabidopsis Gene Regulatory Information Server (AGRIS; http://arabidopsis.med.ohio-state.edu/), we initially queried the amino acid sequence of GLIP1 (NM_123464), a GDSL-type esterase/lipase gene, and its lipase GDSL domain, followed by search for the conserved amino acid sequences of esterase/lipase.

To elucidate the entire family of genes encoding GDLS-type esterase/lipase proteins in the Arabidopsis genome, we searched available Arabidopsis databases by using multiple BLAST algorithms and full-length GLIP1 (374 amino acids) and its lipase GDSL domain as the query sequences, followed by additional searches by using related sequences. Pair-wise sequence comparisons to exclude redundant entries from the initial data set, which is frequently caused by multiple identification numbers for the same DNA or protein sequence in the databases, were performed. Subsequently, Pfam Hidden Markov models (local models) were used to analyze the conserved domains and motifs of putative Arabidopsis GDSL-type lipase genes. Arabidopsis nucleotide and protein sequences as well as information regarding gene structure were obtained from the Munich Information Center for Protein Sequences Arabidopsis thaliana Database (MIPS; http://mips.gsf.de), The Institute for Genomic Research (TIGR; http://www.jcvi.org/arabidopsis/qpcr/) Arabidopsis thaliana Database, and the Arabidopsis thaliana Plant Genome Database (AtPGD; http://plantgdb.org/AtGDB/).

Expression analysis of Arabidopsis GDSL-type esterase/lipase genes

We used Arabidopsis MPSS (http://www.hsls.pitt.edu/obrc/index.php?page=URL1152132364), the Arabidopsis Functional Genomics Consortium (AFGC) microarray expression database (Wu et al. 2001) located at the Stanford Microarray Database (http://genome-www5.stanford.edu/MicroArray/SMD/), Genevestigator (Hruz et al. 2008) (https://www.slideshare.net/joachimjacob/genevestigator), and EST data to detect the expression patterns of AtGELP genes. The locus names of AtGELP genes were used to query the MPSS database containing the signature information of the GDSL-type esterase/lipase genes. EST data were obtained from UniGene of NCBI, TIGR, and TAIR (Wu et al. 2001; Zimmermann et al. 2004a, b). As per the expression profile from Genevestigator and eGF microarray database, read levels more than 250 are identified as positive transcripts. The expression patterns for AtGELP genes from the EST database were based on the numbers of ESTs found in different libraries; numbers more than one were identified as positive transcripts.

Phylogenetic analysis of GDSL-type esterase/lipase and motif search

The alignment was performed to summarize the evolutionary relationship among the GDSL-type esterase/lipase plant family by using ClustalW of the Molecular Evolutionary Genetics Analysis (MEGA4) program. Phylogenetic trees were inferred using the NJ functionality of ClustalX. The unrooted phylogenetic trees were constructed using the NJ method (the Jones–Taylor–Thornton model). The NJ topologies were generated as the consensus of 1000 bootstrap alignment replicates. The bootstrap values were placed at the nodes and the values lower than 50 are not shown.

To identify the putative conserved motifs, we used Multiple Em for Motif Elicitation (MEME) search tool (Bailey et al. 2009). As in our previous study on rice OsGELP gene family, the criteria for motif search followed zero or one occurrence per sequence; the maximum number of motifs was 55, and optimum motif width was ≥8 and ≤15 (Chepyshko et al. 2012).

Plant materials and growth conditions

To determine the biological functions of AtGELP genes, we ordered seeds for specific lines from the Nottingham Arabidopsis Stock Center (NASC; http://www.arabidopsis.info) and Arabidopsis Biological Resource Center (ABRC; https://www.arabidopsis.org/servlets/Order?state=catalog) (Ohio State University, Columbus, OH). These putative T-DNA insertion lines of AtGELP genes were screened from Arabidopsis T-DNA insertion database of the SALK Institute Genome Analysis Laboratory (SIGnAL; http://signal.salk.edu/cgi-bin/tdnaexpress). Based on the position of the T-DNA insertion within the coding sequence to enhance the likelihood of successful disruption of gene function, we first selected T-DNA insertional lines of the Columbia (Col-0) ecotype with exon insertions. If not, the lines with predicted intron, promoter, or terminator insertions were selected.

For the 35S::AtGELP72 and 35S::AtGELP65 sense construct, a gel-purified fragment of entire coding AtGELP72 (restriction site: XbaI-SalI) or AtGELP65 (restriction site: BamHI-SalI) regions were inserted into the pCAMBIA1300-modified vectors (CLONTECH, Palo Alto, CA). The constructs were introduced into Agrobacterium tumefaciens strain LBA4404 by electroporation and transformed into wild-type plants by using the floral dip method (Clough and Bent 1998). For the phenotypic investigation, T3 homozygous lines were used.

Arabidopsis plants of the ecotype Col-0 were used. For phenotypic evaluations of plant growth and development, Arabidopsis plants with T-DNA mutations were used; Arabidopsis plants were imbibed for 3 days and then grown in soil at 22 °C under long-day conditions with a 16-h light/8-h dark cycle. For in vitro growth of Arabidopsis, seeds were surface sterilized and sown on Murashige and Skoog (MS) medium (Duchefa) solidified with 0.7% phytoagar (Duchefa). After the seeds were sown, the plates were incubated at 4 °C in the dark for 3 days and then transferred to a growth chamber at 22 °C under a 16-h light/8-h dark cycle. The growth and morphology of the leaves, seeds, siliques, and flowers of the T-DNA mutants were compared with those of the wild-type plants. For osmotic and salt stress studies, Arabidopsis seeds were surface sterilized and sown on MS medium supplemented with 1% sucrose, 0.112 g/L vitamin B5, and amended with or without NaCl (100 or 150 mM), LiCl (20 or 30 mM), mannitol (150 or 200 mM), or glucose (150 or 200 mM), pH 5.7., and then imbibed for 3 days. The growing condition was as above. For Pcc infection, seeds were sterilized and sown onto MS plate, grown for 7 days, and transferred to MS plate for another 3 days. The 10-day-old seedlings were infected with 5 × 108 CFUs Pcc for 3–6 h. The disease rate (0–3) and CFUs were determined. The disease rate was estimated as the levels of disease symptoms, where 0 refers to no symptom; (1) one rot leaf; (2) two rot leaves; and (3) almost whole plants with soft rot symptoms. For bacterial growth assay, infected seedlings were weighted and homogenized in 1 mL ddH2O for determining the colony-forming/mg weights. The experiment was repeated at least three times.

Isolation of genomic DNA and PCR

Correct insertions of the T-DNA in the lines were determined using genomic PCR. Genomic DNA was isolated from individual plants that belonged to the respective T-DNA insertional lines and amplified using PCR by using a set of primers (Table S5) that included a gene-specific primer and a T-DNA-specific primer (left border primer of T-DNA insertion from SIGnAL; http://signal.salk.edu/tdnaprimers.2.html) to determine the presence of the T-DNA insertion. PCR products were sequenced, and the accurate insertional site of T-DNA was confirmed. PCR included two gene-specific primers spanning the proposed insertional site to identify non-disrupted alleles. Plants for which PCR with a gene-specific primer and a T-DNA-specific primer yielded a product, whereas PCR with two gene-specific primers did not yield a product were considered homozygous insertional lines; this was confirmed in plants from the subsequent generation. Homozygous insertional lines in the subsequent generations were identified by antibiotic selection and genomic PCR.

Southern blot

The mutant plants were used to determine the T-DNA insertional number by using Southern blot. DNA blot was conducted according to Montserrat Solanas et al. with minor modifications (Solanas and Escrich 1997) and detected using the digoxigenin (DIG)-labeling detection system. For this, 10 μg of genomic DNA was digested with a restriction enzyme (HindIII or BamHI) and separated using 0.8% agarose gel electrophoresis, and then transferred onto a nylon membrane. The membrane was incubated with blocking solution (Roche) for 1 h, and then hybridized to a DIG-labeled single-stranded DNA probe overnight at 60 °C. Low stringency washes in 2 × SSC/0.1% SDS for 15 min at 37 °C was performed twice, followed by high stringency washes with 0.2 × SSC/0.1% SDS at 68 °C. The DIG detection method was as per the DIG Application Manual from Roche Co. (South San Francisco, CA, USA).

RT-PCR

Total RNA was isolated from plant tissues by using the RNeasy Plant Mini Kit (Qiagen), and Moloney murine leukemia virus (M-MLV) reverse transcriptase was used to generate complementary DNA (cDNA). For the reverse transcription reactions, 2 µg of total RNA extracted from 14-day-old seedlings or flowers was incubated with 0.5 µg Oligo (dT)15 primer in a total reaction mixture of 15 µL at 70 °C for 5 min. After the mixture was chilled on ice for 5 min, components of 1 × RT buffer (Promega, Madison, Wisconsin, USA), 2.5 mM MgCl2, 0.5 mM dNTPs, 25 units of a ribonuclease inhibitor, and 200 units of Superscript RT were added at final concentrations and incubated at 42 °C for 60 min. After incubation at 70 °C for 15 min, 4 units of RNase H (Promega) were added, and the reaction mixture was incubated at 37 °C for 20 min. The expression of AtGELPs was determined using 1 μL of cDNA from the RT reactions, and the specific primer pair for AtGELPs (Table S5) was used to amplify the full-length cDNA of AtGELPs by using PCR.