Abstract
The squash (Cucurbita maxima) phloem exudate-expressed aspartic proteinase inhibitor (SQAPI) is a novel aspartic acid proteinase inhibitor, constituting a fifth family of aspartic proteinase inhibitors. However, a comparison of the SQAPI sequence to the phytocystatin (a cysteine proteinase inhibitor) family sequences showed ∼30% identity. Modeling SQAPI onto the structure of oryzacystatin gave an excellent fit; regions identified as proteinase binding loops in cystatin coincided with regions of SQAPI identified as hypervariable, and tryptophan fluorescence changes were also consistent with a cystatin structure. We show that SQAPI exists as a small gene family. Characterization of mRNA and clone walking of genomic DNA (gDNA) produced 10 different but highly homologous SQAPI genes from Cucurbita maxima and the small family size was confirmed by Southern blotting, where evidence for at least five loci was obtained. Using primers designed from squash sequences, PCR of gDNA showed the presence of SQAPI genes in other members of the Cucurbitaceae and in representative members of Coriariaceae, Corynocarpaceae, and Begoniaceae. Thus, at least four of seven families of the order Cucurbitales possess member species with SQAPI genes, covering ∼99% of the species in this order. A phylogenetic analysis of these Cucurbitales SQAPI genes indicated not only that SQAPI was present in the Cucurbitales ancestor but also that gene duplication has occurred during evolution of the order. Phytocystatins are widespread throughout the plant kingdom, suggesting that SQAPI has evolved recently from a phytocystatin ancestor. This appears to be the first instance of a cystatin being recruited as a proteinase inhibitor of another proteinase family.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
An aspartic acid proteinase inhibitor (SQAPI) has been characterized from the phloem exudate of squash (Cucurbita maxima Duchesne) (Christeller et al. 1998; Farley et al. 2002). The protein has no sequence homology to the other four families of proteinaceous aspartic proteinase inhibitors (PIs) that have been identified and cloned to date: the potato plant Kunitz inhibitors (Mares et al. 1989), the Ascaris inhibitors (Martzen et al. 1990), the yeast inhibitor IA3 (Schu et al. 1991), and the pig serpin inhibitor (Mathialagan and Hansen 1996). SQAPI also has very different properties from the wheat inhibitor, which has only been partially biochemically characterized (Galleschi et al. 1993). The five cloned aspartic acid PIs show no homology to each other, although the Kunitz and serpin aspartic PIs belong to already established families of PIs (Rawlings et al. 2004; http://merops.sanger.ac.uk/).
Several PIs that have been exclusively identified in plants are often highly restricted in their distribution to a single family (Christeller and Laing 2005). The squash serine PI family appears to be limited to Cucurbitaceae, the PI II family to Solanaceae, the trypsin/α-amylase inhibitor family to Graminaceae, and the mustard seed inhibitor family to Brassicaceae. Such a limited distribution among plant families may indicate a recent evolutionary history. Two inhibitor families found only in plants have a broader distribution (Christeller and Laing 2005), with the plant Kunitz and the Bowman-Birk inhibitors being found in legumes and cereals and possibly other species. Only the PI I family, serpins and cystatin inhibitors, have homologues outside the plant kingdom and are widely distributed in plants (Christeller and Laing 2005).
However, the distribution and evolutionary origin of SQAPI are unknown and its phylogenetic relationship to other genes is unreported. Within the order Cucurbitales, there is a preponderance of taxa belonging to the Cucurbitaceae family, many of which are economically important. However, there are several other families within Cucurbitales, including Begoniaceae, Corynocarpaceae, and Coriariaceae. Molecular phylogenetic relationships have been established for Cucurbitales (Zhang et al. 2006).
Two additional widespread features of PIs are the presence of small gene families of each inhibitor within most genomes and hypervariation within the active site contact regions between orthologous inhibitors (Hill and Hastie 1987; Laskowski et al. 1987a,b). These features are considered to be due to the function of these inhibitors as resistance factors against secreted proteinases of pests, parasites, and pathogens and the intense evolutionary pressure generated by these interorganism protein-protein interactions (Christeller 2005; Creighton and Darby 1989). We previously identified two isoinhibitors of SQAPI from squash cDNAs (Christeller et al. 1998), but the extent of the gene family (paralogues) remains undetermined and hence the possibility of hypervariability within this inhibitor family is unknown.
In this paper, we investigate the evolution of this novel inhibitor and present evidence that indicates that SQAPI evolved not from other aspartic PIs but from a phytocystatin cysteine PI.
Materials and Methods
Plant Material
Squash, zucchini (Cucurbita pepo), cucumber (Cucumis sativus), watermelon (Citrullus lanatus), green nutmeg melon (Cucumis melo), bitter melon (Momordica cochinchinensis), and large bottle gourd (Lagenaria siceraria) were obtained as seeds from commercial sources or as gifts. White bryony (Bryonia dioica) was obtained as tubers from the New Zealand Department of Conservation from an infestation in Maungaweka. These materials were grown in a glasshouse at Palmerston North and harvested as required. Tissues from the southern cucurbit (Sicyos australis) were a gift from AgriGenesis Ltd., Auckland, New Zealand, those collected from tutu (Coriaria arborea) and karaka (Corynocarpus laevigatus) were growing in their natural environment near Palmerston North, and begonia (Begonia rex) plants were purchased from a local nursery. Material used in this study is listed in Table 1.
Prediction of Secondary Structure
Predictions of SQAPI secondary structure were based on tools provided by SwissProt using an NMR structure of a rice cystatin (1EKQ [Nagata et al. 2000]) as template (Guex and Peitsch 1997; Peitsch 1995; Schwede et al. 2003).
Fluorescence Spectroscopy
Fluorescence spectra were collected on a Perkin Elmer LS 50B spectrophotometer at room temperature, with a scan speed of 50 nm min−1, slit widths of 2.5 nm, excitation at 295 nm (for specific excitation of tryptophan), and emission from 300 to 400 nm. All data are the average of nine scans after subtraction of the fluorescence caused by the solvent alone. Recombinant SQAPI (HDVA isoform) was prepared as described previously (Farley et al. 2002). L-Tryptophan and porcine pepsin A were from Sigma Chemical Co. (St. Louis, MO, USA).
Identification of Squash SQAPI Gene Family Paralogues
Genomic DNA was extracted (Doyle 1990) from leaves of glasshouse-grown C. maxima var Supermarket Hybrid squash plants. The DNA was quantified fluorometrically using the Picogreen DNA kit (Molecular Probes Ltd., Eugene, OR, USA) at excitation and emission wavelengths of 480 and 520 nm, respectively. Clones were then obtained by PCR using nested primers (Siebert et al. 1995) as follows: SP1, 5′-TGACCTGCGTCTACACCAGCCAAGAT-3′; and SP2, 5′-CATGCTGTTTCAGTGCGAACTCTGCT-3′.
To confirm the approximate number of clones, Southern hybridization was performed (Sambrook et al. 1989) using 10 μg of squash genomic DNA. Aliquots of DNA were digested separately with 15 units of XhoI and BamHI and electrophoresed overnight in an 0.8% TAE agarose gel. Capillary transfer of DNA to Hybond N+ nylon membranes was done under alkaline conditions and the DNA fixed by UV cross-linking. The probe comprising the SQAPI coding sequence amplified from squash genomic DNA was randomly labeled with 32P α-dCTP using the Rediprime II system (GE Healthcare, CT, USA) according to the manufacturer’s instructions. Blots were hybridized at 62°C, then washed twice in 3× SSC and once in 0.5× SSC at 65°C.
Identification of Cucurbitales SQAPI Gene Orthologues
Genomic DNA from the leaves of each plant was extracted using the Nucleon extraction and purification kit (GE Healthcare) following their standard protocol for extraction from 100-mg samples. This DNA was then used as a PCR template with forward and reverse nested primers based on the 5′ and 3′ ends of SQAPI (Christeller et al. 1998). PCR was carried out using Platinum Taq (Invitrogen) for 30 cycles with an annealing temperature of 48°C. The four primers used were as follows: (5′ to 3′).
-
F1: ATG GTT GAT TTT CCA CAC ATG
-
F2: CCA GCC ATC GGT GAA GTG ATA
-
R2: GAG CTT CAG TGA ATT ATC TGA A
-
R1: AGC TTA GAA AAG AGG AAC GAA AG
PCR products were gel purified (QIAgen GmbH) and sequenced using the Capillary ABI3730 Genetic Analyser (Applied Biosystems Inc.). Unique predicted peptide sequence orthologues (Fig. 4) were used for subsequent phylogenetic analyses.
Phylogenetic Analyses
Alignment of SQAPI sequences was conducted using ClustalX (Thompson et al. 1997). Phylogenetic trees were generated using programs implemented within PHYLIP (Felsenstein 2002) and trees were subsequently rendered by Treeview (Page 1996).
Preparation of Anti-SQAPI Antibody
Polyclonal antibodies were raised in a New Zealand white rabbit by multiple subcutaneous injections of purified recombinant SQAPI (HDVA isoform) prepared as described previously (Farley et al. 2002). The initial immunization was by injection of 215 μg of His-tagged recombinant SQAPI emulsified in complete Freund’s adjuvant. Booster injections of 108 μg of His-tagged recombinant SQAPI and 75 μg recombinant SQAPI were administered on days 27 and 49, respectively. Serum was collected on day 59 and partially purified by ammonium sulfate precipitation before use.
Identification of SQAPI Protein
Phloem exudate was collected as previously described (Murray and Christeller 1995) and mixed with SDS-PAGE sample buffer. Plant tissues were ground in 50 mM Tris-HCl, pH 7.5, containing 20 μM E-64 and 5% PVPP (Sigma Chemical Co.). After centrifugation, extracts were dialyzed against three changes of 5 mM Tris-HCl, pH 7.5, and freeze-dried. The residue was suspended in a minimum amount of water and SDS-PAGE sample buffer. Samples were heated at 70°C for 10 min, then centrifuged and run on a NuPAGE 10% gel (Invitrogen, Carlsbad, CA, USA) in MES-Tris-EDTA-SDS buffer (pH 7.3). The gel was blotted onto an Immobilon-P membrane (Millipore Corp., Milford, MA, USA) in 0.5% disodium tetraborate/20% methanol and incubated with rabbit anti-SQAPI polyclonal antibody in PBS-T. SQAPI was detected using goat anti-rabbit IgG-alkaline phosphatase and NBT/BCPIP tablets (Sigma-Aldrich, WI, USA). In the case of Begonia rex, leaves were extracted in ∼2 vol of 0.75% lactic acid with 0.02% Tween 20, then centrifuged, the pH was adjusted to 7.6 using Tris base, the mixture was concentrated to 0.2 vol using a centrifugal concentrator, and 1 ml was applied to a G75 Superdex column (GE Healthcare) equilibrated with 0.1 M Tris-HCl, pH 8.0. Fractions were assayed for pepsin inhibitory activity and active fractions combined, concentrated 10-fold, and assayed (see below).
Cloning of SQAPI Variants and Assay of Pepsin Inhibitory Activity
SQAPI was recloned into pET30 (Laing et al. 2004) from four selected clones isolated as above. SQAPI from expressed clones and from phloem extracts was assayed as described previously (Christeller et al. 1998; Farley et al. 2002).
Results
Identification of Phytocystatins as the Nearest Homologues to SQAPI and Prediction of the Tertiary Structure of SQAPI
BLAST (BLASTP, TBLASTN) (Altschul et al. 1997) searches in GenBank (including EST sequences) using SQAPI sequences indicated that the nearest match to the SQAPI sequence was the phytocystatin family. For example, the closest translated gene sequence to SQAPI was Prunus persica (peach) BAC clone 82I18 (GenBank accession AC154901), the translation of which gave 33% identical sequence over the full length of SQAPI and 51% similar sequence. BLAST of the translation of this particular Prunus clone back into the GenBank protein sequence database hit only cystatin clones from a wide range of species, and the Prunus sequence showed the diagnostic cystatin motifs (Turk and Bode 1991). A functionally verified cystatin (Rassam and Laing 2004) gave an amino acid identity value of 21% and a similarity of 43%.
Pairwise comparisons of SQAPI with Arabidopsis thaliana putative PI peptide sequences using the Smith Waterman local alignment program gave similar identities to those described above over the full length of SQAPI to Arabidopsis cystatins, but only gave alignments over 11 to 24 residues with other Arabidopsis putative PIs (e.g., Serpins, Bowman Birk, proteinase inhibtor 2, and mustard seed families). One exception was a putative serpin inhibitor (AT3G45220; 393 residues) which had 20% identity over 73 residues. Thus there is a strong contrast between cystatins, which align over nearly their full length with SQAPI, and other PIs, which not only are not detected by BLAST searches, but also show little global alignment to SQAPI. In addition, SQAPI and cystatins are of a similar number of residues in length, in contrast to most of the other PIs compared.
We used the structure of a rice phytocystatin (GenBank accession number and Protein data bank ID 1EQK), which had been determined in solution using NMR, as a template on which to model SQAPI using Swissmodel (Schwede et al. 2003). These two proteins show 26% identity and 44% similarity over 104 residues when gaps are unrestricted, with lower percentages when gaps are restricted for modeling purposes (23%/44% over 100 residues; Fig. 1). The identical and similar residues were evenly distributed along the full sequence. The model fitted well to cystatin (Fig. 2), predicting the same two loops and the α-helix region found in cystatin (Turk and Bode 1991). The sequences of the SQAPI loops were quite distinct from the cystatin loops having a tryptophan in the center of loop 1 and no tryptophan (found in cystatin) in loop 2. However, the consensus sequence for the α-helix found in the cystatin, (LVI)-(AGT)-(RKE)-(FY)-(AS)-(VI)-X-(EDQV)-(HYFQ)-N (Margis et al. 1998) was also consistent with the SQAPI sequence at the predicted α-helix (IAEFALKQHA) in eight of nine residues (equating I with L in position 6). The two gaps in the modeled alignment (Fig. 2) occurred in the predicted noncontact loop between β-sheet 2 and β-sheet 3 (missing larger loop in SQAPI due to two extra residues in the SQAPI sequence compared to the rice cystatin; Fig. 2) and in the relatively unstructured sequence leading up to the α-helix (where SQAPI has one less residue; not visible in Fig. 2). While some structural discrepancies were reported by Whatcheck (Hooft et al. 1996), these were probably a function of the original rice cystatin template. For example, the Ramachandran Z-score was −6.163 for the rice cystatin and −5.4 for the modeled structure. These discrepancies were not taken further at this stage.
Fluorescence Spectroscopy of the SQAPI-Pepsin Complex
The predicted tertiary structure for SQAPI places the only tryptophan found in SQAPI in the first active site turn, suggesting that this tryptophan may be intimately involved in binding with the target pepsin. The emission wavelength maximum for the single tryptophan residue (residue 53) in recombinant SQAPI (HDVA isoform) was 350 nm in 50 mM acetic acid (pH 3.2) and 351 nm in 50 mM ammonium bicarbonate (pH 8.0), respectively (Fig. 3). For comparison, the emission wavelength maximum for the amino acid tryptophan in solution was 354 nm at pH 3.2 and 356 nm at pH 8.0, respectively. The fluorescence of the SQAPI tryptophan residue was almost completely quenched following formation of a complex with pepsin. The difference spectrum for the complex was essentially indistinguishable from the difference spectrum for the pepstatin-pepsin complex, whereas the fluorescence of an equivalent concentration of tryptophan was easily observed.
SQAPI Is A Small Gene Family in Squash
In order to determine the size of the SQAPI gene family in squash, we searched for homologues in this species. The existence of a small squash gene family was confirmed by two lines of evidence: first, by isolation of different genes by PCR, from both cDNA and genomic DNA, and, second, by Southern blotting of genomic DNA. Two variants had previously been obtained by PCR from cDNA of squash tissue (GenBank AAC39473 and AAC39474) (Christeller et al. 1998). Additionally, seven more complete clones were identified using a reverse primer based on the 3′ terminal sequence of the above isolated cDNAs (Siebert et al. 1995). Translations of these clones are listed in Fig. 4. In total, 9 distinct variants have been cloned from C. maxima var Supermarket Hybrid, showing changes in 15 residues of 103.
The nonconservative residue changes are mainly toward the N-terminal half of the SQAPI sequences (Fig. 4). For example, C. maxima 6 and 7 differ from the other C. maxima SQAPI sequences in having a proline at residue 12 instead of an alanine found in the others (P12A). Similarly C. maxima 2 and 5 differ from the other SQAPIs by having a proline at reside 6 (P6A) and a glycine at residue 15 (G15E), while C. maxima 2, 5 and 9 have an N at residue 3 and other SQAPIs have a D (N3D). Conservative changes also occur further along the SQAPI sequence. Five squash clones have the sequence HWD at residues 59–61, while four have DWN. In another distinctive change, C. maxima 8 has an extra amino acid insertion at position 24 but is otherwise identical to C. maxima 3.
Southern blots showed supportive evidence for a small SQAPI gene family (Fig. 5). The data show five distinct bands when the DNA is restricted with Xho1 and over six bands when restricted with BamH1. The increase in band numbers is expected with BamH1 because the SQAPI sequence has a BamH1 site in the middle, whereas Xho1 cuts only outside the sequences we have cloned. The results suggest that between 5 and 10 loci for SQAPI alleles occur in squash.
Identification of other Cucurbitales SQAPI Genes
We then searched for SQAPI homologues in other plant species. SQAPI genes were identified in all the Cucurbitales species tested (Table 1) by PCR, using either single or nested primer pairs, and subsequent sequencing of the PCR products. Our survey covered representative species from four of seven families within the order Cucurbitales, these families representing over 99% of the species in the order. The data show that SQAPI is present beyond the family Cucurbitaceae, indicating that evolution of SQAPI preceded speciation within the order Cucurbitales. Furthermore, sequencing revealed at least one major product from each species, and in several instances, two or three different genes were identified. Thus the data show evidence for small gene families of this inhibitor to be widely distributed in the order. BLASTP searches of GenBank failed to discover any non-Cucurbitales SQAPI clones with significant homology (E < 0.018) from either the All GenBank database (excludes ESTs) or the GenBank EST database including the HortResearch apple EST database that did not have cystatin signature motifs (Margis et al. 1998).
SQAPI Expression as a Protein in Phloem
The presence of highly homologous SQAPI genes across the Cucurbitales does not necessarily show that all these genes are transcribed, translated, or active. The presence of SQAPI protein in the phloem of related cucurbits should be detectable by Western blotting and cross-reaction with a SQAPI polyclonal antibody, as the predicted protein sequences (Fig. 4) are very similar. It proved extremely difficult to obtain phloem exudate from the majority of species. However, SQAPI was identified in some Cucurbitaceae by Western blot (data not shown). Bands were observed of varying intensity in the 10-kDa region for phloem exudate collected from the fruit of C. maxima, C. moschata, C. pepo, C. sativus, L. aegyptica, and L. siceraria. The molecular masses of the L. siceraria and C. moschata proteins were slightly greater than those of the other species (data not shown). We could detect the presence of SQAPI using pepsin inhibition assays of crude phloem extracts in L. aegyptica, C. sativus, L. siceraria, C. moschata, Citrullus lanatus, and C. maxima extracts (Table 2). While we could not detect inhibitory activity in C. pepo or C. moschata, this was probably because little to no protein was detected in these extracts (Table 2). In addition, a partially purified aspartic PI activity could be detected in Begonia rex (Table 2), with a molecular mass of ∼11 kDa (data not shown). Three C. maxima variants of SQAPI (GenBank AAC39473, AAT67162, and AAC39474) have been cloned, expressed, purified, and evaluated for inhibitory activity as described in Materials and Methods. All showed strong inhibitory activity (Table 2).
Phylogenetic Analysis of Squash Paralogues
To further understand the relationships between the isolated members of the small gene family of SQAPI, we carried out a phylogenetic analysis of the sequences in Fig. 4 beginning at residue Gly18 in the alignment. Using the neighbor-joining method based on Jones-Taylor-Thornton protein distances, a phylogeny was produced that reveals at least four paralogous clusters of genes. Phylogenies based on DNA sequences using parsimony and distance methods revealed similar groupings, however, the relations among the groups and the position of B. rex 2 were not stable across the methods (data not shown). Groups II, III, and IV contain a member from a family other than the Cucurbitaceae, while group I only contains Curcurbitaceae members. Cucurbitaceae (19 genes) has members in all groups, whereas Corynocarpaceae (3 genes) has members in two groups. Begoniaceae (2 genes) and Coriariaceae (1 gene) are each represented in a single group. Because it is probable that these plant families each possess small gene families of SQAPI that we have not detected, the distribution of genes in Fig. 6 is certainly incomplete, although the number of groups may be more accurate and correlates with the suggested number of loci.
The plant cystatins, selected either from Arabidopsis or from other cystatins referred to in this paper, showed much greater difference between different members compared to SQAPI members, although this may be biased by the length difference between SQAPI and cystatins.
Hypervariability in Transcribed Amino Acid Residues in SQAPI Genes
PIs generally exhibit hypervariability at their contact residues due to coevolutionary pressures from their cognate proteinases (Christeller 2005; Creighton and Darby 1989). Cystatins are proposed to have three contact points with papain, their model target proteinase: a glycine near the N terminal and the two loops discussed above (Turk and Bode 1991). We identified these positions on SQAPI aligned with 1EQK_A (Figs. 1 and 2) and examined the sequences of these postulated contact points for variability compared with other sections of SQAPI. In the N-terminus there are four glycine residues in SQAPI, three of which are invariant, with several other residues (12, 13, 15, and 17) showing variability between SQAPI clones (Fig. 4). The second contact region occurs at residues 57–61, where again there is significant variability around an invariant W. We have suggested that this W is possibly involved in inhibitor activity. The third region occurs at residues 87–92 and again includes variant residues.
These three regions of SQAPI were analyzed for hypervariability as described by Creighton and Darby (1989), whereby the numbers of amino acid replacements per amino acid per gene analyzed are compared for variable and nonvariable regions and hypervariability is detected with ratios significantly >1. We used this method as we wished to count the maximum possible changes as observed in our data, rather than the extent of change as would be measured by amino acid diversity. The three variable regions chosen contain five or six amino acids, compatible with cystatin inhibitor loop sizes (Turk and Bode 1991). Residues 1–5 were excluded from the analysis because we found that these residues were commonly cleaved in mature molecules and do not affect protease binding (Christeller et al. 1998). The data in Table 3 show that hypervariability is established for these three regions since the function divergence ratios (FDR) are much >1 and are similar to FDR values observed in other protease inhibitors (Creighton and Darby 1989). In addition, the predicted α-helix region of SQAPI showed a reduced FDR value compared with the three identified regions or the rest of SQAPI (Table 3).
While there is evidence for hypervariability in the three regions of SQAPI thought to be involved in contact with the protease, there is no evidence for hypervariability over the whole SQAPI protein as estimated by the ratio of amino acid replacements per replacement site (Ka) over the number of silent changes per silent site (Ks) (Ka/Ks). The overall value of Ka/Ks for the SQAPI dataset is 0.355. A Ka/Ks <1 is generally indicative of a protein being under functional constraint, presumably in the case of SQAPI to maintain its structure and PI activity.
Discussion
The Predicted Three-Dimensional Structure of SQAPI, Hypervariability, Binding Loop Identification, and Homology with Cystatin
Hypervariability is well documented in many PI families as positive Darwinian selection at regions of the molecules where interaction with proteinases occurs (Christeller 2005; Creighton and Darby 1989). The involvement of the two interacting molecules from two different organisms in a pathogenic, parasitic, or food source relationship is central to this phenomenon and represents a case of evolutionary warfare (Creighton and Darby 1989). Table 3 shows the higher variability between different SQAPIs in predicted proteinase contact regions compared with other regions with predicted backbone structure function. It has been found that regions of hypervariability within a PI indicate the presence of external loops that are involved in proteinase binding (Hill and Hastie 1987; Laskowski et al. 1987ab). That our predicted contact regions are hypervariable supports the concept that these regions of SQAPI are regions that are also involved in these protein-protein interactions despite no direct structural information being available. This is also supported by the fact these hypervariable regions in SQAPI aligned with predicted papain contact regions of cystatins. SQAPI shows 23%–33% identity to phytocystatins and 45%–54% similarity. Interestingly, high homology is detected (30% identity and 47% similarity over 101 residues including gaps) to a Cucurbitaceae cystatin, the C. sativus cystatin (GenBank BAA28867). This level of homology is a very good indication of related proteins, e.g., the homology and similarity between the cucurbit cystatin and cucurbit SQAPI is similar to that between apple and pear cystatins (GenBank accession numbers AAO18638 and AAB71505; 30%/55%; alignment not shown). These homologies indicate that SQAPI may have evolved from a phytocystatin, the latter being widespread throughout Eukaryota, at a late stage in Angiosperm evolution.
Tryptophan Fluorescence of SQAPI
The invariant single tryptophan in SQAPI aligns and models with the contact loop sequence QVVSG on the oryzacystatin both in the threaded structure and in BLAST alignments with other phytocystatins. Consistent with a relatively exposed location for the single tryptophan residue, the emission wavelength maximum of the SQAPI tryptophan fluorescence was found to be almost identical to that of free tryptophan but fluorescence emission difference spectrum analysis shows complete quenching on binding of SQAPI to pepsin (Fig. 3). Furthermore, kinetic studies with the 60W/A mutant and SQAPI in which the single tryptophan residue was specifically oxidized with N-bromosuccinimide (data not shown) have also implicated this invariant tryptophan as a contact residue. Together, these three independent observations strongly suggest that we have correctly identified a binding loop of SQAPI. Structural and sequence similarity also supports the identification of the N-terminal region and the second identified loop as important in inhibitor binding as predicted by hypervariability.
The Cucurbita maxima SQAPI Gene Family
Southern blotting of C. maxima genomic DNA with restriction enzymes XhoI and BamH1 produced estimates of a gene family of at least five genes (Fig. 5). However, about two bands in each blot were broad and dense, suggesting that more than a single species might be present, although matching the probe to the loci might also affect this interpretation. Because BamH1 cuts in the middle of the SQAPI gene sequence, it would very probably produce multiple bands of very similar size if the gene family has evolved by gene duplication to produce tandem or higher-level contiguous genes. Gene duplication as an evolutionary mechanism is very common in many inhibitor gene families and has been well reviewed (Laskowski and Kato 1980; Rawlings et al. 2004). The most extreme examples are genes in the PI II family and cystatin family that have been characterized with varying numbers of domains, from one through eight (Choi et al. 2000; Walsh and Strickland 1993). Cloning from cDNA and gDNA, while not necessarily exhaustive, suggests that at least nine distinct orthologous genes are present. Thus although the two approaches give different but similar answers, neither is authoritative or can provide an absolute value for the size of the gene family. The consistency in the data is further underlined by noting that cloning identifies individual genes and Southern blots identify gene loci and alleles if the loci are polymorphic. We can reasonably deduce from the data that we are dealing with a gene family of about nine genes (although we can not necessarily distinguish alleles from loci), with at least two orthologous variants in each cluster (Fig. 6). This result indicates that evolution of orthologues has continued during Cucurbitaceae evolution.
Phylogeny
We were able to clone SQAPI gene homologues from all Cucurbitales specimens examined. This includes specimens from four families within the order representing over 99% of the species in the order. These data indicate that SQAPI evolved before the evolutionary separation of families. The phylogenetic tree (Fig. 6) provides evidence that that the evolution of orthologues had also occurred in the Cucurbitales ancestor. The tree shows the existence of four paralogue clusters, all of which contain Cucurbitaceae genes and three of which contain non-Cucurbitaceae genes.
While cluster I only contains Cucurbitaceae sequences, we cannot be sure that any clusters of paralogues are uniquely Cucurbitaceae because it is probable that small gene families exist in all these species and we have merely isolated representative genes. This conjecture is supported by the isolation of multiple genes (two or three) from three specimens. However, the appearance of Corynocarpus genes in two highly separated clusters indicates that at least gene duplication had occurred in the Cucurbitales ancestor. The tree also shows the clear-cut separation of SQAPI from the cystatins.
As the angiosperm phylogeny becomes increasingly certain, it may be possible to trace the appearance of relatively newly evolved genes like SQAPI by identifying their presence or absence in sister orders. The current status of angiosperm phylogeny suggests that Rosales and Fagales are the two sister orders to Cucurbitales (Zhang et al. 2006). We believe it is likely that SQAPI evolved comparatively recently because aspartic protease inhibitors are rare in nature, and the SQAPI gene has not been identified in any fully sequenced genomes, including plant genomes. Neither has it been found in any order other than the Cucurbitales in genes deposited in GenBank, including EST collections, or in the HortResearch apple (Rosales) EST database of over 150,000 ESTs (Newcomb et al. 2006). In addition, we have been unable to detect pepsin inhibitory activity in apple tissue samples (unpublished data).
Our data suggest that not only has SQAPI evolved recently from the older widely distributed cystatin family, but also it has also utilized the cystatin inhibitory mechanism. The cystatin mechanism, which relies on steric hindrance by insertion of its inhibitory contact residues into the active site crevice of cysteine proteinases, does not directly interact with the proteinase nucleophilic sulfydryl. It remains to be seen exactly how SQAPI interacts with aspartic proteases. The protein-protein interactions for two protein aspartic inhibitors, PI-3 and IA3, have been characterized (Li et al. 2000; Ng et al. 2000) and are quite distinct from that proposed for SQAPI.
Nevertheless, evolution of a protease inhibitor of one family from that of a different family is not without precedent. It appears to be an evolutionary mechanism occurring frequently. Examples include serine PIs of the serpin family recruited to cysteine proteinase inhibition (Komiyama et al. 1994) and to aspartic proteinase inhibition (Mathialagan and Hansen 1996), serine PIs of the seed Kunitz family recruited to cysteine proteinase inhibition (Krizaj et al. 1993) and to aspartic proteinase inhibition (Mares et al. 1989), and cysteine PIs of the thyropin family, recruited to aspartic proteinase inhibition (Lenarcic and Turk 1999).
References
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Choi D, Park JA, Seo YS, Chun YJ, Kim WT (2000) Structure and stress-related expression of two cDNAs encoding proteinase inhibitor II of Nicotiana glutinosa L. Biochim Biophys Acta 1492:211–215
Christeller JT (2005) Evolutionary mechanisms acting on proteinase inhibitor variability. FEBS J 272:5710–5722
Christeller JT, Laing WA (2005) Plant serine proteinase inhibitors. Protein Peptide Lett 12:439–447
Christeller JT, Farley PC, Ramsay RJ, Sullivan PA, Laing WA (1998) Purification, characterization and cloning of an aspartic proteinase inhibitor from squash phloem exudate. Eur J Biochem 254:160–167
Creighton TE, Darby NJ (1989) Functional evolutionary divergence of proteolytic enzymes and their inhibitors. Trends Biochem Sci 14:319–324
Doyle A (1990) Isolation of plant DNA from fresh tissue. Focus 12:13–15
Farley PC, Christeller JT, Sullivan ME, Sullivan PA, Laing WA (2002) Analysis of the interaction between the aspartic peptidase inhibitor SQAPI and aspartic peptidases using surface plasmon resonance. J Mol Recognit 15:135–144
Felsenstein J (2002) PHYLIP: Phylogeny Inference Package, version 3.6, Department of Genome Sciences, University of Washington, Seattle
Galleschi L, Friggeri M, Repiccioli R, Come D, Corbineau F (1993) Aspartic proteinase inhibitor from wheat: some properties. In: Proceedings of the Fourth International Workshop on Seeds: Basic and Applied Aspects of Seed Biology, Angers, France, 20–24 July 1992. ASFIS, Paris, Vol 1, pp 207–211
Guex N, Peitsch MC (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modelling. Electrophoresis 18:2714–2723
Hill RE, Hastie ND (1987) Accelerated evolution in the reactive centre regions of serine protease inhibitors. Nature 326:96–99
Hooft RWW, Vriend G, Sander C, Abola EE (1996) Errors in protein structures. Nature 381:272
Komiyama T, Ray C, Pickup D, Howard A, Thornberry N, Peterson E, Salvesen G (1994) Inhibition of interleukin-1 beta converting enzyme by the cowpox virus serpin CrmA. An example of cross-class inhibition. J Biol Chem 269:19331–19337
Krizaj I, Drobnic-Kosorok M, Brzin J, Jerala R, Turk V (1993) The primary structure of inhibitor of cysteine proteinases from potato. FEBS Lett 333:15–20
Laing WA, Barraclough D, Bulley S, Cooney J, Wright M, Macrae E (2004) A specific L-galactose-1-phosphate phosphatase on the path to ascorbate biosynthesis. Proc Natl Acad Sci USA 101:16976–16981
Laskowski M, Kato I (1980) Protein inhibitors of proteinases. Annu Rev Biochem 49:593–626
Laskowski M Jr, Kato I, Ardelt W, et al. (1987a) Ovomucoid third domains from 100 avian species: isolation, sequences, and hypervariability of enzyme-inhibitor contact residues. Biochemistry 26:202–221
Laskowski M Jr, Kato I, Kohr WJ, Park SJ, Tashiro M, Whatley HE (1987b) Positive darwinian selection in evolution of protein inhibitors of serine proteinases. Cold Spring Harbor Symp Quant Biol 52:545–553
Lenarcic B, Turk V (1999) Thyroglobulin type-1 domains in equistatin inhibit both papain-like cysteine proteinases and cathepsin D. J Biol Chem 274:563–566
Li M, Phylip LH, Lees WE, Winther JR, Dunn BM, Wlodawer A, Kay J, Gustchina A (2000) The aspartic proteinase from Saccharomyces cerevisiae folds its own inhibitor into a helix. Nat Struct Biol 7:113–117
Mares M, Meloun B, Pavlik M, Kostka V, Baudys M (1989) Primary structure of the cathepsin D inhibitor from potatoes and its structural relationship to soybean trypsin inhibitor family. FEBS Lett 251:94–98
Margis R, Reis EM, Villeret V (1998) Structural and phylogenetic relationships among plant and animal cystatins. Arch Biochem Biophys 359:24–30
Martzen MR, McMullen BA, Smith NE, Fujikawa K, Peanasky RJ (1990) Primary structure of the major pepsin inhibitor from the intestinal parasitic nematode Ascaris suum. Biochemistry 29:7366–7372
Mathialagan N, Hansen TR (1996) Pepsin-inhibitory activity of the uterine serpins. Proc Natl Acad Sci USA 93:13653–13658
Murray C, Christeller JT (1995) Purification of a trypsin inhibitor (PFTI) from pumpkin fruit phloem exudate and isolation of putative trypsin and chymotrypsin inhibitor cDNA clones. Biol Chem Hoppe Seyler 376:281–287
Nagata K, Kudo N, Abe K, Arai S, Tanokura M (2000) Three-dimensional solution structure of oryzacystatin-I; a cysteine proteinase inhibitor of the rice, Oryza sativa L. japonica. Biochemistry 39:14753–14760
Newcomb RD, Crowhurst RN, Gleave AP, Rikkerink EH, Allan AC, Beuning LL, Bowen JH, Gera E, Jamieson KR, Janssen BJ, Laing WA, McArtney S, Nain B, Ross GS, Snowden KC, Souleyre EJ, Walton EF, Yauk YK (2006) Analyses of expressed sequence tags from apple (Malus x domestica). Plant Physiol 141:147–166
Ng KK, Petersen JF, Cherney MM, Garen C, Zalatoris JJ, Rao-Naik C, Dunn BM, Martzen MR, Peanasky RJ, James MN (2000) Structural basis for the inhibition of porcine pepsin by Ascaris pepsin inhibitor-3. Nat Struct Biol 7:653–657
Page RDM (1996) TREEVIEW: Aan application to display phylogenetic trees on personal computers. Comput Appl Biosci 12:357–358
Peitsch MC (1995) Protein modeling by E-mail. Bio/Technology 13:658–660
Rassam M, Laing WA (2004) Purification and characterization of phytocystatins from kiwifruit cortex and seeds. Phytochemistry 65:19–30
Rawlings ND, Tolle DP, Barrett AJ (2004) Evolutionary families of peptidase inhibitors. Biochem J 378:705–716
Sambrook J, Fritsch EF, Maniatis T (1989) Molecular cloning. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY
Schu P, Suarez Rendueles P, Wolf DH (1991) The proteinase yscB inhibitor (PB12 gene of yeast and studies on the function of its protein product. Eur J Biochem 197:1–7
Schwede T, Kopp J, Guex N, Peitsch MC (2003) SWISS-MODEL: an automated protein homology-modeling server. Nucleic Acids Res 31:3381–3385
Siebert PD, Chenchik A, Kellogg DE, Lukyanov KA, Lukyanov SA (1995) An improved PCR method for walking in uncloned genomic DNA. Nucleic Acids Res 23:1087–1088
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 24:4876–4882
Turk V, Bode W (1991) The cystatins: protein inhibitors of cysteine proteinases. FEBS Lett 285:213–219
Walsh TA, Strickland JA (1993) Proteolysis of the 85-kilodalton crystalline cysteine proteinase inhibitor from potato releases functional cystatin domains. Plant Physiol 103:1227–1234
Zhang L-B, Simmons MP, Kocyan A, Renner SS (2006) Phylogeny of the Cucurbitales based on DNA sequences of nine loci from three genomes: Implications for morphological and sexual system evolution. Mol Phylogenet Evol 35:305–322
Acknowledgments
We thank Mr. Andrew Clarke (Massey University, Palmerston North) for cucurbit seeds, Dr. Tony Lough (AgriGenesis Ltd., Auckland), Mr. Tom Rouse (Department of Conservation, Palmerston North) for plant material, Mr. Jonathon Proctor (Tanenuiarangi Manawatu Inc.) for advice on plant collection, and Gareth Cochran (Massey University, Palmerston North) for taking care of the plants. Finance was provided by the Public Good Science Fund of New Zealand (Contracts CO6X0207 and CO6X0220).
Author information
Authors and Affiliations
Corresponding author
Additional information
[Reviewing Editor: Dr. Antony Dean]
Rights and permissions
About this article
Cite this article
Christeller, J.T., Farley, P.C., Marshall, R.K. et al. The Squash Aspartic Proteinase Inhibitor SQAPI Is Widely Present in the Cucurbitales, Comprises a Small Multigene Family, and Is a Member of the Phytocystatin Family. J Mol Evol 63, 747–757 (2006). https://doi.org/10.1007/s00239-005-0304-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-005-0304-z