Introduction

Protein families consist of proteins with similar biochemical function and three-dimensional structure. The primary sequences of proteins in a family contain conserved amino acid residues, which usually serve important roles in determining the protein function and structure. However, during divergent evolution, conserved residues are not invariable, and may be replaced to create a new function. As a result, when subfamilies within a protein family are compared, some amino acid residues are found to be replaced in moving from one subfamily to another, while remaining invariant within each subfamily. Such amino acid residues specifically conserved within a subfamily are known to be involved in determining the functional specificity of the subfamily (Innis et al. 2000; Pritchard and Dufton 1999; Sowa et al. 2001), such as substrate specificity. Thus, replacement of conserved residues may lead to diversification and the production of a family with a new function. On the other hand, mutations of conserved residues generally have deleterious effects on the protein’s function and/or stability (Guo et al. 2004). These observations indicate that mutations in the conserved residues may alter the functional specificity or eliminate the mutated proteins from the evolutionary tree.

During molecular evolution, compensatory mutations are known to suppress deleterious mutational effects in functional and/or structural properties of proteins (Kimura 1991; Kondrashov et al. 2002). This compensatory mechanism has been investigated empirically. For example, the complete loss of lysozyme activity in a defective T4 lysozyme mutant with substitution of the Trp residue in the hydrophobic core could be restored through other compensatory mutations (Jucovic and Poteete 1998), and similar observations were made with structural destabilization of p53 protein mutants (Nikolova et al. 2000). These results indicate that there may be compensatory mutations that can neutralize the deleterious effects caused by replacement of highly conserved residues. To experimentally investigate the presence of such compensatory mutations, we focused on the most strictly conserved residue within the WW domain family, and examined whether and to what degree defective function and structure caused by replacement of this residue can be compensated by second-site mutations. This is important to understand the possible path taken in the evolution of proteins when a conserved residue is replaced.

The WW domain is one of the smallest protein modules (30–40 residues), consisting of a three-stranded antiparallel β-sheet structure, which mediates protein–protein interaction by recognizing proline-rich peptide sequences (Bork and Sudol 1994; H.I. Chen and Sudol 1995; Macias et al. 1996). The WW domain carries two highly conserved Trp residues separated by a stretch of 20–22 amino acids, for which it is named. Especially, the N-terminal Trp is almost perfectly conserved among more than 2500 WW domains determined to date, as revealed by analysis using the SMART server (Schultz et al. 1998). A previous study on the human Yes-associated protein (hYAP) WW domain demonstrated the functional and structural importance of the two highly conserved Trp residues (Koepf et al. 1999a; Macias et al. 1996) and indicated that the N-terminal Trp17 is involved in the hydrophobic patch and is essential for maintaining the overall β-sheet structure (Fig. 1). Indeed, replacement of Trp17 with Phe resulted in a significant loss of structural integrity and reduced binding affinity (K d  = 15.1 μM) to the PY ligand (EYPPYPPPPYPSG) compared to the wild type (K d  = 5.9 μM) (Koepf et al. 1999a). Thus, the W17F mutant of the hYAP WW domain lacks the strictly conserved residue and the overall structural integrity, and exhibits reduced binding function (Koepf et al. 1999a).

Fig. 1
figure 1

The primary and three-dimensional structures of the hYAP WW domain. (A) Three-dimensional structure of the wild-type hYAP WW domain (V11–R43) bound to the gPY ligand (GTPPPPYTVG) (Macias et al. 1996). Side chains of Trp17, Lys21, and Gln35 are depicted. Trp17 is the primary mutated residue highly conserved in WW domains, and Lys21 and Gln35 are the residues substituted most frequently in the enriched clones. The figure was prepared using PyMOL (http://www.pymol.org). (B) The sequence of the wild-type WW domain used in this study. Residues that form a hydrophobic patch, important for stabilizing the overall structure of the domain, are shown in boldface, and residues that make direct contact with the PY ligand are underlined. The two highly conserved residues, Trp17 and Trp39, are indicated by filled triangles above the sequence. Lys21 and Gln35, the residues substituted most frequently in the enriched clones, are indicated by stars above the sequence. The regions forming β-sheets (β1, β2, and β3) and the flexible regions are indicated below the sequence by arrows and dashed lines, respectively

In this study, we investigated whether and to what extent decreased binding function and structural disorder in the W17F mutant of the hYAP WW domain can be restored with second-site mutations introduced through the evolutionary process. We performed a selection experiment using the W17F mutant as a starting point and selected for variants (second-site revertants) that were able to bind the PY ligand with improved affinity, but preserving the W17F mutation, using ribosome display (Hanes and Plückthun 1997; Matsuura et al. 2007). After four rounds of selection, we obtained a second-site revertant, whose affinity to the PY ligand was 9-fold higher than that of the original W17F mutant and even 3.6-fold higher than that of the wild type. We also evaluated the binding specificity and structural properties of the selected revertants and found that the binding was highly specific for the recognition motif of the PY ligand. Moreover, the selected revertants were found to have a decreased apparent molecular weight and increased secondary structure content of the WW fold. Our results suggest that deleterious effects caused by mutations in highly conserved residues occurring through divergent evolution not only can be restored but can be improved even further by compensatory mutations. In addition, the functional significance of unstructured proteins, such as the W17F mutant, in protein evolution is discussed.

Materials and Methods

Synthesis of PY Ligands

Two types of PY ligand were used for selection and binding experiments: ePY (EYPPYPPPPYPSG) (Sudol 1998) and gPY (GTPPPPYTVG) (H. I. Chen and Sudol 1995). All peptides were synthesized by Gene Design Inc. (Osaka, Japan) with standard Fmoc protocols, purified by reverse-phase HPLC, and confirmed by MALDI-TOF mass spectrometry. Biotinylation of ePY ligand was performed by labeling the N-terminal amine of the ligand with EZ-link NHS-SS-biotin (Pierce, Rockford, IL, USA) in accordance with the manufacturer’s instructions.

DNA Constructs

Plasmid pRD-WW is a plasmid encoding human Yes-associated protein WW domain (Koepf et al. 1999a) and a spacer sequence for ribosome display format (Matsuura and Plückthun 2003). Plasmid pRD-W17F encoding the W17F mutant of the WW domain was generated by site-directed mutagenesis using the primers W17F+ (5′-TCTGCCAGCAGGTTTCGAAATGGCAAA-3′) and W17F– (5′-TTTGCCATTTCGAAACCTGCTGGCAGA-3′) with a QuikChange site-directed mutagenesis kit (Stratagene, La Jolla, CA, USA) in accordance with the manufacturer’s instructions. To prepare the randomly mutated library of the W17F mutant, the DNA encoding the W17F gene was amplified by error-prone PCR using ΔTth DNA polymerase (Toyobo, Osaka, Japan) and the primers WW-forward (5′-ACGACAAAGCCATGGGTGGT-3′) and WW-reverse (5′-ATATAGGATCCACTGGTCGGGGCTGTGACG-3′) as described previously (Arakawa et al. 1996). The PCR product was then digested and ligated into the vector pRD-n1n2_2 (Matsuura and Plückthun 2003) using the NcoI/BamHI restriction sites. The ligation products were directly PCR amplified using the primers SDA-pqe (5′-AGACCACAACGGTTTCCCTCTAGAAATAATTTTGTTTAACTTTAAGAAAGAGGAGAAATTAACTATGAGA-3′) and pqe1- (5′-GATCTATCAACAGGAGTCCAAGCTCA-3′). The products were subsequently amplified again by PCR with the primers T7B (5′-ATACGAAATTAATACGACTCACTATAGGGAGACCACAACGG-3′) and T3te_pD- (5′-CGGCCCACCCGTGAAGGTGAGCC-3′) to introduce the 5′ and 3′ stem loops and to make the final construct for in vitro transcription (Matsuura and Plückthun 2003). PCR was performed using Ex Taq polymerase (TaKaRa, Otsu, Japan). The PCR products were used directly for in vitro transcription using T7 RNA polymerase and purified as described previously (Hanes and Plückthun 1997).

Ribosome Display

In vitro translation was carried out using an E. coli S-30 extract system (H. Z. Chen and Zubay 1983) or PURE system: an E. coli-based reconstituted in vitro translation system (Post Genome Institute, Tokyo) (Shimizu et al. 2001), essentially as described previously (Hanes and Plückthun 1997). Following a 7-min translation reaction at 37°C, the reaction was stopped by fourfold dilution with ice-cold wash buffer (WBKT; 50 mM Tris-acetate, pH 7.5, 150 mM NaCl, 50 mM magnesium acetate, 0.5 M KCl, 0.1% Tween 20) supplemented with 2.5 mg/ml heparin and 0.5% BSA. After centrifugation at 14,000g for 5 min, the supernatant containing ternary complexes of mRNA, encoded protein, and ribosomes was placed on ice and used for subsequent experiments.

Microtiter plates (Maxisorb; Nunc, Roskilde, Denmark) were coated with 100 μL of 2 μg/ml neutravidin (Pierce) dissolved in phosphate-buffered saline (PBS; 137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 2 mM KH2PO4, pH 7.4) at 4°C overnight, and then blocked by incubation with 5% BSA for 2 h at room temperature. After three washes with 300 μl of PBST (PBS supplemented with 0.1% Tween 20), the plates were incubated with 100 μl of 0.5 μM biotinylated ePY ligand (biotin-EYPPYPPPPYPSG) for 0.5 h at room temperature. After five washes with 300 μl of ice-cold WBKT, aliquots of 100 μl of the supernatant containing the ternary complexes were applied to each well of the microtiter plates, which were then shaken at 4°C for 1 h. After five washes with 300 μl of WBKT, the retained ternary complexes were eluted with 200 μl of ice-cold elution buffer (50 mM DTT, 50 μg/ml Saccharomyces cerevisiae RNA in WBKT), which cleaves the S-S bond between the biotin moiety and the PY ligand, for 0.5 h at 4°C, and the released mRNA was purified using the SV Total RNA Isolation System (Promega, Madison, WI, USA).

The recovered mRNA was reverse transcribed and amplified by PCR using Taq polymerase (Promega) as described previously (Matsuura and Plückthun 2003). The PCR products were analyzed by agarose gel electrophoresis and used directly, or purified from the gel and reamplified if necessary. The resulting product was digested with NcoI and BamHI and ligated again into the vector pRD-n1n2_2 using the NcoI/BamHI restriction sites. The mRNA was prepared from the ligation product as described above and then used for subsequent panning rounds. It should be noted that errors can be introduced not only in the first round but also during the subsequent rounds, but with lower error rates. Although the ribosome display experiments were carried out using two different in vitro translation systems, S30 extract and the PURE system, we found no differences between the sequences of the selected clones, but the enrichment factor judged by the band intensity after RT-PCR was markedly better with the PURE system (data not shown).

Expression and Purification of the Selected Clones

The DNA pool after selection rounds was cloned into the NcoI/BamHI sites of pQE-pDNB (Matsuura and Plückthun 2003) to add the His-tag at the N-terminus used as an affinity tag for purification and an enterokinase cleavage site to remove the His-tag if necessary. E. coli XL1-blue (Stratagene) harboring the plasmid encoding the recombinant protein was grown in LB medium containing ampicillin (50 μg/ml) at 37°C to an OD600 of 0.5–1.0. Expression was induced by addition of 1 mM IPTG followed by an additional 3-h incubation at 37°C. The proteins were purified by immobilized metal ion affinity chromatography (IMAC) using Ni-NTA superflow (Qiagen, Hilden, Germany) under denaturing conditions, in accordance with the manufacturer’s instructions. Although all WW variants existed in the soluble fraction, they were purified under denaturing conditions because the variants may be sensitive to protease digestion. Purified proteins were then successfully refolded by dialyzing against HBST (10 mM HEPES, 150 mM NaCl, 3 mM EDTA, 0.005% Tween 20). When necessary, the His-tag was cleaved from the target protein by enterokinase (EKMax; Invitrogen, San Diego, CA, USA). Finally, the cleaved WW domains were purified to homogeneity by gel filtration chromatography on a HiLoad 16/60 Superdex 30 column (Amersham Bioscience, Piscataway, NJ, USA) using HBST as a running buffer for the subsequent BIAcore experiment. The purity was confirmed by SDS-PAGE and MALDI-TOF mass spectrometry (Voyager-DE STR; Applied Biosystems, Foster City, CA, USA). Note that Tween20 was removed from the sample prior to mass spectrometry by dialysis.

Surface Plasmon Resonance (SPR)

SPR experiments were carried out on a BIAcore 3000 instrument (BIAcore, Uppsala, Sweden). All experiments were performed at 25°C in HBST at a flow rate of 20 μl/min. The biotinylated ePY ligand was immobilized on an SA chip (BIAcore) in accordance with the manufacturer’s instructions. The amount of PY ligand immobilized on the flow cells was between 50 and 250 resonance units (RU). An uncoated flow cell was used as a reference cell. It should be noted that some variants showed stronger responses in the reference cell than those in the PY ligand immobilized cell, resulting in negative responses. This may have been because of enrichment of molecules such as streptavidin binders. Dissociation constants (K d ) were determined with protein concentrations ranging from 0.1 to 20 μM (seven different concentrations were used for each experiment) and measurement of the RU at equilibrium. Assuming 1:1 binding of the analyte and the ligand, RU at equilibrium (R eq ) obtained with different concentrations of analyte (C) was fitted to the following equation:

$$ R_{{eq}} = \frac{{R_{{\max }} C}} {{K_{d} + C}} $$
(1)

where R max is a constant and K d (M) is the dissociation constant. Data are presented as the averages of measurements in three independent flow cells (Table 1).

Table 1 Dissociation constants (μM) of WW domain variantsa to PY ligands

Solution affinities of WW variants for five gPY ligands were measured by competition BIAcore analysis. Competition experiments were carried out using the same SA chip coated with ePY ligand as described above. Samples of 5 μM WW variant were coincubated with gPY ligand at a series of concentrations (0.01–100 μM) for at least 1 h at 4°C before injection. Plateau responses (R eq ) of sensorgrams were plotted against the corresponding gPY ligand concentration (L i ), and the dissociation constant in solution (K i ) was calculated using Eq. 2:

$$ R_{{eq}} = \frac{{R_{{\max }} }} {{1 + \frac{{2K_{d} }} {{{\sqrt {(L_{i} - C + K_{i} )^{2} + 4K_{i} C} } - (L_{i} - C + K_{i} )}}}} $$
(2)

where C (M) is the concentration of the WW variant (here 5 μM), K d (M) is the dissociation constant between WW variant and the ePY ligand estimated using Eq. 1, R max is a constant, and K i is the dissociation constant in solution. Errors for all measurements ranged between 5% and 20% (Table 1).

Biophysical Characterization of the Selected Variants

Circular dichroism (CD) spectroscopy was performed with a Jasco J-720WI spectropolarimeter. All measurements were performed at 25°C. The data were normalized to molar ellipticity with a path length of 0.1 cm. The His-tagged WW domain variants were dissolved in TBS (25 mM Tris-HCl, 137 mM NaCl, 2.7 mM KCl, pH 7.4) to a final concentration of 50 μM.

Gel filtration experiments were performed with a SMART system using a Superdex 75 column (Amersham Biosciences) at room temperature. The absorbance data were recorded at 280 nm. A total volume of 25 μl of 10 μM sample was injected and run at a flow rate of 80 μl/min using HBST as a running buffer. Standards used for molecular weight calibration were bovine albumin (66 kD), carbonic anhydrase (29 kD), cytochrome c (12.4 kD), and aprotinin (6.5 kD) (all from Sigma, St. Louis, MO, USA).

Results

Directed Evolution of the W17F Mutant of the WW Domain

To prepare the W17F mutant gene library, we introduced random mutations into the full length of the W17F gene using error-prone PCR. We found an average of 3.2 amino acid changes per clone in the initial library generated by error-prone PCR. This library was then subjected to selection experiments using ribosome display based on binding to the PY ligand, ePY (EYPPYPPPPYPSG). In ribosome display, rare molecules in the initial library (i.e., those with improved binding) are enriched, and thus the proportion of such molecules is expected to increase through selection rounds. After four rounds of selection and amplification, we observed a clear increase in the amount of recovered mRNA, indicating enrichment of molecules with improved binding. It should be noted that errors could be introduced during the amplification step between the rounds of selection, as we used Taq polymerase for amplification, which does not exhibit a high fidelity (see Materials and Methods for details).

After the fourth round, the enriched pool was cloned and isolated clones were subjected to sequence analysis. None of 51 clones sequenced contained stop codons, deletions, or insertions. No mutations were observed at position F17, indicating that all of the selected clones still lacked the Trp residue in the N-terminus. It should be noted that ribosome display experiments were performed using two different in vitro translation systems: E. coli S30 extract and the PURE system (see Materials and Methods). Nevertheless, we obtained similar clones containing the same prevalent mutations in these two independent experiments, indicating that these prevalent mutations are not neutral but are likely to exhibit compensatory effects. Five of 51 clones were found to have identical mutations, which we named sk1 (K21R/Q35R/M50V) (Fig. 2B), while most of the other clones were different from each other. Several prevalent mutations were identified among the mutations in the above 51 clones. The most frequent mutations were K21R and Q35R, which were found in 29 (56.9%) and 14 (27.4%) of the 51 clones, respectively (Fig. 1B), and were also present in the enriched clone sk1. Both mutated residues are located in the β-sheet structure region, and the Q35 residue makes direct contact with the PY ligand (Macias et al. 1996), while the K21 residue does not (Fig. 1A). We also found prevalent M50V and N51D mutations in the flexible region of the C-terminus of the WW domain (Fig. 1B), which were found in 14 (27.4%) and 6 (11.8%) of the 51 clones examined, respectively.

Fig. 2
figure 2

Binding properties and the amino acid sequences of the wild type (WT), W17F mutant, and selected variants by ribosome display. The parent W17F mutant is indicated with an arrow. (A) Binding of the wild type, W17F mutant, and 16 variants obtained after four rounds of ribosome display to the ePY ligand evaluated by SPR. Each bar represents the binding signal (RU) at equilibrium with samples at 10 μM. (B) Amino acid sequences of the wild type, W17F mutant, and 16 selected variants. Sequences are aligned according to the RU shown in (A). The regions forming β-sheets are shown (β1, β2, and β3) in boxes. Boldface, underlining, filled triangles, and stars indicate the residues involved in hydrophobic core formation, residues that make direct contact with the gPY ligand, two highly conserved Trp residues, and residues substituted most frequently in the enriched clones, respectively

PY Ligand Binding Properties of Selected Variants

To evaluate the restoration of binding function, we chose 16 representative clones, including sk1, which have different combinations of second-site mutations, and performed preliminary binding analysis by surface plasmon resonance (SPR) (Fig. 2). All 16 clones were purified via the His-tag and then used for the binding assay. Binding assays were carried out by injecting 10 μM samples onto an ePY ligand immobilized sensor chip and the responses at equilibrium were measured. As shown in Fig. 2A, seven selected protein variants showed higher signals than the W17F mutant, among which five variants, including sk1, exhibited even higher signals than the wild type. As the variants containing K21R, Q35R mutations, and mutations in the C-terminal flexible region tended to exhibit stronger signals (Fig. 2B), these mutations are likely to contribute to the restoration of binding affinity. Thus, the rounds of ribosome display successfully enriched variants that showed improved binding in comparison with the parent W17F molecule. About half of the variants showed less binding than W17F, among which four of the selected variants showed negative signals. These are likely to be molecules that do not bind to the target ligand, for example, molecules adsorbed nonspecifically to the solid surface or binding to streptavidin or BSA.

The top four variants (pe2, pe1, sk1, and se3), the wild type, and the W17F mutant were further subjected to experiments to determine the K d (dissociation constant) after removing the His-tag by enterokinase cleavage. In agreement with the results of the previous binding assay, all variants showed improved affinities to ePY ligand as compared to the W17F mutant (Table 1). The affinities of sk1, pe1, and pe2 were even higher than that of the wild type. The best improvement achieved (pe2) amounted to 9-fold and 3.6-fold compared to the W17F mutant and the wild type, respectively. All three variants with affinities higher than the wild type contained both the K21R and the Q35R mutations (Fig. 2B).

We further characterized the binding specificities of the selected variants by measuring solution affinities for various PY ligands by competition BIAcore analysis (Nieba et al. 1996). The hYAP WW domain is known to bind to proline-rich peptides by recognizing the PPxY motif (where x can be any amino acid residue) (H. I. Chen and Sudol 1995). Here we used gPY ligand (GTPPPPYTVG) and its variants carrying P3I, P6W, P4S, or P4W mutations, termed gPY-P3I, -P6W, -P4S, and -P4W, respectively (Table 1). The gPY ligand was chosen, as comprehensive mutational analysis of this ligand and its binding to the hYAP WW domain have been reported previously (H. I. Chen et al. 1997), and we therefore had sufficient information to design the gPY variants used in this study. The mutations in gPY-P3I and -P6W are outside the recognition motif PPxY, whereas those in gPY-P4S and -P4W are within the recognition motif (Table 1).

Binding affinities of WW variants for each PY ligand are plotted on each axis of a radar plot in Fig. 3, with the binding properties shown in a pentagon shape. The wild type showed a distorted shape from an equilateral pentagon with bias toward gPY, P6Y, and P3I, indicating that its binding is specific for the PPxY sequence, as reported previously (H. I. Chen and Sudol 1995). Unlike the wild type, the W17F mutant showed a nearly equilateral pentagonal plot, indicating that it has broad specificity compared to the wild type. The selected variants excluding se3 showed similar distorted plots, indicating that they have improved affinity to gPY, P6Y, and P3I, similar affinity to P4S, and reduced affinity to P4W compared with the W17F mutant. These results indicate that that the selected variants acquired a high degree of specificity for the PPxY sequence.

Fig. 3
figure 3

Radar plot of binding specificities of the selected variants (sk1, se3, pe1, and pe2), wild type, and W17F mutant. The values of –logK d for each PY ligand are plotted on each axis. The K d values are listed in Table 1. Arrows show the directions of changes in binding affinity of the selected variants from the W17F mutant. The binding specificity can be evaluated from the shape of the pentagon. See text for details

To quantitatively define the binding specificity for the PPxY motif, we next calculated two ratios of K d between the gPY and the gPY-P4S or -P4W, both of which contain the mutation within the PPxY motif (Table 1). Binding specificity can be defined as the ratio of K d between the sequences with the highest and lowest affinity, as described previously for DNA binding specificity (Bulyk et al. 2001). For example, the specificity of the wild-type WW domain against the PPxY motif can be defined as 9 = (1052/123) = (K d for gPY-P4S/K d for gPY) or 20 = (2444/123) = (K d for gPY-P4W/K d for gPY), while that of the W17F mutant is 3 = (823/272) or 4 = (1047/272), respectively (Table 1). Those with higher specificities showed larger numbers, and thus binding of the W17F mutant to the PY ligand is less specific than that of the wild type. All of the selected variants exhibited increased specificity relative to the W17F mutant (Table 1). These results indicate that by replacing the highly conserved Trp with Phe in the hYAP WW domain, both the binding affinity and the specificity for the PY ligand were decreased, and the introduction of compensatory mutations through evolutionary experiments recovered not only the affinity but also the binding specificity.

Biophysical Properties of Selected Variants

The W17F mutant is largely unfolded, as suggested by the lack of downfield or upfield resonances in the 1H nuclear magnetic resonance (NMR) spectrum typically observed with native protein (Koepf et al. 1999a). We carried out a gel filtration experiment and also performed CD spectroscopy to investigate how the structural features of the functionally restored proteins have been altered in comparison with the parent W17F mutant.

The results of the gel filtration experiments indicated that all six proteins were monomeric, and thus the increase in affinity was not due to oligomerization (Fig. 4). The elution volume of the wild-type protein was somewhat lower than expected for natural globular proteins of this size. From the elution volume of the wild type and those of the molecular weight standards, the apparent molecular weight of the wild type (actual MW: 6.6 kDa) was estimated to be 30% larger (8.7 kDa) than that of native globular proteins of the same size, most likely because of the flexible region in both termini of the protein as seen in the NMR structure (Macias et al. 1996). In agreement with the previous report that the W17F mutant is largely unstructured, the elution volume of the W17F mutant was larger than that of the wild type, and the apparent molecular weight of the W17F mutant was 90% (12.5 kDa) greater than that of a globular protein of the same size. The peaks of the selected variants (sk1, se3, pe1, and pe2) were distributed between those of the wild type and the W17F mutant, and thus the selected variants had lower apparent molecular weights than the unstructured W17F mutant and were therefore more compact than the parent molecule, despite the lack of the Trp17 residue constituting the hydrophobic core in the wild type. Among the selected variants, pe2 showed the largest elution volume, which was almost the same as that of the wild type.

Fig. 4
figure 4

Gel filtration analysis of the selected variants (sk1, se3, pe1, pe2), wild type, and W17F mutant. The absorbance data were recorded at 280 nm. Protein concentration was 10 μM in all cases. Elution volumes of the molecular weight standards are shown at the top: carbonic anhydrase (29 kD), cytochrome c (12.4 kD), and aprotinin (6.5 kD)

We used far-UV CD spectroscopy to compare the secondary structures of the selected variants, the wild type, and the W17F mutant in the presence and absence of the ePY ligand (Fig. 5). The spectrum of the wild-type WW domain exhibited the maximum peak at 230 nm as reported previously (Koepf et al. 1999a), resulting from the aromatic contribution to the far-UV CD spectrum (Krittanai and Johnson 1997; Woody 1994). This distinctive feature has therefore been used as a qualitative signature of the three-stranded β-sheet of the WW fold (Dalby et al. 2000; Jiang et al. 2001; Koepf et al. 1999b). The spectra for the W17F mutant and the selected variants showed blue-shifted peaks at 225 nm with peak intensities that were different from each other (Fig. 5A). Shifted peaks at 225–230 nm are often seen in other WW domains (Kraemer-Pecore et al. 2003; Przezdziak et al. 2006). The differences in peak wavelength between the wild type and the selected variants, including the W17F mutant, are likely due to the changes in the environment of the aromatic residues resulting from the lack of N-terminal Trp in these proteins. An NMR study showed that the W17F mutant folds into a well-defined WW domain structure upon ligand binding (Koepf et al. 1999a), and in the CD spectra this resulted in an increase in the peak at 225 nm (see below). Therefore, the shifted peaks at 225 nm can also be used as a signature of the folded WW domain structure.

Fig. 5
figure 5

Far-UV CD spectra of the selected variants (sk1, se3, pe1, pe2), wild type, and W17F mutant at 25°C in the (A) absence or (B) presence (100 μM) of the PY ligand. In (A), the spectrum of the W17F mutant measured at 80°C is shown with a dashed line. Protein concentration was 50 μM in all cases

The W17F mutant yielded a significant decrease in the peak intensity compared to the wild type, as reported previously (Koepf et al. 1999a). Upon thermal denaturation at 80°C, this peak disappeared completely, resembling a typical random-coil spectrum (Fig. 5A). Thus, the W17F mutant had decreased β-sheet structure of the WW domain, but not a typical random-coil structure, as determined by CD spectroscopy. The spectra of the selected variants (sk1, pe1, and pe2) showed higher intensities of the peaks around 225 nm compared with the W17F mutant, suggesting an increase in their β-sheet structure in comparison to the W17F mutant. The variant se3 had a smaller peak at 225 nm than the W17F mutant, possibly because the se3 sequence contained a mutation at Ile7, which covers the hydrophobic patch and is therefore essential for folding (Macias et al. 1996). Addition of the ePY ligand to the selected variants resulted in an increase in positive ellipticity in the 225- to 230-nm range characteristic of a folded WW domain (Fig. 5B), indicating conformational changes when bound to the PY ligand, as seen in both the wild type and the W17F mutant.

Discussion

In the process of molecular evolution, compensatory mutations are thought to act as a salvage mechanism against deleterious changes in functional properties of proteins (DePristo et al. 2005; Jucovic and Poteete 1998; Kimura 1991). In the present study, we examined whether and to what degree the defective W17F mutant of the WW domain, containing substitution in a strictly conserved residue, could be restored through second-site mutations, providing additional evidence that a defective protein variant could be restored through second-site mutations. The results indicated that only a few second-site mutations, such as K21R/Q35R, were sufficient to restore the reduced binding function of the W17F mutant and even exceeded that of the wild type. In addition, improvements in binding specificity and structural properties accompanying functional restoration were also seen, as observed previously (Eaton et al. 1995; Ruan et al. 1998; Starovasnik et al. 1997). Thus, we have shown that second-site mutations not only can compensate for lowered function and structural loss due to replacement of conserved residues, but also can create proteins with improved functionality over the original protein. These observations suggest the possible path taken in protein evolution when a conserved residue is replaced; divergence after deleterious mutation in a conserved residue of a protein may not necessarily be eliminated but may be maintained through compensatory mutations in protein evolution.

Compensatory mutations are usually defined as a pair of substitutions at different sites in a protein that have a harmful effect independently but result in recovery of the original function in combination. This type of effect has been reported to take place between amino acid residues in close proximity in the folded state of a protein (Kimura 1991). On the other hand, none of the selected variants in the present study had direct or proximal mutations to the Phe17 residue. In a previous study using T4 lysozyme, most of the functional revertants carried primary-site mutations (Jucovic and Poteete 1998). However, we did not find any primary-site mutations, including simple F17W reversion, probably because F17W reversion requires two specific transversions in the DNA sequence, the occurrence of which is rare. In the wild-type structure, second-site mutations that were preserved among selected variants, such as K21R, Q35R, and the C-terminal mutations, were all located on the surface of the WW domain, and relatively remote from the site of the primary W17F mutation in the hydrophobic patch (Fig. 1A). These mutations are therefore likely to have compensatory effects independent of the site of primary mutation, i.e., additivity effects (Wells 1990). It is also possible that such mutations are “global suppressor” mutations, which are known to suppress more than one primary mutation at different sites (Poteete et al. 1997; Shortle and Lin 1985). Among the preserved mutations, only the Q35 residue comes into direct contact with the PY ligand in the three-dimensional structure of the wild-type protein (Macias et al. 1996). Therefore, although the physicochemical details have yet to be determined, the Q35R mutation appeared to contribute greatly to the functional restoration in the selected variants. In contrast, the K21 and C-terminal residues are not located in positions where they can make contact with the hydrophobic core or the ligand in the wild-type three-dimensional structure. Interestingly, improvements in structural properties, such as amount of secondary structural elements and apparent molecular radius, occurred without mutation to residues related to structural stability, such as the hydrophobic core. Further investigations are required to determine the individual contributions of these mutations and their effects.

In this study, the defective W17F mutant due to replacement of a highly conserved residue was found to have relaxed binding specificity compared with the wild-type protein (Fig. 3). As a result, the W17F mutant showed a higher binding affinity than the wild type toward gPY-P4W, which does not have the consensus sequence of the PY ligand, PPxY. These observations suggest that mutation of the Trp17 residue, a conserved residue in the WW domain, disrupts the structure but generates functional promiscuity due to relaxation of binding specificity. Recent studies also showed that proteins with structural plasticity displayed functional promiscuity. For example, an antibody with conformational diversity showed multispecificity, and natively unfolded proteins, a number of which have recently been found in eukaryotic cells, have been shown to be multifunctional (James et al. 2003; Tompa et al. 2005). Functional promiscuity generated by such structural plasticity may be an important property of a protein as a source of evolution generating proteins with new functions (James and Tawfik 2003).

The results of the present study have shown that even when a deleterious mutation occurs in a highly conserved residue of a protein during divergent evolution, it is possible that subsequent divergence may be maintained and can be improved even further through compensatory mutations. In addition, proteins with disrupted structure through mutation in conserved residues may become sources of divergent evolution, leading to the generation of families with new functions due to functional promiscuity. The above observations indicated that new functional proteins with novel binding specificities can be generated from the W17F mutant. For example, in comparison with the wild type, the W17F mutant is more likely to evolve toward proteins with novel binding specificity for P4W in evolutionary experiments. This may be true when selecting for ligands with structures that are markedly different from the original PY ligand, as the wild type requires global rearrangement of its structure to fit the novel ligand. On the other hand, there may be cases in which the wild type is preferred when selecting for ligands with similar conformation to the original PY ligand, as the wild type already has a rigid structure and requires only a few amino acid changes in the ligand-binding surface for improved affinity. Further investigations will help to understand the quantitative effects of structural plasticity generated by the unfolded molecule on the ability to evolve new functions.