Keywords

1 Selenocysteine in Chemical Protein Synthesis

The 21st naturally encoded amino acid selenocysteine (Sec, U), the Se-containing analog of cysteine (Cys, C), is present in the three domains of life, Bacteria, Archaea and Eukarya, in the form of selenoproteins . These proteins exploit the unique chemical properties offered by Sec, which is typically located in the protein’s active site. While Sec and Cys are isosteric, key differences in their chemistries allow Sec to perform unique activities within the cell. Compared to sulfur, Se is a better nucleophile and electrophile in nucleophilic exchange reactions [1, 2] due to its higher polarizability. The lower pK a of its selenol compared to the thiol of Cys (5.5 vs. 8.7) [3] means that Sec is mostly deprotonated at physiological pH (selenolate ), which further enhances its nucleophilicity. In addition, the considerably lower reduction potential of Sec [4, 5] as compared to Cys means that selenols are easily oxidized upon air exposure and are more resistant to reduction [6, 7].

The chemical similarities between Sec and Cys have been applied, most notably, in chemical protein synthesis. Because different mechanisms of Sec incorporation in bacteria and mammals [8] are a source of inefficient recombinant expression of selenoproteins , chemical protein synthesis (or semi-synthesis) has become a robust and versatile method for accessing selenoproteins . This method is based extensively on two technologies: solid-phase peptide synthesis (SPPS ) [9] and chemical ligation of unprotected peptides, most commonly native chemical ligation (NCL ) [10, 11]. Regarded as one of the main tools used in the synthesis of moderately-sized proteins (up to ~300 amino acids) [12, 13], classical NCL is initiated when the C-terminal thioester of a peptide undergoes transthioesterifcation when attacked by an N-terminal Cys of another peptide. A subsequent S→N acyl shift affords a native peptide bond at the ligation site (Fig. 7.1) [10]. In 2001, improved access to selenoproteins was enabled when three separate research groups reported the expansion of NCL to Sec in addition to Cys (Fig. 7.1) [1416]. While one study suggested the low pK a of selenol could be exploited as a chemoselective ligation at acidic pH [14], others observed slow Sec ligations due to the sensitivity of selenols to air oxidation and stability of the diselenide bonds in solution [15, 16]. This was later supported with evidence that a free selenol was necessary to initiate ligation [17]. Sec-mediated ligation has also been applied to expressed protein ligation (EPL) [18, 19], in which an expressed segment with a C-terminal intein is converted to a C-terminal thioester, finally undergoing NCL with a synthetic peptide. Such an approach was used to prepare a variant of the copper binding protein azurin, in which the Cys ligand of copper was replaced with Sec (Cys112Sec) [20].

Fig. 7.1
figure 1

Native chemical ligation (NCL ) at Cys or Sec

Due to the rarity of naturally-occurring selenoproteins and both chemical and monetary hurdles in accessing selenols during SPPS [21], few researchers in the field investigated Sec in chemical protein synthesis for some time. The breakthrough that refueled interest in the use of Sec in chemical protein synthesis was the development of selective deselenization of Sec to Ala in the presence of unprotected Cys residues [17]. Deselenization’s precursor, desulfurization , expanded available ligation sites in peptide synthesis by allowing ligation at the rare amino acid Cys; subsequent thiol removal provides the native Ala, which is a much more common residue [22]. As originally suggested [22], this approach was later expanded to other ligation sites, (Phe [23], Val [24, 25], Lys [26, 27], Leu [28, 29], Pro [30, 31], Thr [32], Arg [33], Asp [34], and Trp [35]), provided a synthetic “thiolyated” amino acid was accessible.

However, both metal-based desulfurization [22] and radical desulfurization removed all unprotected Cys residues in the sequence (Fig. 7.2a) [36]. Only Cys residues that were orthogonally protected with groups appropriate for the specific method could be preserved [37, 38]. Incorporations of thiol protecting groups on amino acid side-chains and subsequent deprotection inevitably led to lower yields in synthesis [39].

Fig. 7.2
figure 2

Deselenization vs. desulfurization reactions. (a) The deselenization of Sec produces Ala, while keeping unprotected Cys residues unaffected, whereas desulfurization leads to conversion of all unprotected Cys residues to Ala. (b) While deselenization under anaerobic conditions produces Ala, under oxygen saturation or in the presence of the mild-oxidant oxone, Ser is selectively produced

In contrast to these desulfurization procedures, deselenization is a chemo- and enantioselective reaction (Fig. 7.2a). It enabled conversion of Sec to Ala via incubation in buffer with tris(2-carboxyethyl)phosphine (TCEP) alone, leaving any unprotected Cys residues in the sequence unaffected [17]. As suggested in the original work [17], unique amino acids with β-selenols [40, 41] have also been synthesized to expand ligation sites to other amino acids beside Ala.

In describing selective deselenization , the authors highlighted Se's ability to form a radical species as a key source of unique reactivity, and noted that no radical initiator was needed for the reaction to proceed completely [17]. Further evidence for a radical mechanism was presented in later studies, which showed that in radical promoting conditions, such as heat and UV irradiation, as well as in the presence of radical initiators, the reaction rate increases [42]. However, the presence of the common radical scavenger, sodium ascorbate, decreased, and even halted, the rate of the reaction [42].

Recent studies also indicated that selective deselenization may be applied to expand ligation reactions to Ser [42, 43]. First described as an unwanted side-reaction of the deselenization reaction [17], conversion of Sec to Ser was eliminated under anaerobic conditions [42]. This indicated that the Ser-product formed via a similar radical mechanism in the presence of molecular oxygen. This was supported in a study of the deselenization of β-selenol-phenylalanine [40], wherein the selenol-to-alcohol conversion was enhanced due to the stability of the benzylic radical formed during deselenization . To support this hypothesis, saturation with O2 [42] or addition of the mild oxidant oxone [43] under deselenization conditions led exclusively to formation of the Ser product. This approach was chemoselective in the presence of reactive amino acid side chains and in the synthesis of the glycoproteins, MUC5AC and MUC4 [43]. Notably, as Ser is one of the most common amino acids in proteins, this constitutes a major step forward in the field of chemical protein synthesis.

To apply selective deselenization in the synthesis of longer proteins (>150 AA), which require the ligation of multiple segments , selenazolidine (Sez) [44] has recently been used as a masked precursor for N-terminal Sec. Sez, the seleno-analog of thiazolidine (Thz) [45], has proven especially useful in the synthesis of proteins with non-strategically placed Cys residues. A key element of this approach is the facile and smooth conversion of the Sez to Sec using MeONH2 at low pH. Selective deselenization after NCL affords the natural Ala at the ligation site [44].

Sez was utilized in the total chemical synthesis of a 125-residue protein, human phosphohistidine phosphatase 1 (PHPT1 ) and a protein variant containing an unnatural His analog, β-thienyl-L-alanine (Thi), PHPT1 (His53Thi) (Fig. 7.3) [44]. PHPT1 ’s three Cys residues are located toward the C-terminus (Cys69,71 and 73) and, as such, are not strategically placed for multiple ligation steps. Due to both the protein’s size and sequence, PHPT1 served as a very appropriate model to show Sez’s utility in future syntheses [44].

Fig. 7.3
figure 3

Chemical protein synthesis utilizing Sez as a masked Sec for proteins with non-strategically placed Cys residues. This approach was used recently in the chemical synthesis of two analogs of the protein PHPT1

Replacing sulfur with Se in NCL can also be achieved by replacing the C-terminal peptide thioester with a selenoester. A facile ligation at the Pro–Cys junction , which is typically extremely slow, was executed using a preformed prolyl selenoester peptide. Through a comparative study, α-selenoesters were found to be at least two orders of magnitude more reactive acyl donors than peptide α-thioesters [46]. Recently, a rapid and additive-free ligation between a C-terminal peptide selenoester and N-terminal Sec-peptide was demonstrated. This ligation was completed in minutes, even at sterically hindered ligation junctions [47].

As described here, the use of Se in chemical protein synthesis has increased as it provides access not only to selenoproteins , but also to a variety of previously difficult-to-access sequences. We envision that these synthetic tools will enable easy synthesis of selenoproteins and will be of great utility in protein chemistry in general.

2 Selenocysteine in Oxidative Protein Folding

Sec incorporation, which has been utilized in chemical protein synthesis to prepare selenoproteins and access long, difficult sequences, has also been applied in the world of protein folding. Sec's low redox potential [4, 5] and pK a [3], as well as its increased nucelophilicity and electrophilicity [1, 2], can enhance thiol-disulfide-like exchange reactions that are essential for protein folding.

Proper folding is critical for protein function. Under suitable in vitro conditions, proteins fold spontaneously into their three-dimensional native states, as all the information required for this process is contained in the primary amino acid sequence [48]. However, for many Cys-rich proteins, the number of possible disulfides increases substantially and, accordingly, the folding process becomes more complicated, with an increased risk of non-native disulfide bond formations (scrambled isomers) and/or the formation of “trapped” intermediates [4951]. In addition, many Cys-rich proteins also have an increased risk of aggregation, due to intermolecular disulfide bond formation. Replacing native Cys residues in the protein with Sec has been demonstrated as a promising strategy to simplify oxidative folding of Cys-rich peptides in vitro without changing their native conformations or biological activities.

The use of Cys-to-Sec substitution in protein folding was first applied to study the 21-residue endothelin-1 (ET-1) [52], a peptide with potent vasoconstrictor activity and two disulfide bonds. An analog was prepared via SPPS in which the residues of one native disulfide pair were replaced with Sec residues. Notably, the preferred formation of the diselenide bond reduced the likelihood of formation of scrambled isomers. The function and structure of the folded seleno-analog, [Sec3,Sec11,Nle7]ET-1, was almost indistinguishable from that of wild type ET-1.

After establishing Sec’s utility in ET-1 folding without affecting its bioactivity, apamin, an 18-residue toxin with four Cys residues, was studied. Although its native structure was known, its folding pathway was ambiguous. Accordingly, three apamin analogs were synthesized [53, 54] with strategic Cys-to-Sec substitutions mimicking the three possibilities of the first disulfide bond formation: two crossed (globule), parallel (ribbon), or consecutive (bead). Through oxidative folding, the diselenide intermediate was trapped and, using CD and NMR analysis, the isomer most likely to lead to the native structure, was predicted. Thus, through Cys-to-Sec substitution, a greater understanding of the folding pathway of apamin was reached [52].

Sec incorporation has also been applied to proteins with folded structures that are notoriously complex. In 2006, this technique was applied by synthesizing ɑ-selenoconotoxins as an alternative to naturally occurring conotoxins with therapeutic properties. The synthetic seleno-analogs were more stable than wild type, implying that they could be useful in designing stable scaffolds for peptide-based drugs [55]. Sec incorporation was also used to investigate the conotoxin ω-GVIA [56, 57], whose native structure contains three disulfides, and, as a result, yields of the natively folded protein tended to be low. To investigate the effect of Cys-to-Sec substitution on ω-GVIA’s folding, Bulaj synthesized three analogs, each of which had one native disulfide replaced with a diselenide [56]. C8U/C19U ω-GVIA folded with the greatest efficiency, and substituting a diselenide for a native disulfide in GVIA had no detectable effect on function while enabling a marked acceleration in the kinetics of folding and an increase in yield of the properly folded analogs [56].

This strategy has been extended to other disulfide-rich peptides with medical applications, such as μO-conotoxin MrVIB , which targets voltage-gated sodium channels (VGSCs) . μO-conotoxin MrVIB availability is limited due not only to its scarcity in natural sources, but also due to its hydrophobic nature and tendency to aggregate during isolation and folding. Similar to previous studies, synthetically prepared seleno-analogs of MrVIB with native diselenides [58] showed a decrease in the number of isomers formed with incorrect crosslinks and an increased yield of the natively folded analogs. As an added benefit, a study of seleno-MrVIB analogs’ biological activity showed that in TTX-R sodium channels, C2U/C20U MrVIB exhibited slightly higher potency and greatly enhanced selectivity when compared to the wild type MrVIB [58].

Cys-to-Sec substitution was also applied to the folding of bovine pancreatic trypsin inhibitor (BPTI ) , a 58-residue protein with three disulfide bonds (5–55, 14–38, 30–51). BPTI ’s folding occurs via a bifurcated pathway whose intermediates contain only native disulfide bonds, and is considered a model for many other protein folding pathways [50, 51]. In contrast to previous Cys-to-Sec studies, the effect of substituting a non-native disulfide with a diselenide was applied in the hopes of altering the population of intermediates. During folding, C5U/C14U BPTI indeed bypassed two long-lived intermediates present in folding of wild type BPTI , N*(5–55;14–38) and N′(14–38;30–5) and, as a result, the rate of folding was greatly enhanced [66]. Further studies [67] showed that replacing only one Cys with Sec, rather than a disulfide pair, successfully accelerated folding. However, the intermediate N*(5–55;14–38) was still present during folding. Finally, replacing a native disulfide 14–38 crosslink with diselenide was also successful in accelerating folding. However, the solvent-exposed selenols caused the protein to undergo aggregation and lowered the overall yield.

In addition to employing Sec substitution as an intramolecular catalyst for protein folding, small molecule diselenides have been used as intermolecular catalysts for oxidative protein folding. Because they are easily re-oxidized after reduction, diselenides have long been known to enable thiol/disulfide exchanges, even when present in catalytic amounts [59]. As such, a variety of small-molecule diselenides have been developed and used in the folding of challenging proteins both in vitro and in vivo. The most studied molecule, selenoglutathione (GSeSeG) (Fig. 7.4), is the seleno-analog of natural glutathione, which acts as a redox buffer in vivo and in vitro [6, 7, 60, 61]. Folding studies with GSeSeG have been performed with a variety of biologically relevant proteins such as BPTI , hirudin, lysozyme, human epidermal growth factor and interferon α-2a. This technology has also been applied in the folding of more challenging proteins such as bovine serum albumin, which contains 17 disulfide bonds, and the antigen-binding (Fab) fragment of the antibody MAK33 [61]. In all cases, the diselenide molecules facilitated the folding process of these proteins and resulted in higher yields, perhaps through rescuing trapped intermediates [62]. Recently, a new class of small molecule diselenides, either commercially available or readily prepared (Fig. 7.4), were found to be as effective as GSeSeG in promoting oxidative protein folding [7, 63]. As difficult-to-fold proteins are revisited or discovered, the chemistry of selenols in the form of Sec residues and small seleno-molecules will continue to contribute to the field of protein folding.

Fig. 7.4
figure 4

Reduced (GSeH) and oxidized (GSeSeG) selenoglutathione, the selenium analogs of common folding redox buffer GSH and GSSG. Other, smaller-size diselenides, 1–4, have been recently tested as additives for oxidative protein folding

3 SEP15 and SELM in In Vivo Protein Folding

Sec is not only utilized to optimize protein folding in vitro, but is also suspected to play a role in protein folding pathways in vivo. Of the 25 known genes encoding for human selenoproteins [64], seven are located in the endoplasmic reticulum (ER): the 15 kDa selenoprotein (SEP15), selenoprotein M (SELM), SELS, SELK, SELN, SELT and DIO2 [65]. Because protein folding and disulfide bond formation occur in the ER, it has been proposed that these selenoproteins , or at least some of them, are involved in the formation, isomerization (shuffling) or reduction of incorrect disulfide bonds of proteins undergoing folding pathways in the cell. Two of the better-studied selenoproteins in the ER are SEP15 and SELM. These proteins share 31 % sequence identity and both have structures that are highly similar to the thioredoxin fold [66]. However, their exact biological functions are still unknown.

SEP15 was first described in human T-cells and its tissue distribution suggests it is most common in prostate and liver [67, 68]. This 162 residue protein contains Sec in a conserved CXXC-like motif found in the thiol-disulfide family of oxidoreductases. Interestingly, the motif in SEP15, CXU, is one amino acid shorter than similar motifs in comparable proteins. While SEP15 is found in the ER, it does not itself contain an ER retention sequence. A major breakthrough in uncovering its probable function was published in 2001, where SEP15 was found to form a tight 1:1 complex (K D of 20 nM) with the ER protein UDP-glucose:glycoprotein glucosyltransferase (GT) [69]. GT is responsible for specific glycosylation of misfolded proteins in the ER [70], which are then folded by the calnexin/calreticulin glycoprotein folding system [69]. SEP15’s Cys-rich domain at its N-terminus, a conserved domain from plants to humans, is an essential element of the SEP15:GT complex [71]. Therefore, SEP15’s presence in the ER can be attributed to its interaction with GT.

Based on the above findings, it was proposed that SEP15 plays a role in the detection and refolding of glycoproteins with erroneously formed disulfide bonds. Through utilization of its CXU motif as a reductant or oxidant, refolding of the misfolded glycoprotein marked by GT is initiated [71]. It has also been suggested that SEP15 could be related to the regulation of the enzymatic activity of GT [71].

The distantly related, homologous selenoprotein, SELM, was discovered after SEP15, but even less is known about its activity [72]. Similar to SEP15, SELM contains the redox-active motif CXXU. While it is expressed as a 145-residue protein, the mature SELM consists of 122 amino acids after its N-terminal 23-amino acid signal peptide is cleaved in the cell [72]. In contrast to SEP15, SELM is found mostly in the brain [72], contains an ER retention sequence at the C-terminal end [72], and does not interact with GT [71].

Additional evidence for thioredoxin-like activity for SEP15 and SELM was found in the NMR structures of wild type SEP15 from Drosophila melanogaster and the U48C mutant of SELM from Mus musculus [73]. The study revealed that the two proteins share the characteristic α/β-fold of the thioredoxin superfamily. Furthermore, the active site Sec of SEP15 and SELM were in locations similar to that of catalytic Cys in thioredoxin [73].

The exact roles of SELM and SEP15 are still unknown (see Chap. 19), but possible functions have been suggested. Localization in the ER, which is an oxidizing environment responsible for disulfide bond formation, and the presence of a similar redox-motif to thiol-disulfide oxidoreductases indicate their potential thiol-disulfide oxidoreductase activity [74]. Additionally, the identity of the X amino acids in the CXXC-like motif can determine both the redox potential of the enzyme the function of the enzyme as thiol-disulfide reductase, oxidase, or isomerase [75].

While the role of Se in the active site of SEP15 and SELM is still a mystery, evidence of the impact of Sec in a similar context was presented in a study of the thiol-disulfide oxidoreductase, glutaredoxin 3 (Grx3) [5]. Through synthesis and analysis of the wild type and three seleno-analogs of Grx3, it was shown that Cys-to-Sec substitution in the active site motif CXXC influenced the redox potential dramatically, leading to an average difference of ~73 mV and an almost two orders of magnitude higher rate of thiol–disulfide exchange reactions [5]. The latter result was supported by studies of a Grx1 seleno-analog [76]. These findings provide suggestive implications toward the function of selenoproteins with similar redox motifs.

Determining the redox potential of selenoproteins in general, and that of SEP15 and SELM in particular, will be a fascinating step forward in our understanding of their cellular functions. Future studies on these ER-resident selenoproteins should be directed to provide supporting evidence for their proposed function in protein folding.

4 Concluding Remarks

The impact of Sec chemistry in the world of protein synthesis has been wide and deep. Sec incorporation via solid-phase peptide synthesis has opened the door to the study of synthetic selenoproteins , while a fuller understanding of the different chemistries of selenols and thiols has enabled the discovery and optimization of selective deselenization in the presence of unprotected Cys residues. The additional development of selenazolidine as a masked Sec precursor allowed for the synthesis of yet longer proteins with non-strategically placed Cys residues. Its incorporation into standard chemical biology techniques can provide access to proteins, both with and without Se, that were previously challenging to synthesize.

In addition, Sec incorporation has afforded a more thorough study of protein folding and activity. Therapeutic peptides, whose Cys-rich sequences’ complicated folding pathways and structures led to notoriously low yields during synthetic production, have seen dramatic improvement of yield and efficiency through Cys-to-Sec substitution methods. Small diselenide molecules with straightforward synthetic schemes have also been shown to enhance protein folding, even when present in catalytic amounts. These findings provide insights not only to Cys-rich peptides in vitro, but also promise to elucidate the role of Sec-containing peptides in vivo, such as the natural ER-resident selenoproteins SELM and SEP15. As the world of selenopeptide chemistry continues to advance, we expect it to promote great strides in the world of chemical biology.