Introduction

The thioredoxin (Trx) fold consists of a four-stranded β-sheet surrounded by three α-helices (Fig. 1) [1]. The Trx superfamily comprises proteins with different functions, characterized by the presence of at least one Trx fold [24]. The main components of the superfamily are the ubiquitous Trxs and glutaredoxins (Grxs) [5], the bacterial disulfide bond forming (Dsb) proteins [6], the eukaryotic protein disulfide isomerase (PDI) and its homologs [7], the thermophilic protein disulfide oxidoreductases (PDOs) [8], and the eukaryotic flavoprotein quiescin-sulfhydryl oxidase (QSOX) [9, 10], all involved in the thiol-disulfide exchange reactions. The superfamily (SCOP: http://scop.mrc-lmb.cam.ac.uk/scop/) also includes glutathione S-transferase (GSH-transferase), which catalyzes the conjugation of GSH to electrophilic substrates, the bacterial Escherichia coli arsenate reductase (ArsC) [11], catalyzing the reduction of arsenate, the glutathione peroxidase (GPX) and the peroxiredoxins (Prxs), which are involved in the reduction of hydroperoxides [12], and finally the chloride intracellular channel (CLIC) proteins and the copper-ion binding protein Sco1 [13, 14].

Fig. 1
figure 1

The Trx fold and its variations in different components of the Trx superfamily. The basic Trx fold of Grxs is reported in a and bf show modifications of Trx fold in other thiol oxidoreductases: b Trx, c A. pernix PDO-N unit, d yeast PDI a domain, e S. solfataricus Bcp1 and f GSH-transferase. Additional secondary structural elements with respect to the basic Trx fold are shown in black. β-sheets are drawn as arrows and α-helices as cylinders

These proteins do not present a high level of sequence similarity, although they show considerable structural similarity thanks to the presence of the common Trx fold. However, multiple variations of this fold can be observed (Fig. 1). Grx is the only member of the family with the basic Trx fold; an additional β-strand and α-helix are inserted at the N-terminus in Trxs [1, 3]. An insertion of secondary structure elements between the second β-strand and the second α-helix is very common in several proteins, such as Prxs, GPXs, ArsC, and DsbA-like enzymes. In the latter two cases, this insertion consists of four or five α-helices. Further modifications as insertions at the C-terminus can be observed in GSH-transferase [15].

Members of the Trx superfamily involved in the thiol-disulfide exchange reaction all have an active site containing the CXXC motif and share the same catalytic mechanism. In the CXXC motif, always located at the N-terminus of helix α1, the N-terminal cysteine, referred to as the nucleophilic cysteine [13], is deprotonated at physiological pH, largely exposed and consequently hyper-reactive. By contrast, the C-terminal cysteine is buried and is usually protonated [13]. Other hallmarks of this subgroup are the following conserved residues: (1) a cis-proline located in the loop region before strand β3 and juxtaposed to the active site, which is implicated in substrate binding [16, 17]; (2) a conserved proline in the middle of helix α1, which introduces a kink in the helix; (3) charged residues in the vicinity of the active site, which are implicated in proton-transfer reactions required for the redox mechanism [18, 19]. The nature of the amino acids between the two cysteines varies considerably among the members of the superfamily, influencing the pKa of the nucleophilic cysteine and consequently the redox properties of these proteins [20]. Indeed, the redox potential ranges from −270 mV of the reductant Trx to −95 mV of the oxidant DsbL [21, 22], leading to several functional differences [23]. A recent work by Ren et al. [23] showed that protein functions are also influenced by the loop containing the cis-proline: modifications in the residue preceding the cis-proline play a crucial role both in the redox properties and in regulating the ability to interact with substrates [23].

The canonical CXXC motif can be modified, as regards the number and position of the cysteines involved in the catalysis with consequent variations in catalytic functions. For example, the C-terminal cysteine can be substituted by other residues, such as in monothiol Grxs [1], the two cysteines can be separated by three/four residues, such as in Sco1 [24], both cysteines can be substituted by other residues, with consequent loss of redox activity, such as in calsequestrin or some PDI homologs [2, 9, 10, 25, 26].

Taken together, these observations highlight the fact that the Trx fold is very versatile. Such versatility makes the proteins of the Trx superfamily suitable to perform several catalytic functions in numerous processes, ranging from protein folding to detoxification and metabolite synthesis [27].

During evolution, the assembly of different Trx fold domains has often been used to build new proteins. In particular, several examples are available where the duplication of the Trx fold has occurred generating multi-domain proteins [14]. However, although the Trx superfamily has received widespread attention in the literature, few studies have made a comparative structural and functional analysis of the multi-domain proteins. Thus, to provide insights into the mutual interaction of different Trx domains within a multi-domain structure and how their combination can influence physiological functions, this review will focus on the most representative proteins of the Trx superfamily containing at least two catalytically active Trx domains: the eukaryotic PDIs [28], the thermophilic PDOs [29] and the hybrid Prxs [30].

An overview of the PDI family

To fold in the oxidative environment of the endoplasmic reticulum (ER) proteins need to form the correct disulfide bonds. For this purpose, the cell produces a large number of proteins, belonging to the protein disulfide isomerase (PDI) family, which catalyze the oxidation and isomerization of disulfide bonds [7, 28]. This protein family comprises PDI and PDI-like proteins, with four members in Saccharomyces cerevisiae and 19 members in mammals so far identified [7] (Fig. 2). The reason for such a large number of PDI family members and how they function together to cooperate to the protein folding is still a mystery. One hypothesis is that each family member has the ability to act on a specific set of substrates and/or that each enzyme catalyzes a precise type of reaction [39, 40].

Fig. 2
figure 2

The human PDI family. Catalytic domains are shown in white, while non-catalytic ones in dark grey. The x and transmembrane regions are also shown in black and light grey, respectively. PDB codes are given only for mammalian and fungal members of the family

PDI family members differ in tissue distribution, substrate specificity, and ability to catalyze dithiol-disulfide exchange reactions [41], but all contain an ER-localization motif and at least one Trx-like domain, which can be either catalytic (hereafter indicated as domain a) or non-catalytic (hereafter indicated as domain b) (Fig. 2). The CXXC motif is a hallmark of the catalytic domains. If it is in the reduced state, substrate disulfide can be reduced (Fig. 3a); if it is in the oxidized state, the disulfide can be transferred to the substrate protein and the PDI active site released in its reduced state (Fig. 3b). The isomerization of wrongly formed disulfides (Fig. 3c) can occur directly or indirectly, via several reduction–oxidation cycles, until the substrate disulfide becomes sufficiently stable to resist reduction by PDIs [7].

Fig. 3
figure 3

Schematic representation of reactions catalyzed by PDI (in white substrates and in black PDI): a reduction, b oxidation, c isomerization. The last reaction can occur directly (C1) or indirectly, via several reduction–oxidation cycles (C2)

Many research efforts have been devoted in recent years to understanding the structure and enzymatic properties of PDIs. Below, the most abundant and best characterized members of the family, namely PDI, Erp57, and Erp72, will be reviewed both from a structural and functional point of view.

PDI

PDI is the best characterized member of the PDI family and is highly conserved in all eukaryotic cells [28]. Besides its role in protein-folding catalysis, PDI operates as a chaperone by preventing aggregation of proteins that do not contain cysteine residues [42], is the β-subunit of propyl-4-hydrolase [43], is a subunit of microsomal triglyceride transfer protein [44] and is involved in the regulation of NAD(P)H oxidase [45].

In vitro PDI catalyzes the reduction, oxidation, and isomerization of disulfide bonds in a large range of substrates [4648], while in vivo it is more likely to act only as an isomerase and as an oxidase [28]. Since the latter activity leaves PDI in the reduced form, a further source of oxidizing equivalents is necessary to complete the catalytic cycle. The proteins responsible for this oxidation in vivo are Ero1-Lα and Ero1-Lβ in mammalian cells [49, 50], and Ero1p [5153] or Erv2p [5456] in yeast. These proteins combine disulfide bond formation to reduction of molecular oxygen, with the help of the cofactor flavin adenine dinucleotide (FAD) [52, 53]. However, it cannot be excluded, as was thought for several years, that GSSG may also provide oxidizing equivalents to PDI. Indeed, GSSG together with GSH forms a redox buffer that helps to maintain the ER redox homoeostasis and can influence the in vivo redox state of PDI [57].

Few PDI natural substrates have so far been identified [5862], even if the promiscuity of the enzyme in vitro is expected to be reflected also in vivo [28].

PDI is a multi-domain protein with two catalytic domains, a and a′, each containing the WCGHC sequence, separated by two non-catalytic domains b and b′. In addition, between b′ and a′ there is a linker region named x, and at the C-terminus a highly acidic extension is present, termed c, which contains the ER-localization motif (Fig. 2).

Catalytic assay experiments on PDI mutant forms and on individual domains or their combinations have provided insights into the contribution of the catalytic domains to the different functions of the enzyme [6369]. These studies demonstrated that domains a and a′ operate independently, in that each catalytic center alone imparts activity to the enzyme [63, 69], even though mammalian and yeast PDI differ in the contribution of each domain to overall catalytic activity. Specifically, in yeast PDI in vitro experiments, where the Ero1p pathway was reconstituted, the a domain functioned mainly as an isomerase, while a′ presented mostly an oxidative activity [70]. These results were also in agreement with previously reported in vivo experiments [69, 71]. More ambiguous results were obtained for human PDI. In this case, in vitro experiments showed that at substrate concentrations near the KM the two active sites presented comparable isomerase activity [63], while at saturating concentration of substrate the isomerase activity was mainly carried out by the a domain [65]. Subsequent experiments which reconstituted the Ero1Lα/PDI oxidative folding system demonstrated surprisingly that both oxidase and isomerase activity were mainly performed by the a′ domain, and that this domain was preferentially oxidized by Ero1Lα [72]. In the same study, the authors showed also that the functional asymmetry of the active sites was not to be ascribed to their intrinsic catalytic properties, but rather to the specific order of the two domains in the full-length PDI [72]; indeed if the position of the two catalytic domains was swapped, domain a became more active than a′ [72].

Insights into molecular determinants of substrate recognition were also obtained, showing that the b′ domain was sufficient for binding small peptides and domains a and a′ were necessary for binding the larger proteins [73]. Within the b′ domain, the proposed ligand-binding site is a small hydrophobic pocket, defined by residues Leu242, Leu244, Phe258, and Ile272 (numbering refers to human PDI). They are all implicated in substrate binding with a major role for Ile272 [74].

These studies were in line with other experiments showing that simple oxidation reactions required only a and a′, simple isomerization reactions required a linear combination of b′ with either a or a′, while complex isomerization reactions were only catalyzed by full length PDI (excluding the c region) [75].

Although several structures of individual PDI domains have been determined (Fig. 2), only the three-dimensional structure of the full-length yeast PDI (PDB code: 2b5e) showed how the individual domains were arranged with respect to one another [76]. Analysis of this structure revealed a monomeric enzyme in which the four domains were organized in a twisted ‘U’ shape, with a and a′ active sites facing each other and domains b and b′ forming a rigid base [76] (Fig. 4). The inner surface of both domains b and b′ comprises the large hydrophobic region involved in substrate interaction [73]. The c region is located at the top of domain a′ pointing toward the external part of the U-shaped molecule [76]. The a and a′ catalytic domains present a structure very similar to Trx, while domains b and b′ adopt some variations with respect to it. Domains a and a′ share with other Trx family members all the structural features and residues important for catalysis. However, one of their unique features is the presence of an arginine residue (corresponding to Arg126 in domain a and Arg471 in domain a′), which is critical for the catalytic function [77, 78]. Indeed, this residue resolves the paradox of the different requirement for a high pKa (when the oxidation of the substrate has to take place) and a low pKa (when the reoxidation of the PDI occurs to complete the catalytic cycle) of the C-terminal cysteine. Specifically, this arginine undergoes a conformational change, from an ‘out’ to an ‘in’ conformation, to destabilize and stabilize the thiolate form of the C-terminal cysteine [77, 78].

Fig. 4
figure 4

Overall fold of yeast PDI. a domain is shown in red, b domain in blue, b′ domain in green, x region in orange, a′ domain in cyan and C-terminal region in magenta. Cysteine residues of catalytic sites are colored in yellow and shown in stick representation

Interestingly, the two active sites were found in the structure in two different redox states: a oxidized and a′ reduced. This finding was in agreement with redox potentials of two active sites (−188 and −152 mV for domains a and a′, respectively) and with theoretical pKa calculations which indicated that the pKa of the N-terminal cysteine of the a domain was slightly higher than that of a′ [79].

More recently, the full-length yeast PDI has been crystallized (PDB code: 3boa) at a different temperature and analysis of this structure has revealed a dimeric arrangement of the enzyme, with the extensive intermolecular contacts between the two monomers primarily mediated by their b′ domains [80]. Comparison with the previous structure shows that in this case a drastic rotation and translation of the a domain takes place, with the result that the two active sites no longer face each other and the substrate-binding site in the b′ domain becomes buried. This conformational change is allowed principally by the presence of a flexible loop connecting the a and b domains [80]. These data suggest that PDI can exist in two different states: the monomeric form represents the ‘active’ state, because both active sites and the substrate binding site are accessible, while the dimeric form represents the ‘inactive’ state of the enzyme, with the two active sites far apart and the substrate binding site buried [80]. Further support for this hypothesis comes from the observation that the recombinant human PDI b′x domain can assume two conformations corresponding to different oligomeric states. In the first conformation, corresponding to a monomeric form, the x region interacts with domain b′, ‘capping’ its hydrophobic site, while in the second conformation, corresponding to a dimeric form, the x region is far removed from the b′ domain, which becomes less buried and can promote the formation of the dimer [8183].

Taken together, these data suggest that PDI is a flexible molecule and this flexibility is important to perform enzymatic activity. In line with this observation, recently reported small-angle X-ray scattering (SAXS) and spectroscopic experiments on PDI from Humicola insolens demonstrated that oxidation of the a′ domain causes a change in the relative orientation of a′ and b′ domains, with a consequent modification in the exposure of the hydrophobic surface responsible for substrate binding [34]. On the basis of these data, it has been proposed that when the a′ domain transfers its own disulfide bond to the substrate, positioned on the hydrophobic surface of the binding site, the enzyme adopts a ‘closed’ conformation, liberating the oxidized substrate [34].

Erp57

Erp57 is another member of the PDI family characterized in depth in recent years, which was shown to efficiently catalyze both in vivo [59, 60, 84, 85] and in vitro [78, 86, 87] disulfide reduction, disulfide isomerization, and dithiol oxidation in substrate proteins [7]. As already observed for PDI, recent experiments suggest that the oxidizing equivalent required for oxidation could be provided in vivo by Ero1Lα [70, 85].

Erp57 is the closest homolog of PDI, sharing with it 27% sequence identity and showing the same domain composition and the same active site motif in both catalytic domains (Fig. 2). Major differences between these two enzymes are located in the c region, which is extremely acidic in PDI, while it contains several lysine residues in Erp57 [28].

Several biochemical studies have allowed the substrate specificity of Erp57 to be identified: it is specific to catalyze the isomerization of non-native disulfide bonds in glycoproteins with unstructured disulfide-rich domains [85, 88]. To meet this challenge, Erp57 interacts with two ER resident lectin-like chaperons, calnexin (CNX) and calreticulin (CRT). CNX is membrane-bound while CRT is a soluble luminal protein [89] and both have been demonstrated to be directly associated with glycoproteins [39, 86, 88, 90, 91]. Erp57 forms with these two lectins distinct 1:1 complexes with dissociation constants in the micromolar range [86, 92, 93]. NMR and mutagenesis studies have identified the b′ domain of Erp57 and the P-domain of both lectins, a proline-rich arm-like domain, as the main players in the formation of these complexes [92, 9496].

Erp57 also plays an important role in the immune system, forming part of the major histocompatibility complex (MHC) class I peptide loading complex, where it stabilizes the complex and facilitates the proper assembly of class I molecules, through the formation of a disulfide-linked adduct with tapasin, a MCH class I-specific chaperone [97, 98].

Mutagenesis studies provided insights into the role of different catalytic domains to enzymatic activity. In particular, the observation that full-length Erp57, with the C-terminal cysteines of both active sites mutated in serines, partially retains reductase activity, while the recombinant a′c fragment, with the same mutation, was inactive, suggested that domains a and a′ were not functionally equivalent [99101]. Redox properties of the catalytic sites of Erp57 were also characterized, showing redox potentials of −167 and −156 mV for domains a and a′, respectively [86]. These values were comparable with redox potentials determined for domains a and a′ of PDI [102], suggesting that these two proteins could possess similar catalytic properties. However, a quantitative comparative analysis of the activity of full-length Erp57 with human PDI revealed that Erp57 reduces and isomerizes less efficiently than PDI [86], probably due to a more efficient interaction of PDI with its substrates [86].

The first structural studies on Erp57 were performed by NMR experiments on the isolated a and a′ domains (PDB codes: 2alb; 2dmm) [103, 104], showing that they present a structure very similar to that of Trx. Subsequently, the crystal structure of the Erp57 bb′ fragment was solved at 2.0 Å resolution (PDB code: 2h8l) [93], and together with NMR experiments, provided important information on the CNX/CRT binding site. Specifically, it was shown that this site was mainly localized on the b′ domain and was characterized by a large patch of positively charged residues, which interacted with the negatively charged residues of the lectin P-domain [93]. These experiments confuted previous studies which indicated a role for the positively charged C-terminus of Erp57 in CNX binding [105]. Residues R282 and K274 of the b′ domain were identified as being mainly involved in this binding, while a minor contribution of K214 of the b domain was also proved [93]. Interestingly, the corresponding surface in the b′ domain of yeast PDI is rich in negatively charged residues, in agreement with the lack of interaction with CNX [93]. These data together with the observation that the b′ domain of PDI is directly involved in substrate binding [73], indicate that the b′ domain of these proteins has an important role in determining enzyme specificity.

Preliminary structural information on full-length Erp57 was derived from SAXS experiments, which allowed a model of the enzyme to be obtained [93]. Superposition of this model with the yeast PDI crystal structure [76] showed that the shape of Erp57 was strikingly similar to the U-shaped molecule of PDI, with the relative position of the domains conserved [93]. Surprisingly, in this model the CNX-binding site was placed far from catalytic thiols, on the outer surface of the base of the U formed by domains b and b′. Thus, it has been suggested that glycoproteins could be properly positioned close to the catalytic sites of the a and a′ domains thanks to the peculiar elongated arm-like shape of the P-domain, which curves around Erp57 [93]. Altogether these studies suggest a mechanism following which the binding of Erp57 to CNX is necessary not only to bind but also to properly position substrates next to the redox active site.

More recently, the crystal structure of the full-length Erp57 in complex with tapasin was solved (PDB code: 3f8u) [106], confirming that the overall arrangement of the four Trx-like domains of Erp57 resembles that observed in yeast PDI [76]. However, small differences in the relative orientations of these domains are observed, which result in a different distance between the two active sites (~34 Å for Erp57 and ~26 Å for PDI). In the complex structure, a disulfide bond is present between the N-terminal cysteine of the active site of the Erp57 a domain and tapasin Cys95 residue. Beyond this covalent bond, a large number of protein–protein interactions, involving both Erp57 catalytic domains, account for the high stability of this heterodimer [106]. It is worth noting that, despite the specificity of Erp57 for tapasin, interacting residues of Erp57 are conserved also in PDI. It has thus been suggested that the lack of interaction of PDI with tapasin could be attributed to the lower distance between its a and a′ active site with respect to that observed in Erp57. Although the aforementioned domain flexibility could allow PDI to have a range of distances and orientation of the catalytic domains, it may well not be enough to reach the right distance for proper interaction with tapasin [106].

Erp72

Erp72, unlike PDI and Erp57, contains five Trx-like domains (Fig. 2), three of which, namely , a and a′, present the WCGHC catalytic motif [107]. The domain is a rather unique feature of this enzyme and presents a higher sequence identity with the a domain (54%) compared to a′ (39%), suggesting that it could derive from duplication of the a domain during evolution [108]. At the N-terminus of the domain, a highly negatively charged sequence is present, which resembles the c region of PDI and has been reported to bind Ca2+ ions and interact with positively charged substrates [109].

Erp72 has been shown to catalyze in vitro the reduction, formation, and isomerization of disulfide bonds [110112] and is supposed to have the same function in vivo [28, 48]. It has been detected in complex with various substrate proteins [62, 113115] and often targets misfolded proteins with their consequent retention in the ER [114116].

Erp72 is more similar to Erp57 (41% sequence identity, excluding domain from the alignment) than PDI (30% sequence identity, excluding domain from the alignment). Even if many of the residues that are implicated in ERp57–CRT interactions are conserved in ERp72 [18], no experimental evidence has been found to suggest a direct interaction between ERp72 and CNX or CRT [28, 84]. However, recent studies demonstrated that Erp72 can functionally substitute in part for Erp57 in knockout cell lines [84].

Although several functional studies have been carried out on this protein, the participation of each catalytic domain in disulfide bond formation and isomerization remains to be determined. However, using site-directed mutagenesis experiments the function of individual domains in the insulin reduction assay system was recently identified [117]. These studies demonstrated that all three catalytic domains participate in catalyzing insulin reduction and, rather than simply having additive effects, act synergistically [117]. However, while the active site of the domain contributes mainly to catalytic activity, the active sites of the a and a′ domains mainly enhance the recognition and binding of substrate in the steady state [117].

To date, few structural studies have been performed on Erp72. In 2006, the NMR structure of the individual catalytic domains of this protein was deposited in the PDB (Fig. 2), although no paper has yet been associated with these structures. Only recently the crystal structure of the central non-catalytic domains (bb′) was solved at 1.92 Å resolution (PDB code: 3ec3) [108]. Analysis of the structure revealed as expected the presence of two Trx domains connected by a short linker. Observation of two perfectly superimposable bb′ fragments in the asymmetric unit together with NMR experiments strongly suggested that the b and b′ domains of Erp72 form a rigid pair and the inter-domain linker is not flexible. Structural comparison of the bb′ fragments of yeast PDI [76] and Erp57 [93] shows that the main differences are observed in the charge distribution on the protein surface. In particular, Erp72 lacks the exposed hydrophobic patch of yeast PDI involved in substrate interaction, thus suggesting that it is unlikely to bind unfolded substrates in the same manner as PDI. Moreover, of the three Erp57 basic residues fundamental for CNX binding, namely Lys214, Arg282, and Lys274, the first two are conserved and correspond to Lys364 and Lys435, while the third is substituted by a threonine. This substitution explains why Erp72 does not bind CNX and CRT and acts independently of the association with these two lectins [28, 84].

SAXS experiments were also performed to determine the overall shape and domain organization of the protein [108]. These studies suggested that the shape of the protein region containing domains a, b, b′ and a′ is very similar to the U-shaped structure of yeast PDI [76] and Erp57 [93, 106], while the additional domain was located above the a domain, resulting in a structure where the three catalytic domains were close to each other on the same side of the protein. However, alternative conformations characterized by different orientations of the a and domains were also possible [108].

PDOs: the discovery of a dual Trx fold in thermophilic microorganisms

While in the past it was largely accepted that disulfide bonds were rarely found in intracellular proteins, recent studies have strongly disputed this belief, demonstrating that intracellular proteins of some thermophilic bacteria and archaea are rich in disulfide bonds [118]. Moreover, a clear correlation between disulfide abundance and maximal growth temperature was also observed [119, 120], suggesting a critical role for disulfide bonds in thermal stabilization [118].

A potential role in disulfide bond formation in the intracellular proteins of thermophilic organisms, and consequently in adaptation to high temperature, has been recently attributed to a new protein family, termed PDO. Members of this protein family are characterized by a molecular mass of about 26 kDa and two Trx folds, one at the N-terminus and the other at the C-terminus, each containing the CXXC active site motif [121]. Some proteins of the family present additional cysteines involved in a structural disulfide bridge [122].

PDOs have so far been identified both in archaea such as Pyrococcus furiosus and Aeropyrum pernix and in thermophilic bacteria, such as Aquifex aeolicus, Dictyoglomus turgidum DSM 6724, and Roseiflexus sp. RS-1 (Fig. 5). Phylogenetic analysis has suggested that they first evolved in the Crenarchaeota, dispersing later into bacteria by horizontal gene transfer events (HGT) via the Euryarchaeota [123]. The preferential HGT generally observed between archaea and hyperthermophilic bacteria, such as Aquifex and Thermotoga [121, 124], further supports this evolutionary hypothesis.

Fig. 5
figure 5

A representation of phyla in the kingdoms of Bacteria and Archaea based on PDOs sequences annotated in the genomes

The first member of the PDO family was isolated in 1995 from the crude extracts of the hyperthermophilic archaeon P. furiosus [125]. This protein, termed PfPDO, showed two active sites, CQYC and a typical Grx sequence CPYC, and thioltransferase activity. It was thus considered a Grx-like protein [125, 126]. The resolution of its crystal structure (PDB code: 1a8l) opened up a completely new scenario, revealing structural details that correlated PfPDO with the multidomain PDI [127, 128]. So far several members of this family have been characterized in detail both at structural and functional level: one from a hyperthermophilic bacterium A. aeolicus [129], and four from the archaea A. pernix [79], P. furiosus [121, 127, 128], P. horikoshii [130], and Sulfolobus solfataricus [122, 131].

Insight into PDO functions

Although the characterized PDOs present a variety of sequences within the redox active site (Table 1), they are all active as dithiol oxidants of synthetic peptides, reductants of insulin disulfides and isomerases of disulfide bond in scrambled ribonuclease [79, 122, 125, 129]. Mutagenesis studies performed on PDOs from P. furiosus (PfPDO), P. horikoshii (PhPDO), and S. solfataricus (SsPDO) have clarified the contribution of each active site to overall catalytic activity, showing that the C-terminal site has a fundamental role in the thiol-transferase activity, whereas both sites are indispensable for isomerase activity, where the two units are presumed to function synergistically [121, 122, 126, 130]. By contrast, mutagenesis studies performed on PDO from A. aeolicus (AaPDO) showed a different behavior; indeed in this case each redox site was able to perform all catalytic activities operating independently, but their contribution was not equivalent. In particular, the C-terminal site was able to perform an activity comparable to the wild-type, while the N-terminal site presented a slightly lower activity [129].

Table 1 Comparison of active site sequences, estimated pKas, solvent accessibility and disulfide conformations for the cysteine residues of PDOs from different sources

A chaperone activity in the presence of ATP was determined for SsPDO [131], following the refolding of an alcohol dehydrogenase isolated from S. solfataricus. This activity, reported also for PDI [132], further supports the correlation between PDO and PDI [132]. In agreement with this property, an ATPase activity was identified both for SsPDO (Pedone E. pers. comm.) and PfPDO [121]. In the latter case, a more detailed characterization showed for this activity a maximum pH around basic values and an optimum temperature of 90°C [121, 133].

FT-IR spectroscopy and molecular dynamic simulation studies, performed at different pHs and temperatures, were used to evaluate the effect of pH and temperature on the stability of PfPDO [134]. These studies demonstrated that at pH 10.0, and at a temperature between 90.0 and 99.5°C, optimal conditions for ATPase activity, the protein undergoes partial unfolding with a concomitant relaxation of the tertiary structure, followed by a reorganization of part of the structure into a new β-conformation. The α-helix regions proved more affected by the increase in pH and temperature than the β-sheets [134].

Very little information is currently available on PDO substrates, while in several cases Prxs have been identified as Trx target proteins [135137]. In correlation with these findings, recently two Prxs of S. solfataricus, bacterioferritin comigratory protein (Bcp)1 and Bcp4 [138], were reported to utilize the reconstituted SsPDO/SsTrx reductase (SsTr) (SSO2416) system for recycling, in place of the common Trx/Tr system [131], thus suggesting their potential role as PDO substrates. A similar recycling system had been previously characterized also in P. horikoshii (PhPDO/PhTr) [130].

Many reports have highlighted the peculiar role of sulfhydryl groups in the oxidative stress response and mainly of the Grx/Trx system in the maintenance of cell redox homeostasis [131, 139]. The in vivo role of PDOs has also been investigated, analyzing the expression of SsPDO under oxidative stress [131]. However, only a very small increase in mRNA and protein levels has been observed in such conditions, suggesting no direct role of SsPDO in the oxidative stress response [131]. In agreement with these data, a recent transcriptomic, proteomic, and chemical reactivity analysis, performed in oxidative stress conditions, failed to show any up-regulation of SsPDO, unlike other antioxidant enzymes [140143]. By contrast, similar studies conducted on P. furiosus showed that PfPDO is one of the most strongly up-regulated proteins in response to cold adaptation from 95 to 72°C [144]. Microarray experiments performed independently by adding elemental sulfur to growing P. furiosus cells also indicated the up-regulation of PfPDO [145].

Recently, in vitro transcription experiments allowed the identification in P. furiosus of a transcriptional repressor of pdo, namely Sulfur Response Regulator (SurR). The surR gene is positioned 132 bp downstream of pdo and is divergently oriented [146]. SurR seems to effect its transcriptional regulation in the absence of elemental sulfur, activating genes that are down-regulated during the primary elemental sulfur response and repressing genes like pdo, that are up-regulated under the same conditions [146].

Altogether, these results reveal a complex scenario for PDOs, demonstrating that they can be finely regulated as shown in the anaerobic P. furiosus or constitutively expressed, as observed in S. solfataricus [131].

Insights into structural features of the PDO family

The resolution of the structure of PfPDO (PDB code: 1a8l; 1.9 Å resolution), AaPDO (PDB code: 2ayt; 2.4 Å resolution), and ApPDO (PDB code: 2hls; 1.9 Å resolution), revealed the presence of two structural units, one at the N-terminus and another at the C-terminus (Fig. 6), each consisting of a typical Trx fold with an additional α-helix inserted at the N-terminus [79, 127, 129]. For all the structures the two structural units superimpose well, despite a rather low sequence identity (about 20%). A homology model of SsPDO has also been built, confirming the general fold observed for the other PDOs [122]. In AaPDO and ApPDO the two active sites [79, 129] were accessible to the solvent and presented similar structural features, with dihedral angles comparable and in agreement with stable disulfide bonds (Table 1) [129]. By contrast, in PfPDO the two active sites showed strikingly different geometrical parameters and solvent accessibilities. Indeed, while the C-terminal site had a stable and exposed disulfide bond [127, 129], as shown in Table 1, the N-terminal site was completely buried and showed unusual dihedral angles, indicating the existence of a strong conformational strain [127]. Surprisingly, crystallographic data showed a greater mobility of the N-terminal segment with respect to the C-terminal one, in agreement with simulation studies conducted at different temperatures [124, 127, 134].

Fig. 6
figure 6

Overall fold of ApPDO [79]. The N-terminal unit is shown in red, while the C-terminal one in cyan. Cysteine residues of catalytic sites are colored in yellow and shown in stick representation

In the Trx superfamily, the pKa values of the nucleophilic cysteines have been demonstrated to be related to the redox potential, hence to the distinct reactivities of the CXXC motif [78, 147, 148]. Thus, a theoretical pKa study for the active site cysteines of the three different members of the PDO family with known crystallographic structure, ApPDO, AaPDO, PfPDO, was carried out [79, 127, 129] (Table 1). In all cases, the pKa values obtained for the N-terminal cysteines for both PDO active sites were found to be lower than those determined for the C-terminal cysteines highlighting the higher reactivity of the N-terminal cysteines. Moreover, while for ApPDO and AaPDO pKa values of the first cysteine range between 9.2 and 9.5 for the N-terminal active site and between 6 and 7.3 for the C-terminal one, in PfPDO the first cysteine of the N-terminal active site shows a significantly higher pKa (Table 1). This result can be attributed to its low solvent exposure and is in agreement with above reported mutagenesis and kinetic studies, which indicated that this site does not perform any reductive or oxidative activity.

Very little is known about the substrate binding site in PDOs [124], unlike PDI and the bacterial Dsbs. In 2006, Ladenstein and Ren built a model for the putative peptide binding site in PfPDO, comparing the crystal packing contacts, which can generally mimic substrate binding interactions [124], and the interaction region between human Trx and peptides from Ref-1 and NFkB [149, 150]. The authors observed that residues around the two PfPDO active sites form two grooves, denominated N and C, which could constitute two substrate binding sites [127]. The N groove is very narrow and deep and is delimited by residues belonging to both Trx units, while the C groove, whose shape is very similar to that of Trx, is much larger and is delimited only by residues of the C-terminal domain [127]. These residues are mainly hydrophobic and strictly conserved in all PDOs structurally characterized to date (Fig. 7). Interestingly, the hydrophobic nature of the C groove is a distinctive feature also of the PDI b′ domain [74].

Fig. 7
figure 7

Sequence alignment of the characterized PDOs performed with ClustalW2. PhPDO: Pyrococcus horykoshii PDO; PfPDO: Pyrococcus furiosus PDO; AaPDO: Aquifex aeolicus PDO; ApPDO: Aeropyrum pernix PDO; SsPDO: Sulfolobus solfataricus PDO. Residues of the CXXC motif are boxed, residues constituting the N groove are bold, while residues forming the C groove are underlined

Similarities and differences between PDI and PDO: is PDO an ancestor?

In light of the findings reported above, PDOs and PDI clearly present many similarities. Indeed, all these enzymes are able to reduce, oxidize, and isomerize disulfide bridges with high catalytic efficiency. Moreover, they show two catalytically active Trx-like domains and present a hydrophobic region for substrate recognition. However, a detailed comparison also highlights the presence of major differences. First of all, although in PDI it is clearly recognized that each active site has a different role in catalytic activity, the data available on PDOs do not define the same role for each active site of every member belonging to the family. Furthermore, these enzymes present a different structural organization; in particular, in PDOs the two Trx units are packed together, whereas in PDI they are structurally separated [8], although the distance between the two active sites is comparable (about 20 Å). Finally, in PDOs the substrate binding site is located in the same domain as the active site, whereas in PDI this region is situated in an additional non-active Trx domain. Altogether these data suggest that PDOs could be considered precursors of PDI, representing a simpler version of the eukaryotic enzyme, even though the observed differences in these two biological systems indicate very different mechanisms of action.

Heterodimerization of redox-active Trx fold: hybrid Prxs

The variability of Trx-like proteins can also be obtained by the fusion of genes encoding different Trx folds [30]. The hybrid Prxs are a peculiar subgroup of the Prx family [151], in which the Prx domain is fused with a Grx domain. These proteins represent a solution adopted by different bacteria to optimize hydroperoxide reduction [152156].

Prxs are ubiquitous enzymes that degrade hydroperoxides and alkyl hydroperoxides to defend the cells from oxidative stress, and in eukaryotic cells they also play a key role in regulating H2O2-mediated cell signaling [157, 158]. To date, a large number of crystal structures of Prxs have been solved, showing that they all contain a canonical Trx fold with additional secondary structure elements, such as an extension of the N-terminal region, an insertion between β2 and α2, and sometimes an extension of the C-terminal region [12]. Prxs can be divided into 2-Cys and 1-Cys type based on the number of cysteine residues involved in the catalysis [157]. The first step in the catalytic mechanism, common to 2-Cys and 1-Cys Prxs, is the nucleophilic attack of a conserved cysteine located in the N-terminal region, called the peroxidatic cysteine (CPSH), on the peroxide and its consequent oxidation to cysteine sulphenic acid (CPSOH). In 2-Cys Prxs the CPSOH resolution depends on a second redox-active cysteine, termed the resolving cysteine (CRSH). Through a conformational change, from a fully folded to a locally unfolded form, a disulfide bond is formed connecting the two redox-active cysteines located in the same monomer (atypical 2-Cys Prxs) or in two different subunits (typical 2-Cys Prxs). In 1-Cys Prxs the CRSH is absent and CPSOH is regenerated with a mechanism not yet well elucidated. In this case, the CPSH is stabilized by hydrogen bond formation with a conserved histidine, which prevents its overoxidation to sulfinic and sulfonic acids [157].

The structure of the active site is highly conserved among Prxs. Indeed, the CPSH is always located in a loop-helix region exposed to the solvent and surrounded by three conserved residues: Pro38, Thr42, and Arg112 (the numbering refers to Bcp1 from S. solfataricus) [151]. The first residue restricts the solvent accessibility and protects the reactive sulfenic acid intermediate from further oxidation, Thr42 facilitates the proper position of the CPSH, allowing an unidentified catalytic base to extract a proton and finally Arg112 lowers the pKa of CPSH.

Different thiol redox systems are used to regenerate Prxs, the main one being the flavoprotein Tr coupled to Trx [157]. A variety of other recycling systems [159, 160] have also been described, such as the Alkyl hydroperoxide reductase (AhpF), in which AhpF represents the fusion between Tr and Trx domains and regenerates the Prx AhpC [161, 162], the previously described Tr/PDO system associated to the reduction of Bcps in S. solfataricus [138, 151], and the GSH reductase/GSH/Grx involved in the regeneration of Prx from Poplar [163].

Grxs are small (9–15 kDa) thioltransferases characterized by a Trx fold and an active site containing a CXXC motif in dithiol Grxs or a CXXS motif in monothiol Grxs [1, 164]. They specifically use GSH for their regeneration and catalyze the reduction of proteins with disulfide bridges and GSH-containing mixed disulfide bonds. Their role spans a great number of functions related to defense against oxidative stress, but also to protein folding, sulphur assimilation, coordination of [2Fe–2S] cluster, regulation of cellular differentiation, and transcription in eukaryotic cells [164].

In the hybrid Prxs, the Prx domain is always localized at the N-terminus, while the Grx domain is located at the C-terminus and is involved in the regeneration of the Prx domain. These proteins have so far been characterized in the ancestral microorganism Chromatium gracile [152] and in pathogenic bacteria such as Vibrio cholerae [155], Haemophilus influenzae [153, 154], and Neisseria meningitidis [156] (Fig. 8). In addition, genomic analysis of other pathogenic bacteria and cyanobacteria highlights both the diffusion of these types of proteins among pathogens and their ancestral role in the detoxification from hydroperoxides (Fig. 8). The multiple sequence alignment, reported in Fig. 8, shows that in all these proteins the Prx domain resembles the Type II Prx [151], while the Grx domain resembles E. coli Grx3 [166]. Hybrid Prxs can be divided into two subgroups on the basis of the number of conserved cysteines in the Prx domain: members of the first subgroup possess two conserved cysteines, while members of the second have only one conserved cysteine (Fig. 8).

Fig. 8
figure 8

ClustalW2 multiple alignments of hybrid Prxs from C. gracile [165], V. cholerae AAF95778, N. meningitidis NP_273984.1, and H. influenzae AAC22230.1 with homologous hybrid Prxs from pathogens (Yersinia aldovae ZP_04621281.1; Serratia proteamaculans YP_001480997.1, Erwinia pyrifoliae YP_002647197.1) and cyanobacteria (Anabaena variabilis Q3M358; Synechoccus sp YP_002370726.1), with Type II Prx YP_353982.1 from R. sphaeroides and Grx3 NP_418067.1 from E. coli. Boxed regions indicate the Prx and Grx domains; the two regions were separated by a linker. The active site cysteines of Prx domains are in bold, while the conserved CXXC motif of Grx domains is underlined

The only proteins of the first subgroup to be characterized are from C. gracile (CgPrx), V. cholerae (VcPrx), and N. menigitidis (NmPrx). CgPrx was demonstrated to be involved, together with the NADH, a glutathione amide reductase (GAR) and a glutathione amide (GASH) molecule, in the detoxification of H2O2 and small alkyl hydroperoxides of C. gracile, an anaerobic prototroph sulphur-oxidizing bacterium [152]. This protein contains in the Prx domain two cysteines in position 50 and 75, but no direct information is available on their catalytic role.

The first information on the catalytic mechanism of this subgroup comes from a kinetic study performed on VcPrx [155]. This protein reduces lipid hydroperoxides such as linoleic hydroperoxide, suggesting its action in vivo as a lipid hydroperoxide peroxidase, but unlike CgPrx, the recycling system is GSH supported. VcPrx has been shown to exist as a monomer and to function as a Type II 2-Cys atypical Prx, with Cys51 and Cys77 acting as CPSH and CRSH, respectively. A direct exchange of reducing equivalents between the Prx active site and that of Grx is thought to perform the complete catalytic mechanism.

More detailed information is available on the catalytic mechanism of the second subgroup, thanks to the crystal structure resolution of the hybrid Prx from H. influenzae (HiPrx) (PDB code: 1nm3) [153] (Fig. 9). HiPrx has two redox active sites: one in the Prx domain, which contains a unique cysteine (Cys49), corresponding to CPSH, and the other in the Grx domain containing two cysteines, Cys180, and Cys183 in the CXXC motif [154]. Analysis of the crystal structure reveals that the Prx domain (residues 3–162) presents the canonical Trx fold with the addition of several insertions common to other Prxs, while the Grx domain (residues 171–214), joined to the Prx domain through a linker loop, is characterized by the basic Trx fold (Fig. 9). The protein presents a tetrameric organization, which is achieved mainly by two strong subunit contacts: the first is between two Prx domains and the second between two Grx domains. In this structural organization the active sites of the Prx and Grx domains of two different monomers come close to each other, allowing the regeneration of the peroxidase activity of the enzyme. Indeed, after the oxidation of the CPSH, a direct electron transfer may occur from the CXXC redox site of the Grx domain of one monomer to the CPSOH of the Prx domain of another monomer.

Fig. 9
figure 9

Overall fold of HiPrx. The Prx domain is shown in green, the linker region in blue and the Grx domain in magenta. The cysteine residues of catalytic sites are colored in yellow and shown in stick representation

The same quaternary organization has also been hypothesized for NmPrx [156] belonging to the first group of hybrid Prxs. The catalytic activity of the full length enzyme and of the two separated domains was tested, showing the better efficiency of the fused enzyme with respect to the reconstituted system. These results indicate that the hybrid enzyme can optimize both peroxidase activity and the reduction reaction to recycle the Prx domain [156].

Taken together, these data show that the aforementioned pathogens have acquired a key enzyme to defend themselves against both the attack of the human antimicrobial system and the oxidant environment, such as the nasopharynx, where H. influenzae and N. meningitidis can localize.

Concluding remarks

Trx fold is a widespread and versatile protein scaffold. Various insertions are possible on this structural theme originating different proteins, which span from the simple Grxs to the more complex GPXs or PDIs. The different structural complexity of these proteins reflects a variety of catalytic functions which range from dithiol-disulfide exchange reactions to hydroperoxide reduction. Analysis of the sequences of Trx-like proteins in all the kingdoms of life indicate that while most members contain just one copy of the Trx fold, some classes contain multiple copies. PDI represents the most interesting example of the presence of multiple copies of the same Trx fold in a single protein. Within the overall multi-domain organization, each catalytic domain specializes in a different dithiol-disulfide exchange reaction, while the non-catalytic domains are simply involved in substrate recognition. A simpler form of PDI is represented by the prokaryotic PDOs, in which there are only two Trx units packed together, and such a specific division of the functions is not possible. Finally, within the same multi-domain structure, combinations of different Trx domains with different functions may also be found. The hybrid Prxs are an example in which a Prx, involved in hydroperoxide reduction, is fused with a Grx, responsible for its regeneration.

Taken together these examples show that generally the presence of two or more catalytically active Trx fold in the same protein results either in the possibility to perform different catalytic functions or to fine tune the substrate specificity, thus representing a very useful strategy to improve enzyme catalytic performances.