Keywords

1 Introduction

Heparan Sulfate proteoglycans (HSPGs) are ubiquitous components of the extracellular matrix (ECM), where they mediate diverse structural and signaling interactions between cells and proteins of the ECM [1]. HSPGs are comprised of a core transmembrane, membrane-anchored, or extracellular protein attached to one or more chains of the glycosaminoglycan (GAG) polysaccharide heparan sulfate (HS). Interactions between HSPGs and their binding partners primarily occur via the HS chains which decorate the core HSPG protein [2].

The chemical composition of HS is complex and dynamically regulated in response to stimuli via a process of continual turnover [3,4,5]. HS composition has been shown to vary in relation to development [6,7,8], cancer stage [9,10,11], and general age [8, 12]. While biosynthesis of HS is a multistep process involving the concerted action of a host of polymerases, sulfotransferases, and epimerases [13], breakdown of HS in mammals is primarily carried out by a single enzyme – heparanase (HPSE) [1415].

HPSE is an endo-acting glycoside hydrolase , which cleaves within long HS chains to release product fragments of HS ~5–7 kDa in size [14]. The HS degrading activity of heparanase is essential for ECM remodeling, affecting diverse processes such as inflammation, angiogenesis and cell migration [16,17,18]. HPSE activity can also release growth factors sequestered within networks of HS, which subsequently promote angiogenesis and wound healing [19]. Whilst normal HPSE function is essential for physiological processes which involve ECM remodeling, the HS degrading capability of HPSE can also be co-opted by cancerous cells to promote malignant growth and dissemination. Accordingly, upregulation of heparanase is a hallmark of aggression and metastasis in a wide range of cancers [20,21,22,23,24].

A full summary of the many functions of HPSE in health and disease is beyond the scope of this article, and will be covered elsewhere in this book. Instead, we aim here to provide a structure/function-centric review of HPSE, drawing from insights gained from crystal structures of HPSE and its related proteins. From these, we hope to provide the reader with an appreciation of the structural features that underlie the many biological and biochemical insights obtained from decades of research on HPSE.

2 Heparan Sulfate – The Biochemical Basics

Chemically, HS is a linear glycosaminoglycan polysaccharide comprising of alternating 1,4 linked units of hexuronic acid (HexUA) and glucosamine (GlcN) [25]. HS chains can display high complexity due to the number of permutations possible for the core HexUA and GlcN building blocks. The HexUA of HS can be either β-D-glucuronic acid (GlcUA) or α-L-iduronic acid (IdoUA), and GlcN can be either N-acetyl-α-D-glucosamine (GlcNAc) or N-sulfo-α-D-glucosamine (GlcNS). These core residues are further decorated by varying degrees of O-sulfation (Fig. 5.1a).

Fig. 5.1
figure 1

(a) Chemical structures of HexUA and GlcNX building blocks of HS, with possible sites of sulfation shown. (b) Representative HSPG illustrating the domain structure of HS chains, and the predominant disaccharide units found within NA and NS domains. Mixed NS/NA domains separating NS and NA domains have not been shown here

Biosynthesis of HS is non-templated, allowing HS composition to vary substantially along a single polysaccharide chain (typical HSPGs contain HS chains between 40–300 sugar units (20–150 nm) in length) [1]. Variations in HS structure occur across broad macromolecular regions, leading to the formation of N-acetyl (NA) domains (characterized by poorly sulfated GlcNAc-GlcUA repeats) and N-sulfo (NS) domains (characterized by highly sulfated GlcNS-IdoUA repeats), separated by mixed NS/NA domains (Fig. 5.1b). This structural heterogeneity is crucial for HS function, enabling a single polysaccharide chain to interact with a host of different binding partners. HS heterogeneity is also of central importance for its breakdown by HPSE. As will be further discussed below (Sect. 5.4.2.), cleavage of HS by HPSE is limited to only certain GlcUA residues within the sugar chain, depending on the local sulfation pattern around the target site.

3 Historical Developments in HPSE Research

3.1 Identification of a Specific Heparan Sulfate Degrading Enzyme

The existence of a specific mammalian HS degrading factor was first demonstrated in 1975 by Ogren and Lindahl [26], and Höök et al [27], who described the isolation of enzyme preparations from mouse mastocytoma and rat liver respectively, which were capable of degrading heparin and HS to low molecular weight fragments (heparin is structurally similar to highly sulfated HS). These studies were closely followed by reports of similar heparan sulfate degrading activities in a number of different cell and tissue types (Vlodavsky et al., Chap. 1 in this volume).

Heparan sulfate degrading activity in platelets was first demonstrated by Wasteson et al., who found that cultured human glial cells exposed to platelet lysates released low molecular weight HS into their culture medium [28]. Similar HS degrading activity was subsequently identified in placental tissue by Klein and von Figura [29]. Nicolson and coworkers demonstrated that B16 mouse melanoma cells utilized a HS degrading enzyme to assist with breakdown of ECM like barriers in vitro [30] , and that the HS degrading capabilities of B16 subpopulations positively correlated with their metastatic potential in vivo [31]. This direct relation between heparanase activity and metastatic potential in cancer cells was further demonstrated by Vlodavsky et al., who showed that the poorly metastatic T-lymphoma cell line Eb and its spontaneous highly metastatic variant ESb differed strongly in their ability to degrade HSPGs [32].

A number of observations from these early studies have since become recognized as hallmarks of HPSE activity. Chemical analysis of enzymatically degraded HS products found that cleavage occurred only at the glucuronic acid of HS, not at the glucosamine, indicating that the responsible enzyme was a glucuronidase [26, 29, 33]. Enzymatic HS cleavage was also found to be limited, leading to the formation of intermediate-sized oligosaccharide products resistant to further degradation, consistent with an endo-glucuronidase that targets specific HS sites [26, 28, 30,31,32]. Although commonalities between these early studies indicated researchers were studying the same enzyme activity, it would take more than a decade for the enzyme responsible to be identified unambiguously.

3.2 Isolation of Heparanase Enzyme and Cloning of the HPSE Gene

The identity of the HS degrading enzyme was controversial for a number of years, with proteins ranging from 8 kDa to 137 kDa mass being reported as possessing HPSE activity [34,35,36]. These discrepancies were resolved in the late 1990s, following several independent reports describing the purification of the same HS degrading protein from various sources. Goshen et al. first reported the purification of a ~50 kDa HS degrading enzyme from human placenta [37], followed by Freeman and Parish, who isolated an enzyme of similar size and biochemical profile from platelets [38]. These reports were closely followed by seminal studies from Toyoshima and Nakajima, Vlodavsky et al., Kussie et al. and Hulett et al., who all carried out peptide sequencing of the isolated HS degrading protein, and used this information to identify and clone the responsible HPSE gene [39,40,41]. These groups all noted the strange observation that whilst the HPSE gene encoded for a ~65 kDa protein; purified HPSE appeared to be ~50 kDa in size, with its N-terminus apparently beginning at Lys158. Furthermore, expression of the full HPSE gene was found to be required for activity, with expression of the sequence corresponding to the ~50 kDa subunit alone failing to endow cells with HS degrading activity [41].

The discrepancy between HPSE gene and protein size was resolved by Fairbanks et al., who demonstrated the existence of a previously undetected 8 kDa subunit in HPSE purified from platelets [42]. This 8 kDa subunit was found to tightly associate with the 50 kDa subunit, only being separable under denaturing conditions, indicating the existence of a non-covalently associated heterodimer. MALDI-TOF analysis identified the 8 kDa subunit of HPSE as Gln36-Glu109, corresponding to an N-terminal fragment encoded by the HPSE gene. Based on these results, Fairbanks et al. proposed the now widely accepted maturation pathway of the HPSE protein. HPSE is initially expressed as a single chain pre-proenzyme (pre-proHPSE), comprising an N-terminal signal peptide (Met1-Ala35), followed by the 8 kDa (Gln36-Glu109) and 50 kDa (Lys158-Ile543) subunit sequences, separated by a 6 kDa linker peptide (Ser110-Gln158). Loss of the signal peptide from pre-proHPSE following signal peptidase cleavage [43] leads to formation of the inactive HPSE proenzyme (proHPSE). Active HPSE is only produced following proteolytic excision of the 6 kDa linker peptide from proHPSE, leading to formation of the mature enzyme, which exists as a non-covalent heterodimer of 50 kDa and 8 kDa subunits (Fig. 5.2).

Fig. 5.2
figure 2

HPSE biogenesis pathway . Steps pertinent to baculoviral expression of pro- and mature HPSE in insect cells are highlighted in red

3.3 Production of Homogenous Recombinant HPSE

Structural biology studies rely on the availability of large amounts of purified homogenous protein. In this regard, the production of recombinant HPSE presents an unusual challenge, due to the complex process of HPSE maturation. Recombinant expression of HPSE in mammalian cells often leads to a mixture of both 65 kDa proHPSE as well as mature HPSE heterodimer [39, 41], rendering these expression systems unsuitable for structural biology purposes.

Protein production in insect cells using the baculovirus expression vector system (BEVS) [44] has proven an invaluable tool for the study of recombinant HPSE. BEVS is a bipartite gene expression platform utilizing recombinant baculovirus for gene delivery and cultured insect cells for protein production. Because insect cells are eukaryotic animal cells (albeit non-mammalian cells), gene expression using BEVS usually allows for the faithful reproduction of native mammalian protein folds. Serendipitously, insect cells lack the cellular apparatus to carry out proHPSE maturation, thus precluding the production of pro- and mature HPSE mixtures [41]. Although this feature of insect cell protein production provides an obvious route towards proHPSE, the problem of accessing mature HPSE was not addressed until McKenzie et al demonstrated that co-expression of the 50 kDa and 8 kDa subunits under two different promotors led to co-translational association of the two subunits, allowing for direct access to mature HPSE [45] (Fig. 5.2).

A markedly different approach to tackling the HPSE linker problem was reported by Nardella et al [46], who engineered HPSE to replace the linker region with much shorter sequences. Expression of engineered proHPSE in which the linker sequence was replaced by either an artificial (GSGSGS) repeat or the analogous sequence from a Hirudinaria manillensis hyaluronidase (AFKDKTP) gave a single chain variant of HPSE with activity comparable to wild type enzyme. The key role of the 6 kDa linker peptide in controlling (pro)HPSE activity is discussed further below (Sects 5.4.3 and 5.5.1).

4 Heparanase – Insights from Crystal Structures

We reported the crystal structure of mature HPSE in 2015, revealing the overall 3-dimensional protein structure of HPSE, and also (via several ligand complexes), the mode of interaction between HPSE and its substrates [47]. This was followed by the solving of the proHPSE crystal structure in 2017 [48]. In this section of the review, we aim to provide an overview of the main insights into the HPSE structure from these two studies, and how the structural features of HPSE relate to its biochemical and biological properties.

4.1 3-Dimensional Structure of Mature HPSE

Several features were immediately apparent upon initial solving of the HPSE crystal structure (PDB accession code: 5E8M). The HPSE protein comprises two major domains: a predominant (β/α)8 barrel domain, flanked by a smaller β-sandwich domain. The 8 kDa HPSE subunit contributes a single β-sheet towards the β-sandwich domain, as well as the first β-α-β elements of the (β/α)8 domain, with the rest of the protein structure being contributed by the 50 kDa subunit. Such a division of structure between the 8 kDa and 50 kDa subunits of HPSE was postulated by Nardella et al., based upon the predicted secondary structure elements within the HPSE sequence [46]. The (β/α)8 barrel domain is commonly found in glycoside hydrolases, and usually contains the active site of these enzymes [49]. Visual inspection of the (β/α)8 barrel of HPSE revealed a clear cleft in the domain, spanning ~10 Å in diameter, suggesting a binding site for chains of HS. This cleft was lined with a number of basic Arg and Lys residues, which are commonly found in HS interacting protein domains [50,51,52,53,54,55]. Notably, HS binding “domains” (HBDs) I (Lys158-Asp162) and II (Pro271-Met278), previously identified by Levy-Adam et al [56], were found to lie around the HPSE binding cleft, supporting a role for these two domains in facilitating HPSE-HS interactions (Fig. 5.3).

Fig. 5.3
figure 3

Three-dimensional structure of unliganded human HPSE, showing ‘top’ (left) and ‘side’ (right) views. The 50 kDa subunit is colored in blue, and the 8 kDa subunit is colored in yellow (colors correspond with Fig. 5.2). Two domains can be discerned in the HPSE structure, a (β/α)8 barrel domain containing the HS-binding cleft, and a smaller β-sandwich domain. HBDs I and II identified by Levy-Adam et al [56] are highlighted in pink, other basic residues around the binding cleft are highlighted in cyan. A putative NLS sequence in the β-sandwich domain is highlighted in red. N-glycans are shown in green

N-glycosylation of HPSE is known to be essential for its proper cellular trafficking, and its secretion by cells into the extracellular space [57]. Of the six N-glycosylation sites predicted by analysis of the HPSE sequence, five were visible in the crystal structure of unliganded HPSE, albeit endoglycosidase H digestion carried out prior to protein crystallization meant that most of these were only visible as a single N-linked GlcNAc.

One of the more curious findings of HPSE biology has been the discovery of HPSE in cell nuclei, where it appears to co-localise with highly transcribed euchromatin regions of the genome [58]. Nuclear HPSE can alter the expression of tumor-promoting genes such as matrix metalloproteinase 9 (MMP-9), vascular endothelial growth factor (VEGF) and hepatocyte growth factor (HGF) [58,59,60,61]. Two putative nuclear import signals were noted by Schubert et al. in the HPSE sequence: residues 271–277 (PRRKTAK) and residues 427–430 (KRRK) [62]. While the Pro271-Lys277 sequence forms an alpha helix (and also corresponds to HBD II), Lys427-Lys430 appears in the HPSE crystal structure as a disordered loop near the β-sandwich domain of HPSE. Lack of secondary structure renders this loop-free to interact with the importin machinery involved in nuclear trafficking [63] and is thus consistent with a role for Lys427-Lys430 as a nuclear import signal .

4.2 Structural Insights into HPSE Substrate Interactions

The defining feature of HPSE mediated HS cleavage is the high degree of sequence discrimination displayed by HPSE, rendering only certain sites in a HS chain susceptible to enzymatic attack. This behavior is in marked contrast to the bacterial heparin lyases, which carry out a much more complete breakdown of HS with little regard for the sequence of the sugar chain [64,65,66]. The high specificity of HPSE cleavage was noted as early as the 1970s, with observations that HPSE mediated cleavage of HS produces products intermediate in size between the initial substrate and fully depolymerized HS, and that these oligosaccharide products are resistant to further hydrolysis by HPSE [26, 28, 30,31,32].

Although sulfation of HS substrates had long been suspected to be important for recognition and cleavage by HPSE, early studies on this topic were hampered by a lack of pure enzyme preparations and chemically defined HS substrates. In this light, the pioneering 1999 study by Pikas et al. on HPSE cleavage site specificity stands as an impressive milestone given the state of HPSE research at the time [67]. Subsequent advances in cloning and recombinant expression of purified HPSE, and the advent of chemoenzymatic HS synthesis have contributed to several reassessments on this topic (summarized in Table 5.1) [68,69,70,71]. Whilst studies of HPSE cleavage site specificity agree on a few central points: that HPSE is an endo-β-D-glucuronidase, and that sulfation around the HPSE target site is essential for cleavage, the finer details of substrate recognition, especially regarding the specific sulfation patterns required for cleavage, have been a source of disagreement.

Table 5.1 Summary of studies on HPSE-HS cleavage site specificity

Structural biology can help to address such questions regarding enzyme/substrate specificities, via the direct visualization of enzyme-substrate complexes. Given the heterogeneous nature of HS itself, the development of specific chemoenzymatically driven HS oligosaccharide synthesis by Petersen and Liu was a crucial foundation for our work to characterize well-defined HPSE-HS complexes [70].

We utilized 3 distinct synthetic, commercially-available, HS oligomers with different sulfation states to probe HPSE-HS interactions with both non-sulfated and sulfated substrates. M04 S00a and M04 S02a contain no sulfates and 1 N-sulfate, respectively and are not HPSE substrates. In contrast, M09 S05a contains 4 N-sulfates and 1 O6-sulfate, endowing this oligosaccharide with a consensus HPSE cleavage site (Fig. 5.4a). Soaking crystals of HPSE with these defined HS oligosaccharides enabled the capture of HPSE-HS complexes in crystallo, allowing the molecular basis for interactions between HPSE and its substrates to be mapped (Fig. 5.4b).

Fig. 5.4
figure 4

(a) HS and heparin oligosaccharides used to obtain ligand complexes with HPSE. Carbohydrate symbol nomenclature as in Fig. 5.1. M09 S05a contains a consensus HPSE cleavage site – highlighted in the red box, with the cleaved bond indicated by the red arrow. pNP - para-nitrophenol. (b) Ribbon and surface figure of an M04 S00a oligosaccharide bound within the active site cleft of HPSE (grey sticks). HBDs and other basic residues around the HPSE binding cleft are highlighted pink and cyan respectively

4.2.1 HPSE Interactions at the −1 Subsite

The −1 subsite of a glycosidase enzyme is the position occupied by the sugar which directly undergoes glycosidic bond cleavage by the enzyme [72]. In all HPSE-HS complexes we obtained (PDB accession codes: 5E97, 5E98, 5E9B; complexes with M04 S00a, M04 S02a and M09 S05a respectively), the −1 subsite was occupied by a GlcUA, making identical interactions to the enzyme active site in all cases, illustrating the invariant nature of GlcUA binding at this position. Glycosidases such as HPSE utilize two key catalytic residues to facilitate substrate hydrolysis, a nucleophile and a general acid/base (detailed reviews of glycosidase mechanisms can be found in Refs. 73,74,75,76,77). GlcUA at the −1 subsite of HPSE positions its anomeric center proximal to the catalytic residues Glu343 (nucleophile) and Glu225 (acid/base), in a position ready to undergo attack by the enzyme.

The HPSE −1 enzyme subsite is also characterized by a dense network of H-bonding interactions, made to the C6 carboxylate of the GlcUA from Gly349, Gly350, and Tyr391. These H-bonding interactions appear to be highly conserved amongst HPSE and its homologs [7879], and likely function as a specificity filter to recognize and bind GlcUA over superficially similar sugars such as glucose (Fig. 5.5).

Fig. 5.5
figure 5

HPSE-HS interactions at the −1 enzyme subsite with M04 S00a. For clarity, only the −1 subsite ligand atoms have been shown

4.2.2 HPSE Interactions at the −2 Subsite

Whilst −1 subsite interactions in HPSE were observed to be invariant between M04 S00a, M04 S02a and M09 S05a complexes, differences at the −2 subsite of HPSE could be discerned, highlighting the interactions employed by HPSE to recognize different HS sulfation patterns. M04 S00a, which contains no sulfation and is not cleaved by HPSE, places its −2 GlcNAc N-acetyl moiety near residues Ala388-Tyr391 and Asn64, making direct H-bonds to Tyr391, Asn64, and an ordered water molecule. M04 S02a, which differs from M04 S00a by the presence of an N-sulfate, places its −2 GlcNS in the same orientation as the GlcNAc of M04 S00A. However, the larger N-sulfate of GlcNS can make an additional H-bonding interaction to the backbone amide of Gly389, thus rationalizing the preferred interaction of HPSE with GlcNS at the −2 subsite.

The role of O6 sulfation at the −2 subsite was probed by the M09 S05a complex, which showed that the O6 sulfate of GlcNS(6S) was placed towards the ‘upper’ portion of the −2 subsite, proximal to some of the basic residues lining the substrate binding cleft (Lys158 and Lys159). Although we could not observe ordered interactions between O6 sulfate and these basic residues, non-directional electrostatic interactions likely play a role in stabilizing the −2 subsite complex between HPSE and O6 sulfated HS substrates (Fig. 5.6).

Fig. 5.6
figure 6

HPSE-HS interactions at the −2 enzyme subsite with M04 S00a, M04 S02a, and M09 S05a. For clarity, only the −2 subsite ligand atoms have been shown

4.2.3 HPSE Interactions at the +1 Subsite

The primary disadvantage of employing HS oligosaccharides to generate HPSE complexes is the propensity of the enzyme to turn over substrates that match the requirements for HS cleavage. Thus whilst M04 S00a and M04 S02a were observed in crystallo to place non-hydrolysed pNP groups at the +1 subsite of HPSE, the presence of a consensus HPSE cleavage site in M09 S05a led to the +1 subsite of the enzyme in this complex to be poorly occupied, due to enzymatic cleavage of the aglycon fragment.

To circumvent this problem, we turned to heparin, a close structural analog of HS, and a known inhibitor of HPSE activity. Soaking HPSE crystals with a heterogeneous heparin dp4 oligosaccharide (obtained through heparin lyase cleavage of polymeric heparin; Fig. 5.4a) yielded a structure with interpretable heparin dp4 electron density within the HPSE active site cleft (PDB accession code: 5E9C). This observed density likely corresponded to a minor component of the heparin dp4 mix, as it was substantially weaker than the electron density observed for the pure HS oligosaccharides. However, this heparin dp4 density spanned the −2, −1 and (crucially) +1 positions of the HPSE active site, thus providing insight into the nature of HPSE +1 subsite interactions.

HPSE +1 subsite interactions with heparin dp4 were broadly similar to those observed at the −2 subsite with M09 S05a, except the helical nature of HS and heparin substrates reversed the roles of N- and O6 sulfates at the +1 subsite. Analogous to the role of −2 subsite N-sulfate, we observed H-bonds from +1 subsite O6 sulfate to the sidechain and backbone amide of Gln270. Electron density for the +1 subsite N-sulfate of dp4 was too poor to be observed directly. However, a + 1 subsite N-sulfate could only plausibly be modeled towards the ‘top’ of the HPSE binding cleft, in position to make electrostatic contacts with the basic residues lining this region (Arg303 and Arg232) (Fig. 5.7).

Fig. 5.7
figure 7

HPSE-HS interactions at the +1 enzyme subsite with heparin dp4. For clarity, only the +1 subsite ligand atoms have been shown

Taken together, the combined structural data from complexes with M04 S00a, M04 S02a, M09 S05a, and heparin dp4 indicate that HS sulfates in the ‘upper’ portion of the HPSE binding cleft (−2 subsite O6 sulfate, +1 subsite N sulfate) electrostatically interact with the basic residues around the cleft. In contrast, sulfates ‘lower’ in the HS-binding cleft appear to make direct H-bonding interactions with HPSE residues and ordered water molecules. Our structures indicate a ‘dual mode’ of interaction between HPSE and its substrates, with the ‘lower’ H-bonds likely acting as specificity filters for sulfation (due to the directional nature of H-bonding), while the ‘upper’ electrostatic interactions stabilize the binding of HS within the active site cleft (Fig. 5.8).

Fig. 5.8
figure 8

HPSE-HS interactions across the −2, −1 and + 1 subsites of the HPSE binding cleft, as mapped by complexes with HS oligosaccharides and heparin dp4. The site of enzymatic cleavage is shown (//). H-bonding interactions between HPSE and HS are illustrated in blue. Basic residues in place to make electrostatic interactions with HS substrates are shown in red. Nuc. – catalytic nucleophile (Glu343). (a./b.) – catalytic acid/base (Glu225)

4.2.4 Beyond the +1 Subsite

Although the crystal structures of HS oligosaccharide complexes point to a maximally favored trisaccharide cleavage site, they do not rationalize all findings from biochemical studies of HPSE cleavage specificity. Observations that HS hexasaccharides are preferentially cleaved by HPSE over shorter tetrasaccharides [69], and the ability of HPSE to cleave substrates lacking −2 or + 1 subsite O-sulfation, but containing −3 or + 2 sulfation [71], hint at interactions beyond the core −2, −1 and + 1 subsites that were not captured in our HPSE crystal structure complexes.

The most likely candidates for mediating additional interactions between HPSE and HS are the HBDs postulated by Levy-Adam et al., which may help to bind sulfates outside of the core trisaccharide cleavage site. Modeling studies of HPSE with either the highly sulfated HPSE inhibitor SST0001 [80], or the synthetic HS pentasaccharide fondaparinux [81], suggest that a +2 subsite GlcUA(2S) could interact with basic residues in HPSE HBD2, and thus contribute toward enzyme-substrate binding [82]. It may be the case that HPSE-HS interactions under native contexts are less strictly defined than those captured by static crystal structures, with more distant sulfates potentially being able to compensate for lack of sulfation around the core trisaccharide cleavage site.

4.3 3-Dimensional Structure of proHPSE

Proteolytic excision of the 6 kDa linker peptide of proHPSE is required for its maturation to HPSE, indicating a role for this peptide in inactivating proHPSE towards HS substrates. Based on the positions of the 8 kDa and 50 kDa subunit C- and N-termini (respectively) in mature HPSE, we postulated that the 6 kDa linker peptide of proHPSE would likely lie near the HPSE substrate binding cleft, implying a steric occlusion mechanism for inactivation. This steric mechanism for proHPSE inactivation by its 6 kDa linker peptide was confirmed by the 2017 crystal structure of proHPSE (PDB accession code: 5LA4). Broadly speaking, the proHPSE 6 kDa linker peptide forms a predominantly α-helical domain which sits ‘atop’ the HPSE binding cleft, thereby preventing the HPSE active site from binding HS [48]. Occlusion of HS-binding appears to be the only mechanism whereby the 6 kDa linker peptide inactivates HPSE (Fig. 5.9). Indeed, protein engineering to ‘shrink’ the proHPSE linker peptide produces an enzyme with HS degrading activity similar to wild type HPSE [46].

Fig. 5.9
figure 9

Three-dimensional structure of human proHPSE, showing ‘top’ (left) and ‘side’ (right) views. The 6 kDa linker peptide of HPSE (green) sterically occludes the HS-binding cleft (compare with Fig. 5.3). HBDs are highlighted in pink. Tyr156 and Gln157, which form part of the CTSL cleavage site involved in HPSE maturation, are highlighted in red. CTSL cleavage occurs between Gln157 and Lys158 (part of HBD I). A ‘binding pocket’ structure can be discerned on the surface of proHPSE, and is shown here in complex with a glucuronidase specific activity-based probe (grey sticks)

ProHPSE readily binds to cell surface HSPGs and undergoes internalization and trafficking to the lysosome, whereupon it undergoes processing to produce mature HPSE [83,84,85]. This process of proHPSE sequestration has been proposed to contribute to aggression and metastasis in cancer cells, by providing a mechanism for these cells to capture extracellular proHPSE and increase their own stores of mature HPSE. Although the substrate binding cleft is occluded in proHPSE, HBD1 and HBD2 remain freely accessible on the surface of proHPSE, and electrostatic interactions between these HBDs and cell surface HSPGs may facilitate proHPSE binding to cell surfaces, with subsequent internalization and processing. Proteolytic processing of proHPSE is mediated by cathepsin L (CTSL), with one key cleavage occurring at Gln157−/−Lys158, directed by CTSL recognition of the nearby Tyr156 residue [8687]. Tyr156-Lys158 in the proHPSE structure reside within a highly disordered turn towards the end of the 6 kDa linker sequence, where they would be freely accessible for interaction with CTSL (or another protease).

One of the most surprising discoveries upon solving of the proHPSE structure was that the 6 kDa linker peptide only obscures part of the HPSE binding cleft, with a substantial ‘binding pocket’ still remaining on the protein surface. Whilst large HS substrates are occluded from proHPSE, the ‘binding pocket’ of proHPSE renders its active site residues fully accessible to smaller ‘monosaccharide’ like molecules. We confirmed the catalytic competency of proHPSE (at least towards artificial substrates) by labeling the protein with aziridine based activity-based probes, which are highly activated substrate mimics we have previously utilized to study many classes of glycoside hydrolase (PDB accession code: 5LA7; Fig. 5.9) [88,89,90,91]. It remains to be seen whether there are biologically relevant substrates in vivo which are turned over by the proHPSE ‘binding pocket’, or whether this motif is an evolutionary relic from an ancestral enzyme (discussed further in Sect. 5.5.1.).

5 HPSE Within the Broader CAZy Classification

As an enzyme which catalyzes the hydrolytic breakdown of a carbohydrate substrate, HPSE falls within the general enzyme class known as the glycoside hydrolases (or glycosidases). Glycoside hydrolases are a diverse group of enzymes, which facilitate the hydrolytic breakdown of carbohydrate-containing biomolecules (e.g., glycoproteins, polysaccharides, small molecule glycoconjugates) in varied contexts across all domains of life [92].

Reflecting the central importance of carbohydrate-containing molecules in biology, it has been estimated that ~1–3% of the protein-coding genome of a (non-archael) organism corresponds to enzymes involved in carbohydrate processing (both for synthesis and breakdown) [93]. The Carbohydrate Active enZymes (CAZy) classification aims to classify these carbohydrate processing enzymes into sequence-based families [93,94,95,96,97,98,99,100,101,102,103,104]. Given that protein sequence largely dictates structure and function, CAZy families typically contain enzymes with similar structural folds and enzyme mechanisms, although the specific substrates processed by enzymes within a family can vary. Under the CAZy classification, HPSE belongs to the GH79 family, itself further classified into the broader GH-A clan (clans are based on groupings of GH families with similar overall topologies and conservation of active site residues) [100, 105]. The GH79 family primarily consists of retaining β-D-glucuronidases, although the substrate contexts of these glucuronic acid residues are diverse, including HS [47, 79], but also chondroitin sulfate [79], hyaluronic acid [46], β-D-glucuronides linked to plant arabinogalactan proteins [106107], and small molecule β-D-glucuronide glycoconjugates [108].

5.1 Structural Determinants of exo vs. endo-Glycosidase Activity in the GH79 Family

The GH79 family contains representatives of both endo-acting and exo-acting β-D-glucuronidases, raising the question of how a single ‘scaffold’ can be adapted to process substrates in either an exo- or endo- acting fashion. To date, 3 GH79 enzymes have been structurally characterized: HPSE [4748], AcGH79 from Acidobacterium capsulatum [78], and BpHep from Burkholderia Pseudomallei [79]. In keeping with the scope of CAZy classification, there is substantial sequence and structural homology between these three enzymes and all three act as β-D-glucuronidases, although the natural substrates of the two bacterial enzymes are not known.

One area of major variability in GH79 family enzymes is a loop which we have termed the ‘exo-pocket’ loop, which connects the 2nd β-sheet of the (β/α)8 barrel domain to the 2nd α-helix. Comparison of the three structurally characterized GH79 enzymes demonstrates that the ‘exo-pocket loop ’ can vary dramatically in size, and appears to act as a key structural determinant of whether an enzyme of the GH79 family behaves as an exo- or an endo- acting glycosidase.

5.1.1 AcGH79

The crystal structure of the exo-acting β-D-glucuronidase AcGH79 was reported by Michikawa et al. in 2012, and was the first enzyme of the GH79 family to be structurally characterized (PDB accession code: 3VNY) [78]. Although the function of AcGH79 in its native biological context is not well understood, the authors determined that this enzyme could not hydrolyze 4-O-methyl GlcUA containing substrates. AcGH79 may be involved in the catabolism of plant arabinogalactan proteins, which typically contain both GlcUA and 4-O-methyl GlcUA substitutions on the main arabinogalactan polymer [106107].

The ‘exo-pocket’ loop of AcGH79 is 23 residues long, extending from Phe86 to His108 (limits defined on the basis of homology to BpHep and proHPSE; Fig. 5.10). This sequence adopts an extended turn that occludes the ‘rear’ face of the AcGH79 active site, delimiting an exo-acting substrate binding pocket that can only accommodate a single GlcUA residue. Discrimination of 4O-methyl GlcUA vs. GlcUA is facilitated by Glu45, Pro104, and His327, which together form a tight binding pocket around O4 of GlcUA, which does not allow for further methyl substitution .

Fig. 5.10
figure 10

(a) ‘Exo-pocket’ loop structures for AcGH79 , BpHep and proHPSE (highlighted green) showing their role in delineating an exo-acting binding pocket structure in AcGH79 and proHPSE, or alternatively, an endo-acting binding cleft structure in BpHep. Proteolytic removal of the ‘exo-pocket’ loop of proHPSE (i.e. the 6 kDa linker peptide), reveals the endo-acting binding cleft of mature HPSE. (b) Clustal ω [149] alignments of AcGH79, BpHep, and HPSE showing the variation in ‘exo-pocket’ loop lengths between these three proteins

5.1.2 BpHep

BpHep was the first endo-acting GH79 enzyme to be structurally characterized, and the second structurally characterized GH79 enzyme overall (PDB accession code: 5BWI) [79]. BpHep is an endo-β-D-glucuronidase which can degrade both heparan sulfate and chondroitin sulfate, suggesting it may be a general glycosaminoglycan breakdown enzyme. Saturation-transfer difference NMR binding experiments using defined HS oligomers suggest that BpHep prefers to interact with HS cleavage sites rich in GlcNAc rather than GlcNS, indicating a different HS substrate specificity to that displayed by HPSE.

Compared to AcGH79, the ‘exo-pocket’ loop of BpHep is substantially shorter in length (Gly92 to Asp99; 8 residues long; Fig. 5.10), and is not long enough to occlude any part of the enzyme active site. Instead, this very short ‘exo-pocket’ loop of BpHep results in opening of the enzyme active site, revealing an extended endo-acting binding cleft, well suited for interaction with glycosaminoglycan substrates .

5.1.3 (pro)HPSE

ProHPSE to HPSE maturation provides the most direct example of the role of the ‘exo pocket’ loop in controlling exo−/endo- activity in GH79 family enzymes. The ‘exo-pocket’ loop of (pro)HPSE (110 to 157; 48 residues long) directly corresponds to the 6 kDa linker peptide and is substantially increased in size compared to corresponding ‘exo-pocket’ loop sequences in AcGH79 and BpHep (Fig. 5.10). As mentioned in Sect. 5.4.3., the 6 kDa linker of proHPSE forms an alpha-helical domain that acts as a direct steric block ‘above’ the HPSE binding cleft, preventing interaction of the enzyme with HS substrates. Removal of the proHPSE linker is required for unmasking of the mature HPSE binding cleft, and can be considered analogous to the effect of the minimal ‘exo-pocket’ loop sequences of BpHep [79] and engineered single chain HPSE mutants [46].

Comparison of proHPSE with AcGH79 and BpHep places the exoglycosidase like binding pocket of proHPSE into an understandable evolutionary context (see Sect. 5.4.3.). We hypothesize that expansion of the ‘exo-pocket’ loop sequence from an ancestral exo-acting GH79 enzyme led to formation of the 6 kDa proHPSE linker, without formal loss of the exo-acting binding pocket architecture. The retention of such exo-glycosidase like structural features on proHPSE warrants consideration whether there are genuine substrates that are processed by this protein species in vivo. Supporting this hypothesis, mature HPSE has been demonstrated to possess exo-glycosidase activity against terminal glucuronides within certain HS contexts [109]. There is no structural reason why proHPSE would not also possess such activity, and thus it may play a role in, e.g. trimming terminal glucuronides from certain HS chains .

6 Concluding Remarks and Future Challenges

HPSE has captured the interest of researchers for over four decades, with efforts to understand its function ranging from fundamental biochemical studies on HPSE enzymatic activity to complex biomedical characterizations of its role in cancer and other diseases. Structural studies of HPSE and proHPSE provide a framework on which to place these biochemical and biomedical insights, allowing them to be related to features on the HPSE protein itself.

There are still a number of unresolved challenges in the HPSE field, which will likely be the subject of substantial research efforts in the coming years.

Most pressingly, despite intense interest in HPSE as an anti-cancer target, there are few effective HPSE inhibitors known, and none in use clinically. Various small molecule HPSE inhibitors have been reported, based on benzoxazole, furanylthiazole [110], isoindole [111], benzimidazole [112113], and other scaffolds. However, none of these small molecule inhibitors appear to have progressed beyond initial enzyme inhibition and invasion/angiogenesis studies. More recently, four HPSE inhibitors have entered clinical trials: PI-88 [114,115,116,117,118,119], SST0001 [82, 120,121,122], M402 [123124] and PG545 [125,126,127,128,129], although an interim analysis of PI-88 phase III clinical trials showed a failure to meet its primary endpoint (disease-free survival) [130] (Chhabra & Ferro; Noseda et al., Hammond & Dredge, Chaps. 19, 21 and 22 in this volume). All HPSE inhibitors currently under clinical trials are highly sulfated oligosaccharide molecules, and of these only PG545 possesses a well-defined molecular structure. Such oligosaccharide-derived molecules are less likely to possess desirable pharmacokinetic properties, and a renewal of efforts to develop novel small molecule HPSE inhibitors may be timely.

The HPSE field also lacks a reliable, sensitive, and facile method for quantitation of HPSE enzymatic activity [131]. The development of routine activity assays, often relying on artificial chromogenic and fluorogenic substrates, has been essential for enzyme discovery and enzyme characterization efforts in the glycosidase field [132,133,134,135,136]. Robust assays are vital for effective inhibitor development since a potential inhibitor cannot be quantitatively characterized if there is no suitable assay available to inhibit. The lack of ‘gold standard’ assays for HPSE probably reflects the complex nature of its interaction with HS, which may be difficult to recapitulate in artificial substrates.

Finally, the discovery of a close homolog of HPSE, termed HPSE2, which can bind HS but lacks glycosidase activity [137], raises questions regarding the biological functions of HPSE2, and how they might relate to HPSE. HPSE2 expression inversely correlates with the size and grade of tumours [138139], and it appears to act as an anti-tumorigenic factor [140], possibly through antagonism of HPSE activity [141]. Biallelic mutations in HPSE2 have been linked to the rare genetic condition urofacial syndrome (UFS; also known as Ochoa syndrome) [142,143,144,145], a disease characterized by symptoms of facial grimacing, coupled to loss of adequate urinary voiding [146]. Such symptoms may indicate that HPSE2 plays a role in urinary tract and/or neurological development [147148]. (Mckenzie; Roberts and Woolf, Chaps. 34 and 35 in this volume).

We anticipate that meeting the above (and other) challenges in the HPSE field will greatly benefit from an improved understanding of HPSE structure/function relationships. Structure-guided development of new methods to assess and modulate HPSE activity will doubtless lead to improvements in our ability to treat HPSE driven cancers and other HPSE related diseases. More fundamentally, improved molecular understanding of HPSE activity will also help us better understand the many varied roles of this enzyme in the regulation of HSPGs and the ECM.