Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

5.1 Introduction

Posttranslational modifications (PTMs) control almost all (patho-)physiological processes in living systems by affecting protein folding, conformation, turnover, cell communication, localization and protein function in an orchestral manner. Various PTMs can modulate one single protein leading to an enormous number of protein isoforms, about three orders of magnitude higher than the number of genes encoded in the genome (Cox and Mann 2007; Witze et al. 2007). As a result, PTMs are pivotal in many diseases, such as cancer, Alzheimer’s disease, and diabetes. The most abundant and ubiquitous PTM is phosphorylation, which plays major roles in signal transduction and regulation of enzyme activity (Kamath et al. 2011). Due to an abnormal phosphorylation process, the protein tau aggregates in Alzheimer’s disease (Martin et al. 2011). Ubiquitination decides the fate of proteins since covalent attachment of ubiquitin leads to proteasomal or lysosomal degradation of a protein. Other common and well investigated PTMs are (a) glycosylation, affecting protein stability, solubility and cell-cell interactions, (b) acetylation, controlling cell signaling processes and gene expression by modification of histones and (c) methylation, affecting gene expression by modifying of histones similar to acetylation (Kamath et al. 2011).

Proteolysis has emerged as one of the most prevalent PTMs, regulating numerous physiological processes such as development, immune response or blood clotting. Hence, it is not surprising that >560 proteases are encoded in the human genome, representing the second largest enzyme family in man (Puente et al. 2003). In general, proteases break down polypeptide chains by hydrolysis of peptide bonds (Lopez-Otin and Overall 2002). Proteolytic cleavage is of particular interest compared to most PTMs since it is an irreversible process occurring intra- and extracellularly. The “irreversible nature” associates proteolytic processing with fundamental steps in cell function. Consequently, proteolysis is a tightly regulated process that affects every protein either through limited proteolysis or terminal degradation.

Limited proteolysis describes a process in which proteins are functionally modified by a proteolytic cleavage yielding truncated, stable cleavage products. The functional consequences of limited proteolysis can be diverse. Proteins can be activated, inactivated or transferred to another cell compartment, thereby maintaining the proper operation of the cell machinery. Examples for limited proteolysis include proteases involved in blood coagulation (Turk et al. 2012) and ADAM17 (a disintegrin and metalloproteinase 17) is a membrane bound protease that cleaves cell surface proteins, such as cytokines (e.g. TNFα) and cytokine receptors (e.g. IL-6R and TNF-R) thereby regulating cell signaling (Scheller et al. 2011). Less specific proteases including cathepsins, located in the lysosomal compartment, and the proteasome are involved in both limited proteolysis and protein turnover, thereby serving as a quality control by degrading denatured, misfolded and obsolete proteins (Wickner et al. 1999). Since 3–5 % of our cellular proteins are degraded and resynthesized every day (Ciechanover 2006), the rate of synthesis/degradation becomes nowadays more important to fully appreciate cellular dynamics and bridge the gap between transcriptome and proteome data (Jayapal et al. 2010).

Disproportioned distributions of protease, protease inhibitors, and protease substrates can be deleterious and many developmental disorders and diseases, including cancer and neurodegenerative diseases, are accompanied with dysregulation of proteolysis (Doucet et al. 2008; Quesada et al. 2009; Reiser et al. 2010). Therefore proteases are suggested to be valuable and suitable drug targets (Drag and Salvesen 2010). To fully understand protease functions in physiology and pathology, and to exploit the therapeutic potential of protease inhibition, knowledge about the in vivo substrate repertoire of a protease is needed (auf dem Keller and Schilling 2010; Impens et al. 2010). The term “degradomics” summarizes all proteomic investigations and techniques regarding the genetic, structural and functional identification and characterization of proteases, and their substrates and inhibitors (Lopez-Otin and Overall 2002). Mass spectrometry-based proteomic techniques are applied to detect alterations in substrate abundance caused by the lack or overexpression of a protease, to identify generated neo amino- and carboxy-termini generated by a protease, and to determine the protease cleavage site specificity. Moreover activity-based probes (ABPs) specifically target active proteases and monitor their activity from cellular to organ level demonstrating colocalization of proteases and their substrates (auf dem Keller and Schilling 2010).

Degradomic approaches uncovered that proteases do not operate in isolation. They built up proteolytic systems comprising of substrates and cleavage products, protease inhibitors and proteases that dynamically interact, thereby forming cascades, regulatory circuits and pathways. These dynamics form the so-called “protease web” that underlies a constant flux (Butler and Overall 2009; Overall and Dean 2006). For instance the aspartic protease cathepsin D degrades the protease inhibitor cystatin C (Laurent-Matha et al. 2012). Cystatin C not only targets cysteine cathepsins but also acts on matrix-metalloproteinase (MMP)-2 (Butler and Overall 2009). Lower cystatin C levels potentially affect MMP-2 levels which in turn will affect other proteases and protease inhibitors. This scenario visualizes that certain proteases are important nodes in this network modulating and balancing the activity of proteases and protease inhibitors, enabling the cell to react on perturbations. The direct role of a protease in knockout or transgenic animal models is hard to elucidate, since a knockout of one protease often leads to unexpected downstream effects mediated by the protease web. The protease web gets more complex by considering the occurrence of PTMs that can control the action of proteases in vivo including regulation of gene expression, catalytic activity, protein interactions and degradation.

Mass spectrometry (MS) has become a powerful tool for qualitative and quantitative analysis of a large number of proteins and their PTMs in complex samples. The basis was built by the invention of soft ionization techniques called matrix-associated laser desorption ionization (MALDI) (Tanaka et al. 1988) and electrospray ionization (ESI) (Fenn et al. 1989). The MS analysis for protein quantification employs two main strategies, label-free methods, which directly compare intensity profiles from liquid-chromatography mass spectrometry (LC-MS) or usage of distinct mass tags to directly compare samples (e.g. healthy versus diseased or wildtype versus gene knockout) labeled either with the light or the heavy stable isotope in one MS run (Ong and Mann 2005). The aim of a qualitative proteomic analysis is to identify all proteins present in a sample. The coverage of a proteomic sample is dependent on the sensitivity of a mass spectrometer. Important parameters are (a) the minimum amount of analyte which can be detected, (b) the dynamic range, the signal range in which analytes can be identified and (c) the duty cycle, the number of fragmentation spectra per timeframe which can be recorded (de Godoy et al. 2006). Importantly a protein cannot be considered as absent from the sample, if it was not identified in a mass spectrometric analysis (Helsens et al. 2011). Protein modifications occur only in a subpopulation of the proteome, therefore mass spectrometry in combination with separation and enrichment techniques is used to successfully characterize peptides decorated with PTMs, e.g. for phosphorylation (McNulty and Annan 2008), glycosylation (Geng et al. 2001), lysine acetylation (Choudhary et al. 2009) or ubiquitination (Peng et al. 2003).

In the following sections we want to give an overview on emerging degradomic techniques enabling the characterization of protease cleavage sites in a large, proteome-wide scale. We show perspectives how identified cleavage sites can be validated by profiling of protease specificity and monitoring of proteolysis by activity-based probes. Recent proteomic approaches highlight important nodes of the proteolytic network, illustrate how proteolysis is modulated by other PTMs, and emphasize its role in protein turnover.

5.2 Identification of Cleavage Sites by N- and C-Terminal Degradomics

5.2.1 Overview

In the field of proteolysis research, quantitative gel-based and gel-free proteomic strategies can monitor how differential proteolytic activity affects protein abundance on a proteome wide scale (auf dem Keller and Schilling 2010). As a typical gel-based approach, two-dimensional gel electrophoresis in combination with MS is applied to determine protein abundance by comparing spot-staining intensity. Most common gel-free approaches determine alterations in protein abundance by comparing stable isotope labeled samples allowing relative quantification by one MS-measurement. Several labeling strategies have been successfully applied, like isotope-coded affinity tag (ICAT) (Gygi et al. 1999), isobaric tag for relative and absolute quantification (iTRAQ) (Ross et al. 2004) or stable isotope labeling by amino acids in cell culture (SILAC) (Ong et al. 2002). In a prototypical proteomic-degradomic experiment, a sample that was exposed to proteolysis is mixed with an equal amount of an unexposed control sample (e.g. protease inhibitor treated versus untreated, wild-type versus knockout). Meanwhile multi-tag labeling strategies have evolved, e.g. triple SILAC (Mann 2006) or iTRAQ allowing up to eight different tags (Prudova et al. 2010).

For the identification of direct cleavage sites global quantitative comparisons are not suitable due to sample complexity, because internal peptides would overshadow terminal peptides in the MS analysis. Furthermore they insufficiently profile subtle alterations caused by limited proteolysis. For identification of cleavage events terminomic techniques are applied. Since every cleavage event is leading to a neo-N- and neo-C-terminus on the cleavage product terminal proteomic strategies, “terminomics”, select either N- or C-terminal peptides to enrich neo-N-terminal or neo-C-terminal peptides from complex peptide mixtures (Overall and Blobel 2007). This selection process reduces complexity by excluding the overwhelming amount of noninformative internal peptides that overshadow the terminal peptides in the MS analysis. Enrichment of terminal peptides before MS analysis simplifies the proteome by using single terminal peptides for protein identification, which increases dynamic range and proteome coverage (Gevaert et al. 2007). A drawback of single peptide identification is that MS analysis may not identify all terminal peptides, because they are too long, too short, too hydrophobic, ionize and fragment poorly (Huesgen and Overall 2012; Impens et al. 2010). Terminomic strategies targeting neo-N-termini and neo-C-termini will inevitably enrich for an increased number of so called “one-hit wonders”, proteins that are only identified by one terminal peptide. Often these identifications are considered as less reliable. This situation is also faced by other PTM approaches (acetylation/phosphorylation). Numerous PTM proteomic studies have contributed to increase confidence in MS based peptide identification with “one-hit wonders”. These are increasingly regarded as an invaluable and rich source for the proteomic understanding of biological systems. In fact, as early as 2007 (prior to most degradomic techniques), it has been argued that “one-hit wonders” should not be discarded, since a large amount of biologically valuable data would get lost (Higdon and Kolker 2007). To reduce the number of false positive protein identifications the use of statistical approaches for a validation of search engine based spectrum-to-peptide assignment is mandatory. Different strategies like the usage of decoy databases (Elias and Gygi 2007), a statistical validation by PeptideProphet (Keller et al. 2002) and identification by at least two different MS/MS spectra can be applied (Huesgen and Overall 2012). Moreover a customized scoring tool with high sensitivity termed Peptizer was developed to point to potential false positive protein identifications in large data sets (Helsens et al. 2008). Another challenge is to handle the high amounts of data, which is addressed by tools such as TopFind, a knowledgebase linking protein termini with function (Lange et al. 2012; Lange and Overall 2011) or MEROPS the peptidase database (Rawlings et al. 2010).

N-terminomic techniques are classified into positive and negative selection procedures. Positive selection techniques directly target terminal peptides by modifying neo-N-termini and mature N-termini with an affinity tag (e.g. through specific biotinylation) and subsequent removal of internal peptides after digestion (e.g. trypsin). By definition, positive selection schemes fail to enrich for naturally modified protein termini. This is a major caveat, given that terminal modifications are abundant in nature. Negative selection techniques overcome this problem by modification of all primary amines on all neo-N-terminal and mature N-termini in the sample. In the next step a secondary digest (e.g. trypsin) creates new primary amines selectively on internal peptides that are removed from the mixture afterwards. Finally only neo-N-termini, unmodified mature N-termini with free amines as well as naturally blocked mature N-termini remain. Negative selection procedures enrich for neo-N-termini and mature N-termini more efficiently and allow complete proteome coverage by isolating naturally blocked N-termini as well (Huesgen and Overall 2012). In general, negative selection procedures might be preferred over positive selection procedures (Impens et al. 2010).

C-termini are less accessible for analysis than N-termini and so far have rarely been targeted to determine protease derived cleavage sites. The underlying problem for enrichment of C-termini is the lack of chemical methods to specifically modify terminal carboxyl groups. C-terminomic techniques normally follow a negative selection procedure. C-terminomic strategies are required to examine C-terminal processing by carboxypeptidases. They are useful to validate cleavage sites identified in N-terminomic approaches and uncover cleavage sites that are overlooked by N-terminomic approaches, e.g. N-termini closely located to the substrate C-terminus, which are too short for LC-MS/MS analysis.

5.2.2 Isolation of N-Termini: Positive Selection Procedures

5.2.2.1 Biotinylation of Protein N-Termini Using Subtiligase

In 2008 the group of J. A. Wells engineered a subtiligase enzyme to specifically biotinylate primary amines of neo-N-termini and unprotected mature N-termini (Mahrus et al. 2008). During biotinylation procedure a biotinylated peptide ester containing a cleavage site for the highly specific tobacco etch virus (TEV) protease reacts with subtiligase forming an intermediate product (Fig. 5.1). In a second reaction subtiligase is transferred onto the N-terminal α-amines, but not onto ε-amines of lysine side chains. Following trypsin digestion, biotinylated N-termini are bound to immobilized streptavidin and subsequently eluted by digestion with TEV. After TEV cleavage the N-termini keep a Ser-Tyr-dipeptide which helps to confirm true positives during MS analysis.

Fig. 5.1
figure 00051

Outline of the biotinylation of protein N-termini using subtiligase procedure

This N-terminomic technique has been successfully applied to identify 333 caspase-like cleavage sites on 292 protein substrates in etoposide-treated apoptotic Jurkat cells (Mahrus et al. 2008). Moreover it has been used in combination with SILAC for the discovery of caspase-1 substrates in THP-1 human monocytic leukemia cells exposed to inflammatory stimuli (Agard et al. 2010). Through low efficiency of subtiligase N-terminal labeling, this technique requires large amounts of starting material (circa 50–100 mg) (Agard and Wells 2009) and may introduce a selection bias through subtiligase specificity (Huesgen and Overall 2012).

5.2.2.2 Biotinylation of Protein N-Termini Post Lysine Guanidation

In 2007 the group of G. S. Salvesen developed a positive selection procedure based on the selective guanidation of lysine ε-amines on protein level using O-methylisourea (Timmer et al. 2007). O-methylisourea does not react with α-amines present at neo-N-termini and mature unprotected N-termini, which are than chemically biotinylated (Fig. 5.2). Following tryptic digest, biotinylated N-termini are captured by immobilized streptavidin and released by the reduction of disulfide bonds that link the biotin tag to the peptide. Afterwards N-termini carry a thioacyl modification and are analyzed by MS.

Fig. 5.2
figure 00052

Outline of the biotinylation of protein N-termini post lysine guanidation procedure

This method has been used for the discovery of mitochondrial transit peptides from Escherichia coli, yeast, mouse and human cells as well as for a study of structural and kinetic parameters that determine specificity of caspase-3 and GluC (Timmer et al. 2007, 2009).

5.2.2.3 Labeling of Protein N-Termini with iTRAQ Reagents Post Lysine Guanidation

In 2007 the group of G. S. Salvesen published a second method for identification of neo-N-termini and mature unprotected N-termini. In this approach, first lysine ε-amines are selectively guanidated and afterwards α-amines labeled with iTRAQ reagents on protein level (Enoksson et al. 2007) (Fig. 5.3). Following tryptic digest, N-termini are selected by an in silico analysis based on two rounds of MALDI-TOF/TOF-MS/MS. In detail, a first MS/MS analysis is performed to screen for N-termini that possess signature iTRAQ ions in their fragmentation spectra. These can only be observed on MS/MS level since iTRAQ tags result in characteristic reporter ions. N-termini showing iTRAQ signatures pass through a second more efficiently in-depth MS/MS analysis identifying the actual peptide.

Fig. 5.3
figure 00053

Outline of the biotinylation of protein N-termini with iTRAQ reagents post lysine guanidation procedure

To validate the procedure, a set of recombinant Escherichia coli proteins with predicted caspase-3 cleavage sites was used. A protein mixture was treated with active or inactive caspase-3 and labeled with two different iTRAQ reagents. Ten cleavage sites, all corresponding to caspase-3 consensus, could be identified (Enoksson et al. 2007). Since this approach disregards removal of internal peptides, analyzed samples are highly complex. Moreover its usage is restricted to MALDI mass spectrometers. Identification of terminal peptides through MS2 reporter fragments was also used by Tholey and coworkers in a PICS-like approach (biochemical strategy to investigate protease specificity) (Jakoby et al. 2012).

5.2.2.4 N-CLAP: Biotinylation of Protein N-Termini Using Edman Chemistry

In 2009 the group of S. R. Jaffrey introduced a technique called N-terminalomics by chemical labeling of the alpha-amine of proteins (N-CLAP) (Xu et al. 2009a). N-CLAP takes advantage of Edman degradation chemistry to selectively label unprotected α-amines of proteins with a cleavable biotin (Fig. 5.4). In detail, phenyl isothiocyanate (PITC) is employed to block all primary amines on protein level. In the next step trifluoracetic acid (TFA) is used to initiate an intramolecular cyclization on the PITC-blocked N-terminal amino acid leading to a peptide bond breakage between the first and second amino acid. Since PITC-blocked ε-amines on lysine side chains cannot undergo this reaction, primary α-amines are generated only on N-termini, which are shortened by one amino acid. These are specifically biotinylated using sulfo-NHS-SS-biotin. Following tryptic digestion, biotinylated N-termini are bound to immobilized avidin and eluted by reducing the disulfide bond of the cleavable biotin tag.

Fig. 5.4
figure 00054

Outline of the N-CLAP procedure

This technique was used to characterize proteolytic cleavage events associated with methionine aminopeptidases and signal peptide peptidases, as well as proteins that are proteolytically cleaved after cisplatin-induced apoptosis (Xu et al. 2009a). Disadvantage of this approach is the incompatibility with chemical stable isotope labeling. Enriched N-termini are shortened by one amino acid compared to true N-termini. Incomplete modification of α-amines by PITC can result in biotinylation of the first n-terminal amino acid leading to an unclear N-termini determination.

5.2.3 Isolation of N-Termini: Negative Selection Procedures

5.2.3.1 Combined Fractional Diagonal Chromatography

In 2003 the Gevaert et al. developed the widely applied N-terminomic technique combined fractional diagonal chromatography (COFRADIC) (Gevaert et al. 2003). During this procedure all primary amines are acetylated on protein level (Fig. 5.5). Following tryptic digestion, peptides are separated by reverse liquid chromatography (RP-HPLC), generating typically 12–15 fractions containing N-terminal and internal peptides. Only internal peptides possess primary α-amines, that are labeled in a second step with highly hydrophobic 2,4,6-trinitrobenzenesulfonyl (TNBS), which increases the hydrophobicity of these peptides. In a second RP-HPLC analysis, TNBS labeled internal peptides show a hydrophobic shift and can be discarded. N-terminal peptides elute in the same gradient concentration as in the first sorting step and can be further fractionated. The number of identified N-terminal peptides can be increased by performing strong cation exchange (SCX) chromatography at low pH (Staes et al. 2008).

Fig. 5.5
figure 00055

Outline of the COFRADIC procedure

This method, using a negative selection procedure, allows for identification of neo-N-termini and mature unmodified as well as modified N-termini. Note that in vivo acetylated N-termini can be distinguished from in vitro acetylated N-termini, because trideutero-acetylation can be used. Therefore COFRADIC was successfully applied to determine the N-terminal acetylation status of human and yeast protein N-termini (Arnesen et al. 2009). Although based on a complex chromatography scheme, COFRADIC has been widely used for N-terminal analysis, also in combination with stable isotope labeling (18O isotope labeling, SILAC) (Impens et al. 2008; Vande Walle et al. 2007).

5.2.3.2 Isolation of N-Termini by Phospho Tagging

In 2012 the group of Ad P.J.M. de Jong recently published a new technique to analyze the global N-proteome (Mommen et al. 2012). During this procedure all primary amines are dimethylated with formaldehyde at the protein level (Fig. 5.6). Following tryptic digest new primary amines of internal peptides are generated, which are modified using the phospho tagging (PTAG) reaction. Resulting phosphopeptides are depleted by TiO2 affinity chromatography. The flow-through contains neo-N-termini as well as naturally modified N-termini, which are analyzed by LC-MS/MS. This recently introduced approach was used to isolate N-termini from bacteria and yeast. More than 700 N-termini were identified in Neisseria meningitidis and more than 900 N-termini in Saccharomyces cerevisiae.

Fig. 5.6
figure 00056

Outline of the PTAG strategy

5.2.3.3 Terminal Amine Isotope Labeling of Substrates

In 2010 the group of C. M. Overall developed terminal amine isotope labeling of substrates (TAILS) as a simple and highly efficient negative selection strategy (Kleifeld et al. 2010). In the first step all primary amines are labeled with different stable isotope variants of formaldehyde allowing for differential quantification between two samples (light: protease present; heavy: control) (Fig. 5.7). In the next step samples are combined and subjected to tryptic digest. Afterwards internal peptides contain primary amines and are specifically extracted from the complex peptide mixture by using a highly specific aldehyde functionalized amine reactive hyperbranched polyglycerol polymer (HPG-ALD polymer). This covalently binds internal tryptic peptides and can easily be removed by filtration with a spin filter device, caused by its high molecular weight. The flow-through contains neo-N-termini, mature unmodified as well as naturally modified N-termini. The N-termini are then quantified according to their relative abundance in the protease treated and the control sample.

Fig. 5.7
figure 00057

Outline of the N-TAILS procedure

This N-terminomic approach has been widely applied and used for the identification of cleavage sites in complex samples. TAILS identified MMP-2-substrates in secretome samples derived from MMP-2−/− mouse embryonic fibroblasts (MEFs) incubated with recombinant MMP-2. Moreover it was applied for the discovery of cathepsin substrates in cell-based systems (Tholen et al. 2011, 2013a) as well as tissues from knockout and wild-type mice (Tholen et al. 2013b). Moreover, amyloid precursor protein (APP) was identified as a substrate for meprin β by TAILS (Jefferson et al. 2011). Meanwhile TAILS has been combined with iTRAQ reagents allowing the analysis of up to four or even eight samples in one experiment (Prudova et al. 2010). Additionally, a statistics-based platform for quantitative N-terminome analysis and identification of protease cleavage products has been established (auf dem Keller et al. 2010). TAILS requires low sample amounts and internal peptides are highly efficient removed by an aldehyde polymer, which is commercially available.

5.2.4 Isolation of C-Termini

5.2.4.1 COFRADIC-Based Enrichment of C-Terminal Peptides

In 2010 the group of K. Gevaert published an enrichment strategy for C-termini by SCX, based on peptide charge at low pH (Van Damme et al. 2010). In the first step all proteins are S-alkylated and all primary amines are blocked by (trideutero-) acetylation (Fig. 5.8). The proteins are digested using trypsin, which can only cleave after arginine residues. Afterwards the peptide mixture consists of α-amino-blocked N-terminal peptides, α-amino-free internal peptides and α-amino-free-C-terminal peptides, that all end with an arginine residue, except from the C-terminal peptides. In the next step these peptide mixture is passed through a SCX column using a low pH, where all internal peptides are captured caused by their positive charge at low pH whereas α-amino-blocked N-terminal peptides and α-amino-free-C-terminal peptides pass through. These enriched N- and C-terminal peptides are then separated by RP-HPLC. Afterwards C-terminal peptides are butyrylated and further separated in a second RP-HPLC run under identical conditions. Butyrylated C-termini elute later compared to the first RP-HPLC run, are collected in distinct fractions and analyzed by LC-MS/MS. This approach identified 334 neo-C-termini generated by granzyme B and 16 neo-C-termini generated by carboxypeptidase A4 in human cell lysates.

Fig. 5.8
figure 00058

Outline of the COFRADIC-based enrichment of C-terminal peptides procedure

5.2.4.2 C-Terminal Amine-Based Isotope Labeling of Substrates

In 2010 Schilling et al. introduced C-terminal amine-based isotope labeling of substrates (C-TAILS), a targeted approach for the enrichment of C-terminal peptides from complex samples (Schilling et al. 2010). In this negative selection procedure first all primary amines are modified by reductive dimethylation using formaldehyde (Fig. 5.9). In the next step all carboxyl groups are protected by carbodiimide-mediated condensation of C-terminal carboxyl groups with ethanolamine. Following tryptic digestion new generated α-amines of N-terminal and internal tryptic peptides are dimethylated and removed based on their unprotected C-termini by coupling to the high-molecular-weight polymer poly-allylamine. Ultrafiltration separates the uncoupled, blocked C-terminal peptides that are subsequently analyzed by LC-MS/MS.

Fig. 5.9
figure 00059

Outline of the C-TAILS procedure

C-TAILS has been used for identification native protein C-termini together with neo C-termini, more than 100 cleavage sites were identified in an Escherichia coli proteome exposed to GluC. This approach can be used in combination with stable isotope formaldehyde-based labeling. A detailed protocol was recently published (Schilling et al. 2011c).

5.2.5 Identification of Cleavage Sites by Non-selection Procedures

5.2.5.1 The Protein Topography and Migration Platform

In 2008 the group of B. F. Cravatt developed protein topography and migration platform (PROTOMAP), an approach for the quantitative comparison of biological samples with the focus on proteolytic processing (Dix et al. 2008). This technique does not enrich for C-terminal or N-terminal peptides. In the first step an experimental and a control sample are separated independently by one-dimensional (1D) SDS-PAGE. Each lane is sliced into gel bands of fixed intervals. Peptides are released by tryptic digestion and analyzed using LC-MS/MS. Altered proteolysis of a given protein between both samples is indicated by reduced abundance at a given molecular weight and appearance of cleavage fragments at lower molecular weight. Since samples are not mixed and analyzed independently, PROTOMAP employs spectral counting for label-free quantification with peptide abundances being mapped on protein sequences to locate cleavage sites.

This technique is very powerful and has been successfully applied to establish proteolytic processing in the intrinsic apoptosis pathway in Jurkat T cells. Many known caspase mediated proteolytic events were validated and 150 additional proteins cleaved during apoptosis have been reported. In PROTOMAP, cleavage events leading to only a slight shift in molecular weight may not be detected and information on the exact cleavage site is missing but may be crucial to assign a cleavage site to a putative protease.

5.3 Validation of Cleavage Sites

5.3.1 Overview

The identification of cleavage sites by terminomic techniques results in potential protease substrates, which have to be validated in a subsequent step. Before performing post-screening experiments like biochemical and cell-biological approaches, cleavage sites that are physiologically irrelevant should be excluded. Especially if a proteome treated with a recombinant protease is compared to an untreated proteome, cleavage events can occur in vitro that would not have occurred in vivo. During sample preparation cell compartments can be disrupted, resulting in cleavage of a protein that would not meet the test protease in vivo. Moreover it is important to use physiological concentrations of the test protease. Higher concentrations of the test protease could increase the number of cleavage events. It is important to ensure for incubation conditions that reflect the in vivo situation and not induce partial unfolding of proteins thereby enhancing accessibility of cleavage sites (Huesgen and Overall 2012).

A major challenge is to distinguish cleavage sites directly generated by the protease under investigation from indirect cleavage events stemming from further proteases. Since proteases are organized in complex networks, the so-called “protease web”, changes in protease activity of one protease will affect the activity of other proteases resulting in downstream effects like altered gene expression or protein turnover (Butler and Overall 2009; Overall and Dean 2006). This is of particular importance for the analysis of complex samples, for example for the comparison of protease-knockout and control samples or protease overexpression and control samples. In general, a combination of a terminomic technique and a quantitative proteome comparison is advised to distinguish direct and indirect cleavage events. In some cases, altered abundance of a proteolytically processed N-terminus reflects altered abundance of the corresponding protein rather than impaired proteolytic processing. Moreover the usage of different experimental setups like a combination of overexpression and silencing of a test-protease can prevent the identification of “false” cleavage events (Doucet et al. 2008).

To distinguish direct cleavage sites from downstream effects, knowledge about sequence specificity of the protease of interest is useful. Several techniques have been developed to characterize substrate specificity like substrate phage and bacterial display, peptide microarrays or positional scanning peptide libraries (auf dem Keller and Schilling 2010; Poreba and Drag 2010). Proteomic identification of protease cleavage sites (PICS) is a fast and effective alternative to these techniques (Schilling and Overall 2008) and will be introduced in the next section. Next to knowledge about the site specificity of a protease, knowledge about the subcellular localization can help to discriminate direct and indirect effects of altered protease activity. To prove for colocalization of a protease and its potential substrate, activity based probes (ABPs), molecules that can irreversibly bind to active proteases, are useful tools to monitor protease activity in vivo.

5.3.2 Proteomic Identification of Protease Cleavage Sites

Schilling et al. published in 2008 a technique called proteomic identification of protease cleavage sites (PICS) for characterization of prime- and non-prime specificity as well as subsite cooperativity in one experiment (Schilling and Overall 2008). A PICS experiment starts with generating a peptide library by digesting a proteome (e.g. cell lysate) with a specific protease such as trypsin, chymotrypsin or GluC (Fig. 5.10a). Following all primary amines and sulfhydryl groups of the resulting peptides are chemically protected and then exposed to the protease of interest (Fig. 5.10b). Newly generated primary amines are biotinylated with sulfo-NHS-SS-biotin, enriched using streptavidin, released by disulfide bond reduction and analyzed by LC-MS/MS. The corresponding non-prime side sequences are derived bioinformatically by database searches, thereby revealing the exact position of the cleavage site. PICS determines protease specificity using natural sequence diversity and enables the successful corroboration of cell contextual cleavage sites. A detailed protocol was published recently (Schilling et al. 2011b) as well as a web-based data-analysis resource termed WebPics (Schilling et al. 2011a).

Fig. 5.10
figure 000510

Outline of the PICS procedure. (a) Library generation. (b) Cleavage site screen

PICS has been widely applied for the identification of cleavage site specificity including all protease classes. Amongst others, site specificities of MMP-2, caspases 3 and 7, cathepsins B, L, S, K and G, HIV protease 1, thrombin and elastase have been profiled. PICS has been used to validate cleavage sites identified by TAILS (Tholen et al. 2011). In this study more than 1,500 protein N-termini were identified, that mostly were contradicting cathepsin L specificity determined by PICS. This result indicates that altered cathepsin L activity affects numerous proteases and protease inhibitors leading to downstream effects. PICS revealed that cathepsin L specificity and the specificities of cathepsins B and S share some features (Biniossek et al. 2011). For instance, all three cathepsins prefer glycine residues in P1 and P1′ position. Another study revealed the cleavage site specificity for several members of the astacin metalloprotease family. A strong specificity for aspartate residues in P1′ position was observed for meprin α, meprin β and LAST_MAM (Becker-Pauly et al. 2011). Interestingly, cleavage site specificity was also influenced by proline in P2′ or P3′ position leading to an example of subsite cooperativity. Here, the obtained specificities validated TAILS data and results of other biochemical approaches revealing processing of vascular endothelial growth factor-A (VEGF-A) by meprin α and processing of the serine protease pro-kallikrein 7 by meprin β.

5.3.3 Activity-Based Probes

Proteomic techniques determine protein abundance, but do not account for changes in protease activities. To fully understand protease action monitoring of protease activity is fundamental, but how proteases carry out their biological functions is a challenging task, since proteases can be regulated by small molecules, interactions with other proteins or PTMs (Drag and Salvesen 2010; Shen 2010). Moreover many proteases are synthesized as inactive zymogens that are activated by proteolysis or conformational changes. For instance, cysteine cathepsins are processed in a pH-dependent manner in the lysosomal compartment (Conus and Simon 2010). ABPs can help to distinguish active and inactive proteases, thereby revealing the localization of protease activity. Meanwhile ABPs are applied for biomarker discovery, drug screening and in vivo imaging (Deu et al. 2012; Fonovic and Bogyo 2007).

ABPs are small molecules that irreversibly bind to active proteases, but not inactive or inhibited proteases (Kidd et al. 2001; Liu et al. 1999). The general structure of ABPs consists of a chemically reactive group termed as a “warhead”, a spacer region, that targets the probe to the target protease, and a tag, normally a fluorescent dye or an affinity tag such as biotin (Deu et al. 2012) (Fig. 5.11). ABPs operate by irreversible binding of the warhead to the active site nucleophile residue of the target protease. The tag allows for visualization allowing the proof of colocalization of the protease activity in the candidate substrate, and/or for isolation of the ABP-protease-complex, identifying the active protease from complex samples (Doucet and Overall 2008).

Fig. 5.11
figure 000511

Schematic structure of activity-based probes (ABPs); adapted from Paulick and Bogyo (2008)

The main application of ABPs is their use to visualize protease activity using modern imaging techniques, revealing localization and distribution of proteases in cells and in vivo. This becomes of special interest for cancer research and treatment, since almost all human cancer tissues show increased protease activity (Blum 2008). In mice developing pancreatic cancer, but also other cancer tissues, ABPs targeting cysteine cathepsins have been used to noninvasively detect cathepsin activity (Blum et al. 2005, 2007; Joyce et al. 2004). Moreover to study the kinetics of apoptosis caspase activity has been monitored in multiple mouse models (Edgington et al. 2009, 2012). All these studies also highlight the potential of ABPs to monitor the response of protease activity to a drug treatment, which could be a helpful tool to keep tumor growth under surveillance. Most ABPs target cysteine or serine proteases, since the development of ABPs for metalloproteases or aspartic proteases is more challenging. During their action an activated water molecule conducts the nucleophilic attack towards the amide bond preventing the ABP from covalent binding to the active site nucleophile (Doucet and Overall 2008). Meanwhile ABPs for metalloproteases are available that contain an additional photoactivable handle anchoring the ABP to the protease via an amino acid outside the active site (Saghatelian et al. 2004). In summary, there are ABPs targeting almost all protease classes such as metalloproteases (Chan et al. 2004; Saghatelian et al. 2004), threonine proteases (Kessler et al. 2001; Wang et al. 2003), cysteine proteases (Greenbaum et al. 2000; Thornberry et al. 1994) and serine proteases (Kidd et al. 2001; Williams et al. 1989). A major challenge remains the development of probes that selectively bind to one member of a protein class, but some new approaches have emerged (Blair et al. 2007; Hagel et al. 2011). Another limitation of ABPs is that their production requires extensive knowledge of organic chemistry and synthesis efforts, because the probes have to be stabile, not toxic and suitable for in vivo use.

Besides in vivo imaging applications ABPs have been combined with mass spectrometry to isolate active proteases from complex proteomes for identification and quantification, a technique known as ABPP-MudPIT (activity-based protein profiling—multi-dimensional protein identification technology) (Speers and Cravatt 2009). In these approaches often lysates of cells or tissues are incubated with a probe targeting a certain protease class, afterwards the ABP-protease complexes are isolated by affinity purification and then identified by MS analysis. ABPs have been used to profile metalloprotease activity in several cell lines (Saghatelian et al. 2004). More than 20 metalloproteases were identified by applying a cocktail of metalloprotease-directed probes to cell proteomes followed by LC-MS/MS (Sieber et al. 2006). Additionally, the role of cysteine proteases in tumorigenesis, microbial pathogenesis or apoptosis was investigated using ABPs (Berger et al. 2006; Paulick and Bogyo 2008; Puri and Bogyo 2009).

5.4 Proteolysis Regulation Networks: Exemplary Studies

Proteases are involved a multitude of physiological reactions from simple digestion of food proteins to highly regulated systems. Maintaining homeostasis in living systems requires strict control of all physiological processes. Importantly, most of these processes are directly or indirectly interconnected. Also, sequential effects are interdependent, meaning that diverse initial events can result in the same downstream process. Thus, the term networking of proteases and inhibitors better depicts the actual state of our knowledge on proteolytic regulation.

Proteases create a complex network of sequential activation and inhibition episodes. The main agents in this system are: (i) zymogens, (ii) activated proteases, and (iii) protease inhibitors. On the most basic level, zymogen activation is a proteolytic process, hence representing an initial layer of functional protease interconnectedness.

Maintaining the balance between active and non-active forms of proteases as well as their inhibitors is crucial for appropriate cellular processes. Many proteases act in cascades, which integrate and amplify primary proteolytic “signals”. Thus, they yield dominant downstream effects that often overshadow the initiating proteolytic events. Most cascades are tightly controlled to sustain an appropriate homeostasis within the proteolytic system (Garcia-Verdugo et al. 2010; Le Magueresse-Battistoni 2007).

The most prominent example of a zymogen activation cascade is the coagulation system, which sequentially recruits factors required for blood clotting. The coagulation factors are mostly serine proteases (Morrissey 2012; Ott 2011; Zogg and Brandstetter 2009). Two parallel pathways are involved in the cascade depending on the initiation source. In the extrinsic pathway the blood clotting process begins when the vessel wall is injured and membrane bound tissue factor forms a complex with the zymogen FVIIa in circulating blood, finally leading to activation of factor X. In the intrinsic pathway of coagulation a sequence of reactions leading to fibrin formation begins with the contact activation of factor XII, and also results in the activation of factor X. As an example, this protease activation cascade is shown in Fig. 5.12.

Fig. 5.12
figure 000512

Blood clotting cascade (blood coagulation pathway); Roman numerals represent coagulation factors, a—active form, Asterisk—activated by thrombin. TF tissue factor

In addition to these prototypical cases, recent data indicates widespread networking and regulation of proteolytic activities of proteases of different kinds.

Protease networking is illustrated by the interplay between members of the papain subfamily of cysteine proteases, which consists of 11 human cysteine cathepsins (Rawlings et al. 2006), and their physiological inhibitors. In the present compilation, a detailed review on cysteine cathepsins is provided in the sections “Cathepsins: getting in shape for lysosomal proteolysis” and “Exploring systemic functions of lysosomal cysteine proteases: the perspective of genetically modified mouse models”. As for most other protease families, cathepsin activity is regulated on several levels. Cathepsins are strictly regulated by endogenous inhibitors and impaired regulation leads to various malignancies (Glondu et al. 2002; Hu et al. 2008; Sevenich et al. 2010). Endogenous cysteine cathepsin inhibitors include extracellular cystatin and intracellular stefins (Dubin 2005). Overexpression of cystatin C, cystatin M and stefin/cystatin A in tumor cell lines has shown that cysteine cathepsins have functional roles in growth, invasion and metastasis of tumor cells of both epithelial and mesenchymal origins. On the other hand, cystatin M inhibition significantly increased the enzymatic activities of cathepsins B and L and legumain (Li et al. 2005; Sokol and Schiemann 2004; Vigneswaran et al. 2006; Zhang et al. 2004). Other studies showed that cystatin M as well as cystatin C also inhibit legumain (Rawlings et al. 2006). Regulation, and hence interplay on the enzyme-inhibitor axis was further shown in studies using the mouse model of multistage pancreatic islet cell carcinogenesis. As an example selective cathepsin S deficiency impaired angiogenesis and tumor cell proliferation, angiogenic islet formation, and the growth of solid tumors; whereas the absence of its endogenous inhibitor cystatin C had the opposite result (Wang et al. 2006).

Compensatory effects are also found for cysteine cathepsins and constitute another aspect of protease networking. Cathepsins B and Z are the only carboxypeptidases among the cysteine cathepsins (Klemencic et al. 2000). During breast cancer development in the PymT mouse model both cathepsins B and Z exert synergistic anticancer effects. Single deficiencies show partial or full reciprocal compensation (Sevenich et al. 2010) and deletion of either cathepsin B or Z alone produces considerably mild phenotypes. In fact, a more substantial anti-cancer phenotype occurred in the double knock-out, in comparison to the single knockout or wild type mice (Sevenich et al. 2010; Vasiljeva et al. 2006). Thus, only the combined loss of both proteases led to significant reductions in tumor and metastatic burden, while single deficiencies resulted in reciprocal compensation.

Additionally, migratory and invasive properties of tumor cells with different cathepsin B and Z genotypes were investigated in vitro. Notably, cathepsin B or cathepsin Z deficient tumor cells showed less invasive phenotypes compared to wild type controls, However, the invasiveness of the cathepsin B and Z double-deficient cells was most impaired, indicating a synergistic effect of cathepsin B and Z on cancer cell invasion by proteolytic matrix remodeling (Sevenich et al. 2010).

These results demonstrate that although cathepsins B and Z are critical for early tumor formation and affect this process independently, the loss of cathepsin Z function in promotion of established tumors is completely compensated by cathepsin B alone or in combination with proteases other than cysteine cathepsins, whereas cathepsin Z at least partially compensates for a lack of cathepsin B (Sevenich et al. 2010).

A recently discovered example of protease networking is degradation of cystatin C by cathepsin D, a ubiquitously expressed aspartic endoprotease. Yeast 2-hybrid screening using human cathepsin D as a bait and a cDNA library isolated from normal human breast tissue identified cystatin C as a putative cathepsin D interaction partner. It has been confirmed that both cathepsin D precursor and cystatin C interact extracellularly at neutral pH. Transcriptome analysis showed that the cystatin C extracellular levels are cathepsin D dependent. Degradomic analysis demonstrated that cathepsin D cleaves cystatin C at acidic pH at multiple sites. Additionally, system-wide effects of cathepsin D depletion were observed. Upon cathepsin D silencing, cystatin C was not degraded and the extracellular activity of cysteine cathepsins was decreased (Laurent-Matha et al. 2012).

Cystatin M/E is the endogenous inhibitor of cathepsin L and legumain. In mice, cystatin M/E deficiency is embryonically lethal. Lethality is rescued by additional depletion of cathepsin L. Cystatin E/M can regulate both intracellular and extracellular processing and activation of legumain. It has been observed that prolegumain was highly secreted into the conditioned media of legumain overexpressing cells. Additionally the increased processing of cathepsin L to the two-chain form in legumain expressing cells has been showed, whereas this processing was inhibited in the cystatin E/M overexpressing cells (Smith et al. 2012).

Interestingly, loss of legumain does not rescue the cystatin M/E-knockout phenotype. By generating mice lacking both cystatin M/E and cathepsin L, it was shown that cathepsin L deficiency rescues the lethal cystatin M/E-knockout mice phenotype. Also, in the same study it was shown that cathepsin D (unlike cathepsin L) is able to process legumain, and in addition some autoactivation was observed, which makes the interplay between proteases even more complex (Zeeuwen et al. 2010). This is an example on how sophisticated in vivo studies untangle a protease—inhibitor network.

Networking in the regulation of protease activity is also illustrated by examples within the metalloproteinase family. Inhibitors often target multiple proteases so alteration in inhibitor levels can affect seemingly unrelated proteases. Tissue inhibitors of metalloproteases (TIMPs) are another example: four TIMPs each target a different array of metalloproteases, including different MMPs and ADAMs (Murphy 2011). Moreover, non-canonical inhibitory profiles are beginning to emerge. For example, cystatin C (the prototypical cysteine protease inhibitors) also exerts inhibitory activity on meprin metalloproteases (Hedrich et al. 2010).

Activation of MMP-2 involves a ternary complex, consisting of another metalloprotease (typically MT1-MMP) and TIMP2 (Butler et al. 1998). A trimeric complex is formed on the cell surface by an interaction between the carboxyterminal domain of TIMP-2 and the hemopexin domain of proMMP-2(Strongin et al. 1993). Notably, the local concentration of TIMP-2 in the tissue is crucial for the complex formation and MMP-2 cleavage regulation; if it is too low, insufficient proMMP is brought to the cell surface, whereas too high TIMP-2 concentration inhibits all of the MT1-MMP and thus blocks the initial cleavage. Interestingly, the experiments on TIMP-2 knock-down cells, with different expression levels of MT1-MMP did not exhibit any MMP-2 active form production, which indicates that MT1-MMP absolutely requires TIMP-2 to catalyze the conversion of MMP-2 to the fully active form (Butler et al. 1998). Notably, in this context a protease inhibitor augments protease activity rather than repress it, thereby adding further complexity to proteolytic systems in vivo.

Impaired control of protease signaling leads to severe abnormalities, defects, and even lethality. The SPINK5 gene encodes the serine protease inhibitor LEKTI, lympho-epithelial Kazal-type related inhibitor (Magert et al. 1999). LEKTI is expressed in the differentiated viable layers of stratified epithelium. In the epidermis, it is mainly restricted to the granular layer (Bitoun et al. 2003). LEKTI is comprised of 15 potential Kazal-type serine proteinase inhibitory domains. The full-length recombinant protein was shown to inhibit serine proteases such as trypsin, plasmin, subtilisin A, cathepsin G, and elastase (Mitsudo et al. 2003); however, kallikrein proteases are thought to constitute the primary in vivo targets of LEKTI (Jayakumar et al. 2004).

Due to its versatile activity, LEKTI is involved in multiple biological pathways relevant to tissue homeostasis, inflammation and antimicrobial defense. LEKTI is involved in the regulation of proteolytic events crucial for barrier formation and maintenance. Loss of balance between LEKTI and its target proteases leads to severe skin barrier defects with repetitive inflammations and allergic symptoms The impaired functioning of LEKTI due to mutations leads to Netherton syndrome (Chavanas et al. 2000). The group of Alain Hovnanian has shown that LEKTI-deficient mice faithfully replicate a skin phenotype reminiscent of the Netherton syndrome in humans. To a large extent, the severe skin phenotype is a result of unregulated kallikrein 5 and kallikrein 7-like activity and subsequent loss of stratum corneum adhesion through degradation of desmoglein. In mice, further depletion of matriptase, an auto-activating transmembrane serine protease, rescues the skin phenotype stemming from LEKTI deficiency. Matriptase initiates kallikrein activation, hence loss of matriptase counter-balances excessive kallikrein activity in LEKTI deficient mice. Although this is a superb example for fine-tuned interaction of proteases and protease inhibitors, it should be noted that matriptase is not a natural target of LEKTI (Chavanas et al. 2000). Deficiency of matriptase alone—in the presence of LEKTI—impairs epidermal barrier function by weakening epidermal tight junctions (List et al. 2009). In LEKTI-deficient mice, matriptase initiates Netherton syndrome by premature activation of a pro-kallikrein-related cascade. Once converted to its active form, kallikrein-related peptidase 5 is capable of proteolytically activating pro-kallikrein-related peptidase 5 and pro-kallikrein-related peptidase 7, in line with the previously proposed role of the protease in the propagation of an epidermal prokallikrein-related peptidase cascade (Sales et al. 2010).

5.5 Interactions Between Proteolytic Processing and Post Translational Modification

Limited proteolysis and protein degradation are essential processes in a diversity of (patho-) physiological responses. Proteolysis is integrated in a complex network of multiple PTMs, such as phosphorylation, acetylation or ubiquitination. While the ubiquitin proteasome system (UPS) represents a prototypical case for synergistic and co-operative action of proteolysis with another type of PTM, proteome-wide studies are now beginning to shed light on further links, e.g. between proteolysis and phosphorylation. However, while the ubiquitin proteasome system has been understood in great detail (see below), much less is known how limited proteolysis is fine-tuned by PTMs.

Although the precise mechanisms of protein degradation processes are yet to be fully defined, our knowledge in understanding the cellular and molecular mechanisms of different proteolytic pathways has increased significantly over the years (Attaix et al. 2001). Some of the major degradation pathways utilize lysosome proteases (Agarraberes et al. 1997; Cuervo et al. 1997; Franch et al. 2001), calcium-dependent proteases (Sorimachi et al. 1997), metalloproteases (Yong et al. 1998), proteases that are involved in apoptosis (Tseng et al. 2008), as well as the ubiquitin-proteasome system (Coux et al. 1996; Goldberg et al. 1995; King et al. 1996). The ubiquitin-proteasome system is perhaps the most comprehensively described degradation pathway to date (Price et al. 1996; Rock and Goldberg 1999). This complex pathway is one of the major proteolytic processes that are responsible for the removal of abnormal or damaged proteins in all living cells (Ciechanover et al. 1980; Etlinger and Goldberg 1977). It involves a cascade of enzymatic reactions including activation and attachment of ubiquitin to proteins that are targeted for degradation (Fig. 5.13). This process comprises several ubiquitin ligases with varying substrate specificity and different roles in cellular physiology.

Fig. 5.13
figure 000513

Ubiquitin-proteasome system for proteolytic degradation of proteins into free amino acids

Ubiquitin chains are linked through Lys11/Lys48 for proteosomal degradation, however alternatively, other linkages (Lys11, Lys29 and Lys63) have also been previously described (Weissman 2001; (Komander and Rape 2012). Structure and composition of the (poly)ubiquitin chain constitutes itself a cellular signal with different functional consequences and outcomes; e.g. lysosomal or proteasomal degradation. Details of the “ubiquitin code” have been reviewed previously (Komander and Rape 2012). It is well established that when monoubiquitylation and homogenous chains of four or more ubiquitins are formed, this chain is recognized by the 26S proteasome for degradation (Komander and Rape 2012). Within the proteasome complex, the polyubiquitin chain is cleaved for recycling prior to cleavage of targeted protein. Once the proteins are degraded into peptide fragments, they are released from the proteasome for further degradation into free amino acids by exopeptidases in the cytosol. The elements that constitute the ubiquitin-proteosome system had been extensively reviewed in (Chitra et al. 2012; Debigare and Price 2003). Proteolysis that involves the ubiquitin-proteosome pathway underlies a multitude of cellular events such as cell growth and proliferation, antigen presentation and DNA repair. Further examples of cellular processes involved include the degradation of protein regulators and inhibitors. When regulators such as cyclin or some transcriptional regulators should act transiently, or when a process is initiated by the degradation of inhibitors such as cyclin-dependent kinase inhibitors (Ckis), or a class of transcriptional inhibitor proteins (IκBs), proteolysis of these regulators and inhibitors is achieved through the ubiquitin-proteosome pathway (Pagano et al. 1995).

Small ubiquitin-like modifier (SUMO) proteins are similar to ubiquitin. However unlike ubiquitin, sumoylated proteins are not targeted for destruction by the proteasome. Instead, sumoylation of proteins are associated with various cellular processes such as nuclear-transport, transcriptional regulation, cell death, protein stability, response to cellular stress, and cell cycle (Hay 2005). Studies have shown that the N-terminal sumoylation of the main ubiquitin E3 ligase for p53, MDM2, reduces the self-ubiquitination and degradation activity by proteasome (Buschmann et al. 2001; Lee et al. 2006; Miyauchi et al. 2002). Similarly, tumor the tumor suppressor protein, p53, can be modified by sumoylation with either up-regulation or down-regulation of its activity (Hock and Vousden 2010). Furthermore a recent report had shown that the protooncogene SKI is able regulate sumoylation of MDM2 and p53, which leads to the increase in MDM2 self-ubiquitination activity and enhanced degradation of p53 (Ding et al. 2012). Another known PTM that acts reversibly on lysine residues is acetylation. Interestingly, a recent proteome-wide study has shown substantial overlap of acetylation and ubiquitination site in Saccharomyces cerevisae (Henriksen et al. 2012). This acetylation site is localized within the critical regulatory domain of SAGA (Spt-Ada-Gcn5-Acetyltransferase) complex and it is involve in the Ubp8-containing histone H2B deubiquitylase complex (Henriksen et al. 2012). It is likely that the acetylation of this site reversibly shields the protein complex from UPS-based degradation.

Further studies have described protective effects of PTMs from proteolytic cleavage. Huntington’s disease (HD), an autosomal dominant neurodegenerative disorder, is characterized by impaired muscle movements, psychiatric problems and cognitive decline (Rubinsztein and Carmichael 2003). It is caused the abnormal expansion of polyglutamine (polyQ) tract in the huntingtin (htt) protein. It was proposed that proteolysis of the polyQ proteins by a range of proteases such as caspases (Goldberg et al. 1996; Wellington et al. 2002), calpains (Gafni and Ellerby 2002; Kim et al. 2001), and aspartic proteases (Lunkes et al. 2002) to generate shorter, diffusible fragments that are responsible for aggregation. Studies have shown that polyQ repeats in htt protein when cleaved, yield short amino-terminal toxic fragments (Kim et al. 2001; Lunkes et al. 2002; Wellington et al. 2000), which in turn lead to protein aggregation and cellular toxicity. A number of studies have shown that htt exhibits a range of PTMs, including ubiquitination (Kalchman et al. 1996), sumoylation (Steffan et al. 2004) and phosphorylation (Schilling et al. 2006a). It has been reported that htt undergo calpain-mediated proteolytic cleavage at amino acid Ser536 (Gafni et al. 2004). In vitro study showed that this amino acid is able to undergo phosphorylation to prevent cleavage and modulate cellular cytotoxicity of the polyQ disease. In addition, another study has reported that It has been reported that Cdk5-mediated phosphorylation of Ser434 reduces caspase-mediated htt cleavage of the protein at residue Asp513 (Luo et al. 2005). In another polyQ related study of Kennedy’s disease (degeneration of motor neurons), it was also shown that phosphorylation of Ser514 of the androgen receptor that blocks caspase-3 cleavage, thus preventing cell death (LaFevre-Bernt and Ellerby 2003).

The paradigm in which protein cleavage can reveal PTM site(s), and conversely, PTM of protein can promote proteolytic cleavage, highlights the tight regulation and diverse interactions of proteolysis and post-translational processing. Apart from PTM and proteolytic crosstalk in degradation, protease-mediated cleavage and PTM such as protein phosphorylation events play essential roles in regulating multitude biological and pathological processes including tissue development, cancer and apoptosis (Kurokawa and Kornbluth 2009; Lopez-Otin and Hunter 2010). It has been proposed that up to 5 % of the proteome is subjected to caspase-mediated proteolysis during apoptosis (Arntzen and Thiede 2012; Crawford and Wells 2011). The diverse roles of caspase include activation as well as inactivation of protein kinases that are involved in phosphorylation. For example, caspase 3 is able to cleave and inactivate focal adhesion kinase (FAK) to downregulate phosphorylation (Taylor et al. 2008). The complexity of caspase functions is highlighted in a study by Cravatt and colleagues, where more than 700 cleaved proteins were identified in Jurkat T cells that are involved in apoptotic pathway, with over 5,000 phosphorylation sites (Dix et al. 2012). Approximately 500 apoptosis-specific phosphorylation were enriched on cleaved proteins and clustered around sites of caspase proteolysis (Dix et al. 2012). The dynamic interactions between phosphorylation kinases and proteases have been extensively reviewed previously by Lopez-Otin and Hunter (Lopez-Otin and Hunter 2010). The association of proteolytic cleavage/degradation and PTM is also well characterized in Alzheimer’s disease. The aggregation of Aβ peptides is influenced by different PTMs (Kuo et al. 1998) [such as peptide truncations (Hartig et al. 2010; Kuo et al. 1997; Miravalle et al. 2005; Saido et al. 1996; Tekirian et al. 1998), racemization (Mori et al. 1994; Tomiyama et al. 1994), isomerization (Murakami et al. 2008; Shimizu et al. 2000), pyroglutamination (Kuo et al. 1997; Saido et al. 1995), metal induced oxidation (Dong et al. 2003) as well as phosphorylation (Kumar et al. 2011)] which contribute to the insolubility, stability and resistance of the amyloid filaments to proteolytic degradation (Fig. 5.14). While the majority of cellular PTMs (e.g. phosphorylation and glycosylation) are actively induced by cells through dedicated enzymes (e.g. kinases), other PTMs such as oxidation and racemization rather represent accidental chemical events against which cells tend to possess protective machineries. It has been reported that some of these post-translationally modified Aβ peptides are detected in early stages of Alzheimer (Hartig et al. 2010; Kumar et al. 2011; Schilling et al. 2008; Sergeant et al. 2003), resulting in higher cellular cytotoxicity compared to non-modified counterparts (Millucci et al. 2010). It has also been proposed that these modified peptides serve as seeding species to propagate the aggregate formation of amyloid plague in vivo (Kumar et al. 2011; Schilling et al. 2006b, 2008).

Fig. 5.14
figure 000514

Schematic diagram showing fibrils formation from Alzheimer’s amyloid β-peptide. Post-translational modifications of amyloid β-peptide reduce lag phase promoting aggregation

Another significant PTM that involves in proteolytic degradation include protein glycosylation. Glycosylation represents the most abundant extracellular PTM in eukaryotes. It has been suggested that glycosylation can act as a protective storage depot for some important glycoproteins, such as growth factors from non-specific proteolysis in the extracellular matrix and thus prolonging their activities (Varki 1993). Another implicated role of post-translational glycosylation of extracellular proteins is the enhancement of structural stability and rigidity by sugar moieties (Imperiali and O’Connor 1999; Wyss et al. 1995) and it also assists in folding and transport processes, by protecting glycoproteins from proteolytic cleavage (Opdenakker et al. 1993). Many examples have demonstrated the protective role of glycosylation from proteolytic cleavage. For example, in Drosophila, the glycosylation of acetylcholinsterase, a membrane-anchored protein, prevents it from proteolytic cleavage by cellular processing enzyme(s) (Mutero and Fournier 1992), while in bacteria, glycosylated cellulases from Cellulomonas fimi had been shown to be protected by C. fimi protease (Langsford et al. 1987). Another example of bacterial glycoproteins has also shown a protective effect of glycosylation against proteolytic cleavage of a lipoprotein antigen from Mycobacterium tuberculosis by cellular proteases, where up to six cleaved fragments of the protein were observed when the protein was not glycosylated (Herrmann et al. 1996). However, like phosphorylation, glycosylation could also play a role in the activation and/or facilitating proteolysis. Studies have reported the involvement of glycosylation in proteolytic processing of various proteins, including CREB-H (Chan et al. 2010), CD97 (Hsiao et al. 2009), anti-inflammatory protein A20 (Shrikhande et al. 2010), ATP-binding cassette transporter ABCG2 (Nakagawa et al. 2009), and immunoglobulin A (Taylor and Wall 1988). Nevertheless, PTMs are one of the major processing events in cellular homeostasis and they are implicated in the pathogenesis of various diseases. More importantly, PTM could result in enhanced or ablation of specific proteolytic activity.

5.6 Protein Turnover

Initially proteases were not regarded as “precision tools” altering activities/characteristics of their respective substrates and protease cleavage was not regarded as posttranslational modification. The major role of proteases was thought to be protein degradation, which is indeed one of their vital and important functions. As cells have to respond to changing environmental conditions the cellular proteome is constantly being turned over with protein synthesis and degradation balancing each other under steady-state conditions (Ciechanover 2005). Mass spectrometry (MS)-based proteomics has been successfully employed to study protein turnover and underlying molecular mechanisms [for review see Engelke et al. (2012)]. Especially the development of quantitative MS methods has greatly expanded our knowledge on protein dynamics of degradative processes.

Historically, protein degradation was thought to take place solely in lysosomes (De Duve et al. 1953). However, the observation that protein half-lives differ substantially raised the question how bulk lysosomal degradation should account for this. The discovery of the ubiquitin-proteasome system (UPS) provided an unexpected answer: next to lysosomal degradation proteins can be specifically targeted to proteasomal degradation by the covalent attachment of ubiquitin (Ciechanover 2005). To date, the UPS and the autophagosomal-lysosomal system are regarded as the two major cellular degradation systems. However, autophagy, lysosomal degradation of intracellular proteins, has emerged as tightly regulated protein degradation pathways being far less “unspecific” than initially anticipated (Yang and Klionsky 2010). Several autophagy subtypes allowing specific degradation of protein complexes and organelles have been described such as mitophagy for mitochondrial degradation (Kim et al. 2007; Tolkovsky 2009), reticulophagy for ER degradation (Bernales et al. 2006), ribophagy for ribosomal (Beau et al. 2008; Kraft et al. 2008), pexophagy for peroxisomal degradation (Dunn et al. 2005). In addition, xenophagy, the selective removal of intracellular bacteria and viruses via the autophagic machinery, has attracted attention, particularly due to its important role in human health and disease (Romao and Munz 2011). Also the unspecific nature of classical macroautophagy, the stereotype of lysosomal bulk degradation, has been debated lately. Thus, it could be shown that organelles are degraded in an ordered fashion during starvation-induced macroautophagy (Kristensen et al. 2008) and that autophagosomal proteome compositions reflect the inducing stimuli and differ over time (Dengjel et al. 2012).

In protein turnover studies classic radiolabeling principles have been transferred to quantitative MS-based proteomics approaches using metabolic, stable isotope labeling strategies allowing global unbiased investigations (Hinkson and Elias 2011). Protein degradation constants can be determined by pulse-labeling approaches using the decrease of stable isotope-labeled peptide signals (Pratt et al. 2002). Compared to studies using radiolabeled tracers proteomics studies allow the use of more than one isotopic label, which enables the simultaneous recording of protein turnover, synthesis and degradation values (Boisvert et al. 2012). Also absolute quantification approaches have been successfully employed to study both, protein synthesis and degradation (Schwanhausser et al. 2011). MS-based proteomics experiments did not only lead to new insights with respect to protein turnover, also mechanistic insights on protein degradation have been generated.

5.6.1 Ubiquitin-Proteasome System

The 26S proteasome has been investigated in detail and MS-based proteomics approaches highlighted the heterogeneity of the multi-protein complex. Hence, proteasomes containing two different activator subunits could be described, as well as proteasomes differing in their posttranslational modification status (Drews et al. 2007; Schmidt et al. 2005; Wang et al. 2007). Another field of active research is the role of proteasomal activity in the generation of MHC peptides and their implication on human immune responses (Mester et al. 2011).

Next to proteasomal composition and activity the role of ubiquitin and ubiquitin-like modifiers has been extensively studied by MS-based proteomics. The existence of branched ubiquitin chains, the use of all seven ubiquitin lysine residues for branching, and their involvement in protein degradation was outlined (Peng et al. 2003; Xu et al. 2009b). In general, ubiquitinated proteins have to be enriched prior analysis, which can be done by several ways: immunoaffinity purifications using anti-ubiquitin antibodies or tagged ubiquitin versions, as e.g. HA-, FLAG-, or His-tagged ubiquitin, and affinity purifications (AP) by ubiquitin-binding domains. The latter has been successfully employed inter alia to study ubiquitination dynamics during growth factor receptor signaling (Akimov et al. 2011). In global, bottom-up approaches a di-glycine remnant can be identified by MS, which marks ubiquitination sites after tryptic digestion. A purification strategy using a respective monoclonal antibody has been developed (Xu et al. 2010). However, as also ubiquitin-like modifiers like NEDD8 and ISG15 leave the same mark the data has to be interpreted cautiously (Kim et al. 2011).

5.6.2 Autophagosomal-Lysosomal System

MS-based proteomics approaches have focused on (I) protein-protein networks responsible for regulation of autophagosomal-lysosomal protein degradation and (II) studying of underlying organellar proteome dynamics (Zimmermann et al. 2010). In an extensive protein interaction study a complex, autophagy-related protein network was outlined having implications on vesicle trafficking, protein and lipid phosphorylation (Behrends et al. 2010). Interestingly, a sequence motif was identified as binding to the autophagosomal marker protein LC3 (LC3 interacting region; LIR) serving as a signal for specific autophagy degradation receptors (Johansen and Lamark 2011). It could be shown that in the case of the protein optineurin the interaction is phosphorylation dependent and that phosphorylation promoted selective removal of ubiquitin-coated cytosolic Salmonella bacteria by autophagy (Wild et al. 2011).

Both, lysosomes as well as autophagosomes, have been studied by proteomics means and were either enriched by AP, or profiled over gradient centrifugations. As lysosomes are relatively labile AP have been used, e.g. magnetic separation of iron-containing lysosomes and immunoprecipitation of whole organelles by anti-vacuolar H+-ATPase antibodies allowing insights into lysosomal trafficking (Cardoso et al. 2009; Nylandsted et al. 2011). Autophagosomes have been studied by both AP and profiling approaches. AP purifications are commonly performed using anti-GFP antibodies in combination with cells stably expressing the autophagosomal marker protein LC3 fused to GFP (Dengjel et al. 2012; Gao et al. 2010). In combination with time-resolved data and different autophagy inducing stimuli interesting insights into organellar dynamics have been generated identifying a close crosstalk between the UPS and the autophagosomal-lysosomal system (Dengjel et al. 2012).

5.7 Outlook

Proteomic-based protein discovery has seen rapid technical advances and enhanced our understanding of diverse and complex systems by applying various mass spectrometry based techniques. The area of degradomics has evolved in recent years and has started a new era, enabling identification of proteolytic cleavage products of natural substrates in their biological context in cell-based as well as in tissue samples. A number of key publications describing promising techniques arose in the past 5 years that build up the basement for protease substrate discovery. Now is the time to apply these techniques in sophisticated experimental settings. Degradomic techniques have already identified hundreds of potential substrate candidates and many cleavage sites have been validated by biochemical or cell-based approaches uncovering multiple roles of proteases in physiology and pathology. Even though not many in vivo degradomic studies have been published to date, recent presentations and discussions on scientific meetings raise the assumption of an upcoming wealth of knowledge.

A key challenge will be the discrimination between direct and downstream effects resulting from alterations in protease activity in vivo. Substrate candidates revealed by degradomics have to be validated by biochemical approaches, such as immunoblotting, in vitro cleavage assays, imaging techniques to establish co-localization of protease and substrate, and cell-based assays to evaluate protease function in a more physiological context. Since MS instruments constantly improve in sensitivity and accuracy, it will be equally important to create computational platforms for data analysis that are accessible and easy to handle even for non-experts. It is desirable to integrate proteomics data with phenomic and functional genomic information to fully reconstruct proteolytic networks. Present degradomic studies already provided impressive insights into the function of proteases, protease inhibitors and their interactions among each other (Butler and Overall 2009; Overall and Dean 2006). The question still remains if the protease network is guided through direct compensatory mechanisms or rather represents “accidental” accumulations of downstream and secondary effects. Independently of this debate, present studies highlight that the proteomic phenotypes, stemming from loss- or gain of protease function, often represent strong connections between different proteolytic systems.

To fully understand the biological role of protease networks in cancer or other diseases it is important to not only focus on substrates alone but also include other proteases, protease inhibitors, cofactors and receptors to analyze their relationships. This has to be extended to the analysis of further PTMs, since different PTMs act in concert to control gene expression, control cell signaling processes, and affect protein stability and degradation.

The mass spectrometry-based analysis and identification of PTMs is still challenging and methodological improvements are required. Due to the complexity of PTM regulation typical approaches follow a discovery based strategy detecting a mixture of hundreds to thousands of proteins including their PTMs through mass spectrometric analysis. In degradomic approaches targeted proteomics is playing an increasingly important role as it provides a sensitive and specific way to measure selectively proteins and peptides. Techniques like multiple reaction monitoring (MRM) (Picotti et al. 2007, 2009) or multiplexed selected ion monitoring (Gallien et al. 2012) will be increasingly used to specifically monitor protein processing by measuring several proteins with high sensitivity and increasing throughput. With the present proteomic and degradomic, we now possess valuable tools to dissect the protease web in both health and disease states with the aim to better understand protease biology.