Introduction

The ability to perform complex post-translational modifications is a major attribute that distinguishes mammalian cells from other production platforms used to produce biopharmaceuticals. Indeed, only a few biopharmaceutical proteins such as albumin (Recombumin) and insulin (Lispro) undergo simple modifications such that they can be manufactured using yeast or bacteria [1]. The most prevalent modifications include variable glycosylation, misfolding and aggregation, oxidation of methionine, deamidation of asparagine and glutamine, and proteolysis. Detecting and preventing these modifications has become a major challenge for the biotechnology industry. As the market for therapeutic protein production from mammalian cells has expanded, so too has the requirement for improved efficiency and stability of production [2]. Expression systems that exploit multiple gene copies or use strong promoters to increase intracellular mRNA levels do not consistently increase the amount of secreted recombinant protein, in part due to bottlenecks in the secretory pathways [3].

Protein Misfolding and Aggregation

The folding and initial glycosylation of most secreted proteins take place in the endoplasmic reticulum (ER) lumen. However, misfolded proteins can accumulate as intracellular aggregates and induce dilation of the ER [4]. Therefore, the rate-limiting step for recombinant protein secretion becomes the exit of correctly folded polypeptide chain from the ER. Molecular chaperones such as heavy chain-binding protein (BiP) facilitate protein folding at high concentrations by binding of unfolded protein chains, prevention of aggregation, and/or support of refolding. These and other molecules deliver misfolded and unfolded proteins to the unfolded protein pathway that degrades peptides in the endosome [5], as shown in Fig. 1. Additionally, chaperones such as calreticulin and calnexin provide another quality control mechanism by transient binding to newly synthesized glycoprotein intermediates, ensuring that only correctly folded proteins are released from the ER [6]. However, in high-producing cell lines the capacity of these chaperones can become overloaded with recombinant protein, leading to accumulation of misfolded and aggregated proteins, for example CHO cells making Antithrombin III (AT-III) [7]. In this study, some of the intracellular AT-III was found to form disulfide-bonded high molecular weight aggregates at expression levels that were rate-limiting to the secretion of this molecule (7.4 μg/106 cells/day). The data indicated that a portion of the AT-III was aggregated either as dimers or higher molecular weight forms, and that most of the aggregates contain disulfide bonds because they were not detected under reducing conditions. More AT-III aggregation was also observed in cells that had reached the late exponential phase of culture compared to the earlier exponential growth phase of culture, however, a metabolic trigger for this increase in aggregation could not be found.

Fig. 1
figure 1

Protein folding in the endoplasmic reticulum (ER). Secreted proteins are co-translationally located in the ER lumen. Both co-translational and post-translational folding events are chaperoned by ER-resident folding chaperones such as heavy chain binding protein (BiP), protein disulfide isomerase (PDI), glucose regulated protein 94 (GRP94), calnexin (CNX), and calreticulin (CRT). Disulfide bond formation is catalyzed by PDI and its accessory proteins. The chaperones and folding enzymes release the proteins from the ER only when they have folded correctly and assembled into oligomers

Protein aggregates can arise by several mechanisms that include reversible or irreversible reactions, non-covalent interactions between hydrophobic domains, or the formation of disulfide bonds. Some are insoluble and others remain in solution. For non-vaccine biotherapeutics, all types of aggregates are considered undesirable since small soluble aggregates may become immunogenic and larger particulates may cause problems at the site of administration [8]. Whilst there are US Pharmacopoeia guidelines for the number of particles of size ≥10 and ≥25 μm that are acceptable in a pharmaceutical preparation, the permissible number of soluble aggregates are not well defined.

Disulfide bonds result from the coupling of unpaired thiols on Cysteine residues, whilst this is essential for the correct formation of some multimeric proteins such as IgG it can also be the source of covalent aggregate formation and protein misfolding. One effective way of reducing the free thiol content and increasing disulfide bond formation of recombinant IgG produced in CHO cells is to add low amounts (up to 100 μM) of the oxidizing agent copper sulfate. This resulted in a 10-fold reduction of percentage free thiols, and significant (3-fold) reductions were observed with as little as 5 μM copper sulfate [9]. In order to assist the formation of disulfide bonds between thiol groups, the lumen of the ER provides an oxidizing environment, the substrate glutathione and enzymes such as protein disulfide isomerase (PDI), which catalyze the formation of protein tertiary structures [10]. PDI catalyzes the formation and breakage of disulfide bonds between cysteine residues within proteins as they fold, allowing proteins to quickly find the correct arrangement of disulfide bonds.

Effects of Protein Aggregation

Evidence has been found that protein aggregates can continue to form in the supernatant after cells (producing recombinant IgG) have been harvested [8], and the aggregation problem is not confined to upstream bioprocessing. For example, the techniques used to inactivate viruses during downstream processing such as exposure to detergents or extremes of pH can inadvertently damage and aggregate the protein product [11]. Low pH conditions (pH 2–4) are also commonly used to elute antibodies from Protein A capture columns during purification. Multiple filtration steps are also used in protein purification for concentration, buffer exchange, and virus removal. Large protein aggregates can cause membrane fouling and the high pressures employed may also increase aggregation during these process steps. Excipients such as sugars [12] and arginine [13] are often used to suppress aggregate formation during protein purification and formulation.

In addition to aggregates causing manufacturing problems, the administration of proteins with low-level aggregate contamination can lead to an immune response in the patient, resulting in inhibitory antibodies to the therapeutic protein [14]. This can be a particular problem with multimeric proteins such as recombinant IgG and blood clotting proteins such as Factor VIII [15]. Another example is where the bio-therapeutic forms aggregates with its excipient: a formulation of interferon alpha-2a became oxidized at room temperature and formed aggregates with the excipient human serum albumin. This induced an immune response to IFN-alpha-2a, and changing to a liquid, albumin-free formulation stored at 4°C reduced the immune reaction [16].

Cell Line Engineering to Improve Protein Quality

Engineering of chaperone systems by overexpressing a single component of the ER-resident protein folding machinery has overall yielded mixed results. BiP overexpression has been shown to increase heterologous protein secretion for about half of the heterologous proteins studied [17] an in some cases inhibits protein secretion [18]. It has been proposed that either insufficient ATP levels or the lack of co-chaperones such as Lhs1p may become rate-limiting to BiP functions. In addition, increased BiP activity may stall the GRP94, calnexin–calreticulin chaperone and unfolded protein response machineries, because of the hierarchy of ER luminal chaperone systems.

As with BiP, PDI overexpression does not always lead to improved levels of protein secretion [19, 20]. Regeneration of oxidized PDI is essential for consistent disulfide bond formation, and other factors in this pathway such as glutathione availability and the activity of Ero1p (the enzyme that oxidizes PDI) may become rate-limiting in PDI over-expressing cells [21].

Clearly, more work is required to optimize the secretory and folding capacities of recombinant mammalian cells both to improve yields and increase product quality. Recent advances in CHO genomics [22] should facilitate these approaches.

Oxidation and Deamidation

Some of the methionine residues in a protein can be oxidized to methionine sulfoxide or even methionine sulfone. The exposure of susceptible residues on the cell surface can determine the extent of methionine oxidation, and conditions such as the cell culture media components and formulation excipients can be used to protect the protein from oxidative damage [23]. In cases such as α1-antitrypsin (used in the treatment of emphysema) oxidation of methionine residues leads to a loss of its critical anti-elastase activity [24].

Deamidation of asparagine residue to form aspartic acid and iso-aspartic acid is a another cause of protein degradation, particularly during long-term storage (Fig. 2) [25]. Most of the deamidated asparagine forms iso-aspartate, which is not a natural amino acid and can potentially be immunogenic. This isomerization is accompanied by partial racemization of newly created isoaspartyl and aspartyl residues. This degradative event can have serious implications for drug efficacy, because the potency of one commercially available antibody was reduced to 70% post-deamidation [26], and deamidation has also been shown to alter the activity of stem cell factors [27]. The non-enzymic deamidation reaction is accelerated at alkaline pH, with surface asparagine residues being more susceptible. Glutamine residues can also be deamidated, but this reaction is one hundred times slower than asparagine deamidation and is rarely detected in recombinant proteins. Predicting which asparagine residues will be prone to deamidation in a new therapeutic protein can also be difficult.

Fig. 2
figure 2

Two types of degradative post-translational modifications: (a) asparagine deamidation and (b) methionine oxidation

Analytical Methods

Detecting protein aggregates can be a challenge, not least because aggregates vary in size from simple dimers to large particles several μm in diameter. Size-exclusion HPLC (SEC-HPLC) has become one of the standard techniques used to detect aggregates, but some artifacts may be introduced under the high pressures needed to run HPLC, and the large aggregates may not enter the column. A certain degree of incomplete product recovery is also evident with SEC due to protein adsorption to the gel matrix. Because of these limitations, the FDA usually requests orthogonal analytical techniques to confirm SEC-HPLC results. Analytical ultracentrifugation (AUC) has been proposed as an alternative detection system based on differences in sedimentation rates, but the AUC equipment can be quite expensive [28]. Because AUC is both time-consuming and expensive, it may not lend itself to routine product characterization but rather serves as a validation/calibration technique. Larger aggregate particles can be detected using laser light scattering [29].

Protein aggregate-sensitive dyes may help by complementing existing detection methodologies used in bioprocessing. Among the most widely used dyes is Congo Red, a sulfonated azo dye that undergoes a spectral shift from 490 to 530 nm when bound to fibrils and other forms of protein aggregates [30]. In additional to the parent molecule, various Congo Red derivatives are available which may differ in their lipophilicity or ability to undergo fluorescence [31]. The other principal dye utilized in detecting aggregates is thioflavin T, a benzothiolate dye that exhibits enhanced fluorescence at 480 nm when bound to amyloid. Thioflavin T has also been adapted for fluorescent spectrophotometric assay [32].

Various other dyes such as 4-(dicyanovinyl)-julolidine (DCVJ) [31] have also been used in both static and kinetic assays to characterize aggregation events using both steady state and time-resolved fluorescence techniques. Introducing 96-well plate spectrophotometric assays based on these dyes could facilitate protein quality measurements during clone and media selection.

Total free thiol content can be rapidly assessed spectrophotometrically using Ellman’s reagent or a fluorescent thiol-sensitive maleimide dye, such as Fluorescein-5-Maleimide, or by hydrophobic interaction chromatography [9]. Intra- or Inter-molecular disulfide bond formation can be detected using diagonal electrophoresis, whereby protein complexes are separated firstly undergo non-reducing then reducing conditions [33].

Methionine oxidation can be detected by mass spectrometry or UV detection of peptide fragments [34]. Peptides containing oxidized methionines can also be separated from natural fragments on a weak ion-exchange column [35] or by hydrophobic interaction chromatography [36].

Asparagine residues prone to deamidation can be identified through peptide mapping of the recombinant protein. This involves reduction, alkylation, and tryptic digestion of the protein and subsequent HPLC separation of the tryptic peptides prior to mass spectrometric analysis, and these multiple steps are difficult to adapt into high-throughput screens. Peptide mass fingerprinting has been employed to identify deamidation sites in human IgG [25]. The ISOQUANT assay has been developed by Promega for detecting deamidation. This method exploits the ability of the enzyme l-isoaspartyl methyltransferase (PIMT) to transfer the active methyl group of S-adenosyl-l-methionine (SAM) onto the free alpha-carboxyl of iso-aspartate to form an O-methyl ester. The isoaspartyl methyl ester rapidly breaks down at neutral pH to form the same cyclic imide arising from deamidation or aspartate isomerization, with concomitant release of methanol. Incubating proteins with PIMT and a radioactive substrate (3H-SAM), allows isoaspartyl sites to become quantitatively methylated and quantified by scintillation counting of 3H-methanol. However, no detection method for oxidized methionines in proteins currently exists in a microtitre format that could be used for high-throughput screening.

Conclusions

The number of recombinant proteins in clinical trials for new and existing therapeutic targets continues to increase each year, as does the total amount of product required globally. It is also predicted that generic protein therapeutics will become more prevalent as the patents associated with the original proteins expire [37]. Despite the success of biotherapeutics there remain significant challenges to be overcome in maintaining product stability and efficacy throughout the production cycle and during long-term storage. These include the pathways reviewed in this article, as well as variable glycosylation, proteolytic cleavage and RNA splice variants. Advances in protein analysis and a deeper knowledge of industrial process conditions should lead to preventative strategies that help counteract these degradative pathways and ensure product efficacy, safety and affordability.