General introduction

Most NMR work done on proteins relates to understand their biological functions, 3D structures, binding interactions and dynamics. Some of these macromolecules are potential target proteins for the treatment of diseases. Another class of interesting proteins are in use as active pharmaceutical ingredients (APIs) as part of drug products (DP). These molecules are often named biologicals or biopharmaceuticals. An ever growing number of such drugs are currently on the market or in the process of development (Moorkens et al. 2017). Many but not all biopharmaceuticals are monoclonal antibodies or Fab fragments thereof. In contrast to high medical and economic relevance of this class of molecules only a small fraction of methods and applications can be found in literature related to the usage of NMR to support development and marketed product support for biologicals. The main purpose of this contribution is not to provide a classical scientific paper with an in-depth methods description, but moreover to show how standard NMR experiments and equipment readily available in many laboratories can be used to solve analytic problems in the fields mentioned. This is done by a selection of applications where the authors have contributed over the years. The report is organized in three sections. The first part describes specificities of NMR methodologies for the analysis of therapeutic proteins. In the second section, applications of protein NMR to support already marketed products are described. Finally, the third part deals with examples from the development process of biologicals.

NMR methodology for the analysis of biopharmaceuticals

Technical considerations

Many NMR based methods for the spectral assignment of proteins, and the determination and characterization of their structural states in liquid (Cavanagh 2018) and solid state (Lange et al. 2005) are described in literature. For larger proteins with molecular weights higher than approximately 10 kDa most of these methods rely on the availability of 15N and/or 13C enriched proteins. These labels are introduced via biotechnological methods during recombinant protein expression. For therapeutic proteins such labels are available only in rare cases (Klaus et al. 1997; Zink et al. 1994). This is due to the fact that proteins used in drug development have to be produced by processes allowing for scale-up and standardization. Therefore, methods that rely on the production of protein materials dedicated for NMR based structural studies are rarely applicable. The same holds true for excipients or other chemicals for which interaction or influence on the therapeutic protein has to be investigated. Often NMR researchers in this field have to use batches of defined qualities and manufacturers, which are provided by scientific partners of studies. This is even more strictly the case if NMR has to be applied to biologicals already on the market (e.g. biosimilars or counterfeits of biologicals). Here the pharmaceutical product submitted for investigation gives the composition of samples.

Having this said it is clear that most of the NMR based methods in the literature can only be used in selected cases, and the investigator has to rely on methods where isotope enrichment is not a prerequisite. Thus, NMR signal assignments of therapeutic proteins are often not available, and spectroscopic results have to be interpreted in a meaningful way without this knowledge.

These limitations are counterbalanced in part by the fact that the protein API under investigation is often available in huge quantities. The protein concentration in typical pharmaceutical samples is often extremely high and can reach 100 mg/ml. This allows for many applications a very simple sample preparation. Thereby, an adequate amount (typically 550 μl for a 5 mm NMR tube) of dissolved or reconstituted (for lyophilized biologicals) aqueous solution is routinely transferred with a pipette to the NMR tube where 5–10% (vol/vol) of D2O are added and gently mixed by shaking. If needed this mixing step can be done outside of the NMR tube to remove bubbles or foam by centrifugation or decanting. Normally no chemical shift or internal quantification standards are added. Chemical shifts of substances in formulation mixtures will vary slightly in dependence on the composition. Therefore, comparison of chemical shifts with tabulated referenced data is helpful but never matches perfectly. Quantification is done in our laboratory solely by use of sensitivity calibrated instruments (external standard). Usage of an internal chemical standard is difficult, because in formulated protein solutions it is sometimes nearly impossible to find a spectral region without overlap with excipient or protein related NMR signals. Absolute quantification without recourse to reference substances or spiking of internal reference compounds has gained awareness over recent years (Holzgrabe 2010a). Especially quantification of unexpected compounds in counterfeits or present as contaminants in development projects or production batches is of high relevance. By this methodology, NMR offers when compared to other quantification techniques like HPLC, high flexibility and versatility circumventing any compound specific method development.

Obviously such simple sample preparations require solvent suppression techniques like single- or multi-frequency presaturation (Hoult 1976). If no quantitative analysis is needed Watergate type sequences are in use alternatively (Liu et al. 1998). A related problem is the substantially higher molar concentration of excipients compared to that of the protein API, and their related NMR signals that hamper data interpretation. This difficulty can be overcome by the SIERRA/SMILE method (Arbogast et al. 2018) or by use of diffusion ordered spectroscopy (DOSY) (Wu et al. 1995) by which large signals related to low-molecular weight compounds can be suppressed substantially. Examples are shown below.

Given the high concentration of protein API it was recently proposed to use 1H–13C or 1H–15N correlated two-dimensional NMR spectroscopy to monitor the influence of formulation or other parameters on the higher-order structure (HOS) of proteins (Arbogast et al. 2015; Brinson et al. 2019). One example following this strategy is included in section three of this report. It should be mentioned that monitoring of the integrity of HOS was alternatively pursued by use of one-dimensional 1H NMR spectroscopy (Poppe et al. 2015). Another approach in our laboratory to provide in-depth structural information on large unlabeled proteins is to digest the protein by use of suited proteases and to isolate the peptides of interest in NMR suited amounts. One example for this approach is presented in this report for a pegylated protein drug. Important applications of NMR for development of biologicals include as well the characterization and quantification of leachables (Mueller et al. 2009) or impurities introduced by packaging materials, like pre-filled syringes or vials (Malmstrom 2019), and the structural characterization of degradation products of biopharmaceutical excipients (Kishore et al. 2011). Related applications will not be presented as examples and the reader is referred to the literature mentioned.

In respect to NMR hardware no extra requirements besides the usual need for as-high-as-possible resolution and sensitivity emerge. Depending on the stage of development it may be necessary to run the NMR facility under a good manufacturing protocol (GMP) (Webster et al. 2010). Our NMR facility does not run under GMP but a regular monitoring of the performance of equipment is implemented as a system suitability test for spectral resolution and sensitivity.

In our laboratory a Bruker 600 MHz instrument equipped with a cryogenic 5 mm QCI{31P} and a 500 MHz equipped with a cryogenic 1.7 mm TCI probehead are in use. The latter is important for isolated peptide fragments which are often available at sub-milligram amounts only.

NMR methods in use and typical applications in our laboratory are summarized in Table 1. More related literature references are provided in this table and the examples sections.

Table 1 Summary of NMR methods used in our laboratory for structural characterization and quantitative analysis related to protein drug samples

Spectral database

For fast identification of excipients, and to rule out conclusively the presence of genuine Roche drugs in counterfeits, a spectroscopic database was built with 250 excipients widely in use in pharmaceutical formulations, aqueous solutions of all biological drugs and DMSO solutions of low-molecular weight drugs marketed by our company. For all sets of samples a panel comprising 1D 1H, 1H–13C HSQC, 1H–1H TOCSY as well as DOSY data was measured.

Protein NMR applied to marketed product support

Introduction

Counterfeiting of goods is a serious issue, especially when medicines are targeted. The consequences for the patients are appalling since even in the “best-case” scenario, counterfeits may contain impurities and are manufactured and stored in dreadful conditions (Riccardi 2014; Bate 2012a, c). In many cases they have a wrong or an inadequate dose of the required API. They may be formulated with different excipients, and might even possess toxic compounds or sometimes just “dirty water”. Counterfeiters have long targeted lifestyle medicines. Nowadays, lifesaving products like cancer or antimalarial medicines are however largely impacted and no therapeutic area is immune. Contrary to what is commonly assumed, also all regions of the world are impacted (Pharmaceutical Security Institute 2018; WHO 2010; SafeMedicines 2018; Degardin et al. 2015; Bate 2012b; Deisingh 2005; del Castillo Rodriguez 2015; Medina et al. 2016). Complex manufacturing and delivery chains involving several countries complicate scientific and legal investigations. The production and trafficking of counterfeits have proved to be the results of organized crime (Union des Fabricants 2016; Degardin and Roggo 2016; Reynolds and McKee 2010; UNODC 2010). A fast and detailed analysis of the samples is therefore necessary in order to help the investigators speed up cases and arrest criminals. Both, the packaging and the DP of a medicine can be almost independently counterfeited, resulting in products with genuine and counterfeit parts. Criminals have reused genuine packaging material and manipulated it to introduce a fake medicine. In order to provide the right conclusion and gain understanding about the “modus operandi” of the counterfeiters, it is therefore necessary to analyze the packaging as well as the DP.

Most reports on the chemical analysis of counterfeit medicines describe the identification of small molecules. Methods can be grouped in spectroscopic (Assi et al. 2011; Zontov et al. 2016; Dégardin 2016; Neuberger and Neususs 2015; Ortiz et al. 2013; Holzgrabe 2010b; Dégardin 2017b) and chromatographic tools (Custers et al. 2016; Schappler 2014; Deconinck et al. 2012; Marini et al. 2010). Biologicals represent an ideal target for the criminals, who can easily achieve a high profit margin. Contrary to the small molecule APIs, the characterization of proteins is laborious and requires different analytical strategies. Spectroscopic tools, like infrared or Raman spectroscopy or microscopy, can be used as first screening tools and detect the presence or absence of a protein (Dégardin 2017a). Using spectroscopy, the analyst can get a general idea about the formulation of the DP and identify very fast a “bad” counterfeit. In case of a confirmed counterfeit, the interpretation of Raman or IR bands also helps to detect most of the compounds present in the solution. Simple methods like capillary electrophoresis were used for the identification of the protein in the DP (Flurer and Wolnik 1994; Dégardin 2015; Marini et al. 2010). However in case of degraded or unknown proteins in counterfeits, more advanced methods—typically liquid chromatography coupled with mass spectrometry (LC–MS)—are usually needed. The detection of the right API is however not sufficient, since counterfeiters sometimes refill old packaging like vials or syringes. In these cases traces of the right protein might be detected, but in a much diluted concentration. Protein content methods like UV spectrophotometry are used to complete identification and quantification of the DP. For this purpose NMR does provide an attractive alternative.

Performing a full characterization of the chemical composition of counterfeits is primarily necessary for drug safety and detailed information may be supportive in the event of legal action. The more detailed the characterization of the composition is, the easier it is to detect links between different counterfeit cases (Degardin et al. 2015). Such evidence, used in a forensic intelligence context, enables to deliver additional information for law enforcement (Ribaux 2003).

The chemical composition of counterfeits can be very exotic, especially in the case of liquids. The methods mentioned previously cannot always help to unequivocally identify the unknown compounds, in particular, since corresponding reference data are not always available. In these cases, it is still important to identify compounds unequivocally and to provide their concentration for the evaluation of patient safety. In this context, NMR analyses can be of great support. It can identify unknown active ingredients or confirm their presence if a previous method was able to propose them. NMR can provide their concentration, help to identify the excipients present in the mixture, and finally propose the presence of contaminants that have been introduced by the counterfeiters.

A few papers have already been published about the use of NMR for the analysis of counterfeit medicines, however to the authors’ knowledge only for small molecules. Holzgrabe and Malet-Martino published for instance a review of NMR methods used in the context of counterfeit analysis (Holzgrabe and Malet-Martino 2011). More specific applications can be found in the literature (Kaur et al. 2015; McEwen et al. 2012; Remaud et al. 2013; Gaudiano et al. 2016; Trefi et al. 2008; Wilczynki et al. 2017; Nyadong et al. 2009). One novelty of the present publication resides in the fact that the analyzed samples were counterfeits of protein-based medicines.

Along the lines described above usage of NMR for the structural characterization of biosimilars or competitor products (including NCBs—non comparable biologicals) is also part of the examples given below. In this context a biosimilar is defined as a structural highly similar molecular entity with a pharmaceutical effect identical to that of the original approved product. NMR-based analysis of biosimilars is of importance to prevent patent infringement of related genuine products but has also implications if these molecules are in use as counterfeits.

Examples

Example 1: structural characterization of pegylation chemistry

Pegylation of biopharmaceuticals, that is the covalent conjugation of high molecular weight, water-soluble and synthetic polyethylene glycol (PEG) to therapeutic proteins, is used to enhance the drug pharmacological activity by increased blood half-life (Veronese and Mero 2008; Milla et al. 2012). For PEG-conjugation, a broad variety of functional PEG linkers are available. These include PEG amine, maleimide PEG, N-hydroxysuccinimide ester PEG, bromide PEG and others (Veronese and Harris 2002). All PEG linkers consist of an activated head group, a short hydrophobic spacer, and the PEG moiety with a specified and defined chain length. The PEG density is additionally controlled by the use of branched PEG units or by attachment of multiple PEG chains to the spacer. Non-specific conjugations to amines and thiols result in a heterogeneous distribution of pegylation sites, while site-specific conjugations yield well defined conjugation products (Dozier and Distefano 2015; Roberts et al. 2002). Protocols for the purification and characterization of pegylated proteins are reported (Akbarzadehlaleh et al. 2016; Gonzalez-Gonzalez et al. 2012), however, to our knowledge the retrospective de-novo structural characterization of the conjugation and spacer chemistry has never been described before.

The elucidation of the linker chemistry in between a synthetic polymer and a protein is a considerable analytical challenge. The reason for that are differences of solvation properties of the protein and PEG moieties, which lead to difficulties in most matrix-based analytical methods. In this respect the analysis by solution NMR would be advantageous, but the size limitation in protein NMR disallows the study of marketed pegylated proteins. In order to get this problem solved, a simple biochemical procedure was established that includes the proteolytic digest of the protein followed by the NMR analysis of generated pegylated peptide fragments. In the case described here, the structural elucidation of the PEG linker chemistry served investigations of biosimilars and competitor products regarding patent violations. Nonetheless, the presented methodology was of use in other in-house studies related to covalent protein modifications induced during biotechnological protein production processes, and can also be applied to non-PEG based polymer conjugates (Singh et al. 2016).

Analysis of intact drug

Formulated biopharmaceuticals were analyzed by NMR prior to proteolytic digest of their containing pegylated proteins. Therefore, 500 µl of aqueous solution of a drug including all formulation excipients were transferred to a regular 5 mm NMR tube and gently mixed with 50 µl NMR-grade D2O. Already a simple 1D 1H NMR experiment does provided a wide range of information, including the identification and concentration of constituents and contaminations, as well as the PEG-to-protein ratio obtained from the integral ratio of the ethoxy, the terminal O-methyl and the protein methyl signals. The spectral comparison of a genuine drug reference and a competitor’s product is shown in Fig. 1.

Fig. 1
figure 1

Aliphatic regions of the 1D 1H spectra of intact pegylated genuine drug and a related competitor product are shown. Information on the PEG-to-protein ratio is obtained from the integral ratio of the ethoxy (PEG), the terminal O-methyl (PEG-CH3) and the protein methyl signals. The excipients acetate, polysorbate 80 (see asterisks) and benzyl alcohol (not shown) are present in both pharmaceuticals at slightly different concentrations. In addition, differences in spurious contaminations with methanol (MeOH), iso-propanol (IPA), tert-butanol, and ethanol (EtOH) were easily identified and quantified using NMR spectroscopy

Biochemical processing

The biochemical processing for the structure analysis of the linker chemistry requires at least 2 mg of total pegylated protein as starting material. After tenfold concentration using Amicon® Ultra 3000 MWCO (Millipore) filter devices, the protein moiety was denatured and reduced with 6 M guanidinium chloride, 5 mM EDTA and 10 mM dithiothreitol. 30 mM iodoacetamide was then added and incubated for 5 min in the dark for the irreversible alkylation of cysteines. Denaturation buffer was exchanged to protease digestion buffer by NAP-25® desalting spin column (GE Healthcare). The eluted fraction was supplemented with the appropriate quantity of ProNase® (Roche Applied Science) and incubated overnight. After lowering to pH 3.0, a second digestion with Pepsin was initiated for another 12 h to complete proteolysis. The digested protein solution was concentrated to 200 µl using Amicon® Ultra 3000 MWCO, and high molecular weight PEG-peptide fragments were purified by reverse phase HPLC (2.1 mm Acquity C18). Pure fractions were pooled and lyophilized. A total yield of 1.2 mg of PEG-peptide was resuspended in 34 µl H2O (10% NMR-grade D2O) and transferred to a 1.7 mm sample tube. It must be noted that due to the high molecular weight of the PEG polymer only a few percent of total mass refer to the linker and peptide fragment domain. The major steps of the biochemical process are listed in the flow chart of Fig. 2a and are visualized in the illustration of Fig. 2b.

Fig. 2
figure 2

a/b The flowchart and the visualization summarizes the major steps of the established biochemical process for the analysis of a typical pegylated protein. The amount of 2 mg of PEG-protein drug was required to successfully accomplish the digestion and to purify 1.2 mg of PEG-peptide. Note, that this corresponds to 15 µg of linker for which the structural analysis was performed

NMR analysis

The high sensitivity of the 1.7 mm TCI probehead allowed the acquisition of standard 2D HSQC, HMBC and TOCSY spectra required for the structure elucidation of the PEG linker chemistry. The superposition of the 2D HSQC and HMBC is shown in Fig. 3a. Interestingly the attachment of 50 kDa PEG did not significantly affect the line-width and sharp resonance signals were collected for the pegylation linker and the peptide fragment. For 2D experiments, solvent and PEG signal suppression using double presaturation was inevitable to obtain spectra of high quality. The sequential assignment of the peptide fragment attached to the PEG moiety matched the N-terminal residues of the biopharmaceutical protein (Cys-Asp-Leu-Pro-Gln). Only one set of signals was found in the spectra and a homogenous single point pegylation was concluded. Most important, the assignment of the PEG linker identified an amino-propyl motif that connected the PEG moiety to the N-terminus of the protein. The chemical structure of peptide fragment and pegylation linker is shown in Fig. 3b, and assigned proton and carbon chemical shifts are listed in Table 2. The elucidated linker chemistry differentiated significantly from the reference drug of our company and absence of patent infringement was proven.

Fig. 3
figure 3

a The superposition of the non-multiplicity edited HSQC (black contours) and HMBC (red contours) spectra of the competitor’s product is shown. 1H and 13C chemical shifts that belong to the amino propyl structural motif are connected with green dashed lines. Note the total acquisition time of both spectra was one weekend at our 1.7 mm TCI probehead. b The chemical structure of the peptide fragment (Cys-Asp-Leu-Pro-Gln) including the characterized N-terminal pegylation chemistry is shown. The iodoacetamide-based alkylation of Cysteine was spectroscopically confirmed. Assigned proton and carbon chemical shifts are summarized in Table 2

Table 2 The 1H and 13C chemical shift table of pegylated protein fragment

Example 2: analysis of biopharmaceutical counterfeits by NMR spectroscopy

The success of biopharmaceutical drugs is accompanied by the attempts to imitate products with the intent to deceive and mislead consumers. The determination of the exact composition of the counterfeits is essential in order to define the danger encountered by patients. While all processes and constituents of a genuine biopharmaceutical drug undergo highest quality controls, the composition of a counterfeit may come as a surprise. In this regard, characterization by NMR spectroscopy can provide fast and accurate information. As a great advantage for samples of unknown composition, NMR spectroscopy does not require method development in the classical sense. Sample preparation is fast, non-destructive, and there is no direct contact between sample and instrumentation. Since NMR acts as a universal detector for all soluble molecules that carry covalently linked protons, mixtures of small molecules, polymers and proteins are analyzed and identified in the same sample with the same instrument. Thereby, the intrinsic linear response of NMR detection allows quantification, which is of high importance in counterfeit characterization. All these advantages make NMR extremely suitable to analyze biopharmaceutical counterfeits.

Sample preparation

As described in more detail in the introduction section the NMR samples of aqueous solutions of suspect counterfeits are generated in our laboratory by simple addition of 10% of NMR-grade D2O. Based on the available volumes of submitted counterfeits, disposable 3 mm or 5 mm NMR tubes are used for the analysis.

Counterfeits without proteins

Most counterfeits of biopharmaceutical drugs are aqueous solutions that do not contain proteins. In these cases, major constituents are excipients or other APIs of low cost. Their use can simulate the viscosity and brilliance of a protein solution, make the solution more stable, or simply pretend a vague feeling of medication to the patient. Figure 4 shows a collection of counterfeits, which were infiltrated into the market as monoclonal antibody drugs. The signal contours of a typical antibody are clearly visible in the 1H spectrum of a reference drug in Fig. 4a. No protein was detected in counterfeits shown in Fig. 4b and c. Instead, a variety of small molecules was easily identified by use of our in-house NMR spectroscopic database: The preservative benzyl alcohol is often found which inhibits efficiently the growth of microorganisms. Hydroxyethyl starch is in use as blood plasma expander in intravenous therapies and is clinically available in large volumes portioned in sterile plastics bags. Small molecule APIs are also often found, which can be related or non-related to the purpose of the counterfeit medication. For example, the nucleoside analogue Gemcitabine is in use in chemotherapy medication by blocking DNA synthesis. Methylprednisolone is a corticosteroid medication used to suppress the immune system and to decrease inflammation. Liquid diclofenac preparations that consist of a mixture of diclofenac and propylene glycol were also found. Spectra of counterfeits where the protein drug is replaced by a small molecule API are shown in Fig. 4c. This small list shows impressively how creative but useless and dangerous the composition of drug counterfeits can be for patients.

Fig. 4
figure 4

Comparison of 1D 1H spectra of suspected counterfeits and a typical antibody drug reference from the Roche internal spectral database (spectrum in a). b Shows spectra with presence of only excipient type molecules, c shows spectra with low-MW API molecules and panel d shows spectra where in addition to excipients wrong proteins were present in order to mimic the expensive protein content of a monoclonal antibody

Counterfeits with wrong proteins

Protein drugs can be counterfeited not only with preparations that contain small molecules or polymers, but also with protein-containing solutions. These protein solutions can have diverse origins, for example from expired biopharmaceuticals, from inexpensive solutions used in clinical treatments, but also from protein products of the life science, cosmetics or food industry. The chemical shift region of protein methyl and amide backbone signals are usually free of excipient signals and are ideally suited for comparative studies against references. Since variation of the broad enveloping contours can discriminate different proteins, 1D 1H spectra are often sufficient to confirm counterfeit suspicion. If excipient signals interfere, DOSY NMR at higher field gradients is used to clean spectra. Quantitative analysis of drug formulation is included in the analysis report and can have important hints on the origin and the nature of the protein solution. Unambiguous protein identification is however only feasible by NMR if identically formulated reference protein is available. In-depth analysis with protein mass spectroscopy is required to clarify protein identity. Figure 4d shows a small collection of counterfeits in which wrong proteins were found. It is no surprise that these proteins are available cheaply and in relatively large quantities.

Example 3: analysis of biosimilars by NMR spectroscopy

Similar to generic drugs that replace branded products, biosimilars are follow-on biologicals that substitute genuine biopharmaceuticals after their patents expire. In contrast to small molecules, structural properties and modifications of these protein drugs are difficult to analyze. For these investigations, the standard focus of our NMR analysis lies on the quantitative comparison of the protein content, formulates and spurious contaminations. NMR sensitivity is generally not adequate to perform studies on subtle modifications of proteins without isotopic labeling. An exception is presented in the comparative study of a biopharmaceutical drug and its biosimilar presented in Fig. 5. In this case, NMR was able to detect a small variation of the monoclonal antibody.

Fig. 5
figure 5

a The spectroscopic comparison of a genuine drug with its biosimilar product reveals several differences in regard to spurious contaminations and protein composition. Trehalose used for the formulation of the drug is of lower quality in the biosimilar and shows an increased level of glucose contamination. a/b An additional lysine-like motif with typical proton chemical shifts was found (see red arrows). b the 2D TOCSY confirmed the connectivity of the lysine βH, γH, δH and εH resonance signals. Note such variability of C-terminal lysine is one of the most commonly detected heterogeneities, which can easily be assessed to control antibody production and processing

NMR samples were prepared by the addition of 10% of NMR-grade D2O, and data sets of standard 1D 1H, 2D DOSY (not shown), HSQC (not shown) and TOCSY were acquired. As shown in Fig. 5a the overall spectroscopic chemical shift pattern of protein and excipients showed a perfect match. All excipients and the monoclonal antibody were available at highly similar concentrations. Besides the contamination of trehalose with glucose, additional signals with lysine-like chemical shifts were found in the biosimilar. DOSY NMR confirmed identical diffusion rate for these lysine like NMR signals and the protein content (data not shown). The 2D TOCSY in Fig. 5b, supported the suspicion by the observation of a highly mobile lysine with typical βH, γH, δH and εH resonance signals. Subsequently, a covalent C-terminal attachment of a lysine residue to the monoclonal antibody was confirmed by mass spectrometry performed on a proteolytic digest. C-terminal lysine residues are usually removed during cell culture by carboxypeptidases, however, dependent on the mammalian cell lines used for production variations of the C-terminal lysine content may occur (Dick et al. 2008). Results of NMR analysis of biosimilars can support investigations related to patent infringements.

Protein NMR applied to development of biologicals

Introduction

During the development of a biologic NMR can contribute to the identification of a suited formulation allowing long-term storage of the DP and the definition of a high-quality biotechnological production method. For selected cases, NMR can provide valuable information on the influence of different parameters on the integrity of the primary structure of biologicals by characterization of covalent modifications induced upon storage of DP or present as side-components co-eluting during purification of the drug substance. More interestingly, the effect of covalent modifications or details of the formulation on HOS of the protein molecule can be monitored.

Examples

Eample 1: covalent linkage of cysteine to a monoclonal antibody during storage

For stabilization of biologicals against oxidation, inclusion of antioxidants into their formulation is state of the art (Hada et al. 2016). Several antioxidants are available for this purpose, which include ascorbic acid, N-acetyl-cysteine, cysteine or methionine. It is key that the antioxidant in use does not alter the chemical structure of the biological API. The effect of cysteine used as antioxidant was investigated in an explorative study in a formulation of a monoclonal antibody. More specifically, it was proposed that cysteine in solution may attach covalently to solvent exposed cysteine residues via formation of disulfide bonds (Elias et al. 2005). Obviously, the study had to be performed in full formulation of the antibody with other excipients present (e.g. trehalose and histidine) at high concentrations. Compared to concentrations of free unconjugated cysteine and other excipients, the fraction of the cysteine bound to the antibody was predicted to be extremely low. Moreover, it can be assumed that free cysteines may conjugate heterogeneously to different cysteines of the antibody. Therefore, it was clear that a heterogeneously broadened NMR signal has to be identified on a full spectroscopic background of a complex spectrum given by the antibody itself, but also by high-intensity signals of the excipients.

To tackle this challenge a formulation of the biological was prepared by use of fully 13C labeled cysteine. After this a panel of samples was stored under controlled environment conditions for several weeks. In a 13C edited experiment, NMR signals of all other components of the mixture are expected to be reduced by a factor of 100 when compared to that of antibody bound and unconjugated 13C-cysteine. A short calculation revealed that this suppression factor is not high enough to detect reliably the presence of relevant signals on the huge 13C natural abundance background. Therefore a one dimensional version of a NMR experiment was used which only delivers a response for a doubly 13C labeled group of the fully 13C labeled marker cysteine. We decided to use a HCACO type experiment (Kay et al. 2011) which only delivers NMR signals for a 13C labeled H–Cα group connected via a single bond to a 13C labeled carbonyl. This setup allows a suppression of irrelevant signals (antibody, other excipients) by a factor of 10,000. The result of this approach is shown in Fig. 6.

Fig. 6
figure 6

Proof of principle of the HCACO based method for high quality suppression (more than 99.9%) of huge NMR signals related to unlabeled excipients present at high concentration. For this purpose, fully 13C labeled cysteine was spiked into a formulation buffer at a concentration ca. 1000 fold lower compared to that of a histidine excipient. The red and blue 1D 1H NMR spectra were recorded without carbon decoupling prior and after spiking of the 13C cysteine compound. The black spectrum shows the result of the carbon decoupled HCACO experiment on the sample

With this method the presence of cysteine linked covalently to the protein and absence of non-conjugated cysteine was confirmed unequivocally after dialysis against an aqueous buffer without excipients. Finally, the resulting sample was reduced and the amount of released cysteine was quantified by the use of NMR. Based on results shown in Fig. 7 it was decided to formulate another anti-oxidant in the material used for clinical studies.

Fig. 7
figure 7

Application of the HCACO based NMR method to prove covalent cysteine conjugation to an antibody. For the HCACO experiments 8 k datapoints were acquired with summation of 1500 scans within 1 h. The top trace shows a one dimensional version of a DOSY experiment with high gradient strength of the formulated protein solution stored in presence of 13C-Cys prior of dialysis. Signals related to compounds of low molecular weight are suppressed. These data show that an unequivocal identification of signals related to covalently linked cysteine is impossible here. The middle trace shows 1D HCACO data obtained after the dialysis of protein solution against an excipient free buffer. Presence of a heterogeneous protein conjugate and absence of non-protein conjugated 13C-Cys is obvious. The bottom trace spectrum shows HCACO type data obtained after additional reduction of the dialysed sample. Here the absence of conjugated and presence of non-conjugated 13C-Cys is proven. This signal was finally used for a quantitative analysis of “cysteinylation”

Example 2: oxidation of methionine as post-translational modifications

It is known that oxidation of different types of amino acids does have detrimental effects on the efficacy and safety of therapeutic proteins (Torosantucci et al. 2014). Often oxidation of amino acids in proteins is characterized and quantified by use of mass spectrometry methods applied to proteolytic protein digests (Witze et al. 2007). This approach does however fail to work if the fragment obtained by this digest does contain a conjugated polymeric moiety of high mass heterogeneity. In this case the complexity of the MS spectra makes unequivocal identification of oxidation impossible in structural proximity of the conjugation site. As an alternative, a NMR based method for quantification of oxidation of methionine was described in literature (Wang et al. 2001). This method was developed for non-formulated proteins. Presence of huge NMR signals related to high excipients concentrations hamper application of the standard method. For the exploratory pegylated protein under investigation in this study introduction of a diffusion filter by use of the DOSY technique was implemented. We show that this filter removes all signals of low-MW excipients not covalently connected to the protein by more than 99.9%. The result of this method is shown in Fig. 8 for an intact protein in full formulation for a fresh and stressed sample in comparison with a regular 1D 1H spectrum. Note: oxidation of a methionine results in two diastereomers of sulfoxide.

Fig. 8
figure 8

The top trace shows a 1D 1H experiment of an freshly prepared exploratory protein which was pegylated for improvement of pharmacological properties. In the top trace spectrum, the identification of non-oxidized methionine of the protein is possible at 1.95 ppm. Unequivocal identification of the oxidized counterpart is impossible due to overlap with the 13C satellite signal of an excipient in the formulation buffer. The middle trace shows a DOSY type spectrum of the freshly prepared sample. Only a small signal related to oxidized species is present. Note: here the gradient strength was optimized to remove excipient related NMR signals and to maintain an as high as possible fraction of the high-MW protein related signals. This type of spectroscopy allows unequivocal identification and quantification of NMR signals of non-oxidized and those of both diastereomeric forms of oxidized methionine at 2.55 ppm. The bottom trace shows the same type of data after storage of this sample for 1 day. Obviously, the NMR signal of the non-oxidized methionine has disappeared

Results of this method obtained for formulated protein solution exposed to oxidative stress were used to derive a rate of methionine oxidation of the protein in comparison to that of methionine used as an antioxidant in formulation (data for degradation of the methionine excipient are not shown). Results for methionine free solution are shown in Fig. 9. Finally, the amount of antioxidant needed for protein protection for a defined storage time was supported based on these results.

Fig. 9
figure 9

Quantitative analysis of the time-series of the integrals of the disappearing NMR signal of non-oxidized methionine (1.95 ppm in Fig. 8) in comparison with that of the sum of the two appearing signal of the oxidized diastereomeric species (2.55 ppm in Fig. 8). The dots in the figure indicate the time-series measured. The dashed lines show a fit of the signals with a first order reaction kinetic model. The time-constant of the oxidation is provided as insert

Example 3: influence of oxidation on higher order structure of a monoclonal antibody

In an exploratory study, the use of 1H-13C HSQC chemical shift based profiling was tested for the investigation of the influence of protein oxidation on higher order structure. 20 mg of a reference and an oxidized monoclonal antibody were dissolved in 550 µl of D2O and adjusted the pH to 7.0. For both samples two replicates of HSQC spectra were acquired within 19 h each without any isotopic enrichment at 323 K. For the spectra of the non-oxidized reference protein an integration pattern, shown as little boxes in the top panel of Fig. 10, was defined as follows: by use of the NMR software Amix (Version 3.9.14, Bruker, Fällanden) an automated peak-picking was performed on the sum of both replicated spectra. Out of the peak list obtained the integration pattern was generated in full automation with the same program whereby the size of the rectangular integration boxes were set to 0.024 and 0.06 ppm in 13C and 1H dimension respectively. This initial pattern contained 519 integral regions and was applied to a set of four spectra. The resulting table was curated for consistency as follows: integrals which were smaller than 25% of the average of all integrals in the spectrum or which varied by more than 18% between the pairs of replicates were removed. This resulted in 347 curated integrals per spectrum. In a second step of analysis integrals which varied by more than 20% between pairs of spectra obtained for non-oxidized and oxidized protein were defined as “significantly affected”. This was the case in 20% of the curated signal integrals. Note: changes of integrals are mostly caused by small changes of signal positions. Based on the primary sequence of the antibody it is expected that three methionine amino acids can be oxidized. The degree of oxidation for the three methionines was determined by use of mass-spectrometry of a proteolytic digest as 26%, 61% and 76% respectively. Based on all possible combinations of oxidized and non-oxidized methionines the sample with the oxidized species can be considered as a mixture of 8 possible oxidation patterns. Nonetheless this result clearly shows that oxidation of few amino acids does have a significant effect on details of the higher order structure of an antibody molecule.

Fig. 10
figure 10

a Expanded view of non-multiplicity edited HSQC spectra acquired for 19 h with 340 t1 increments and an interscan delay of 1.5 s. Data are shown for the non-oxidized antibody (red) in comparison with the oxidized counterpart (blue). The green box indicates (partially) disappearing methyl signals of intact methionines. Note: the appearing signals of oxidized methionine are not part of this view. Small black boxes indicate integral regions used for further statistical analysis. b Correlation plot of the curated integrals selected based on reproducibility between replicates of spectra (details see text). Panel c: Integrals selected for a significant change (details see text) induced upon oxidation. Error bars were calculated from spectral replicates

Conclusion

The examples selected and discussed here have shown that NMR can be used for in-depth structural and quantitative analysis of proteins part of development project in industry or present as API ingredients in biopharmaceutical drugs. These analysis contribute to high quality and safety of products used for treatments of serious diseases including several types of cancer or multiple sclerosis. Compared to other analytical methods the main advantage relates to the versatility in respect to sample preparation and the chemical nature of the analytes. NMR can be regarded as an unbiased analytical method to characterize complex mixtures of formulated proteins. Substantial overlap of protein related NMR signals and low sensitivity of the method does however allow characterization of minor impurities only in selected cases. In most formulated protein solutions, NMR based analytics of sub-percent side products, due to degradation, aggregation, dimerization or chemical modifications of the protein, is not possible. In these cases, application of size-exclusion or other types of chromatography is definitely a technique more adequate and superior to NMR. An extension of the methodology not presented here is the application of NMR for identity testing of biologicals for quality control of production batches if chromatographic methods fail to work.