Introduction

The continuous advancement in recombinant DNA technologies has led to the escalating production of biotechnology-derived therapeutics intended for human use. Monoclonal antibodies (mAbs) represent a class of biotechnology-derived therapeutics for use in the treatment of various disease indications such as oncology, autoimmune, cardiovascular, and metabolic disorders. Monoclonal antibodies are immunoglobulin molecules engineered to bind to specific antigens, either soluble or cell-surface targets. Immunoglobulins (Ig) are high-molecular-weight proteins comprising two heavy and two light chains with variable domains that bind antigens and Fc constant domains that have effector functions (e.g., complement activation or binding to Fc receptors). There are five main classes of heavy chain constant domains: IgM, IgG, IgA, IgD, and IgE isotypes. All of the currently marketed mAbs in the USA are of the IgG isotype (Table 1). IgG has four subclasses, IgG1, IgG2, IgG3, and IgG4, each with its own biologic properties that can impact activity [1•]. Thus, the inherent therapeutic advantage of mAbs over traditional small molecule pharmaceuticals is high specificity to a particular epitope, providing mAbs with a highly targeted, selective mechanism of action that enables innovative treatment concepts for a range of therapeutic uses [24].

Table 1 FDA-approved and currently marketed monoclonal antibody therapeutics (modified from [14])

The first generation of therapeutic mAbs was generated from mouse hybridomas. Because these mAbs were murine derived, immunogenicityFootnote 1 was a major limitation in their effective use in patients [57]. Subsequently, a variety of advanced techniques have been developed to reduce the immunogenicity of mAbs in humans by replacing the murine regions of the antibody with human components. For example, antibody engineering was employed to create chimeric antibodies with approximately 30 % murine and 70 % human components by fusing the murine variable domains with the human constant domains [8, 9]. Today, most mAbs are either humanized—in which only the complementary determining regions (CDRs) are of murine origin (overall, 5–10 % murine and 90–95 % human components)—or fully human, created either by phage display or transgenic “humanized mice.” Humanized mice are genetically engineered to express human IgG but lack functional murine IgGs because the entire mouse IgG repertoire is replaced with a human repertoire [8]. The creation of chimeric, humanized, or fully human antibodies has lead to the regulatory approval and marketing of numerous therapeutic mAbs for the treatment of various disorders. In addition, through the continued advancement in antibody engineering and manufacturing technology, other antibody-based therapeutics are being developed (such as antibody drug conjugates or bispecific antibodies) [1012].

This review focuses on the nonclinical safety assessment of mAbs. At the time this review was written, 33 mAbs have been approved and currently marketed in the USA as therapeutic medicines for various indications, including cancer and immune and metabolic disorders (Table 1). The first therapeutic antibody approved in the USA was muromonab-CD3 (Orthoclone OKT-3®; Ortho Biotech Products, L.P., Bridgewater, NJ), a murine IgG2a mAb that recognizes the cluster of differentiation-3 (CD3) receptor complex on human T lymphocytes. The US Food and Drug Administration (FDA) approved OKT-3 in 1986 for the prevention of allograft rejection in renal transplantation [13]. OKT-3 is no longer manufactured due to the availability of other therapeutic medicines that have similar efficacy and fewer side effects [14]. In 1994, the next therapeutic mAb approved by the FDA was ReoPro® (manufactured by Janssen Biologics B.V., Leiden, The Netherlands and distributed by Eli Lily and Company, Indianapolis, IN) for the treatment of blood clot complications in patients undergoing cardiac procedures [15]. The first mAb approved by the FDA for the treatment of cancer was Rituxan®, which was approved in 1997 and jointly marketed by Biogen Idec Inc., Cambridge, MA and Genentech Inc, So. San Francisco, CA. Rituxan is a chimeric IgG1 anti-CD20 mAb that binds to human B lymphocytes and is used as a monotherapy or in combination with chemotherapy for treating B cell malignancies such as non-Hodgkin’s lymphoma [16]. Daclizumab (Zenapax®) and basiliximab (Simulect®) are mAbs specific for the IL-2 receptor and were approved in 1997 and 1998, respectively, for the prevention of kidney transplant rejection. Daclizumab is a humanized IgG1 mAb, and basiliximab is a chimeric IgG1 antibody [17, 18]. Several mAbs have been approved for the treatment of autoimmune disorders, such as rheumatoid arthritis and psoriasis. For example, adalimumab (Humira®) is a human IgG1 mAb specific for tumor necrosis factor (TNF) that was approved by the FDA in 2002 for the treatment of rheumatoid arthritis. Over the years, adalimumab has been approved for several other autoimmune disorders, including juvenile idiopathic arthritis, psoriatic and plaque arthritis, ankylosing spondylitis, adult and pediatric Crohn’s disease, and ulcerative colitis [19]. Monoclonal antibody therapeutics most recently approved by the FDA include ramucirumab (Cyramza®), a human IgG1 mAb specific to vascular endothelial growth factor receptor 2 (VEGFR2), siltuximab (Sylvant®), a chimeric IgG1 antibody specific to IL-6, and pembrolizumab (Keytruda®), a humanized IgG4 mAb that blocks the interaction between programmed cell death protein 1 (PD-1) and its ligands, PD-L1, and PD-L2. Ramucirumab was approved in 2014 for the treatment of advanced or metastatic gastric or gastroesophageal junction adenocarcinoma with disease progression on or after prior fluoropyrimidine- or platinum-containing chemotherapy [20]. Pembrolizumab was approved in 2014 for the treatment of unresectable or metastatic melanoma [21].

The complex nature of therapeutic mAbs, which are proteins derived from living cells, makes these molecules fundamentally different than traditional small-molecule drugs that are chemically synthesized. Due to the complex characteristics of mAbs, many factors must be considered in the nonclinical safety assessment studies for these molecules. Additionally, each mAb has its own distinct properties, and therefore, each mAb should be considered individually, and a science-based approach should be applied in the toxicology studies performed for these molecules. The concepts that will be reviewed include 1) the regulatory procedures and guidelines that apply to mAbs, 2) the types of toxicology studies applicable to mAbs, and 3) scientific challenges, such as the selection of a relevant animal species and the development of anti-drug antibodies that can arise due to the unique properties of mAbs.

Regulatory Overview of Monoclonal Antibodies

The main regulatory guidance for mAbs is the International Conference on Harmonization (ICH) Preclinical Safety Evaluation of Biotechnology-Derived Pharmaceuticals S6(R1) [22••], also referred to as the S6 Addendum in the USA. ICH S6(R1) details the studies, study design considerations, and scientific rationale for the nonclinical development of biologic products, including mAbs. Importantly, the guidance covers species selection, dose selection, the evaluation of immunogenicity, reproductive toxicity testing, the timing of reproductive toxicity studies when the non-human primate (NHP) is the only relevant species, and carcinogenicity assessment. These topics are further discussed in this review.

Other guidance that should be consulted for the development of biologic products include ICH Nonclinical Evaluation for Anticancer Pharmaceuticals S9 [23] for biologic oncology products and ICH Non-Clinical Safety Studies for the Conduct of Human Clinical Trials and Marketing Authorization for Pharmaceuticals M3(R2) [24] for the timing of nonclinical studies.

Species Selection

Similar to small-molecule products, the nonclinical safety assessment of biopharmaceutical products should be evaluated in two pharmacologically active mammalian species, one rodent and one non-rodent. Specifically, ICH S6(R1) states “If there are two pharmacologically relevant species for the clinical candidate (one rodent and one non-rodent), then both species should be used for short-term (up to 1 month duration) general toxicology studies” [22••]. Because mAbs are highly specific for their human target, species cross-reactivity of the mAb may be limited to the human and NHP. Selecting a pharmacologically relevant species for the toxicology evaluation of mAbs is critical to ensure that the toxicology data will predict the potential adverse consequences of modulating the mAb’s protein target in humans. The evaluation of a pharmacologically relevant species typically includes the evaluation of sequence homology of the mAb protein target across species compared to humans. In addition, the binding affinity of the mAb to its protein target across species (including human) can inform potential species differences and may be sufficient to determine a pharmacologically relevant species. Ideally, an in vitro and/or in vivo functional assay could be used to evaluate the pharmacological activity of the mAb across species. For example, a decrease in a specific cytokine, protein target, or enzyme could be evaluated in vitro and also be measured in vivo as a pharmacodynamic biomarker of functional activity. Finally, it is important to thoroughly understand the target expression and biology in the nonclinical species compared with the human. This information can come from the literature or from early research studies. The importance of fully understanding the biology of the mAb target came to light in the development of TGN1412, which was an anti-CD28 super-agonist mAb designed to treat B cell chronic lymphocytic leukemia and rheumatoid arthritis. In 2006, the administration of TGN1412 in healthy young male volunteers resulted in cytokine storm and multiorgan failure, severe inflammatory reactions that were not predicted from the nonclinical toxicology studies. It was later discovered that CD28 is expressed on human CD4+ effector T-memory cells but not monkey CD4+ effector T-memory cells. The activation of human CD4+ effector T-memory cells was likely the cause of the cytokine storm in patients. Because monkeys do not express CD28 on CD4+ effector T-memory cells, cytokine storm in humans was not predicted from nonclinical toxicology studies [25, 26••].

Nonclinical Studies to Support the Clinical Development Monoclonal Antibodies

Once pharmacologically relevant species are identified, the nonclinical development studies can be designed. The pharmacology and toxicology data needed to support the development of a mAb are generally less extensive than that required for a small molecule because mAbs are more specific to their target, not as widely distributed due to their large size (~150 kDa), and catabolized to amino acids rather than metabolized. Additionally, because mAbs are specific for their human target, species cross-reactivity of the mAb can be limited to the human and NHP. Therefore, nonclinical testing in two species—as is done for small molecules—may not be feasible for a mAb with limited species cross-reactivity.

For mAbs, the general toxicology study requirements include 1) pharmacology data to support the proof of concept, 2) pharmacokinetics/toxicokinetics data to understand the kinetics of the mAb in vivo, 3) safety pharmacology of essential organ systems (i.e., cardiovascular, respiratory, and central nervous system), 4) characterization of repeat-dose toxicology, 5) reproductive toxicology, and, finally, 6) carcinogenicity. The safety pharmacology evaluation can be performed in the context of the general toxicology studies by including specific endpoints to assess cardiovascular toxicity (electrocardiograms, histopathology), respiratory toxicity (clinical observations and/or functional evaluations, histopathology), and central nervous system toxicity (clinical observations, functional observational battery, histopathology), rather than separate studies as is typically done for small molecules. For longer duration repeat-dose toxicology studies, if a biologic shows cross-reactivity to two species, ICH S6(R1) indicates that both species should be used for initial toxicology testing but that longer-term general toxicity studies in one species can be sufficient if the toxicology findings from the short-duration studies are similar, or the findings are anticipated/expected from the mechanism of action of the product [22••].

General Toxicology Study Design Considerations

The general toxicology studies for mAbs should be designed to support the clinical trial in terms of route, dose, dose frequency, and duration. Monoclonal antibodies are usually dosed parenterally, by intravenous or subcutaneous administration. The same route of administration should be used for the nonclinical and clinical studies. For the dose schedule or dose frequency, the nonclinical study should dose animals at least as frequently, if not more so, than the planned clinical trial. The duration of the nonclinical study should support the planned duration in the clinical trial. For example, to support dosing a mAb for a chronic indication (>6 months of dosing), generally 6-month chronic studies in a non-rodent and rodent species, if pharmacologically relevant, are adequate.

ICH S6(R1) provides guidance on how to select the high dose in the general toxicology studies [22••]. The toxicity of mAbs is generally driven by the exaggerated pharmacology of the mAb binding to its target at relatively high doses rather than unexpected off-target toxicity. An understanding of the pharmacokinetic/pharmacodynamic (PK/PD) relationship for the mAb and a marker for functional activity is ideal. If the PK/PD relationship is understood, ICH S6(R1) states that the high dose should be the higher of (1) a dose that provides the maximum intended pharmacological effect in the preclinical species and (2) a dose that provides an approximately 10-fold exposure multiple over the maximum exposure to be achieved in the clinic, unless a lower dose can be considered a maximum feasible dose (MFD) [22••]. Assuming three treatment groups, the low dose generally approximates the clinically efficacious dose, and the mid-dose is an even multiple between the low and high dose groups. The selected doses are important as they will be used to determine the No Observed Adverse Effect Level (NOAEL), which is used to set the starting dose for first-in-human (FIH) clinical trials as described in the FDA Guidance for Industry, Estimating the Maximum Safe Starting Dose in Initial Clinical Trials for Therapeutics in Adult Healthy Volunteers [27]. For mAbs that are highly potent immune activating or agonist drugs, a minimally anticipated biologic effect level (MABEL) approach, which takes into account the in vitro and in vivo pharmacology data, may be more appropriate than the NOAEL for setting the FIH dose. The MABEL approach came out of the TGN1412 incident described above [28, 29]. As clinical development progresses, the NOAEL from future nonclinical studies (e.g., chronic and reproductive toxicology studies) can be used to inform safe dose ranges for longer duration clinical trials in greater numbers of patients.

Pharmacokinetics/Pharmacodynamics

The distribution and clearance of mAbs are generally highly predictable and can depend on whether the mAb target is soluble or membrane-bound [30]. The distribution of mAbs is initially limited to the vascular space with slow distribution to tissues based on their large molecular weight (~150 kDa). The clearance of mAbs can be target-mediated and/or mediated by the Fc portion of a mAb. The long half-life of mAbs appears to be attributed to the interaction of the Fc portion of IgG with the neonatal Fc receptor (FcRn) expressed on various cell types, including endothelial cells, the internalization of the mAb into a cell endosome without being degraded into amino acids, and the release of the mAb back into the circulation [31, 32]. Modifications to mAbs to improve the binding of the Fc portion to the FcRn can result in even longer half-lives [33].

The correlation between the PK and PD of a mAb can inform the efficacious dose range and dose frequency for the clinical trial. A PD marker is a direct or indirect measure of pharmacological activity of the mAb. A PD marker can be a measure of receptor occupancy, ligand binding, a downstream protein target, or even a lymphocyte population that is targeted by the mAb. Correlating PK with PD provides a model to guide both nonclinical and clinical dose level selection in the evaluation of both toxicity in the nonclinical studies as well as efficacy and safety in the clinical studies [34].

Immunogenicity

Immunogenicity or the generation of anti-drug antibodies (ADAs) is an anticipated response of a healthy animal’s immune system to a foreign protein (such as a mAb) to clear the foreign protein from the body. ADAs are evaluated in the nonclinical study(ies) to aid in the interpretation of the nonclinical exposure and toxicity data. The generation of ADAs can result in a decrease in exposure in the toxicity study. This is a concern if an insufficient number of animals were exposed to the mAb for the entire duration of the study, which could mean that the toxicity of the mAb was not fully characterized. ICH S6(R1) indicates that ADA testing is not mandatory; however, because the study results cannot be predicted, ICH S6(R1) recommends that samples be collected and archived for potential future analysis [22••]. If the data from the toxicology study(ies) indicate that immunogenicity occurred, such as a change in PD, exposure, or immune-mediated reactions, the ADA samples can then be analyzed. Additionally, if there is no PD marker for the in vivo toxicology study, further characterization of whether the ADA can neutralize the therapeutic activity of the mAb should be carried out. If ADA results in a significant decrease in exposure in short duration studies, longer duration or chronic toxicology studies may be challenging to conduct. To mitigate the impact of immunogenicity on exposure, higher doses of the mAb can be used to saturate the ADA response and maintain exposure throughout the study. Importantly, the nonclinical immunogenicity data are not used to predict immunogenicity in patients because animals may inaccurately predict higher immunogenicity rates. However, the 2014 FDA Guidance for Industry Immunogenicity Assessment for Therapeutic Protein Products [35] clarifies that although nonclinical immunogenicity cannot be used to predict the incidence of human immunogenicity, the nonclinical data may be helpful in “describing the consequences” of immunogenicity.

Other General Toxicology Endpoints—Cytokine Release and Immunotoxicity

Additional endpoints can be included if there are specific concerns for the mAb based on the known biology and pharmacology of the target. Some examples of additional endpoints include cytokine release and immunotoxicity endpoints. For mAbs that are agonists or activate their target, cytokine release may be included in the toxicology studies both as a toxicology endpoint for potential immunomodulation and inflammation and as a potential PD marker. Cytokines can have unique kinetics from each other [36] and have both pro-inflammatory and anti-inflammatory effects over time; therefore, it is important to plan cytokine collection time points around the specific cytokines that may be altered.

If immune modulation and potential immunotoxicity are anticipated with the mAb administration, hematology (total and absolute differential lymphocyte counts), clinical chemistry (globulin and albumin:globulin ratios), organ weights (thymus and spleen), and gross pathology and histopathology of the lymph nodes and lymphoid organs (thymus, spleen) can be evaluated in the general toxicology study, as described in ICH Immunotoxicity Studies for Human Pharmaceuticals S8 [37]. Specific immune cell populations can also be evaluated using flow cytometry (e.g., T-memory cells or T-regulatory cells). Based on the data from the shorter duration studies, additional immunotoxicology endpoints can be included in future general toxicology studies.

Reproductive Toxicity

Reproductive toxicity studies with mAbs are a regulatory requirement as outlined in ICH Detection of Toxicity to Reproduction for Medical Products & Toxicity to Male Fertility S5(R2) [38]. However, the study design and dosing schedule can be modified based on an understanding of species specificity (e.g., limited to NHP alone vs. rat and/or rabbit) while taking into consideration the mechanism of action, target biology, immunogenicity, and pharmacokinetics in the species selected for reproductive toxicity testing. Because mAbs contain the Fc portion of IgG, mAbs can bind FcRn expressed on the placenta, cross the placenta by receptor-mediated endocytosis, and result in fetal exposure. An industry survey recently showed that Fc-containing IgG1 molecules were transferred most efficiently in late gestation in rabbits and monkeys with a positive correlation between maternal and fetal exposures [39].

Reproductive toxicity testing should be conducted in pharmacologically relevant species. If the mAb crosses the placenta of both rodents and rabbits, both species can be used for embryo-fetal development studies, unless as ICH S6(R1) describes, “embryo-fetal lethality or teratogenicity has been identified in one species” [22••], in which case only one species needs to be used. When NHP is the only relevant species, developmental toxicity studies should only be conducted in the NHP, although the guidance does state that studies in alternative models can be scientifically justified.

Alternative models such as transgenic mice or use of a homologous or surrogate protein in a species expressing the ortholog of the human target can be considered when there are no relevant species, assuming adequate background knowledge of the model exists. Finally, if the weight of evidence (e.g., mechanism of action, data from knock-out mice, transgenics, and class effects) suggests an adverse effect on fertility or pregnancy, these data may provide adequate information to communicate risk, and additional nonclinical studies may not be needed. For example, interferon products are known to be abortifacient in monkeys, and product labeling communicates this potential risk [22••].

Fertility studies with mAbs are typically not done unless the rat or mouse is a pharmacologically relevant species. If the monkey is the only relevant species, it is recognized that mating studies to evaluate fertility are not practical and that evaluation of reproductive organs (organ weights and histopathology) in repeat-dose toxicity studies of at least 3 months in duration can be used to determine potential effects on the reproductive tract. If reproductive organ toxicity (or a signal of toxicity) is observed, more specialized assessments can be included in a future repeat-dose toxicity study (e.g., menstrual cycle effects, sperm counts, sperm morphology/motility, hormone levels). A homologous protein or surrogate, or a transgenic model, may be considered to evaluate potential effects on conception or implantation when the monkey is the only relevant species. If nonclinical studies are not feasible, the potential risk to patients should be mitigated through the clinical trial, informed consent, and product labeling.

For the evaluation of embryo-fetal development (EFD), when only the NHP is a pharmacologically relevant species, separate EFD and peri-, post-natal development (PPND) studies can be conducted. Alternatively, the enhanced PPND (ePPND) study design can be considered that combines both the EFD and PPND into one study [40]. The ePPND study allows for the evaluation of pregnancy outcome, viability, and external malformations at birth following a natural delivery. Animals are monitored by ultrasound for the progression of pregnancy. Skeletal effects are evaluated by X-ray and visceral abnormalities are evaluated at necropsy. Other study designs can also be considered if, based on the pharmacology of the mAb, there are concerns that pregnancy loss or embryo-fetal toxicity could occur. Other endpoints can then be evaluated in the offspring; the duration of follow-up and endpoints will depend on the anticipated pharmacological activity and/or in vivo effects. For example, immunomodulatory drugs may affect lymph node development, and offspring may need to be followed for a long duration to evaluate the impact on lymph node development. This was the case for rituximab, where neonate NHPs were followed up to post-natal day 180 after weaning to evaluate the potential effects of rituximab, which targets CD20 positive B cells on the developing lymph nodes [41]. Regulatory authorities note in ICH S6(R1) that studies in NHPs are only useful for hazard identification because the number of animals per group is generally lower than for a rodent or rabbit study [22••]. Additionally, because the study is only for hazard identification, such a study could be done with only a control and one dose group. The scientific justification for the dose level used should ideally be based on PK (a 10-fold exposure multiple over therapeutic drug levels) and PD (saturation of target binding) if feasible.

ICH S6(R1) provides guidance on the timing of the reproductive toxicity studies for mAbs [22••]. If women of child-bearing potential are included in clinical trials prior to the conduct of EFD studies, highly effective methods of contraception can be included in clinical trials to manage the potential risk. For mAbs with pharmacological activity only in NHPs and with sufficient precautions in place to prevent pregnancy in clinical trials, the EFD or ePPND studies can be conducted during phase 3 and the report submitted with the Biologics Licensing Application (BLA) or marketing application. When sufficient precautions to prevent pregnancy cannot be taken, a complete EFD or ePPND study report should be submitted prior to initiating phase 3 trials. For products pharmacologically active only in NHPs and where the pharmacology raises a concern for EFD, product labeling should reflect the potential risk or concern without conducting a developmental toxicity study in NHPs, and administration to women of child-bearing potential should be avoided/contraindicated.

Carcinogenicity

The need for a carcinogenicity assessment for a mAb therapeutic depends on a number of different factors that include 1) a duration of clinical use >6 months (including repeated intermittent use); 2) evidence of carcinogenic potential based on the product class, mechanism of action, pharmacology, or known biology; or 3) preneoplastic lesions in the toxicology studies. A weight of evidence approach can be used for mAbs that draws upon data from multiple sources such as the published literature (e.g., transgenics, knock-outs, animal disease models, human genetic disease), class effect information, target biology and mechanism of action, in vitro data, chronic toxicity data, and clinical data. The information from these various sources may be sufficient to inform clinical risk so that additional nonclinical studies are not needed. If the weight of evidence indicates a concern for carcinogenic potential, rodent bioassays would not be warranted, and the risk can be addressed in product labeling and clinical risk management. When the weight of evidence is unclear, a sponsor can consider additional nonclinical studies to address or mitigate the concern. For example, a weight of evidence approach was used in the carcinogenicity assessment for ustekinumab (Stelara®; a human IgG1κ mAb against the p40 subunit of the IL-12 and IL-23 cytokines). Data from the literature describing studies of xenograft mice treated with IL-12 and knock-out mouse models were used in the product labeling to communicate a potential positive carcinogenic signal [42].

Where there are insufficient data regarding carcinogenic potential, a more thorough evaluation may be needed, which could include studies to further understand the biology of the mAb target and/or additional endpoints in the toxicity studies. When there is no cause for concern or evidence for carcinogenic potential, additional nonclinical testing is not needed. Nonclinical studies can be done to mitigate a potential concern for carcinogenic potential. The carcinogenicity assessment is used to communicate risk for the product during clinical trials, labeling, and post-marketing. Alternative approaches can be considered [22••].

Summary

In summary, mAbs represent a highly targeted and specific class of biotherapeutics for a broad range of indications. The increasing historical experience with the clinical development of mAbs for both industry and regulatory authorities led to the development of an addendum to the original ICH S6 guidance for the nonclinical development of biologics, including mAbs, ICH S6(R1) [22••]. This review summarizes the key elements needed to evaluate the nonclinical safety of mAbs, which include the selection of a relevant species, key considerations for toxicology study design, addressing the challenges of immunogenicity, as well as considerations for reproductive toxicity and carcinogenicity studies. The implementation of the recommendations in the ICH S6(R1) guidance for the development of mAbs may result in a more focused nonclinical development plan with potentially fewer overall studies and animals used for the nonclinical testing of mAbs without negatively impacting the quality of the nonclinical safety evaluation.