Keywords

1 Introduction

Twenty-five years ago, Genentech (now a part of Roche) received commercial approval for Activase®, a recombinant form of tissue plasminogen activator expressed in Chinese hamster ovary (CHO) cells. This accomplishment ushered in the era of utilizing mammalian expression systems to produce complex glycosylated therapeutics. Other established rodent cell lines such as NS0, Sp2/0, and BHK cells have been, and to a certain extent, continue to be used to develop biologics with Remicade®, Erbitux®, and Synagis® being the notable commercial successes. However, of the top-ten selling biologics which amassed over US $57 billion in sales in 2011, eight are produced from mammalian expression systems and of those eight, only one, Remicade®, is not produced from engineered CHO cells [33]. This dominance is further illustrated by a review of recently approved recombinant biologics derived from mammalian cell lines. In the last two years, five out of six utilize a CHO host for expression. Benlysta®, a therapeutic mAb to treat lupus patients, is produced from recombinant murine NS0 cells and represents the sole outlier in the group.

In contrast, the use of human cell lines to express recombinant therapeutics has been somewhat limited, despite the diversity of established human cell lines available. Over the years, regulatory agencies have approved recombinant biologics from two human cell lines, HEK 293 cells (Xigris®) and HT-1080 cells (Dynepo®, Elaprase®, Replagal®, and Vpriv®). Unfortunately, Xigris and Dynepo have been withdrawn from the market due to product-driven safety concerns or market challenges leaving only the enzyme replacement therapies marketed by Shire. With Biogen Idec’s recent BLA submissions for extended half-life Factor VIII and Factor IX produced in HEK 293 cells, there is the possibility that the portfolio of approved biologics produced from human cell lines is poised to expand. Although these established human cell lines have been successfully utilized for the production of biologics, efforts have also been applied to the de novo derivation of human cell lines specifically for the expression of recombinant therapeutics and vaccines. This has most notably been achieved through the immortalization of primary cells with the adenovirus oncogene E1A. The most mature of these efforts is the PER.C6 cell line originally developed by IntroGene from fetal retinal cells. This cell line has been used widely for the production of vaccines and more recently for the production of therapeutic proteins with several clinical programs underway [35]. The PER.C6 cell line gained attention when the DSM and Crucell joint venture Percivia announced in 2008 that they had achieved mAb titers exceeding 25 g/L using a modified perfusion system in which the mAb was retained and concentrated in the bioreactor. Newer to the scene is a human amniocyte-derived cell line called CAP, marketed by CEVEC, that was also immortalized with adenoviral genes [21]. However, at this time there is little publicly available information to assess the robustness and reliability of this cell line.

Beyond cultured cells, a discussion of mammalian hosts would not be complete without acknowledging the success, albeit limited, of expressing recombinant proteins in transgenically modified animals. Patients have access to Atryn®, a recombinant antithrombin medication expressed in the milk of transgenic goats first approved in 2006 by the EMA and then by the FDA in 2009. Although this pioneering work established the proof of principle that a human therapeutic produced from a transgenic animal could gain regulatory acceptance, the biopharma industry has been reluctant to adopt this platform. To date, the only other approved product on the market derived from a transgenic animal is Ruconest®, a recombinant C1 esterase inhibitor approved by the EMA in 2010. The failure of this novel expression system to gain more traction in the industry likely reflects the substantial productivity improvements made in the last five years using the traditional cell line based manufacturing that significantly deflated a major impetus to consider hitching commercial development of new therapeutics to an emerging technology.

The modest interest in developing therapeutic proteins in hosts other than CHO reflects the overall attractiveness of the entire CHO expression package. There is extensive media and process expertise at the industrial scale, making their use virtually “plug and play”. Moreover, it has a well-established safety profile and is a known entity with regulatory agencies. Arguments have been made that producing a therapeutic glycoprotein in a human cell line would be advantageous from the perspective that the therapeutic would have “human” carbohydrates and therefore be potentially more efficacious and/or less immunogenic. However, in the instances where there have been direct comparisons, analytical methods detected differences in the protein expressed from the recombinant human cell line relative to the endogenous human form of the protein. Furthermore, there was no evidence that the recombinant human form of the protein was safer or more efficacious than the recombinant CHO-derived counterpart. In the case of recombinant erythropoietin (EPO), comparisons of CHO and HT-1080 expressed product showed that there are detectable differences in the sialic acid content of the molecule produced from the different hosts [62]. However, the recombinant EPO produced in the human cell line was also distinguished by isoelectric focusing from endogenous erythropoietin isolated from plasma and urine [50]. In the same vein, there was also no compelling data to suggest an advantage for a human host cell line when CHO-derived Fabrazyme® was compared to HT-1080-derived Replagal®. Although the sialic acid and mannose-6-phosphate content differed between the recombinant alpha galactosidase produced from the two host cell lines, biodistribution in a mouse model and antigenicity studies found the two molecules to be comparable [45].

This is not to say that the CHO host options are not without potential issues. It has been well established that CHO and other rodent cell lines are capable of generating glycan structures not seen in humans. A naturally occurring mutation in the CMAH gene prevents the formation of Neu5Gc, a hydroxylated form of sialic acid in humans, yet this glycan moiety has been detected on several biologics produced from murine cell lines, as well as CHO [56]. There is no compelling evidence to date that suggests the presence of Neu5Gc adversely affects the safety or efficacy of therapeutics. Nonetheless, the presence of circulating antibodies in humans directed to this sugar raises potential concerns that there is an elevated risk of altered clearance and antidrug antibodies response to Neu5Gc-bearing therapeutics [26]. Similarly, the α-1,3-galactose linkage is also absent in humans but is known to be expressed in CHO and murine cell lines [5]. The presence of the xenoantigenic gal-α-gal linkage is of greater concern, as there is credible evidence that the α-gal linkage can have an adverse impact on the safety profile of a biologic therapeutic. For example, SP2/0 (murine)-derived Erbitux® has been shown to trigger anaphylaxis in a subset of patients due to pre-existing IgE antibodies directed against galactose-α-1,3-galactose sugar residue [13]. As a result of these findings, product quality screening of clones needs to be directed specifically to these glycan structures and will typically result in clones with acceptable productivity being discarded due to concerns around elevated levels of either one of these glycans. Given the well-known metabolic pathways responsible for generating these glycan moieties and the development of some of the new genome modifying technologies mentioned later in this chapter, this shortcoming of the CHO host can be readily addressed to create a modified host cell line that does not suffer from the potential limitation of producing protein compromised by detectable levels of Neu5Gc or α-1,3-galactose.

2 Which CHO is the “Right” CHO?

Historically, there have been three CHO hosts routinely used to develop biologics. Two of these, DUXB11 and DG44, were isolated in the Chasin laboratory at Columbia University, New York [69]. These cells had undergone extensive mutagenesis to generate lines that were deficient in dihydrofolate reductase (DHFR) activity and hence dependent upon an exogenous source of nucleotide precursors for growth. This represented a readily manipulated phenotype suitable to select for genome integration and stable expression of exogenous DNA. This is accomplished by transfecting the cells with expression cassettes for the gene of interest and a DHFR gene. Posttransfection, cells are placed in selection media lacking nucleotide precursors. Given the ease and effectiveness of this approach, these cell lines found widespread acceptance in the industry as the starting host to generate production cell lines. Their suitability for this role was further enhanced due to the ability to select for a high copy number of the introduced expression vector by adding methotrexate (MTX) to the cultures. As MTX is a competitive inhibitor of the DHFR enzyme, applying this additional selection pressure on top of the absence of nucleotide precursors enables the selection and isolation of the minor population of cells that have undergone a spontaneous amplification of the integrated expression vector containing the DHFR selectable marker and, in most cases, the gene of interest. The presence of multiple gene copies helps to ensure maximum productivity for any given molecule by driving an excess of recombinant mRNA for the therapeutic protein of interest.

The third CHO option that has been extensively used is the wild-type CHOK1 cell line, and its derivative CHOK1SV (developed by Lonza). These hosts are usually paired with the other prevalent selection system used in the industry. This method, known as glutamine synthetase (GS) selection, capitalizes on the fact that, absent an exogenous source of glutamine, cell survival is dependent on the GS enzyme to produce glutamine [2]. With host cell lines such as murine myeloma-derived NS/0 cells, which have low endogenous GS enzymatic activity, this affords a simple selection scheme when using a GS selectable marker in the expression vector and glutamine-free selection media. On the other hand, CHO cells tend to have higher endogenous GS activity, making glutamine-free selection less efficient. However, similar to the DHFR/MTX system, the GS competitive inhibitor methionine sulphoximine (MSX) can be added to the media to apply additional pressure and select for CHO cells that are driving high levels of expression from the integrated vector.

The confluence of two unrelated factors has altered the landscape as it relates to the relative attractiveness of the GS and DHFR selection systems. Until recently, the technology associated with the GS, but not the DHFR expression system, has been encumbered by intellectual property. However, much of that protection has now expired, opening up the GS selection system for use without the burden of royalties on commercial sales. The second salient factor is the advances made in efficient genome engineering tools such as zinc finger endonucleases [49], meganucleases [8], TALENs [12] and CRISPR [7]. With these tools readily available, it is now relatively straightforward to create targeted mutations. This capability has been exploited by Eli Lilly and Lonza to create GS-deficient CHO cell lines which enhances the stringency of selection, in turn resulting in a greater proportion of high-expressing clones [22]. With the GS targeting zinc finger endonuclease utilized by Eli Lilly now commercially available from SAFC and the past IP issues around the GS system no longer an impediment, this selection system may be poised to become the dominant tool in the industry. It has the distinct advantage over the DHFR system of not requiring gene amplification to achieve suitable expression which can shave weeks off the development timeline. In an industry facing ever-increasing pressures to get candidates to the clinic faster, a switch from a DHFR to a GS based expression system represents low hanging fruit to achieve this end.

The CHO GS knockout represents one highlight in over 20 years of engineering CHO cells to imbue them with new phenotypes that would not have been readily achievable through classical methods of media and process manipulations. Some of the early pioneering work in this area included improving expression from a heterologous CMV promoter through overexpression of the adenovirus E1A gene [15] and altering glycan structure of recombinantly expressed proteins by overexpressing the alpha 2,6 sialyltransferase gene [44]. More recently, fucosylation modulation has been a subject of intense interest. It has been established that an afucosyl glycan is desirable on a subset of mAbs (i.e., for those intended for some oncology indications) due to the enhancement of ADCC activity [63]. Unfortunately, CHO cells invariantly produce fucosylated glycans on recombinant mAbs making it highly unlikely that, even with exhaustive screening, a recombinant CHO cell line producing a predominately afucosyl mAb could be isolated. This obstacle was initially overcome through the laborious task of classical gene targeting via homologous recombination in CHO cells to create a host in which both alleles of the FUT8 gene (the transferase responsible for adding fucose) were knocked out [76]. Since that time, a variety of other strategies has been employed to establish engineered CHO hosts capable of producing hypo or a fucosylated glycans. These include knocking out or knocking down the genes for other key enzymes in the fucosylation pathway [36, 68, 80] to overexpression of native or chimeric GnT-III glycosyltransferase to drive formation of glycan structures that are not suitable substrates for the FUT8 transferase [23] and overexpression of the prokaryotic enzyme GDP-6-deoxy-d-lyxo-4-hexulose reductase to divert a key intermediate in the de novo pathway for fucose biosynthesis [71].

The engineering strategies described above represent but the tip of the iceberg. There have been dozens of publications detailing other host cell engineering strategies in which antiapoptotic genes, chaperones, and components of the unfolded protein response or the secretory apparatus have been manipulated to achieve a desired phenotype (reviewed in [52]). Although many of these publications hint at potentially interesting avenues of intervention to develop superior hosts for expression of recombinant protein therapeutics, the vast majority of them fail to demonstrate utility in a cell culture system that is industrially relevant, instead relying on models that incorporate transient expression, serum-dependent cell lines and/or a scale no larger than a T-flask. If the minimum criteria to demonstrate industrial utility are considered to be stable cell lines grown in a benchtop bioreactor with serum-free media, only a small number of published studies cross this success threshold (Table 1). What can’t be ruled out is the possibility that some of these engineering targets have been successfully implemented in a commercial setting without being published.

Table 1 A compilation of published reports in which cellular engineering strategies were successfully applied in an industrially relevant setting to improve productivity or product quality

With the currently available tools for precision genome modifications, together with the advancing understanding of CHO metabolism and the long-awaited publication of the CHO genome [30, 75], the ability to engineer CHO cells is greater than ever. This should accelerate the trajectory of successful engineering outcomes. Even challenging metabolic pathways that are under multitiered levels of feedback regulation could become amenable to successful manipulation by exploiting miRNAs, a class of regulatory molecules that can simultaneously influence multiple cellular targets within a metabolic network. [34]. There is currently a significant amount of interest in attempting to leverage these regulatory RNAs to improve CHO-based expression systems. Over the next few years, it will be interesting to see if the hope and promise of this application are realized.

3 Who Knows Best: The Cell or the Engineer?

When it comes to the basic engineering of the host cell to overexpress the protein of interest, the industry has traditionally relied upon random integration of transgenes into the host genome posttransfection. This is an inherently inefficient process whereby the majority of transfected cells yield unsatisfactory production levels. As such, finding the rare high-producing cell lines has been a considerable challenge for many years. Several groups have independently discovered different genetic elements capable of influencing the chromatin environment to promote a transcriptionally permissive state, and employed them as flanking DNA elements in vectors as a tool to achieve higher productivity [3, 43, 78]. Although somewhat counterintuitive, another strategy to manage the low frequency of high-expression challenge is to cripple the resistance gene present in the expression cassette. This increases the stringency of the selection, and enriches for cells that are able to overcome the defective resistance gene, either through the integration of many copies of the vector or by integration of the transgene into a transcriptional hot spot. To achieve this end, different strategies have been employed to compromise the efficiency of translating the selectable marker, such as engineering the DHFR open reading frame to employ predominately low-abundance codons [73], and attenuating the start codon for the zeocin resistance gene by replacing the native ATG start codon with an alternative start codon such as TTG [70].

Another approach that is particularly useful for the expression of multigenic molecules such as monoclonal antibodies is splitting the DHFR coding sequence into two pieces, with the two DHFR gene fragments genetically linked to the heavy and light chain genes through an internal ribosome entry site (IRES). To reconstitute a functional DHFR enzyme, each fragment of the DHFR protein is fused to a leucine zipper dimerization motif [4]. This strategy ensures that only those cells effectively expressing both the heavy and light chains of the antibody survive selection. These tools have helped maintain an impressive trajectory of continuous improvement with regard to cell line productivity. Despite this success, there has been a parallel effort to try to revolutionize the gene integration process. This avenue uses a controlled gene integration process that seeks to minimize the randomness of gene insertion, and thereby predestine daughter clones for predictably high transgene expression. Establishing this type of system comes with its own set of challenges, most notably achieving productivity levels that can match or exceed those currently being obtained with traditional, random integration methods. However, the appeal of a cell line development process that affords more control and predictability than random integration is quite strong. There are two basic systems that have been described that can accomplish the goal of having greater control over the gene integration event. One is through the use of artificial chromosome expression (ACE) technology, which allows one to build the gene expression cassette outside the production cell line, yet within an autonomously replicating genomic structure [48].

ACE technology has been available for several years as a means for introducing exogenous genes into mammalian cells [48]. These large genetic elements are similar to bacterial plasmids in the sense that they serve as autonomous genetic elements capable of replication and faithful segregation within the cell. There is also the added advantage that the minigenome can be exquisitely tailored with specific elements, such as promoters, enhancers, insulators, and the like. Published work allowed us to compare the performance of cell lines generated using ACE versus cell lines generated using a standard random integration approach. A collaboration between the Canadian Institutes of Health Research, Centre for Drug Research and Development (Canada), and Pfizer showed that ACE technology was effective in generating CHO cell lines expressing a model monoclonal antibody [38]. The studies demonstrated that cell lines could be generated quickly and achieve respectable titers. Several cell lines were subsequently examined for their performance in fed-batch production, as well as assessed for gene expression stability [17]. The authors concluded that the ACE cell lines were similar in productivity and stability to the platform standard being used (random integration). This demonstrated that the approach is certainly a viable method for generating cell lines. However, the amount of work required to set up and utilize the ACE system in-house is considerable. Multiple vectors are required in order to build the expression chromosome, which is done in a stepwise manner. A flow cytometer and skilled operator are required to isolate the chromosome, which adds to the cost of supporting this system. As such, it would seem difficult to justify this additional complexity for an expression system that is comparable to the current industry standard. However, the ACE system is being offered to clients by at least one vendor as an available option.

The second method is to engineer the host cell line with an acceptor site within the host genome that is a site for gene integration using site-specific recombinases [54, 61]. The advantages of site-specific integration are primarily the predictability such a system might afford and the potential to engineer a preoptimized integration site. The obvious utility here is to create cell lines that have predictably high levels of gene expression from the very start, eliminating the need for brute force cell line screening. There are two common tools that utilize essentially the same mechanism: Cre/Lox, based upon the Cre-recombinase, and Flp/FRT, based upon the eponymous “flippase.” Both of these systems were adapted for use in mammalian systems not long after their discovery and initial characterization in microbial systems, and were subsequently adapted for use in bioprocess development [39, 53]. The basic approach is the same, regardless of the specific recombinase being used. The first step is to introduce, by random integration, a reporter gene preloaded into the acceptor site cassette. The resulting clones generated are screened for expression of the reporter (commonly a fluorescent protein), and the highest-producing clones are identified. Typically the desire is to have a single integration site, so the clonal cell lines are often screened for copy number. The end result is typically a small number of single-integrant cell lines that are theoretically capable of supporting high levels of transgene expression. The biotherapeutic protein of interest is then swapped into the acceptor site by the appropriate recombinase, and the reporter gene is excised. These systems have been explored numerous times through the years in an attempt to generate improved host cell lines [10, 32, 40, 54], and one such system is also commercially available from Life Technologies (Flp-In™).

A more direct approach has been enabled by recent advances in genomics and elegant new methods for gene manipulation. That is, similar to the approach described above, the starting point of this new method is to identify a hotspot for the landing pad integration site. Instead of relying on random integration events and clone screening for the reporter gene signal, the cells themselves provide information regarding the location of transcriptional hotspots through evaluation of the transcription profiles of CHO cells using gene expression microarrays [18], [77]. Even more comprehensive is the newer technique of simply sequencing every mRNA in the cell (RNA-Seq) as a means to characterize the transcriptome [72]. Regardless of the method, the outcome is that the most highly expressed transcripts are identified. This information, coupled with the recent release of the CHO genome [30], (www.CHOgenome.org) could be used to pinpoint chromosomal locations that are naturally occurring transcriptional hotspots. One can introduce a gene acceptor cassette into one of these regions, with minimal disruption to the naturally encoded genes by the host cell, and thus create an engineered host cell line that utilizes pathways the cell is already using to maximize gene expression. Targeting specific regions in the genome of mammalian cells has been relatively commonplace in stem cell research [55]. Moreover, the same molecular techniques have been used in CHO cells for the purposes of gene knockout or mutation for many years (reviewed in [42]). However, no one has yet demonstrated the convergence of these approaches with the specific application for bioprocess development.

4 There is Many a Slip Twixt the Cup and the Lip

The promise of site-specific integration was to achieve an optimized host cell line that would be predestined for high transgene expression. However, there are mechanisms of gene expression control beyond transcription that affect the ultimate production and secretion of the protein from the cell. Translational control is known to occur at all levels of protein synthesis: initiation, elongation, and termination. The posttranslational modification and secretion of proteins is also a controlled process that can influence the productivity and quality of proteins being produced. For example, changes in the translational machinery could alter the productivity of a cell line, whereas alterations in the secretory pathway could affect both the quantity and quality of the protein produced. Similarly, epigenetics, which are heritable changes in gene expression that are not caused by changes in DNA sequence, is another mechanism by which cell lines may control their gene expression. Such changes are most commonly understood to be caused by methylation of the genomic DNA [57]. The result of DNA methylation is a localized suppression of transcription, and therefore silencing of gene expression. This is a heritable, though dynamic process, and can be influenced positively and negatively over time. Finally, recent studies have pointed increasingly to the role of microRNAs in gene expression regulation [14, 25]. The implication here is that even if an “optimal site” were identified, there are posttranscriptional and epigenetic effects that can affect the expression of an exogenous gene from this site, and these effects can change over time. There are some tools that can be employed to counteract some of these effects. For example, so-called “insulating elements,” such as matrix attachment regions (MARs), that protect chromatin from being methylated can protect against some of the gene-silencing effects [27]. Indeed, Selexis has developed a method to exploit the mechanisms of MARs to maintain chromatin in an open state that appears to permit rapid successive transfections, and thereby gene integration, into the initial integration site [28].

In addition to these aforementioned challenges, there are several other reasons that likely contribute to the failure of targeted integration systems to outperform the standard approach of random integration. It could be that, despite all that we know about these systems, there is still much left to be discovered and it is simply not yet possible to design an optimal expression system from the ground up. It may be that the approaches taken thus far are somehow flawed or incompatible with cell culture platforms that have been optimized for random integration and the expression of specific proteins. For example, the reporter genes used may be the best way to identify hot spots for fluorescent protein expression, but a different site may be optimal for heavy-chain and light-chain expression. It may also be possible that site-specific integration host cell lines that are superior to the industry standard have been developed, but that have not been revealed to the public domain as of yet. Finally, it may be simply that the standard platform of random integration coupled with a sufficiently powerful screening program is, taken altogether, inherently better than site-specific integration for the expression of protein biotherapeutics.

Despite these limitations, there remain distinct advantages that site-specific integration can offer over the random integration platforms used today. Site-specific integration provides predictability of expression. For a well-characterized site-specific host cell line, one can assume that the productivity of the heterologous gene will be within a comparably very narrow range. That is to say that the host cell line will have been predetermined to contain an integration site that is stable, and therefore not prone to transcriptional silencing. As such, by design there should not be nonexpressing or very low expressing cell lines. This has the potential advantage of greatly simplifying the cell line selection process. That is, given the assumption that all clonal cell lines derived from the transfection event are genetically identical at the site of integration, there is no need for an extensive cell line screening program because there is no “needle in the haystack” to find. This has the added benefit of saving the time that would normally be devoted to multiple rounds of clone screening. This lack of genetic diversity could have unintended consequences, however, as there are situations where a needle in a haystack is precisely what is needed (such as proteins that have significant product quality challenges, for example). Finally, there is utility for the initial nonclonal pool itself following the initial transfection. If the selection system is set up such that only host cells that integrate the transgene grow up out of the population, this population, like the clonal cell lines, should have a predictable level of expression. In this scenario, relatively large amounts of the recombinant protein can be generated in a very short time, with a low risk of failure (which would exist for a pool that does not express well, for example). This approach could be used for making material to supply development work, toxicology studies, or, potentially, material for Phase I clinical studies. Indeed, this approach has been used by Regeneron, utilizing their EEYSR (Enhanced Expression and Stability Region) system [1, 60]. The time savings of this approach, compared to establishing clonal cell lines, is considerable. However, it is important to note that although Regeneron has embraced this approach to accelerate speed to clinic, they opt for developing cell lines via traditional methods to produce material for pivotal studies and ultimate commercial launch.

5 Finding the Needle

Until the day arrives when high-efficiency, high-productivity cell line development methods are widely adopted, effective productivity screening technologies will still be required to facilitate finding the needle in the haystack. One system that has gained widespread adoption throughout the industry due to the relatively modest up-front capital cost, ease of use, and effectiveness is the ClonePixFL instrument developed by Genetix [11]. This technology combines the growth of colonies in methylcellulose embedded with a fluorescently labeled antibody directed to the product being expressed. As the antibody/antigen complexes precipitate around the secreting colony, fluorescent halos are formed, with the size of the halo presumably representative of the productivity of the colony. To enhance the throughput of the screening, the instrument includes imaging capabilities, software, and robotics capable of screening tens of thousands of colonies and transferring the most promising colonies to 96-well plates. As with any technology platform, there is the need for some initial optimization; in this case, it involves optimizing media composition to enable existing media platforms typically focused on supporting high-density suspension growth to meet the new demand of enabling robust growth of colonies at low densities in semi-solid media. There is also the challenge of understanding the most effective way to utilize the data that are generated from the ClonePixFL platform. For example, the early protein expression data, as measured by the fluorescent halo around a colony, must be correlated with subsequent expression once the clonal cell lines have been adapted to suspension growth and scaled up into a more “manufacturing relevant” production platform, in order for this approach to be truly effective.

The breadth of clone screening can be enhanced severalfold relative to the ClonePixFL by capitalizing on the throughput of flow cytometry, or fluorescence activated cell sorting (FACS). This platform has been utilized for many years as an effective tool that several groups have utilized either as an alternative to the ClonePixFL technology or as an enrichment tool prior to employing the ClonePixFL. The challenge for flow-cytometry based methods has been the means of detecting the secreted product. One method used frequently in the development of cell lines relies on the transient association of the secreted product with the extracellular matrix as a means to measure how much each cell is producing [6]. Although effective in many instances, this approach does have some limitations. Proteins that are intrinsically “sticky” limit the effectiveness of the screen as they have the potential to remain bound to cells after being secreted. Furthermore, similar to the ClonePixFL method, detection of the product requires an antibody to the recombinant protein being expressed. There are many commercially available options available for detecting antibodies or Fc-fusion proteins, however, early-stage cell line development projects for other (non-Fc-containing) recombinant proteins could be hampered by the absence of available reagents that recognize the protein being expressed.

Other flow-cytometry-based methods have been developed that eliminate some of the drawbacks of the Brezinsky method. These rely on a surrogate reporter to serve as the readout for expression levels of the gene of interest. The reporter molecules are typically fluorescent proteins or cell surface proteins that can be readily detected with fluorescently labeled antibodies. In order for the reporter to be a meaningful barometer of therapeutic protein expression, its open reading frame typically needs to be genetically linked to the expression cassette used to express the protein of interest. The use of an internal ribosome entry site (IRES) is a common strategy to bridge therapeutic and reporter genes, ensuring that both are translated from the same mRNA [19]. More recently, a reporter system that places the small open reading frame (ORF) for the cell surface protein CD52 in the 5′ UTR of the genes encoding therapeutic proteins has been described [9]. As with the IRES system, both reporter and therapeutic genes are expressed from the same mRNA ensuring that reporter levels correlate with therapeutic expression. In this case though, rather than relying on a viral element to direct translation of the second ORF containing the reporter, the 5′ UTR embedded reporter ORF is the first to be encountered by the ribosome scanning the bicistronic mRNA. By engineering the reporter ORF to utilize an inefficiently translated alternate start codon, the system ensures that only a small percentage of ribosomes initiate translation of the reporter, with the majority of ribosomes continuing to scan until the optimal Kozak initiation sequence of the therapeutic is encountered.

One factor to bear in mind with the antibodies used in both the ClonePixFL and FACS-based screening methods is the potential TSE and virus exposure risk these reagents pose. For the ClonePixFL, there is a fully recombinant monoclonal detection reagent produced in CHO using no animal-derived media components which mitigates this risk. For those who feel that the original polyclonal detection reagent produces more robust halos, this reagent at least goes through in vitro viral testing and is certified to be produced from sheep herds that are monitored for disease. At the other end of the risk spectrum are the commercially available antibodies typically used in FACS-based methods which tend to be polyclonal in nature and have been developed with research applications in mind, rather than development. As such, these lack the basic testing and precautions applied to the polyclonal ClonePix reagent. In addition, the purification of these reagents, typically by affinity purification, likely entails exposure to nonrecombinant human and, in some cases, bovine and equine proteins. The potential for commercially available antibodies to be formulated in storage buffers containing BSA should also not be overlooked, although some vendors may provide custom formulations that are free of animal-derived components when specifically requested to do so.

A simple solution to avoid the potential TSE exposure while still capitalizing on the throughput of flow cytometry is to use fluorescent protein reporters that abrogate the need for a detection antibody [51, 65]. When expressing other proteins in addition to mAbs, this also represents an effective alternative if antibody reagents have not yet been developed at the time cell line generation is initiated. An interesting twist on this approach has recently been published by scientists at Cellca Gmbh. Although the method uses GFP as a reporter, it differs from other approaches in that the reporter is not incorporated in the expression vector. Instead, this novel clone screening methodology capitalizes on the ER stress induced by the metabolic burden associated with high-level expression of a recombinant mAb [41]. By engineering a host cell line to express a GFP reporter under the control of a truncated promoter for the ER stress inducible gene GRP78, they have shown good correlation between reporter expression and cell line productivity. The success of these platforms, which enables the screening of thousands, if not millions, of clonal cell lines, creates another problem. How does one exploit the seeming advantages that these technologies bring when it can mean maintaining and analyzing a very large number of cell lines?

6 Necessity is the Mother of Invention

One answer to this question is the development of automated platforms for managing cells, and arguably more important, for analyzing the products these lines produce. Automation in bioprocess development is a relatively small and defined niche in the larger world of automation technologies and platforms. Automation has been incorporated in many areas of the pharmaceutical industry, from the beginning to the end of the process. In drug discovery, for example, many companies have developed large automated platforms for compound library screening. These systems feature vast compound libraries that are integrated into robotics systems for sample handling and computer-controlled inventories. These in turn are coupled with high-throughput analytical platforms that house relevant screening assays. These systems have revolutionized chemical compound screening for drug discovery. The system developed at Bristol-Myers Squibb, for example, increased the numbers of compounds that could be screened by 24-fold, while at the same time streamlining the process in order to realize a fivefold reduction in cycle times [31]. Within the area of biotechnology, automation has long been part of the manufacturing setting. A key example is automated feedback control for the production process, such as pH and dissolved O2 control in bioreactors and fermenters. Beyond this, however, automation and automation platforms have been relatively slow to be incorporated into bioprocess development.

The major factors of integrating a successful automated platform technology in bioprocess development include affordability, flexibility, utility, and adaptability. First, budgets for bioprocess development tend to be included in the much larger budgets of either research or manufacturing, and therefore may not be considered a top priority. Second, systems need to be flexible enough that they can be used for more than one narrow purpose. If the automated platform overspecialized it may stifle platform improvements and be vulnerable to quick obsolescence. Third, the automation must be fit for purpose. There are many examples of high-quality, well-engineered automation that, rather than fitting into a platform or process flow, would require that the platform be significantly altered simply to make use of the automation instrument. Lastly, an automated platform needs to be adaptable. This is captured in some of the above points, but it is worth calling out separately that an automated platform which can be adapted to a variety of uses by different fields stands a good chance of being widely utilized.

Many automated technologies and approaches have been tried in bioprocess development and met with very limited success, or failed outright. More commonly seen are those technologies that were developed for another target audience, but the developers saw a potential application in bioprocess development. An example of this is the suite of large-scale automated platforms for cell culture passaging and maintenance from TAP Biosystems (formerly The Automation Partnership, and recently acquired by Sartorius Stedim Biotech). These systems have been designed specifically for passaging cells in different types of cell culture vessels, such as T-flasks, shake flasks, or roller bottles. They have been successfully implemented in research organizations and some manufacturing settings, but they have not seen wide acceptance in bioprocess development platforms. These are well-engineered, but ultimately expensive and inflexible automated systems, thus falling short of the affordability and flexibility criteria. The more rare case is that of a technology that was specifically designed for bioprocess development, yet still failed to be successfully incorporated. The SimCell from Bioprocessors (now Seahorse) is an example of such an instrument that could be utilized for both clone screening and process development. The core technology of the SimCell device was a microbioreactor (0.6 mL) that was printed into cassettes of six bioreactors per card. Each “vessel” could be automatically controlled for dissolved O2 and pH, while also affording online feeding and sampling. A collaboration between Seahorse and Pfizer demonstrated the potential utility of the system in a very large-scale (180 microbioreactors) DOE for process optimization [47]. In this experiment, a subset of conditions was compared to similar conditions run on benchtop bioreactors, and the performance was very comparable between the SimCell and the benchtop systems. Despite the success of the technology, the SimCell was not able to penetrate the bioprocess development market sufficiently to make it a viable long-term technology, and is now no longer available. The shortcomings of the SimCell system were both its expense and its limited ability to integrate the SimCell into an established bioprocess development platform. Rather, the platform would need to be built around it.

One of the most successful approaches in using automated platforms in bioprocess development is the liquid handling system. These systems meet all the criteria for success mentioned above. For one, they are relatively inexpensive. Second, they are flexible in that many of the platforms have a variety of functionality from variety in volumes they can handle to vacuum attachments for filter work to decks that can manage different temperatures and even shaking platforms for specialized incubations. They also are practical in that they can improve workplace efficiency through high-capacity sample processing and 24 h operations, as well as improved accuracy as compared to manual operations. Finally, they are adaptable in that they can be used across all aspects of bioprocess development, from cell culture to assay setup to resin screening. As an example, scientists at Biogen Idec have developed a high-throughput screening platform that can feed a variety of assays to support process development. The core of the system is a robotic liquid-handling platform that performs the initial Protein A purification of recombinant mAb from the culture supernatant, sets up a variety of assays in a 96-well plate format, and performs the incubations for various steps. Some assays required adaptation to the platform, but many were readily transferrable to that format. By using this platform for assay setup and execution, results for a large sample set for titer, sialic acid content, monomer/aggregate content by size exclusion chromatography, and glycan analysis were quickly generated [58]. Traditionally, most of these types of analyses (excluding titer) would not be considered for early clone screening campaigns because they are low throughput and time consuming. However, the assays were adapted to be accurate enough so as to be very comparable to the industry standards. For example, the glycan data that were generated in this high-throughput format correlate very well with data generated using MALDI-TOF mass spectrometry (Fig. 1). The data are not as high a resolution for the high-throughput method as for MALDI, but the correlation is more than adequate to identify large differences between clones, and the utility of being able to screen this many cell lines simultaneously more than offsets the reduction in data resolution. Similarly, automated liquid-handling systems have added a dimension to purification development that has dramatically altered the scope and speed of establishing appropriate buffer conditions for purification steps. In a series of publications from the downstream purification group at Wyeth (now Pfizer), scientists described a revolution in conditional screening of resins that was completely enabled by scaled-down systems and automated liquid handling. As with the assay development work described above, the authors utilized a standard 96-well plate format coupled with a liquid handling system for plate and sample setup and incubation [16]. Using this platform, they developed a matrix screen for a variety of resins that was able to probe column conditions under an array of buffer conditions, varying in pH, ion, and ionic strength. This represented a tremendous improvement in terms of efficiency as compared to the traditional (at the time) bench-scale model column development.

Fig. 1
figure 1

Correlation of glycan data. Several different clones were used to generate material for the same IgG1 monoclonal antibody. Purified antibody was analyzed using a high-throughput method developed for the GXII from Caliper and compared to the same samples analyzed using MALDI. The predominant species (Man5, G0F, G1F, and G2F) are shown for each cell line. Data and figure courtesy of Biogen Idec

In an elegant convergence of small-scale production and automated liquid handling, TAP Biosystems has created an extremely successful automated platform for bioprocess development in the ambr™ system. The ambr system utilizes a minibioreactor (~15 mL) coupled with a robust liquid-handling platform for adding or removing material to or from the culture vessels. The system is compact enough that it is meant to be operated in a biosafety cabinet and versatile enough that multiple experiments can be carried out simultaneously. One practical limitation is that the temperature control is managed by controlling the reactors in blocks of 12, rather than as individual units. The software controls are simple and do not require an experienced engineer to program the system, making the system accessible to a broader range of cell culture scientists. As such, the system has found use for production conditions, as a clone screening tool for evaluating multiple cell lines under production conditions, and as a means to passage a limited number of cell lines in an automated fashion. The system can also be coupled with analytical platforms to provide real-time cell culture performance data, such as cell density measurements. Attempting to capitalize on the success of the first-generation ambr, TAP Biosystems has released a larger-scale version, the ambr250, which has many of the same features of the ambr system, but is designed to address some of the major limitations of the original system. One is that the configuration of the ambr250 is designed to mirror a stirred-tank bioreactor more closely. The other addresses the volume limitation encountered with the original unit, which prevented sampling throughout the run to generate temporal assessments of product quality. It is a larger and more expensive system, and it remains to be seen if it will be as successful as its predecessor.

The incorporation of automation into process development has truly followed the familiar adage of “necessity is the mother of invention.” Over the past decade process engineers have been faced with the demands of higher productivity and shorter timelines, while the industry as a whole has seen economic pressures that have forced them to find new efficiencies and streamline activities wherever possible. Therefore the industry has looked for solutions that are affordable, fill gaps and/or expand existing capability, can be broadly utilized, and have the potential to be modified as platforms continue to evolve. Many large, expensive niche items have not been broadly adopted, likely for these reasons. Rather, the adaptation and evolution of existing technologies, such as liquid-handling systems and simple robotics, have been the preferred strategy for incorporating new technologies into bioprocess development. A good example of this approach is the “Islands of Automation” developed by bioprocess engineers at Lonza (Fig. 2; [29]. The essence of the approach is to utilize automation specifically and discretely to where it can provide the most value, rather than trying to construct a single automated platform that does everything. The power of this approach is that it can be adapted to a variety of process development platforms with little difficulty and little disruption to an existing platform.

Fig. 2
figure 2

Islands of Automation. A process flow of cell culture development from transfection through to bioreactor assessment that incorporates discreet automated platforms linked together to maximize efficiency. Figure courtesy of Lonza Biologics plc

7 The Benefits of a Steady Platform

It is really this platform itself, in all its varied definitions, that has had the greatest impact on bioprocess development over this past decade. Successful platforms work by streamlining and simplifying development to a single process flow that is executed in the same way from program to program and product to product. Many aspects of “platformization” have been universally applied, such as serum-free processes (driven primarily by safety concerns), as well as adapting host cell lines to media that is consistent or compatible with the ultimate manufacturing process [64]. Some have taken this further, such as the “bioreactor evolved” CHO DG44 host cell line developed specifically to preselect cell lines already conditioned to the bioreactor environment [59]. The integration of a platform host and media expression system, coupled with effective scale-down models and analytical methods empowered by automation is the most effective approach to ensuring a satisfactory cell line development outcome. This type of integrated process can often result in the selection of lead clones that require little, if any, process development effort prior to the initiation of manufacturing runs to produce material for toxicology and clinical studies.

Bioprocess innovation, or platform improvements, has developed along two parallel paths. One path has been a consistent drive towards yield improvements, and has been followed virtually since the field of biotechnology began. This course of improvement has been essential to the success of biotechnology because it reduces the requirements for ever-increasing manufacturing capacity. Indeed, this so-called triumph of “biology over steel” helped to avert a capacity shortfall in the first decade of this century [20]. It does carry with it its own consequences, such as problems with an excess of capacity, but from a patient supply perspective, this is preferable to a market shortage. The leading technologies contributing to yield improvements these days can be found in the areas of cell line engineering and continued advancements in expression technology improvements. These technologies focus not only on improving transgene expression, but also on expressing more of the desired version of the molecule, for example, through manipulation of posttranslational machinery. The other path of bioprocess development, which has emerged more recently, is the path of greater efficiency. Some of the cell line engineering approaches, such as single-site integration, fall into this category, as do the expansion of automated platforms in recent years into all aspects of bioprocess development from cell culture to purification to analytics. Incorporation of these technologies into the bioprocessing workflow is a result of the combined pressures of budget limitations and expanding pipelines. This is the very definition of efficiency: do more with less. However, true to the innovative scientific nature of the people doing the work, not only were efficiencies realized and implemented, but improvements to the quality of the work were embedded into the processes. Process scientists have not been content simply to make processes more efficient, but rather have also strived to make processes better.