Introduction

The development of a productive, stable and good quality Chinese Hamster Ovary (CHO) cell line is arguably the first step on the critical path for Investigational New Drug (IND) enabling studies. Recent advances aim at improving the precision of the screening activities through deep resolution analyses and predicting the cell line performances in scaled-up cell culture environments (Butler and Meneses-Acosta 2012; Estes and Melville 2014; Kim et al. 2012).

During the fed-batch screening of CHO cell lines, the establishment of the ideal feeding regimen is germane to the identification of culturing conditions favorable to the production of therapeutic molecules. The tailored, cell line-specific, feeding scheme is typically defined during the late stage Process Development (PD) activities, through the incorporation of multiparametric Design of Experiment (DOE) and late stage cell culture tasks. These activities intend to identify, characterize and validate the cell culture conditions at which production cell lines perform at the peak of their metabolic capacity (Lu et al. 2013; Zhang et al. 2013). Desirable process conditions are then transferred to the larger production scale, during GMP manufacturing (Abu-Absi et al. 2010; Tsang et al. 2013).

In recent years, several PD activities have been replicated and simplified at the early stage Cell Line Development (CLD). Scale down models, high-throughput platforms and streamlined workflows constitute a substantial part of the screening activities defining an efficient CLD (Huang et al. 2010; Mora et al. 2017; Scarcelli et al. 2017). The incorporation of statistical techniques further improved the prediction of cellular characteristics and the identification of trends defining outstanding cell lines at the earliest stage (Porter et al. 2010; Tsang et al. 2013). A typical production culture requires the supplementation of amino acids, sugars and other nutrients as cells reach critical densities and the metabolic demands increase (Hiller et al. 2017; Rouiller et al. 2012). Common strategies incorporate such nutrients in complex formulations of feed medium, specifically designed to replenish those components depleted in the basal medium. Besides the replenishment of essential nutrients, feed medium also promotes cellular activities, accelerating protein production and metabolic rate (Sellick et al. 2011; Yang et al. 2016). Excessive feeding can generate detrimental waste by-products in the culture medium or slow down the cells proliferation rate via dilution effect, or via osmolality increase, sensed as growth suppression signal by cells (Khattak et al. 2010; Kumar et al. 2007; Saito and Posas 2012). The rational balancing of timing and amount of feed is an essential part of the PD scope, and conventionally not explored during early stage CLD activities. In our experience, the PD lab had developed a simplified feeding scheme and transferred it to CLD lab (Fig. 1, silver box). This “one-fits-all” scheme was an agnostic, basic regimen, whose performances were adequate for most final clones during late-stage projects but not fully tested in new CLD campaigns. Furthermore, the workload management wasn’t amenable to essential cell culture features such as high-throughput screening and minimized handling.

Fig. 1
figure 1

Flowchart diagram of a typical approach for the development of the legacy “one-fits-all” feeding scheme (silver box, left) versus the “DOE-feed approach” (clear box, right). Legacy feeding scheme wasn’t part of routine verifications (box 1) and no changes to the conventional design were ever made (box 2). Novel technologies introduced to the routine CLD affected its suitability (box 3)

Challenges existed during the incorporation of the “one-fits-all” into CLD workstreams. One first challenge regarded the testing of the “one-fits-all” in a completely different screening setup than late stage process, and its suitability for CLD (Fig. 1, box 1). In fact, the heterogeneity of cellular phenotypes dictates differential nutrient requirements that cannot be adequately addressed by a universal feeding (Dutton et al. 2006; Lloyd et al. 1999). Secondarily, as the “one-fits-all” scheme was transferred to CLD, it was then regarded as legacy condition and CLD operators had little motivation or scientific background for proposing changes (Fig. 1, box 2). Lastly, CLD is the primary task to implement new elements such as host cell lines and expression vectors, cell culture medium or novel screening strategies that may dramatically affect the established cellular nutrient demands and challenge the efficacy of the “one-fits-all” regimen (Charaniya et al. 2010; Davies et al. 2011; O’Callaghan et al. 2010). The release of the “one-fits-all” feeding scheme from PD lab was not concurrent with that one of novel CLD technologies and this misalignment exacerbated the issue of screening new-generation cell lines with outdated feeding-schemes (Fig. 1, box 3). To summarize these issues, sub-optimal feeding in CLD carried the risk of not selecting the most desirable cell lines amenable to late stage PD, by failing to identify those phenotypes that needed different supplementation in terms of timing and amount of feed medium provided. By this rationale, it is essential to occasionally investigate and update a feeding-scheme suited for CLD and improve its efficacy to make quick accurate decisions, before the PD activities begin.

Our solution to this issue was to develop a streamlined methodology for occasionally assessing new feeding schemes as the CLD technologies evolved and verify whether the efficacy of the legacy scheme was maintained. Through the integration of statistical modelling and cell culture data, we progressively teased out the feeding condition best suited for CLD, thus this framework was named “DOE-feed approach” (Fig. 1). The influence of feeds on daily productivity and cellular growth was investigated; statistical tools were leveraged to accurately identify good candidate cell lines and provide valuable information for preliminary PD activities. We capitalized on the statistical models defined by the DOE-feed methodology to extrapolate and validate superior feeding-schemes that facilitated and improved the screening in future cell line campaigns. Ultimately, by developing the DOE-feed as a methodology suited for CLD needs, we optimized the early stage workstream, and reduced the operational hands-on time and routine analytical assays, improving the overall efficiency.

Materials and methods

Cell Line Generation

The cell lines used for this study were generated after transfection of dhfr deficient CHO DXB11 cells with vectors encoding a functional dhfr and the therapeutic molecule genes, either antibody or antibody-like. Cell lines were developed in a proprietary chemically-defined, serum-free medium as either stable pools or single-cell clones of stable pools, and expressed molecule 1 (expressed by clone cell lines 1.42 and 1.88), 3 (by clone cell lines 3.03, 3.12 and 3.13), 7 (by pool cell lines 7.07, 7.15, 7.23), 8 (by clone cell lines 8.04, 8.37) and 9 (by pool cell lines 9.01, 9.05). Cell lines 1.42 and 1.88 were further treated with 100 nM methotrexate (MTX) as a mean to increase production titers, and their development was on average 3 weeks longer. These representative cell lines were the final candidates developed during previous campaigns. Cell lines were screened for protein titers in 96-well plate, typically 2–3 weeks after transfection and the top 20% producing cell lines were moved forward. Protein titers were determined by Octet QK384 equipped with a ProA - Protein A, biosensor for the specific binding to the Fc region of the IgG (Pall ForteBio, Menlo Park, CA, USA). The entire cell line development was performed in humidified incubators (Forma, ThermoFisher, Waltham, MA, USA) set at 37 °C, 80% H2O and 5% CO2. The top 20% selected cell lines were scaled-up to 4 mL final volume, in 24-deep well plates (square-well, pyramid bottom, Axygen, Corning, NY, USA), sealed with a duetz low-evaporation sandwich cover (EnzyScreen, Heemstede, Netherlands) and the cell cultures were shaken at 225 rpm in a 25 mm throw incubator shaker (InforsHT Annapolis Junction, MD, USA). Throughout the scale-up procedure, cell lines were routinely passaged, and their production titers and cellular growth were assessed on day 4. The 24-deep well plates method development and scale-up to shake flask or bench-top bioreactors was described previously (Mora et al. 2017). The cell lines used for the DOE-feed method development were cryopreserved in suitable dimethyl sulfoxide (DMSO) medium. Before the fed-batch was performed, cells were revived for at least 4 passages after rapid thaw at 37 °C. All cell cultures activities were performed in chemically defined, serum free medium.

Design of experiment framework and statistical analyses

The DOE-feed methodology included 3 stages of development: The first stage regarded the testing of previously selected cell lines by factorial DOE conditions and (a) the definition of output parameters (e.g. titer and cellular growth). Parallel fed-batch production runs were executed based on the DOE design (Table 1) and statistical models were built by (b) Prediction Profiler and (c) unsupervised Principal Component Analysis (PCA). The DOE conditions were designed by JMP statistical software (JMP software, SAS Institute Inc., Cary, NC, USA) and set as follows. The 5 input factors were the feed days 3, 5, 7, 10 and 12 of the fed-batch (denoted as F%3, F%5, F%7, F%10 and F%12 respectively on the PCA plot), while the 3 levels were the feed volume percent of 5, 7 and 9% (for day 3 and 5), 7, 9 and 11% (for day 7 and 10) and 3, 5 and 7% (for day 12). The total feed percent was calculated from the sum of the individual feed volume percentages, describing multiple levels ranging from 31 to 47%. The measured responses titer and the Integral of Viable Cell Density (IVCD) at the harvest were organized in datasets specific for each cell line, suitable for following analyses. The second stage regarded the identification of superior feeding-schemes. The datasets were analyzed by PCA to study the overall variability explained by the resulting models, as previously proposed (Mora et al. 2017). Significant observations were further investigated by Prediction Profiler in order to assess the relative importance of each variable toward the final output. Profiling models were built with confidence intervals set at 95%. The analyses of these datasets and extrapolation of main notions from the statistical models guided the design of improved feeding-schemes, as later discussed. The third stage of the DOE-feed was a verification of the hypothesized superior feeding-schemes versus the legacy one, and the implementation of superior feeds into a new CLD campaign.

Table 1 Design of experiment (DOE): 24 conditions (first column) identify multiple feed volumes supplemented on day 3, 5, 7, 10 and 12 (second column)

Fed-Batch production runs

Multiple production runs were executed through the first stage (288 runs), second stage (56 runs) and third stage (96 runs) of the DOE-feed method. Representative cell lines were previously identified as good producers by the day 4 titer assessment and by conventional fed-batch screening experiment in 24-deep well plate. These cell lines represented the final top producers in previous CLD campaigns. Cultures were inoculated at 0.5 million cells/mL in 4 mL final volume in a 24-deep well plate and agitated at 35 °C. The cultures were fed on days 3, 5, 7, 10 and 12 in 24 different combinations of feed medium, ranging from 3 to 11% of the initial volume as previously described. Glucose was supplemented at 0.5% v/v on feed days. The viabilities and viable cell densities (VCD) were monitored by trypan blue exclusion cell counts (CellaVista, Synentec GmbH, Elmshorn, Germany) and terminated on day 14 or when the viability was lower than 50% (day 12 for cell lines 1 s and 8 s). Throughout the fed-batch experiment, the integrated viable cell density (IVCD) was calculated from the VCD, as follow:

$$IVCD_{D1} = \left[ {\left( {\frac{{VCD_{D1} + VCD_{D0} }}{2}} \right) \times (t_{1} - t_{0} } \right)] + IVCD_{D0}$$

In equation, IVCDD1 and IVCDDo represent the IVCD at time point 1 and 0 (t1 and t0 expressed in days); VCDD1 and VCDD0 represent the VCD at time point 1 and 0. IVCD was measured in millions of viable cells accumulated throughout the fed-batch. The supernatant at the harvest was clarified by centrifugation (3000 rpm, 15 min) followed by filtration (0.22um PES filter) and the production titers were measured on a PorosA column by HPLC/UV–Vis detection. The clarified harvest was purified and analyzed for size (SEC), charge (WCX) and glycan (LC–MS) of the expressed protein as previously described (Mora et al. 2017).

Results

Production titers depend on the amount of feeds on specific significant days

The effect of the total feed volume was studied in relation to the productivity of the specific cell lines under the 24 fed-batch conditions (Fig. 2a). Regardless of the specific day of the feed, cell lines showed titer improvement by up to 9% upon the increase of total feed from 35% (DOE mid-point) to 47% (DOE limit), as summarized in Table 2. This was further verified on a yield basis (Fig. 2b), where the final productivity was multiplied by the volume increase after feeding, specific to each condition of the DOE. As the productivity increased, the yield increased to a greater degree up to 16% improvement (Table 2), showing higher statistical significance, as determined by the reduced width of the confidence intervals (Fig. 2b). Cell line 3.13 and old-generation cell lines 1.42, 1.88 developed in presence of MTX, showed titer decrease and not significant changes in yield. Their confidence intervals never crossed the mid-point value, meaning they didn’t reach statistical significance during the DOE-feed at Stage 1, hence they were excluded from characterization activities of Stage 2.

Fig. 2
figure 2

Prediction Profiler based on cell line response titer (a left) and yield (b right) at progressive final feed volumes (feed total %, x-axis) for several cell lines (y-axis). Prediction profiles are based on the mean predicted response (red value on y-axis and solid black line on plot) as defined by their 95% confidence intervals (black value on y-axis and grey-highlighted traces on plot). Values reported at 37% (mid-point of the Feed total % factor value)

Table 2 Average titer (a) and yield (b) values measured at representative total feed percentages

The influence of the specific feed-days on the harvest titer was then studied by PCA modelling (Fig. 3), that explained 63.3% of the dataset variability (41.2% on component 1 and 22.1% on component 2). The feed-day 7 (F%7) had a positive correlation with most cell lines on the right hand-side of the graph, based on the proximity of the respective markers of titer (black circle) and feed-day (red triangle). Based on the PCA concept that markers in the same quadrants correlate, cell lines 8.04 and 8.37 showed positive correlation with the feed-day 5 (F%5) whereas cell lines 7.07, 7.15, 7.23 and 3.03 with the day 10 (F%10). The remaining cell lines 1.42, 1.88 and 3.13 negatively correlated with most of the feed-days, as explained by their relative position in the opposite part of the PCA graph. Based on the PCA concept that markers in the middle point of the plot don’t influence the model variability, the day 12 (F%12) of the feed didn’t drive any effect on titer.

Fig. 3
figure 3

Titer response analyzed by principal component analysis (PCA) model of feed-days 3, 5, 7, 10 and 12 (F%3, F%5, F%7, F%10 and F%12, red triangles) and their correlation to titer responses of the analyzed cell lines (black circles). Titer component 1 (41.2%) and titer component 2 (22.1%) explain most of the variance with cumulative R2X = 63.3%. The vicinity of cell line titer responses to specific feeding days represents the degree of their correlation

These observations were verified by a Prediction Profiler analysis, in an effort to study how the model responded as independent factors changed (Fig. 4), and three clusters of titer performances were identified. Cluster A (cell line 8.04 and 8.37) and B (cell lines 3.12, 3.03 and 7.07, 7.15, 7.23) showed benefits after increasing the day-7 feed volume up to the DOE limit (from 7% to 11%). Cluster A were responders to the early days 3, 5 and 7 while no effect was observed on later days 10 and 12. Cluster B were responders to the late days 7 and 10, while no changes were observed on day 3 and 12. Cluster C grouped cell lines that responded poorly to volume increase (3.13, 1.42 and 1.88) all throughout the fed-batch, confirming the unsuitability of old-generation cell lines.

Fig. 4
figure 4

Prediction Profiler based on cell line response Titer at progressive feed volumes on days 3, 5, 7, 10 and 12 (Feed [%], x-axis) for several cell lines (y-axis). Prediction profiles are based on the mean predicted response (red value on y-axis) as defined by their 95% confidence intervals (bracketed black values on y-axis and grey-highlighted traces on plot). Values reported at the mid-point of the several days Feed [%]. Cell lines showed differential responses to feed and were grouped as early responders to feed (cluster A), late responders to feed (cluster B) and no responders to feed (cluster C)

Influence of feed on cellular growth

The cell growth response to feed-days and volume was investigated by a combined PCA/Prediction Profiler approach (Figs. 5, 6). As showed in Fig. 5, the feed-days F%7, F%10 and F%12 localized in the center region of the PCA graph and therefore didn’t influence the IVCD (52.6% of the variability was explained). The F%3 localized on the opposite quadrant of the PCA plot, suggesting that day 3 was associated with minor growth reduction in every cell line. The F%5 co-localized in the same quadrant as cell lines 7 s and 3.03, denoting a growth improvement as the day-5 feed volume increased, specifically in this subset of cell lines. These observations were further studied in a Prediction Profiler analysis (Fig. 6) using model IVCD ~ F%3 + F%5 + F%7 + F%10 + F%12. The consistent growth reduction at day 3 was deemed an effect of the volume dilution after feeding. Furthermore, a specific subset of cell lines was modestly influenced by the feed volume increment at day 5 (Fig. 6, box). Cell lines in this cluster were hypothesized to use the day 5 feed efficiently by increasing both titer (Fig. 4, cluster B) and proliferation rate, with no production bottlenecks within the DOE limits, making them ideal candidates for further process development. The remaining feed days drove no effect on the IVCD and none of the observations was statistically significant. The study of titer and growth on previously selected cell lines concluded the Stage 1 of the DOE-feed methodology.

Fig. 5
figure 5

IVCD response analyzed by principal component analysis (PCA) model of feed-days 3, 5, 7, 10 and 12 (F%3, F%5, F%7, F%10 and F%12, red triangles) and their correlation to IVCD responses of the analyzed cell lines (black circles). IVCD component 1 (32.9%) and IVCD component 2 (19.7%) explain half of the variance with cumulative R2X = 52.6%. The vicinity of cell line IVCD responses to specific feeding days represents the degree of their correlation

Fig. 6
figure 6

Prediction Profiler based on cell line response IVCD at progressive feed volumes on day 3, 5, 7, 10 and 12 (Feed [%], x-axis) for several cell lines (y-axis). Prediction profiles are based on the mean predicted response (red value on y-axis) as defined by their 95% confidence intervals (bracketed black values on y-axis and grey-highlighted traces on plot). Values reported at the mid-point of the several days Feed [%]. The analysis suggested that reduced growth was associated to feed volume increase on early days in all cell lines. The “late-responder to feed” subset of cell lines (black box) weakly responded to day-5 feed increase by improving their growth (not statistically significant). No changes to growth were observed as consequence of different feeding

Rational design of 3 superior feeding schemes

The PCA and Prediction Profiler statistical models were studied and three main notions (I, II and III) were extrapolated. These analyses triggered the Stage 2 of the DOE-feed, leading to the design of three feeding schemes B, C and D (Table 3). Notion-I: greater amount of feeds increased productivities, therefore volumes were slightly increased at day 3, 5 and 7 (scheme B); Notion-II: day 7 influenced most of the productivity improvement, therefore feed volumes on day 7 were increased, while day 3 and 5 were combined into only one feed supplementation on day 4 (scheme C) and Notion-III: day 12 titer, yield and growth responses were not significant, therefore day 12 feed volume was combined with day 7 (scheme D). The three schemes were compared to the one-fits-all legacy fed-batch (scheme A) and deployed to screen cell lines 3.03, 3.12, 8.04, 8.37, and cell lines 9.01 and 9.05, expressing molecule 9 (Fig. 7). The average titer improvements (Fig. 7, columns), were 4% (B), 6% (C) and 17% (D) with a peak of 34% improvement in 8.04 (D), when compared to the legacy scheme A titers. The increase in feed volume didn’t impact the cellular growth (Fig. 7, markers), with the exception of 8.04 (IVCD increase of 16%). The purified fed-batch harvest was analyzed for high molecular weight aggregates (HMW) and low molecular weight species (LMW), charge variants and glycosylation (data in Supplementary Material). In every case, we didn’t detect any deviation from the product quality profile of scheme A.

Table 3 Superior feeding schemes B, C, D compared to legacy scheme A by extrapolation of main notions from the statistical models
Fig. 7
figure 7

Comparison of feed scheme A (dark grey), B (light grey), C (grey) and D (black) in multiple cell lines. Average titer (n = 2) is showed as bar graph and normalized by A values (left y-axis). IVCD is represented by markers A (square), B (circle), C (cross) and D (hyphen) and plotted on the right y-axis. Scheme D showed a titer increase in all cell lines and no significant impact on IVCD

Verification of superior screening power of scheme D

Throughout the Stages 1 and 2 of the DOE-feed we were able to improve cellular performances of a pool of previously selected cell lines, by defining the superior scheme D. In the final Stage 3, we wanted to verify that scheme D was able to screen good candidates out of a heterogeneous pool of cell lines, with no pre-selection applied. Fed-batch screening by scheme D identified the same cell lines as scheme A (Fig. 8). The squared Pearson correlation value (R2 = 0.87) showed good correlation, while the Spearman’s rho value (ρ = 0.90) indicated good ranking of performances. We measured the linear regression and calculated an average titer improvement of 2% (y = 1.02x), with peaks up to 9% increase among the top 10 producing cell lines, consistent with our Stage 2 results. Notably, the scheme D further tested the feed frequency reduction from 5 to 3 days, reducing the hands-on workload by 40%, hence improving the streamlining of the CLD activities. The scheme D successfully met the verification criteria and it could be applied to new CLD programs for screening top-producing cell lines.

Fig. 8
figure 8

Fed-batch titers in a new CLD campaign. Not-preselected cell lines (poor and good producers) were screened by scheme A (x-axis) and scheme D (y-axis). Average titers showed high correlation (R2 = 0.87) and high ranking as defined by Spearman’s (p = 0.90). Average final titer values were improved by 2% in scheme-D screening

Discussion

In this study, we developed the DOE-feed methodology with the intent to improve the current cell line screening workflow and verify its suitability as new technologies are introduced in CLD. We presented the DOE-feed as a novel framework that combined late-stage techniques, like statistical modelling and analytical tools, with early stage cell culture input data. In our experience at AbbVie, the “one-fits-all” hadn’t been regularly updated and became a legacy feeding-scheme in CLD lab. The validation of an ideal fed-batch screening scheme and the characterization of statistical models defining cellular behavior fall outside the CLD scope, and conventionally not explored at this early stage of the development although it’s triggered during late stage PD (Brühlmann et al. 2017; Nagashima et al. 2013; Ben Yahia et al. 2015). In our DOE-feed work, we deliberately anticipated the application of statistics during the early stage CLD and approached the method development as an early-stage, continuously updated process, by building iterative datasets and statistically interpreting their responses. The historic CLD feeding-scheme was previously transferred from PD lab and regarded as an agnostic “one-fits-all” strategy (Fig. 1, silver box). Its suitability for recent CLD needs wasn’t part of routine verifications (Fig. 1, box 1) and bench scientists had little motivation for introducing changes to the conventional design (Fig. 1, box 2). Novel technologies were progressively introduced to the routine CLD (Fig. 1, box 3): the transition to a MTX-free development, the introduction of upgraded vectors, the improvement in medium and screening workflows, that may affect the established cellular nutrient demands and challenge the efficacy of the legacy “one-fits-all” regimen (Davies et al. 2011). We suspected that the “one-fits-all” scheme was outdated and decided to reassess it to make quick accurate decisions, before the PD activities begin. With this motivation in mind, we developed the DOE-feed method to study the significant feed-days and their relationship to productivity and growth, that may facilitate and optimize our CLD workstream.

We generated, organized and analyzed data of 440 fed-batch runs, through a three-stages approach, covering 6 CLD campaigns. We discovered that our current CHO system can tolerate feed volumes greater than the “one-fits-all” currently applied, and notably, the day-7 influenced the productivities of most cell lines. We reported that feeding on day 3 and 12 didn’t provide any benefits on productivity, and that their supplementation may negatively influence the cellular proliferation. Work by Zhang et al. highlighted the need of precise feedback control strategies to avoid under or over-feeding production cell lines (Zhang et al. 2015). Others incorporated tools to monitor biomass, and dynamically determining the feed rate through predictive methods, in order to estimate future nutrient demand (Lu et al. 2013). Feed medium influences CHO cells behavior in terms of growth, productivity, nutrient utilization and waste generation, and several tools can identify those nutrients limiting growth and antibody production (Khattak et al. 2010; Sellick et al. 2011). Through a combination of Prediction Profiler (Fig. 2) and PCA (Fig. 3), we isolated a cluster of cell lines (Fig. 4B) highly desirable for further PD, based on their response to feeding. Multiple CLD campaigns were independently executed and their iterative datasets accumulated sufficient input data to enable the design of superior feeding schemes, extrapolating three main notions from the statistical modelling. Particularly relevant, the scheme D provided an average 17% titer improvement, with a peak of 34% when compared to the legacy “one-fits-all”, with consistent quality of the product expressed. This novel scheme reduced the operational hands-on time and feeding frequency, from 5 to 3 days, corresponding to 40% efficiency improvement. It was possible and beneficial to switch from the legacy 5-days, 35% feed-volume to the novel 3-days, 40% feed-volume without altering the screening power and cellular phenotypes. During AbbVie’s CLD screening, the DOE-feed methodology became a tool for the occasional evaluation and improvement of the feeding-scheme efficiency.

As we deciphered more of the CHO biology, there still was an apparent need to optimize the current feeding strategies with the intent to improve cell line phenotypes and characteristics. For example, our 24-deep well plate scale-down model dramatically improved the throughput, but it’s only sufficiently predictive of the final titers at the bench-scale bioreactor (Mora et al. 2017).

Many biomanufacturing organizations engage in similar endeavors during the late stages of the Process Development and may transfer improved feeding-schemes to their CLD counterparts at various frequencies, hence mitigating the issue of screening with an outdated scheme. Nevertheless, we believe that it’s the combination of statistical modelling applied to early stage cell culture data that makes our DOE-feed method original, and amenable to CLD needs. The reduction of feeding days from 5 to 3, the improvement in screening efficiency and the greater titers during high-throughput scale-down fed-batch runs are issues peculiar to the early stage CLD and the DOE-feed method successfully addressed them. As the DOE-feed method became established in our facility, changes in the clonal selection workflow didn’t influence the timelines and the subsequent activities in process development labs and manufacturing suites, maintaining a successful tech transfer of the process throughout the various stages.

The cell lines used in this study were generated at different times, with slight changes of the selection strategies and were expressing a variety of molecules. During the Stage 1 of the DOE-feed methodology, we leveraged the sample heterogeneity to draw hypotheses around the relationship between specific feed days and volumes, productivity improvement and cellular growth. We used statistical tools such as unsupervised PCA and Prediction Profiler to properly interpret the DOE-feed generated dataset, obtaining a sufficient degree of statistical significance, and a great degree of practical significance. The use of statistics was deemed beneficial for screening large datasets and assessing cellular behaviors since the early stages (Alves et al. 2015; Porter et al. 2010; Povey et al. 2014). Particularly DOE, PCA and Prediction Profiler were effective tools in identifying key process parameters or predict cell culture markers, nutrients consumption or metabolic waste (Brühlmann et al. 2015; Kang et al. 2014; Nolan and Lee 2012). We found that day 7 dominated the productivity response, therefore an increased supplementation on this day directly translated into greater titers. The day 5 showed a lower improvement than the one driven by day 7 and not all cell lines respond to this specific day. Irrespectively of their pool or clone development, cell lines showed a significant variability in their responses. Cell lines were either early responders to feed (Fig. 4A), or late responders (Fig. 4B), as opposed to the old-generation, MTX-selected cell lines (Fig. 4C) that didn’t respond. This phenomenon highlighted the different metabolic needs of new-generation cell lines and corroborated the necessity of an improved feeding scheme. Cell line 3.13, though new-generation, exhibited behaviors comparable to old generation cell lines. This unique behavior of 3.13, the only 1 out of 10 pre-selected new-generation cell lines, demonstrated the difficulties in developing a universal screening workflow and substantiated the need of rigorous platform approach. We analyzed these behaviors retrospectively and verified that an incremented total feed volume was beneficial from the productivity perspective and, most importantly, statistically significant from the overall yield perspective.

We were interested in understanding the relationship between the feed and the growth. All cell lines derived from the transfection of the same parental host CHO dhfr. The event of transfection and selection under dhfr deficient conditions can select specific phenotypes, therefore we anticipated a large heterogeneity in the growth response after feeding (Fig. 5). Feeds influence cellular growth, although their specific interplay is still unclear. Protein production is a linear function of the time cells spend in G1 phase of the cell cycle, suggesting the secretion happens during this time, hence strategies like cell cycle delay capitalize on this notion (Dutton et al. 2006; Kumar et al. 2007). Mathematical models describing CHO growth are constantly updated, specifically tackling the relationship between growth, protein production and product quality (Dietmair et al. 2012; Galleguillos et al. 2017; Gauthier and Pohl 2011; Shirsat et al. 2015; Ben Yahia et al. 2015). There are difficulties in generalizing any relationship because culture medium governs most of these complex interactions, thereby it is important for biomanufacturing organizations to properly characterize their own CHO cellular systems, and incorporate process control strategies specific for the platform they use (Berry et al. 2014; Lloyd et al. 1999). We observed that all cell lines underwent a minor dilution effect after feeding on day 3, although a subset resumed robust growth as feed volumes increase on day 5 (Fig. 6, box). The same cell lines showed higher productivity (Fig. 4B) upon feeding, hence they are more desirable candidates to choose. We hypothesized that they carry a greater potential to improve their productivity by two mechanisms: a. the feed medium is directly channeled through the peptide synthesis machinery and b. the feed medium improves the cellular growth thus enlarging the number of producing cells. We didn’t fully investigate the predominant phenomenon and no metabolic profiling of the cell culture was performed. This observation wasn’t statistically significant therefore we can’t generalize any cell line behavior specific to the growth response. Unsurprisingly, the old generation cell lines behavior deviated from the rest. Those cell lines were the only ones to undergo MTX treatment as mean to increase their productivity, and we suspect that the increased dhfr gene copy number and expression may influence their growth profile (Chusainow et al. 2009). We anticipated unpredictable production and growth characteristics upon MTX selection, although the investigation of such behavior wasn’t the goal of our research. Nonetheless, the treatment with selecting agents MTX or MSX is currently decreasing, in favor of quicker and more efficient molecular strategies for improving productivities and reducing timelines (Ho and Yang 2014; Seth et al. 2007; Ye et al. 2010).

By incorporating statistical modelling, we could interpret cell culture data and extrapolate notable notions, hence advancing to the Stage 2 of the DOE-feed methodology: Notion-I. higher feed volume is typically better; Notion-II. day-7 drives most of the titer improvement; Notion-III. Days 3 and 12 don’t significantly influence titer, while they reduce the growth. The modeling work was instrumental to propose superior feeding schemes, that we then compared to the legacy “one-fits-all” scheme A. We confirmed the positive effect of increased feed volume and the influence of the day-7 (Table 3, box) on the overall response, up to 34% productivity increase. Upon scheme D treatment, we found that the cumulative IVCD was either unchanged or slightly higher (8.34), despite the extra volume at day 7. This indicates that even if a dilution effect takes place, some cells can rapidly counteract that. It was reported that the delay of cell cycle in phase G1 is an effective strategy to extend the peak or antibody production in CHO cells (Dutton et al. 2006; Kumar et al. 2007). There is a fine balance between increased proliferation rate and growth reduction upon supplementation of feed, and the investigation of markers of cell cycle could help in characterizing the important cellular moments from a protein production perspective.

We further focused on the titer response from cell lines producing the same molecule. The molecule sequences and structures may constitute a bottleneck during the peptide assembly process, with cell lines 8.04, 8.37 and 9.01, 9.05 receiving the mixture of amino acids that best promotes the synthesis of molecule 8 and 9. Contrarily, the same mixture isn’t perfectly suited for molecule 3 assembly. From a proteomic perspective, peak antibody production is associated with increased oxidative metabolism and the transition from a glycolytic to a max-TCA status of the cell metabolism (Templeton et al. 2013) therefore the lack of specific nutrients in the feed may constitute a rate-limiting step. We speculate that, along with increased secretory activity, the feeds we adopted in this study may stimulate a metabolic status that favors certain molecules (Mcleod et al. 2011). Researchers incorporated high concentration feeds, spent medium analysis or transcriptomics study to understand whether production bottlenecks exist in the mRNA translation, intracellular assembly or peptide folding (Kyriakopoulos and Kontoravdi 2014; Rouiller et al. 2013; Yang et al. 2016; Yu et al. 2011). However, none of the feeding schemes influenced the product quality (Supplementary Material). This was an important process verification to deem the scheme D as a viable superior alternative.

With regards to the CLD streamlining, schemes C and D were noteworthy, as they offer the benefit of reduced hands-on time from 5 days to 4 and 3 respectively, corresponding to 20% and 40% efficiency improvement. The simplification of the feeding design could additionally ease the routine bench operations and minimize human error or miscalculations. They are desirable attributes of an efficient workstream, particularly when operators must handle great volumes of cell cultures or when automation is incorporated (Lindgren et al. 2009). We lastly wanted to verify that scheme D was equivalent to scheme A in selecting cell lines during a fed-batch screening campaign. The observations that originated the Stage 2 models were defined retrospectively and established on previously selected cell lines. It was necessary to validate those observations in a new CLD campaign, with a greater number of heterogeneous cell lines and no pre-selection applied. We verified that the top 20% cell lines were identified by both feed schemes and we were not missing potentially good candidates. Furthermore, the productivity improvement ranged from 3% to 9%, consistent with our previous observations.

Conclusion

As novel technologies become part of CLD platform development, it is necessary to occasionally reassess the efficacy of the legacy “one-fits-all” feeding scheme. In this study, we described the development of the DOE-feed methodology, a 3-stage framework that allowed the identification of feeding schemes suited for cell line screening. We reported improved productivity up to 34%, more robust CHO cellular growth and 40% more efficient workload management in a panel of 12 cell lines, expressing a variety of molecules. The DOE-feed methodology constitutes a continuously updated process by building iterative datasets, that guided the construction of statistical models. The application of factorial DOE, unsupervised PCA and Prediction Profiler allowed to accurately interpret early stage cell culture data in a novel fashion. The day-7 was a significant one from a nutrient supplementation perspective, in fact reduced feeding on this day was a productivity-limiting step, whereas increased feeding resulted in titer improvements. The statistical models clustered early and late responders to feed, with the latter group able to counteract the day-3 growth reduction, while maintaining greater titer response to feed throughout the late days of the fed-batch. We extrapolated the main notions from the statistical models to design and test 3 superior feeding schemes, that achieved productivity improvements without impairing growth characteristics and product quality. We further validated our superior scheme D in a new CLD campaign, by screening non-preselected cell lines. In summary, the incorporation of the DOE-feed methodology significantly helped the identification of cell culture production bottleneck, improved the workload efficiency and supported the design of a superior feeding regimen, suitable for future CLD screening activities.