Introduction

The pharmaceutical industry currently faces significant challenges. Patent expirations trigger fierce competition between innovator and generic companies. The conventionally long development cycle and expensive cost to commercialize a brand-name drug product significantly limits the production of new medicines. Innovation is necessary to break down such limitations. The promise of “Big Data,” defined here as a collection of structured data sufficiently large to reveal patterns and trends or permit analyses or modeling not previously practical, may provide one such innovation.

With the emergence of the topic of Big Data into the mainstream media, the pharmaceutical industry is beginning to leverage existing tools to analyze and mine historical data. The promise of Big Data permeates almost every aspect of drug product research and development, from defining disease states to identifying new indications, from drug discovery to drug development, and from clinical trial design to drug efficacy and safety evaluation, etc. The subject of Big Data goes hand in hand with a renewed focus on mathematical modeling and simulation within the pharmaceutical industry. Powder compaction is one area that can benefit from Big Data-based modeling as it is inherent to nearly all solid dosage form drug product development and manufacturing processes. Two common unit operations involving powder compaction are roller compaction and tablet compression. As tablet compression serves as the main approach to transfer powder to a consolidated compact by mechanical means, roller compaction is commonly used in the pharmaceutical industry to improve powder flow into the tablet press and compositional uniformity. Although first-principles theories of pwoder mechanics exist for roller compaction and tablet compression [1, 2], the majority of drug product development activities involving powder compaction rely heavily on the generation of experimental data.

In a previous publication [3], a first-principles-based theory was developed to predict powder compaction behavior and stress evolution in roller compactors. While the model was derived and qualitatively verified using a limited set of published works [410, 18] and new experimental data, verification with a large number of experiments was not practical at the time. Since that time, the internally developed CRAVE database has been established at Eli Lilly and Company and has passively collected all aspects of drug product development batch record data for 8 years, including 136 different roller compaction-based formulations and/or products.

In this work, the previous powder compaction model was expanded to include powder tapped density in a unified solid fraction (SF) vs stress powder compaction model. The model was interrogated using data from the CRAVE database. Specifically, tablet compaction profile data and pre-blend tap density data were used to define an empirical relationship between applied stress and the reciprocal of porosity. The correlation was then used to predict the equivalent stress required to attain the reported (corresponding) ribbon solid fraction (also queried from the database). The comparison between the predicted and the observed roll stress and its relation to the equation reported previously is discussed. In all, >2000 batch records were queried from the CRAVE database. Of these, 136 were found to contain a full set of experimental data associated with roller compaction. These data included measurements of tapped powder SF, ribbon SF, roll stress, and a tablet compression profile (SF vs stress).

Materials and Experiments

Data Retrieval

Over the past 8 years, an effort has been maintained at Eli Lilly and Company to systematically and passively catalog all aspects of drug product development data into a single large database. The CRAVE database contains the entirety of development batch records, from raw material attributes to end-product testing data as well as relevant process parameters and intermediate product attributes. Such parameters and attributes include lot numbers, dates, material amounts, equipment sets, etc.—everything that is recorded in a standard development batch record, organized into tables and defined fields within a relational database. For this study, the CRAVE database was queried to obtain the tablet compaction profile data, the associated powder tap density data, and the associated ribbon solid fraction data—all organized by batch record. In total, more than 2000 batch records were interrogated of which 136 were roller compaction-based and found to also possess complete data sets. Within the 136 batches, powder tap densities were measured using a Vankel tap density analyzer. Ribbons were roller-compacted on either a Fitzpatrick IR520 roller compactor or a Gerteis MiniPactor. Tablets were compressed on a wide variety of tablet presses including a Carver press, a Kikisui virgo 520, a Fette (models 1090, 1200, and 2090), and a Korsh (models XL100 and XL200). Batch sizes ranged from 750 to 80 kg. Drug loadings ranged from 0.5 to 77.3% (based on salt form weights). The materials included 13 different active pharmaceutical compounds. Nearly 30 different excipients or excipient grades were used across the 136 formulations. Microcrystalline cellulose and Mannitol (various grades of each) were the most common diluents. Magnesium stearate (vegetable-sourced) and sodium stearyl fumarate were commonly employed lubricants. Most formulations leveraged the cross-linked carboxymethylcellulose as a disintegrant and colloidal silicon dioxide as a common glidant. In most cases, the unit formula varied slightly between the roller-compacted ribbons and the tablets due to the addition of extra-granular excipients. However, these differences were found in a separate analysis to have negligible impact on the consolidation behavior of the unit formulas.

The tablet compaction profiles included tablet solid fractions calculated at multiple compression stresses. The tablet SF values were determined from tablet weight and thickness measurements (typically using an average of 10 tablets) as follows:

$$ SF=\frac{m}{\rho \left[A\left(t-2{t}_{\mathrm{cup}}\right)+2{V}_{\mathrm{cup}}\right]} $$
(1)

where m, t, and ρ, denote the tablet mass, thickness, and true density, respectively, and A, t cup, and V cup denote the die hole cross-sectional area, the cup depth, and cup volume, respectively. The latter three values were generally read directly from the tablet compression tooling drawings. The corresponding tablet compression stress was determined for each data point as the compression force divided by A.

The powder tap density data were recorded as a part of routine powder characterization following guidance from USP chapter <616>. The tapped density was divided by the blend true density to obtain the tapped solid fraction. Due to the negligible stresses applied during the tapped density measurement in comparison to tablet compression forces, a nominal value of 0 MPa was used as the corresponding stress for the tapped powder and for curve-fitting purposes as discussed below.

The ribbon solid fractions were recorded during the course of a roller compaction unit operation and were typically measured on ribbon samples collected from the roller compactor using a GeoPyc Envelope Density Analyzer. The measured ribbon density was divided by the true density to determine the ribbon solid fraction. The corresponding roll stress was calculated from the measured roll force, radius, and width based upon the previously reported Eq. (2) [4]:

$$ {\sigma}_{\mathrm{Equiv}}\approx 9.1\frac{F}{Rw} $$
(2)

where F, w, and R denote the measured roll force, width, and radius, respectively. The proportionality constant 9.1 was established in the previous publication [4] and provides an effective peak roll stress, accounting for both the non-linear pressure distribution across roll surfaces as a function of the rolling angle, as well as the differences between the plane-strain compression geometry provided by roller compactors vs the uniaxial compression geometry provided by tablet presses. The accuracy of this proportionality constant for the estimation of effective roll stress is further evaluated in this work.

Data Analysis

Empirical mathematical fitting was performed separately for each manufactured batch of tablets in order to accommodate the impact of raw material variability. Multiple fitting approaches were explored to empirically relate SF to the applied stress on the tablet data and powder (tap density) data. The fitting approaches included quadratic fitting between solid fraction and stress, linear fitting between stress and the reciprocal of porosity, and quadratic fitting between stress and the reciprocal of porosity. Other data fitting approaches were also attempted including power law and high-order polynomial fits. However, these were immediately eliminated from further consideration due to their visibly poor ability to fit the data across such a wide range of formulations/batch records. Only the tablet compression and pre-blend tapped SF data were used to construct the correlation for each batch. The distribution of the correlation coefficients (R 2) across the 136 batches was used to assess the best mathematical fitting approach.

Once calibrated against tablet data and powder (tap density) data, the inverse correlation was used to predict roll stress, given the reported ribbon solid fraction. The predicted stress was compared to the experimentally reported roll stress (Eq. (2)) in an effort to assess the accuracy of the proportionality constant of 9.1. The reported roll stress was the experimental set-point documented in the batch record to generate the corresponding ribbon solid fraction.

Software and Data Processing

All queries were written in Toad for the SQL Server and were executed against the SQL Server-based CRAVE database. All calculations were executed using Matlab 2013a (The Mathworks, Natick, MA) with multiple Matlab routines written in-house to support this work.

Results and Discussion

Analysis of the Empirical Mathematical Fits

The distribution of correlation coefficients for each curve fitting approach is shown in Fig. 1. As shown, the quadratic fitting between stress and reciprocal of porosity led to the highest R 2 values. Data outside individual boxes were found to be common batches where tablet SF did not increase substantially with increasing applied stress. Given its highest R 2 reached, the quadratic fitting approach described as follows was leveraged in further calculations:

$$ \frac{1}{\varepsilon }=\alpha +\beta \cdot \sigma +\gamma \cdot {\sigma}^2 $$
(3)
Fig. 1
figure 1

Box distribution plot of the correlation coefficient after applying three different fitting approaches onto 136 batches of development campaign data

where σ and ε denote the applied compression stress and porosity (1SF) of the powder, ribbon, or tablet. Of the 136 sets of data, nine example curve fits are shown in Fig. 2. The figure shows compression profiles that were not fit particularly well (top row), were fit reasonably well (middle), and were fit well (bottom row). The profiles include powder data (circles at zero stress), ribbon data (stars), and tablet data (circles at non-zero stress). Circular data points were used to establish the quadratic fit while stars (ribbons) are shown as an indication of the predictability of the ribbon SF-stress behavior. The figure shows that in cases with poor correlation coefficients, appreciable scatter existed in the tablet data. Further probing revealed that these data usually related to tablets generated under varied conditions such as at low speed and high speed, different tooling, and in some cases on different tablet presses. While the high scatter resulted in a low correlation coefficient, the profiles did not suggest an inadequate or incorrect fitting approach overall. Thus, Eq. (3) was considered to reasonably represent the complete, unified compaction profiles across powders, ribbons, and tablets.

Fig. 2
figure 2

Stress vs 1/porosity plot for nine example batches out of the entire 136 development campaigns and its individual quadratic fitting as per Eq. (3)

After the initial evaluation, Eq. (3) was further simplified by considering the two following facts. First, since 0 MPa was used as the corresponding stress for the tapped density test, α (the intercept) should correlate strongly with, and could potentially be replaced by, the reciprocal porosity of the tapped powder bed (1/ε σ = 0). As shown in Fig. 3, the intercept α was indeed shown to correlate strongly with 1/tapped porosity (1/ε σ = 0). Upon further investigation, the first- (β) and second (γ)-order coefficients were also found to be linearly correlated as shown in Fig. 4. Thus, the first-order term (β) could be re-written as a linear function of the second-order (γ) coefficients. With these observations, Eq. (3) was further simplified as follows:

$$ \frac{1}{\varepsilon }-\frac{1}{\varepsilon_{\sigma =0}}=\beta \cdot \sigma +\gamma \cdot {\sigma}^2 $$
(4)
$$ \frac{1}{\varepsilon }-\frac{1}{\varepsilon_{\sigma =0}}=\left(-260\cdot \gamma +0.025\right)\cdot \sigma +\gamma \cdot {\sigma}^2 $$
(5)
Fig. 3
figure 3

Scatter plot between the intercept (α) per quadratic fitting by Eq. (3) and the reciprocal porosity of the tapped powder bed (1/ε σ = 0). The data for the nine example batches shown in Fig. 2 are color-coded according to its correlation coefficient of the quadratic fit

Fig. 4
figure 4

The relation between the second (γ)- and first (β)-order term of the quadratic Eq. (3). The data for the nine example batches shown in Fig. 2 are color-coded according to its correlation coefficient of the quadratic fit

Based upon this simplification, Eq. (5) has γ as the only independent variable, which can be determined by the availability of 1/ε σ = 0 and one pair of stress (σ) and porosity (ε) values for a single tablet. The equation for γ can be written as follows:

$$ \gamma =\frac{\frac{\frac{1}{\varepsilon }-\frac{1}{\varepsilon_{\sigma =0}}}{\sigma }-0.025}{\sigma -260} $$
(6)

The quadratic fitting of the simplified Eq. (5) on the nine example curve fits in Fig. 2 is shown in Fig. 5. Comparison of Fig. 2 with Fig. 5 shows that the simplified Eq. (5) captures the SF vs stress response of these materials reasonably well, given the fact that it requires only a single fitting parameter. Meantime, it is worthy to point out that two out of these nine examples showed a noticeable deviation from the quadratic fit (Fig. 2f and h vs Fig. 5f and h) in comparison to those of the original data. Investigation showed such a deviation was caused by the scattered nature of the data during the fitting between the first- (β) and second (γ)-order coefficients, as highlighted by the arrows in Fig. 4b.

Fig. 5
figure 5

Stress vs 1/porosity plot for nine example batches out of the entire 136 development campaigns and its individual quadratic fitting as per Eq. (5)

Predicted vs Observed Roll Stress

Both Eqs. (3) and (5) were used to interpolate the effective roll stress, given the ribbon SF. The correlation plot between predicted and measured roll stress is shown for each equation in Fig. 6. While significant scatter exists in the data, the slope of the best fit line is near 1 for both correlations, suggesting that the coefficient of 9.1 in Eq. (2) is reasonable. The simplification from Eq. (3)/Fig. 2 to Eq. (5)/Fig. 5 was not found to result in any significant negative impact on the prediction of the roll stress. This is likely because the prediction of roll stress involved interpolation (as opposed to extrapolation) between the powder tapped density data and the tablet data. This interpolated result was moderately insensitive to the quality of the fit in most cases.

Fig. 6
figure 6

Correlation plot between measured and predicted roll stress as per Eq. (3) (left) and as per Eq. (5) (right) across the entire 136 development campaigns. In both plots, the black line represents the fitted line, while the blue line represents the unity line

Understanding of the Quadratic Fitting

The quadratic relationships defined in Eqs. (3) and (5) indicate that the determination of the entire compaction profile could be reasonably modeled by measuring tapped blend porosity along with the SF and compression stress for a few tablets. Thus, pharmaceutical compaction profiles of a new molecule/new formulation can be predicted for both ribbons and tablets using very few experimental data points generated at a small scale in a lab environment. Given more and more common use of the instrumented tablet press, obtaining these data points along the porosity-applied stress profiles is expected to be very affordable. It is recommended to use a few more tablets and Eq. (3) to define the compaction profile if the intended purpose is to determine appropriate stress to produce a tablet with a given SF, while a single tablet along with Eq. (5) is expected to be sufficient to determine the roll stress for a given ribbon SF. This is expected to save significant time, labor, and material when assessing the impact of formulation changes on the powder consolidation behavior.

Conclusions and Prospective

The analysis provided in this paper illustrates the use of Big Data to provide two key results. First, data from 136 batch records encompassing a wide array of formulation compositions and process conditions were used to quantitatively confirm the previously published relation between effective roll stress and roll force, radius, and width (Eq. (2)). Second, an empirical compaction model (Eq. (3)) is provided capturing the SF-stress behavior of pharmaceutical powders across a broad range of compression stresses and across three loading geometries associated with tapped density testing, roller compaction, and tablet compression. Additionally, a simplified compaction model (Eq. (5)) was provided and requires only two data points for calibration. Both data points can be measured quickly and inexpensively at a small scale in a laboratory setting. Although the application was demonstrated here for the case of roller compaction, such a powder compaction profile is expected to be useful in other areas as well, such as in the prediction of the force used to compress tablets to a target solid fraction [1119].