INTRODUCTION

In January of 2011, the US FDA issued “Process Validation: General Principles and Practices” (the 2011 FDA Guidance). This guidance introduces the process validation lifecycle approach (1). One aspect stressed by the FDA is that the traditionally accepted three batches evaluated during the process performance qualification (PPQ) stage may no longer be sufficient to confirm that the manufacturing process, as designed, is capable of reproducible commercial manufacturing (2). The 2011 FDA Guidance does not explicitly indicate a regulatory expectation for the number of process qualification batches. It is expected that manufacturers make a rational decision based on product knowledge and process understanding. Activities in Stage 2 PPQ should be based on well-grounded scientific justification, an appropriate level of product and process understanding, and adequate demonstration of process control (3). The developed approach and criteria should include a description of the statistical methods to be used in analyzing all collected data (2). The 2011 FDA Guidance clearly indicates that the goal of validating any manufacturing process is to establish scientific evidence that the process is reproducible and will consistently deliver quality products (1,2). The 2011 FDA Guidance states

The number of samples should be adequate to provide sufficient statistical confidence of quality both within a batch and between batches.

This statement indicates a need to understand both within and between batch variability. Several recent articles have been published that discuss the challenge of justifying a statistical model for determining a sufficient number of batches (35). Bryder et al. provide an excellent overview of the issue and raises a call for discussion in their ISPE discussion paper (3). Wiles provides an example of a statistically sound method for determining when a valid number of batches have been acquired based on risk assessment and a calculation of process capability (5). The method determines total number of required samples to attain a pre-determined confidence. This is subtly different than determining the number of batches. The difference in this approach is that expected sources of variation are broken down. Some of the variation comes from intra-batch source and some comes from inter-batch sources. The analysis uses the manufacturer’s batch-to-batch variation to estimate. Other approaches do not consider batch-to-batch variation.

This paper describes a statistical approach to determine and justify a minimum number of batches that should be evaluated for Stage 2 PPQ in compliance with the 2011 FDA Guidance. The described statistical tool projects the number of batches that should establish sufficient scientific evidence that the process is robust and will consistently deliver quality products. The approach uses previously collected product specific information (e.g., data generated from Stage 1 batches produced for the purpose of clinical trials, submission/registration, stability, process scale-up/demonstration), and historical batch-to batch process information (e.g., typical variability observed for this product/process type based on active content) across multiple critical quality attributes (e.g., content uniformity, assay, and/or dissolution) to provide a science- and risk-based projection of the number of batches required for Stage 2 PPQ. Note that both the product specific information and historical batch-to batch process information may vary significantly among different manufacturing facilities (due to personnel, operation, process, equipment, raw material, and other factors). Scientific knowledge and sound understanding of the process and the product is critical in facilitating the information acquisition. The strategy of information collection, classification, and analysis should be developed appropriately to allow for a science- and risk-based decision-making.

The basic strategy described in this article is to determine the minimum number of batches for which a projected confidence interval of the product’s critical quality attributes resides completely and readily within the desired specifications. That is, based on “current” information, the number of batches that upon evaluation should provide sufficient data so that a statistically confident conclusion of the product’s critical quality attributes can be achieved. For a product quality attribute to be tested to comply with current specifications, its tested mean shall be as close as possible to the center of the specification and its standard deviation shall be as minimal as possible under the assumption of normal distribution. Based on this assumption, the approach that is described here has created a confidence interval of the product quality attribute measurements that is a combination of the confidence interval of the process mean and the confidence interval of the process standard deviation. Because each specific quality attribute is framed differently, often with distinct requirements, the form of the equations used to determine confidence intervals is tailored per quality attribute. For example, USP <905> Dosage Uniformity indicates computation of an acceptance value (AV) that must be less than 15 to meet the stage 1 criteria. A confidence interval is then estimated for each number of potential PPQ batches based on previously collected product specific data (i.e., the magnitude of the “within” or intra-batch statistics) and historical evidence of batch-to-batch variability for comparable products (i.e., “between” or inter-batch). Per this approach, the projected number of PPQ batches is determined where the entire confidence interval resides within the specification limits. This is illustrated below for the dosage uniformity AV quality attribute.

figure a

Though the form of the equation is dependent on the specific quality attribute, comparable derivations can be accomplished for other attributes such as assay and dissolution or for other pharmaceutical products, such as liquid dose formulations. While the described approach provides statistical justification and projection for number of PPQ batches required, this assessment does not supplant the need to produce the PPQ batches or review the data generated from these batches.

Details of the Approach

The total or overall variability of a process can be represented as a summation of individual component variation. This may be mathematically denoted as

$$ {s}_{\mathrm{total}}^2={s}_{\mathrm{batch}\hbox{-} \mathrm{batch}}^2+{s}_{\mathrm{intra}\hbox{-} \mathrm{batch}}^2+{s}_{\mathrm{sampling}}^2+{s}_{\mathrm{analytical}}^2+\dots $$
(1)

In this example, the total variation is comprised of variation derived from batch-to-batch, intra-batch, sampling, and analytical variability sources. Such sources of variation are typical for a process. In general, Stage 1–process design provides an assessment of most variation sources with the notable exception of the batch-to-batch (between or inter-batch) variability. Thus, data from Stage 1 provides a reasonable measure of product intra-batch performance. However, it is impossible to assess the batch-to-batch variability until several batches of product are produced and analyzed. To approximate this component, it is reasonable to assert that a similar process/product will exhibit similar batch-to-batch characteristics and tabulated evidence from historical records can provide a good estimate (see section on “BATCH-TO-BATCH VARIABILITY DETERMINATION” below).

The deduction of the true underlying population parameters from a limited amount of sampling is statistically determined through the use of confidence intervals. The confidence interval for the mean is defined by the following equation:

$$ \mu =\overline{x}\pm {t}_{\left(\frac{\alpha }{2},n-1\right)}\frac{s}{\sqrt{n}} $$
(2)

and for the standard deviation,

$$ s\sqrt{\frac{n-1}{\chi_{\left(n-1,1-\frac{\alpha }{2}\right)}^2}}\le \sigma \le s\sqrt{\frac{n-1}{\chi_{\left(n-1,\frac{\alpha }{2}\right)}^2}} $$
(3)

wherein the Greek symbols μ and σ represent the true underlying population mean and standard deviation. The measured mean is represented by and the measured standard deviation by s. Note in Eqs. 2 and 3 that the chi-square and t distribution are a function of the number of samples (n).

Equation 3 describes how the confidence interval of the variance improves with increasing number of acquired batches. The number of PPQ batches required is the number of batches when the confidence interval of the product quality attribute measurements, which is a combination of the confidence interval of the process mean and the confidence interval of the process standard deviation, resides completely in the specification range.

Content Uniformity

Section <905> “Uniformity of Dosage Units” in the United States Pharmacopeia provides guidance on assessing the consistency of dosage units (6). The guidance provides a two-stage acceptance criterion through computation of an acceptance value (AV). USP <905> describes multiple cases depending on the measured average of the sampled units. The general form of the AV equation is

$$ \mathrm{A}\mathrm{V}=\left|M-\overline{x}\right|+\mathrm{k}\mathrm{s} $$

Note that while USP <905> provides more details, this report only describes a simplified example where the target dosage value is ≤101.5% and the measured mean is between 98.5 and 101.5%. The simplified example is presented here for illustration and clarity only. For acceptance stage L1, the simplified equation becomes, AV = 2.4 s. Using Eqs. 1 and 3, the following equation can be derived to determine the AV upper confidence limit for progressive number of batches (N B).

$$ {s}_{\left({N}_{\mathrm{B}}\right)}^{\mathrm{hi}}=\sqrt{{\left\{{s}_{\mathrm{B}\hbox{-} \mathrm{B}}\ast \sqrt{\frac{\left({N}_{\mathrm{B}}-1\right)}{\chi_{\left(\frac{\alpha }{2},{N}_{\mathrm{B}}-1\right)}^2}}\right\}}^2+{\left\{{s}_0\ast \sqrt{\frac{\left({N}_0-1\right)}{\chi_{\left(\frac{\alpha }{2},{N}_0-1\right)}^2}}\right\}}^2} $$

Here, s B-B represents the batch-to-batch standard deviation of comparable processes determined from historical information. s 0 represents the standard deviation from previously collected data on the specific process (e.g., stage I efforts). N 0 represents the number of data points used to determine s 0. α represents the desired confidence for the determination (typically 0.05).

Assay

There is typically only one-assay sample (i.e., a composite of at least ten dosage units) analyzed per batch. Thus, the impact of intra-batch variation on assay is considered less significant in assessing the overall variation. As with content uniformity data, pre-existing batch data was used to determine the inter-batch variability (s B-B). Referring to Eqs. 2 and 3, the form of the confidence limits for progressive number of batches becomes

$$ {\overline{x}}_{N_{\mathrm{B}}}^{\mathrm{hi}/\mathrm{l}\mathrm{o}}={\overline{x}}_0\pm {t}_{\left(1-\frac{\alpha }{2},{N}_{\mathrm{B}}-1\right)}\ast \frac{s_{\mathrm{B}\hbox{-} \mathrm{B}}\ast \sqrt{\frac{\left({N}_{\mathrm{B}}-1\right)}{\chi_{\left(\frac{\alpha }{2},{N}_{\mathrm{B}}-1\right)}^2}}}{\sqrt{N_{\mathrm{B}}}} $$

Here, \( {\overline{x}}_0 \) represents the average of any/all assay values predetermined during stage 1 efforts. Because assay has a two-sided specification (typically 95 to 105%), both the upper and lower confidence interval limits must reside within the specification requirement to project the number of batches that will be sufficient for evaluation. In the example illustrated below, three evaluation batches should be sufficient to assure a robust process.

figure b

Dissolution

Dissolution is a considerably more complicated analysis than dosage uniformity or assay. Immediate-release dissolution follows a complicated three-stage acceptance criteria (7). The chance that a particular product will meet the overall stage-wise criteria can be defined by the probability of acceptance (P a). The stage-wise rules essentially create a complicated equation that transforms the mean and standard deviation of the acquired dissolution data into an acceptance probability (810). Among other approaches, this equation can be solved using a Monte Carlo computation wherein the results are stored into a series of lookup tables (dissolution Monte Carlo transformation).

Despite the complexity in the dissolution acceptance probability equation, the defined approach to determining the suggested number of batches based on dissolution data is similar to that described for content uniformity. The specific product dissolution statistics are combined with historical batch-to-batch variability from similar products to determine confidence limits for both the mean and standard deviation statistics using Eqs. 1, 2, and 3. The upper and lower limits are used appropriately along with the dissolution Monte Carlo transformation to determine a probability of acceptance. The computed confidence interval is compared to an acceptance criterion to assure that the number of produced PPQ batches will meet the required specification limits with confidence. An example plot wherein the dissolution criterion is set at a 3 sigma level (acceptance probability greater than 99.87%) is shown below. This example suggests a minimum of three batches are required for Stage 2 PPQ.

figure c

BATCH-TO-BATCH VARIABILITY DETERMINATION

A critical factor in the overall determination of suggested number of PPQ batches is the batch-to-batch (or between batch) variability. Prior to the PPQ campaign, this factor for the particular product has yet to be determined. However, data of comparable campaigns provide a reasonable indication of the anticipated batch-to-batch variability. The magnitude of the batch-to-batch variability is potentially dependent on several different factors; one factor in particular is the API content or product label claim.

To gain an understanding of batch-to-batch variability, historical dosage uniformity and dissolution data from over 200 validation campaigns encompassing over 700 individual batches and approximately 100 distinct molecules was compiled. The batch-to-batch variability was extracted from each campaign by separating the intra-batch variability from the overall total campaign variability (Eq. 1). The distribution of each campaign batch-to-batch standard deviation is shown in the following two plots for content uniformity (CU) and dissolution (Disso). This data exhibits a profile comparable to a chi-square distribution (i.e., non-normal and skewed), that is a typical distribution profile expected for a collection of standard deviation data.

figure d

The data was segregated and analyzed to assess the relative influence of several factors (e.g., manufacturing process, strength, batch size, etc.). A particular factor that appeared correlated to batch-to-batch variability was the product active content or strength. The following plots show that the batch-to-batch variability for both content uniformity (CU) and dissolution significantly increases for low-strength products (<1 mg). Thus, one relevant model input factor for batch-to-batch variability is active content or strength.

figure e

Summary data from the above historical evidence are stored into reference tables and are used as a reasonable approximation of the batch-to-batch component of variation used in justifying the number of batches that should be evaluated during PPQ to provide reasonable confidence that the evaluated process is robust. Once these PPQ batches are manufactured and tested, it is prudent to compare the estimation of the batch-to-batch variation with the truly observed PPQ batch-to-batch variation.

It has to be noted that both the product-specific information and historical batch-to batch process information may vary significantly among different manufacturing facilities (due to personnel, operation, process, equipment, raw material, and other factors). The sources of variation for other types of manufacturing technologies will differ. In order to statistically justify how many validation batches should be produced, the company should gain an understanding of the variation they observe from the various processes based on their historical data.

CONCLUSION

The approach documented in this article reviews data from earlier process validation lifecycle stages with a described statistical model to estimate the number of PPQ batches that should provide sufficient information to make a scientific and risk-based decision on product robustness. This approach is based upon estimation of a statistical confidence from the current product knowledge (Stage 1), historical variability for similar products/processes (batch-to-batch), and label claim specifications such as strength. The analysis is to determine the confidence level with the measurements and to compare them with the specifications. The projected minimum number of PPQ batches required will vary depending on the product, process understanding, and attributes. This approach will ensure that sufficient scientific data is generated to demonstrate process robustness as desired by the 2011 FDA Guidance.