Development of Statistical Computational Tools Through Pharmaceutical Drug Development and Manufacturing Life Cycle

Li, Fasheng; Wang, Ke

doi:10.1007/978-3-319-67386-8_8

Fasheng Li³ &
Ke Wang³

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 218))

Included in the following conference series:

Midwest Biopharmaceutical Statistics Workshop

949 Accesses

Abstract

Statisticians at Pfizer who support Chemistry, Manufacturing, and Controls (CMC), and Regulatory Affairs (Reg CMC) have developed many statistical R-based computational tools to enable high efficiency, consistency, and fast turnaround in their routine statistical support to drug product and manufacturing process development. Most tools have evolved into web-based applications for convenient access by statisticians and colleagues across the company. These tools cover a wide range of areas, such as product stability and shelf life or clinical use period estimation, process parameter criticality assessment, and design space exploration through experimental design and parametric bootstrapping. In this article, the general components of these R-programmed web-based computational tools are introduced, and their successful applications are demonstrated through an application of estimating a drug product shelf life based on stability data.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Overview of Drug Development and Statistical Tools for Manufacturing and Testing

Statistical Methods and Approaches to Avoid Stability Failures of Drug Product During Shelf-Life

Statistical Considerations for Stability and the Estimation of Shelf Life

Keywords

1 Introduction

Through the regulatory, chemistry, manufacturing and controls (Reg CMC) development lifecycle of a drug product, a series of compendial requirements, quality standards, and performance criteria must be well established and met. It usually takes years to perform data collection, analysis, and reporting on chemical process, formulation and manufacturing process, and analytical method development. Common practice in current pharmaceutical industry to “optimize” product compositions, manufacturing processes, and analytical methods is to apply designed experiments (DOEs), statistical models and statistical sampling techniques. Data generated in these procedures, which could be in a large amount, are usually analyzed or evaluated by statisticians or statistically trained professionals with commercial statistical software systems such as Design Expert, SAS, or SAS-JMP. With the increasing demand of statistical application and the challenge of limited number of trained statisticians, it is desirable to develop computational tools to conduct routine statistical analyses in more efficient and consistent ways. The computational tools promote consistency, efficiency, and reproducibility for routine statistical analysis. Version control, monitoring and regular maintenance are an integral part of developing the computational tools. The features of the computational tools align well with the requirements of Title 21 Code of Federal Regulations (CFR) Part 11 that the software systems should be readily available for and subject to FDA inspection (3) [1]. Working as statisticians at Pfizer supporting pharmaceutical development and Reg CMC, we have identified many opportunities and areas that benefit from statistical computation tools. Most tools are developed using a language such as R and have evolved into web-based applications for easy access by statisticians and colleagues at Pfizer. This article introduces the general requirements and structure of web-based statistical tools. The computational application is demonstrated through one tool which evaluates product stability and predicts shelf life or clinical use period.

2 Overview of Available Web-Based Statistical Tools

2.1 Introduction of Components of Web-Based Statistical Applications

Figure 1 illustrates three standard components of typical web-based applications: computer server, GUI platform server and application user. In practice, applet authors utilize the application servers to construct the computation script and the graphical user interface (GUI) of the application, and ensure successful communication between the application servers (usually a web browser) and the computer servers. General users only need to compile data into a required format by the applications. For a statistical computational application, additional software systems, such as R and SAS need to be installed onto the computer server for statistical analysis. The following section provides an overview of the web-based statistical applications developed by Pfizer pharmaceutical development and Reg CMC statisticians.

2.2 Overview of Web-Based CMC Development/Regulatory Statistics Applications

Most computational tools developed at Pfizer to support analytical method, product, and process development are written in script codes using R, SAS, MATLAB, JMP, Minitab, or MS Excel spread sheet templates. One example is drug product shelf life prediction. Long-term stability data are collected under various storage conditions, per ICH Q1A (2) and are evaluated per ICH Q1E (1) [2, 3]. The statistical analysis is coded in SAS and R to generate summary results and plots.

The commercial software packages, nevertheless are important tools for statisticians to carry out data analysis. However, individual usage of the software presents issues in portability, limited version control, and reproducibility. With support from Pfizer Information Technology group, statisticians have been able to turn the individual pieces of code into web-based applications. Figure 2 illustrates various web-based statistical applications developed by the CMC statisticians at Pfizer and the targeted areas throughout the life cycle of drug development and manufacturing. These applications are searchable and accessible to Global Pfizer colleagues.

3 An Example Web-Based Statistical Computation Tool

Below, details are provided on the development and usage of one of the web-based applications listed in Fig. 2, Stability & Shelf Life Prediction.

For this application, assume that the stability data are collected from a registration stability program that follows ICH Q1A guidelines or a clinical stability program. Most stability programs have three registration batches per combination of strength, packaging configuration, and storage condition, whereas clinical stability program usually has only one batch. The online application of analyzing stability data is programmed in R, following ICH Q1E guidance for a specific combination of product, strength, package, and storage condition. The shelf life is determined by the decision criteria in the guidance. The clinical stability data is analyzed using a simple linear regression model, and the use period is determined, according to an internal criterion. For example, the use period of a clinical material is the shorter of the intersection of the 95% confidence interval and the specification limit or real stability time plus 12 months or longer if statistically supported. Therefore, the shelf life or clinical use period can be determined by a two-step procedure: model selection and projection of shelf life/use period.

3.1 Statistical Model Selection

For the statistical analysis of typical registration stability data, the following model selection procedure is performed based on the poolability of the data from the three batches. Assume $ Y_{b} = y_{b1} ,y_{b2} , \ldots , y_{bT} $ are the stability data for an attribute at time period t =1, 2, …, T months for batch b = 1, 2, …, B for a certain combination of strength, package type, and storage condition.

(a)
Fit a full model (the SSSI model—separate slopes and separate intercepts model):

$$ y = \beta_{0} + \beta_{1} *time + \beta_{21} *batch + \beta_{12} *time*batch + \varepsilon $$

(1)

where the error Ɛ is normally distributed with mean 0, and standard deviation σ. This model is referred to as the separate slopes and separate intercept model (SSSI), as it allows for different slopes and different intercepts for each batch.

Decision: If the p-value of the interaction of time and batch (time*batch) is <0.25, STOP and use Eq. (1) for the shelf life projection; if the p-value of the interaction of time and batch (time*batch) is ≥0.25, GOTO step (b).

(b)
Fit a reduced model (the CSSI model—common slope and separate intercepts model):

$$ y = \beta_{0} + \beta_{1} *time + \beta_{21} *batch + \varepsilon $$

(2)

This model is referred to as a common slope and separate intercepts model (CSSI), as it permits the same slope estimate but different intercepts for all batches.

Decision: If the p-value of batch is <0.25, STOP and use Eq. (2) for the shelf life projection; if the p-value of batch is ≥0.25, GOTO step (c).

(c) Fit a reduced model (the CSCI model—common slope and common intercept model):

$$ y = \beta_{0} + \beta_{1} *time + \varepsilon $$

(3)

This model is referred to as a common slope and common intercept model (CSCI), since the same slope and intercept are used for all batches.

Decision: Eq. (3) is used for the shelf life projection.

The above described procedure for the statistical analysis of long-term registration stability data is summarized into a flow chart in Fig. 3. For typical one-batch clinical stability data, a simple linear regression model is used.

3.2 Shelf Life or Use-Period Projection

Once the regression model is determined, the 95% confidence interval (CI) can be calculated for any stability time point. The predicted shelf life/use period is determined as the shortest time point when the confidence limit intersects with the specification limit of the product. Notice that it is necessary to extrapolate the predictions and 95% CIs in order to determine the shelf life/use period beyond the maximum storage time of the stability data. Per ICH Q1E, the maximum extrapolation is two times of the maximum storage time (T_max) when T_max is <12 months or an extrapolation of 12 months when T_max is > = 12 months. Figure 4 illustrates how to establish the shelf life for an example data set. For this set of stability data, a separate slope and separate intercept model is selected and the shelf life is determined by the limiting lot (i.e. Lot 3). This shelf life limiting lot is determined, due to its fastest impurity A growth (largest slope) and thus its 95% CI intercepts with the specification limit of 1%, the earliest at 32.1 months. Therefore, 32.1 months (or 32 months) is the longest shelf life can be proposed. Practically, a shelf life of either 24 months or 30 months can be proposed for this product based on this set of data.

3.3 The Internal Web-Based Online Application

Both long-term registration stability data and clinical stability data are collected routinely for all filed products. The repeated stability data analysis, including stability data plotting and drug product shelf life prediction, necessitated the development of a web-based application tool to standardize these statistical activities.

The web application for Registration and Clinical Stability Data Analysis and Shelflife/Use Period Prediction is programmed in R. A graphical user interface (GUI) is built to allow users to upload the relevant stability data to the program for analysis. The GUI of this application is displayed in Figs. 5 and 6 where the main interface contains links to various features, such as the user manual, example data sets in required formats, dialogues for uploading data, and choices of analyses.

Once stability data is uploaded and choices of statistical analyses and parameters are determined, the job is submitted and run in the background through the HPC computing cluster. As soon as the job is finished, users can view the results (including tables and graphical plots) through the web browser (e.g., Internet Explorer, Chrome). The application also provides the ability to download tables and graphs as well as consolidating the results in a .pdf formatted report. Figure 6a, b are snapshots of the output on a web browser.

In summary, the implementation of the web-based statistical application of “registration and clinical stability data analysis and shelflife/use period prediction” is able to offer benefits and features such as,

Align the statistical analyses of long term stability data
Offer quick and convenient turnaround to analyze stability data, to generate shelf life plots and tables, and summary report
Allow easy maintenance for feature updates due to the version controlled R program
Run jobs in the background on HPC cluster or cloud computers.

4 Conclusions

The benefits and features of web-based statistical applications have been demonstrated through a selected program “registration and clinical stability data analysis and shelflife/use period prediction”. Statisticians and scientists supporting drug development and Reg CMC areas can offer their routine statistical activities with increased consistency, improved efficiency, better alignment of statistical analyses, and easily retrievable results by deploying web-based statistical applications. These web-based statistical applications can standardize statistical approaches, centralize software pieces, validate and verify software pieces, and utilize high performance and cloud computer resources.

References

US Food and Drug Administration Department of Health and Human Services. 21 CFR 11: Electronic records; electronic signatures (2017)
Google Scholar
International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use. Q1E: Evaluation for Stability Data (2004)
Google Scholar
International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use. Q1A (R2): Stability Testing of New Drug Substances and Products (2003)
Google Scholar

Download references

Acknowledgements

The authors acknowledge Kimberly Vukovinsky, Robert J. Timpano, and colleagues in Pharmaceutical Science and Manufacturing Statistics group at Pfizer for their generous support of evaluating and commenting the web-based applications.

Author information

Authors and Affiliations

Pharmaceutical Science and Manufacturing Statistics, Pfizer Inc., MS 8220-2356, Eastern Point Road, Groton, CT, 06340, USA
Fasheng Li & Ke Wang

Authors

Fasheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Ke Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fasheng Li .

Editor information

Editors and Affiliations

Statistical Innovation and Consultation group, Takeda Pharmaceuticals, Cambridge, MA, USA
Ray Liu
Division of Biometrics VI, CDER, U.S. Food and Drug Administration , Silver Spring, MD, USA
Yi Tsong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, F., Wang, K. (2019). Development of Statistical Computational Tools Through Pharmaceutical Drug Development and Manufacturing Life Cycle. In: Liu, R., Tsong, Y. (eds) Pharmaceutical Statistics. MBSW 2016. Springer Proceedings in Mathematics & Statistics, vol 218. Springer, Cham. https://doi.org/10.1007/978-3-319-67386-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-67386-8_8
Published: 13 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67385-1
Online ISBN: 978-3-319-67386-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics