Introduction

Accurate prediction of human pharmacokinetics (PK) and dose is a major goal in drug discovery, because in vivo exposure of a compound is critical for the development of PK/PD (pharmacodynamics) relationships, establishing therapeutic index (TI) and prediction of drug-drug interactions (1). Clearance (2), volume of distribution (3, 4), rate and extent of oral absorption (5) and distribution in target tissues (e.g., unbound distribution coefficient (Kpuu) of brain) (6,7,8) are the key parameters determining PK and tissue exposure of a compound. Among these parameters, clearance is the most sensitive property of drug candidates, since it not only influences dose, but also impacts half-life which determines dosing frequency (2, 9, 10). Many approaches have been developed to predict human clearance based on the elimination mechanisms involved (1, 11). For hepatic metabolism, human liver microsomes (HLM) and human hepatocytes (HHEP) are the most commonly used systems to predict human hepatic clearance (12,13,14). For other clearance mechanisms, such as renal clearance (1, 15, 16), biliary clearance (17) and transporter-mediated clearance (18,19,20), different approaches are applied to estimate human clearance. Some of the examples include allometric scaling of animal unbound renal clearance to predict human renal clearance (1, 15) or PBPK (physiologically based pharmacokinetic) modeling (16), using sandwich-cultured human hepatocyte data or animal data to predict human biliary clearance (17), and applying uptake rate, PBPK modeling, Kpuu or cynomolgus monkey single species scaling to predict transporter-mediated clearance (18,19,20,21,22,23). To accurately predict human clearance, it is essential to understand the clearance mechanisms involved (i.e., hepatic metabolism, hepatic uptake, biliary secretion, or renal clearance), so that appropriate methods can be applied to the predictions. Besides in vitro assays with human reagents (e.g., HLM, HHEP, and transporter assays), in vivo PK data in preclinical species are highly informative in characterizing human drug disposition, while considering species differences in drug metabolizing enzymes and transporters. Development of in vitro - in vivo extrapolation (IVIVE) of clearance is not only essential for human in order to predict human clearance from in vitro human reagents, but also critical for preclinical species. This is because animal intravenous (IV) PK data are readily available early in drug discovery and they can be applied to verify clearance IVIVE in preclinical species before human PK prediction. If there is a clearance disconnect between in vitro and in vivo in animals, careful investigations of clearance mechanisms should be made to understand the disconnect and its implications in human PK prediction.

PK studies of preclinical species are conducted regularly in drug discovery to understand clearance, absorption, and disposition of drug candidates. Animal PK data are commonly used to develop PK/PD relationships in disease models and to define TI based on toxicity studies. In order to develop reliable IVIVE, it is essential to know how well the in vitro reagents (i.e., liver microsomes (LMs) and hepatocytes (HEPs)) of various species predict in vivo clearance and whether empirical scaling factors (SFs) are needed to establish reliable IVIVE. Once IVIVE and SFs are developed using compounds with different extended clearance classification system (ECCS) (24) classes, deviations from the established IVIVE would suggest potential involvement of other clearance mechanisms that are not covered by these in vitro reagents. The knowledge gained from preclinical IVIVE can then be incorporated into human clearance and PK prediction. In this study, clearance IVIVE is examined using both LMs and HEPs in four preclinical species (i.e., mouse, rat, dog and non-human primate (NHP)) with large sets of structurally diverse drug discovery compounds and marketed drugs through mining of the Pfizer internal database. Empirical SFs of clearance are developed for each ECCS class of the four species. Although ECCS is primarily established to identify clearance mechanisms based on human data (24), here we assumed similar classification criteria can be applied to preclinical species. The preclinical IVIVE analysis would help to better understand clearance mechanisms, establish more reliable human IVIVE, and enhance our confidence in human clearance and PK prediction.

Materials and Methods

Material

Cryopreserved male HEPs and male LMs of CD-1 mouse, Sprague Dawley and Wistar Han rat, beagle dog and cynomolgus monkey were purchased from by BioIVT (Westbury, NY), Lonza (Walkersville, MD), XenoTech (Lenaxa, KS), and Corning (BD Biosciences, Woburn, MA). Frozen plasma and fresh blood was obtained from BioIVT and Pfizer labs (Groton, CT). All chemicals were obtained from Pfizer Global Material Management (Groton, CT) or purchased from Sigma-Aldrich (St. Louis, MO) unless specified otherwise. The 96-well equilibrium dialysis (HTD96) device and cellulose membranes with molecular weight cut-off of 12–14 K were obtained from HTDialysis, LLC (Gales Ferry, CT).

Microsomal Stability Assay

The detailed protocol of the microsomal stability assay has been discussed previously (13). Briefly, each incubation contained test compound (1 μM), LMs (0.25 μM CYP protein), NADPH (1.3 mM), MgCl2 (3.3 mM) and potassium phosphate buffer (100 mM at pH 7.4). No other cofactors were added to the incubation. The incubations were conducted at 37°C. At various time points, an aliquot was sampled and added to acetonitrile with an internal standard (IS, CP-628374) to quench the reaction. Samples were analyzed using LC-MS/MS and the detailed LC-MS/MS conditions have been reported previously (25). The intrinsic clearance (CLint) values in LMs are calculated using Eqs. 13 and physiological parameters in Table I, where t½ is the in vitro half-life, CLint,app is the apparent intrinsic clearance, PRlm is the liver microsomal protein concentration in the incubation, PRpLW is the microsomal protein per liver weight, LWpBW is liver weight per body weight, and fu,lm is the fraction unbound under the microsome incubation conditions (Please see section on “Determination of fraction unbound of liver microsomes, hepatocytes and plasma”). Replicate measures of CLint,app were geometrically averaged. Measures below or above the limits of quantification (BLQ and ALQ) were geometrically averaged with quantified values utilizing Beal’s method M3 if at least half the replicates were quantifiable (27).

Table I Physiological Parameters and Experimental Conditions
$${t}_{\mathit{\frac{1}{2}}}=\frac{\ln\;2}{-\left(\mathrm{slope}\ \;\mathrm{of}\; \mathrm{the}\ \ln\;\%\;\mathrm{remaining}\ \mathrm{of}\ \mathrm{drug}\ \mathrm{vs}.\;\mathrm{time}\ \mathrm{plot}\right)}=\min$$
(1)
$$\mathrm{Microsome}\ {CL}_{\mathit{\operatorname{int}}, app}=\frac{\ln 2}{t_{\frac{1}{2}}\ \left(\min \right)}\bullet \frac{1}{PR_{lm}}\bullet \frac{1000\ \upmu \mathrm{L}}{\mathrm{mL}}=\upmu \mathrm{L}/\min /\mathrm{mg}$$
(2)
$$\mathrm{Microsome}\ {CL}_{int}=\frac{CL_{\mathit{\operatorname{int}}, app}\bullet PRpLW\bullet LWpBW}{f_{u, lm}}=\mathrm{mL}/\min /\mathrm{kg}$$
(3)

Hepatocyte Stability Assay

The detailed protocol of the hepatocyte stability assay has been discussed previously (13). Briefly, cryopreserved HEPs were thawed, and re-suspended in Williams E medium (WEM, custom formula number 91-5233EC; Invitrogen, Grand Island, NY) supplemented with HEPES and Na2CO3. The cells were counted using the Trypan Blue exclusion method. Test compounds (1 μM) were incubated with hepatocytes at 0.5 million cells/mL at 37°C in an incubator (relative humidity ≥90%, 5% CO2/air). At various time points, an aliquot was transferred into a plate containing acetonitrile with IS to quench the reaction. Samples were analyzed by LC-MS/MS using conditions reported previously (25). The CLint values in hepatocytes are calculated using Eqs. 1 and 45, and physiological parameters in Table I, where CDhep is the hepatocyte cellularity per volume in the incubation, HEPpLW is the hepatocyte cellularity per liver weight, and fu,hep is the fraction unbound under the hepatocyte incubation conditions (Please see section on “Determination of fraction unbound of liver microsomes, hepatocytes and plasma”). Similar to microsomal stability assay, BLQ and ALQ measures were geometrically averaged with quantified values utilizing Beal’s method M3 if at least half the replicates were quantifiable (27). For low clearance compounds that have t½ greater than the detection limit of a four-hour incubation (i.e., t½ > 8 h), hepatocyte relay assay (14, 28) was used to measure CLint. The details of the hepatocyte relay assay have been discussed previously (14, 28). The Excel calculation template of the hepatocyte relay assay is included in the supplemental material. Both 0.5 and 2 million cells/mL cell densities were used in the hepatocyte relay assay depending on CLint,app values.

$$\mathrm{Hepatocyte}\ {CL}_{\mathit{\operatorname{int}}, app}=\frac{\ln 2}{t_{\frac{1}{2}}\ \left(\min \right)}\bullet \frac{1}{CD_{hep}}\bullet \frac{1000\ \upmu \mathrm{L}}{\mathrm{mL}}=\upmu \mathrm{L}/\min /\mathrm{MC}$$
(4)
$$\mathrm{Hepatocyte}\ {CL}_{int}=\frac{CL_{\mathit{\operatorname{int}}, app}\bullet HEPpLW\bullet LWpBW}{f_{u, hep}}=\mathrm{mL}/\min /\mathrm{kg}$$
(5)

Determination of Fraction Unbound of Liver Microsomes, Hepatocytes and Plasma

The detailed protocol of the microsomal binding assay has been reported previously (13). Fraction unbound (fu,lm) under the human microsomal stability incubation condition was measured using HTD96 equilibrium dialysis device. Human liver microsomes (0.806 mg/mL) spiked with test compounds (2 μM) were dialyzed against phosphate buffered saline in a humidified CO2 incubator (5% CO2/air, 75% relative humidity) on an orbital shaker (200 rpm, VWR, Radnor, PA) for 6 h at 37°C. At the end of the incubation, samples were matrix matched, quenched with cold acetonitrile containing IS, and centrifuged (Eppendorf, Hauppauge, NY) at 3000 rpm for 10 min. The supernatant was transferred to a clean plate and sealed prior to LC-MS/MS analysis. The detailed LC-MS/MS method has been reported previously (29). The calculations for fu,lm, stability and recovery are shown in Eqs. (6)–(8). In silico fu,lm values from Pfizer internal global quantitative structure-activity relationship (QSAR) model were sparingly used when fu,p was ≥0.5 and fu,lm was the only missing experimental parameter. Hepatocyte fraction unbound (fu,hep) can be predicted from fu,lm with a protein per cell density conversion factor (PRpC) of 0.468 established in house (i.e., 1 million cells/mL of hepatocytes is equivalent to 0.468 mg/mL human liver microsomes (Eq. 9), Table 1S (Excel File) and Fig. 1S in Supplemental Material). The Bias, AAFE, DI90% and P2fold (for definition of these terms, please see data analysis section) for the fu,hep prediction method are 0.97, 1.3, 1.9, and 92%, respectively. Binding to microsomes of different species is calculated using Eq. 10 with different dilution factors (D) depending on the microsomal protein concentrations used for preclinical species compared to human liver microsomes (Table I). Plasma protein binding of the different species was measured using similar protocol as microsomal binding and the method has been reported previously (30).

$${f}_{u, lm}=\frac{\mathrm{Receiver}\ \mathrm{Area}\ \mathrm{Ratio}}{\mathrm{Donor}\ \mathrm{Area}\ \mathrm{Ratio}}$$
(6)
$$\%\;\mathrm{Recovery}=\frac{\ \mathrm{Donor}\ \mathrm{Area}\ \mathrm{Ratio}+\mathrm{Receiver}\ \mathrm{Area}\ \mathrm{Ratio}}{\mathrm{Donor}\ \mathrm{Area}\ \mathrm{Ratio}\ \mathrm{at}\ \mathrm{Time}\ \mathrm{Zero}}\ \mathrm{x}\ 100\%$$
(7)
$$\mathrm{Stability}\ \mathrm{as}\%\mathrm{Remaining}=\frac{\ \mathrm{Area}\ \mathrm{Ratio}\ \mathrm{at}\ \mathrm{Time}\ \mathrm{Last}}{\mathrm{Area}\ \mathrm{Ratio}\ \mathrm{at}\ \mathrm{Time}\ \mathrm{Zero}}\kern0.5em \mathrm{x}\ 100\%$$
(8)
$${f}_{u, hep}=\frac{1}{\left(1+\left(\frac{1}{{\mathrm{f}}_{\mathrm{u},\mathrm{lm}}}-1\right)\times \mathrm{cell}\ \mathrm{density}/\mathrm{microsome}\ \mathrm{protein}\ \mathrm{concentration}\times \mathrm{PRpC}\ \mathrm{conversion}\ \mathrm{factor}\right)}$$
(9)
$$\mathrm{Undiluted}\ {f}_{u, lm}=\frac{\ 1/D}{\left(\left(1/{f}_{u,d}\right)-1\right)+1/D\Big)}$$
(10)

Blood-to-Plasma Ratio

Blood-to-plasma ratio (Rbp) in multiple preclinical species was measured using a method that has been discussed previously (31). Briefly, test compounds (1 μM) were added to fresh blood and incubated at 37°C for both 1 and 3 h in an incubator (90% humidity, 5% CO2/air) on a shaker (450 rpm). At the end of the incubations, blood and plasma samples were matrix-matched and quenched with cold acetonitrile containing IS. Samples were centrifuged and supernatant was analyzed with LC-MS/MS. Calculated human Rbp values from the Pfizer internal global QSAR model were sparingly used, if Rbp was the only missing experimental parameter.

RRCK Permeability

RRCK (Ralph Russ canine kidney, i.e., low efflux MDCKII (25)) cells were used for permeability measurement using cell monolayer in 96-transwell® format. The detailed assay conditions have been reported previously (25). Briefly, RRCK cells were seeded on a transwell filter membrane and grown for 5–7 days. For permeability experiments, test compounds (2 μM) were added to the donor wells and the Transwell® plate was incubated at 37°C for 1.5 h in an incubator (95% humidity, 5% CO2/air). At the end of incubation, samples were analyzed using LC-MS/MS. Apparent permeability (Papp) is calculated using Eq. 11, where Area is the surface area of the cell monolayer, CD(0) is the concentration in the donor at time 0, t is time in seconds, Mr is the mass of compound appearing in the receiver as a function of time, and dMr/dt is flux of the compound across the cell monolayer. Permeability values were used to classify compounds into ECCS classes. When experimental values were not available, calculated Papp values from Pfizer internal global QSAR model were used.

$${P}_{app}=\frac{1}{Area\times {C}_D(0)}\times \frac{d{M}_r}{dt}$$
(11)

LogD and pKa

PFLogD (32) was calculated using an in-house LogD model that was developed to predict LogD for any compounds based on underlying experimental data from the SFLogD (33) and ELogD (34) assays. It takes into account the known limitations of the SFLogD (inaccurate for compounds with actual LogD above 4.0) and ELogD (inadequate for acidic or zwitterionic compounds) assays by performing a logical combination of the results from the two assays based on the chemical space of the compound being predicted. The model uses experimental results when available and global statistical model predictions (35) when experimental data is unavailable. Experimental acid and base pKa (36) were used when available from Pfizer in-house databases or else calculated using MoKa software (37, 38) (Molecular Discovery Limited, London, UK).

In Vivo Animal IV PK

Test compounds were obtained from Pfizer Global Material Management (Groton, CT). IV bolus PK studies were conducted at BioDuro contract laboratories (Shanghai, China) or at Pfizer labs (Groton, CT). All procedures performed on these animals were in accordance with regulations and established guidelines and were reviewed and approved by an Institutional Animal Care and Use Committee or through an ethical review process. The animal species/strains included in the study were mouse (CD-1, CF-1, Swiss Webster, and C57BL6), rat (Sprague Dawley and Wistar Han), beagle dog and cynomolgus monkey. Animals (fed, n = 2 in most cases) were generally dosed IV at 0.5 or 1 mg/kg in various vehicles depending on compound properties. At various time points (e.g., 0, 0.083, 0.25, 0.5, 1, 2, 4, 7 and 24 h), blood samples were taken, from which plasma samples were prepared. Plasma samples were extracted using a protein precipitation method containing IS. Samples were centrifuged, and supernatant was quantified using LC-MS/MS. PK data were analyzed using noncompartmental analysis (NCA) in Watson LIMS™ (Thermo Scientific, Philadelphia, PA). Total plasma clearance (CLp) was derived using dose/AUC0-∞. If extrapolated AUC beyond the last quantifiable time point (AUCextra) exceeded the AUC0-∞ by 25% or more, the IV PK data were not included in the analysis. As renal clearance (CLr) can be a significant proportion of total clearance of ECCS class 3A, 3B, and 4A compounds, hepatic clearance data were not included in the analysis for these classes if renal clearance values were not available. Female rat PK data was excluded from the analysis due to potential sex dependent clearance rates (39). Both the well-stirred and the parallel-tube models were used to convert between hepatic clearance and intrinsic clearance.

Data Analysis

Average fold error (AFE; aka Bias) was used to assess model prediction accuracy, i.e., the extent of any systematic under prediction bias. AFE and its corresponding 90% confidence interval (AFECI90%) and log-transformed standard deviation (σlnAFE) were calculated with eqs. 1214, where Obsi are the known observations for each compound, Predi are the model predictions for each compound, N is the number of compounds; ni is the number of predictions per compound, and Φ−1(0.95) is the normal inverse cumulative distribution for 95% probability.

$$AFE={e}^{\left\langle \frac{\sum \left\{\ln \left(\frac{Obs_i}{Pred_i}\right)\bullet \frac{1}{n_i}\right\}}{N}\right\rangle }$$
(12)
$${AFE}_{CI_{90\%}}={e}^{\left\langle \ln (AFE)\pm {\Phi}_{(0.95)}^{-1}\frac{\sigma_{\ln AFE}}{\sqrt{N}}\right\rangle }$$
(13)
$${\sigma}_{\ln AFE}=\sqrt{\frac{\sum \left\{{\left(\ln \left(\frac{Obs_i}{Pred_i}\right)-\ln (AFE)\right)}^2\bullet \frac{1}{n_i}\right\}}{N}}$$
(14)

Two methods were used to assess model prediction precision and accuracy, absolute average fold error (AAFE) and root mean square fold error (RMSFE; aka ε). AAFE provides the average absolute spread of model prediction error from unity. AAFE and its corresponding 90% confidence interval (AAFECI90%) and log-transformed standard deviation (σlnAAFE) are calculated (40) with eqs. 1517.

$$AAFE={e}^{\left\langle \frac{\sum \left\{\left|\ln \left(\frac{Obs_i}{Pred_i}\right)\right|\bullet \frac{1}{n_i}\right\}}{N}\right\rangle }$$
(15)
$${AAFE}_{CI_{90\%}}={e}^{\left\langle \ln (AAFE)\pm {\Phi}_{(0.95)}^{-1}\frac{\sigma_{\ln AAFE}}{\sqrt{N}}\right\rangle }$$
(16)
$${\sigma}_{\ln AAFE}=\sqrt{\frac{\sum \left\{{\left(\left|\ln \left(\frac{Obs_i}{Pred_i}\right)\right|-\ln (AAFE)\right)}^2\bullet \frac{1}{n_i}\right\}}{N}}$$
(17)

RMSFE provides the variance of the model prediction errors from unity. RMSFE and its corresponding 90% confidence interval (RMSFECI90%) and log-transformed standard deviation (σlnε2) were calculated (40) with eqs. 1820.

$$RMSFE=\varepsilon =\sqrt{\frac{\sum \left\{{\left(\ln \left(\frac{Obs_i}{Pred_i}\right)\right)}^2\bullet \frac{1}{n_i}\right\}}{N}\ }$$
(18)
$${RMSFE}_{CI_{90\%}}=\sqrt{e^{\left\langle \ln \left({\varepsilon}^2\right)\pm {\Phi}_{(0.95)}^{-1}{\sigma}_{\ln {\varepsilon}^2}\right\rangle }}$$
(19)
$${\sigma}_{\ln {\varepsilon}^2}=\sqrt{\frac{2\bullet \left(1-{\left(\frac{\sum \left\{\ln \left(\frac{Obs_i}{Pred_i}\right)\bullet \frac{1}{n_i}\right\}}{\varepsilon \bullet N}\right)}^4\right)}{N-2}}$$
(20)

To provide more intuitive meaning to RMSFE values they were transformed into the total deviation index (DI90%) and the probability within a 2-fold error (P≤2-fold). DI90% is a measure of the fold range that captures 90% of prediction errors. It is basically the 90% confidence interval of the model predictions. DI90% and its corresponding 90% confidence interval (DI90%,CI90%) are calculated (40) with the eqs. 21 and 22.

$${DI}_{90\%}={e}^{\left\langle {\Phi}_{(0.95)}^{-1}\bullet \varepsilon \right\rangle }$$
(21)
$${DI}_{90\%,{CI}_{90\%}}={e}^{\left\langle {\phi}_{(0.95)}^{-1}\bullet \sqrt{e^{\left\langle \ln \left({\varepsilon}^2\right)\pm {\Phi}_{(0.95)}^{-1}\bullet {\sigma}_{\ln {\varepsilon}^2}\right\rangle }}\right\rangle }$$
(22)

RMSFE can alternatively be transformed into the more intuitive probability of predictions being within 2-fold of observed (P2fold). P2fold and its corresponding 90% confidence interval (P2fold,CI90%) are calculated (40) with eqs. 23 and 24.

$${P}_{2 fold}=1-\left(1-{\Phi}_{\left(\frac{\ln 2}{\varepsilon}\right)}^{-1}\right)\bullet 2$$
(23)
$${P}_{2 fold,{CI}_{90\%}}=1-\left(1-{\Phi}_{\left(\frac{\ln 2}{\sqrt{e^{\left\langle \ln \left({\varepsilon}^2\right)\pm {\Phi}_{(0.95)}^{-1}\bullet {\sigma}_{\ln {\varepsilon}^2}\right\rangle }}}\right)}^{-1}\right)\bullet 2$$
(24)

LM and HEP empirical SFs were fit in Microsoft Excel by minimizing the −2 log likelihood function (−2LL) per species, ECCS class, and clearance model (well-stirred and parallel-tube) with lognormal residual error (RMSFE) which negates bias.

Data Simulations for Parallel-Tube and Well-Stirred Clearance Models

To understand the impact of clearance model misspecification, rat HEP CLint,app was calculated from the observed in vivo CLh(b) using either the parallel-tube or well-stirred clearance models for the rat HEP ECCS 2/4 compounds. Then the observed CLh(b), Rbp, fu,p, and fu,lm and calculated CLint,app values were randomly given lognormal variability (20% CVs) ten times each to approximate measurement errors. The resulting datasets were reanalyzed by both the parallel-tube and well-stirred clearance models with application of each assay’s limits of detection.

Results

Clearance IVIVE was analyzed using four preclinical species (mouse, rat, dog, and NHP) with large datasets from Pfizer internal drug discovery compounds and commercial drugs with diverse structures. CLint SFs were developed for each ECCS class with both LMs and HEPs using the classification criteria developed previously with human data (24). In vitro ADME data and in vivo animal IV PK data were obtained from Pfizer internal database. All the data are available in the supplemental material (Excel File). Compounds with both in vitro quantifiable experimental clearance values from LMs or HEPs and in vivo clearance values were included in the IVIVE analysis. For the other in vitro ADME properties (Papp, pKa, logD, Rbp, fu,p, and fu,lm), if experimental data were not available (very few cases), in silico values from internal global QSAR models were used for the analysis. The total numbers of compounds included in the analysis are summarized in Table II for each species, ECCS class, and LM and HEP reagents. Rat had considerably more data (~350 compounds) than the other three species (i.e., ~40–100 compounds for mouse, dog and NHP). To the best of our knowledge, these are the largest datasets reported in the literature so far on clearance IVIVE development for preclinical species. Clearance empirical SFs were developed based on CLint for all species in each ECCS class (24), when there are sufficient compounds in the class (excluding class 3, as active uptake and biliary clearance can be the major clearance mechanism for this class). Both parallel-tube and well-stirred clearance models were used to convert between CLh and CLint. The results from the parallel-tube and well-stirred clearance model are summarized in Tables III and IV and Figs. 1 and 2 including SF, AAFE (absolute average fold error), DI90% (fold range that captures 90% of prediction errors) and P2fold (probability of compounds within 2-fold of the observed values). The statistical parameters (AAFE, DI90% and P2fold) were calculated after the applications of the SFs. SFs of class 2/4 are mainly accounting for metabolic clearance, since renal clearance has already been subtracted out from CLp. For class 3, SFs were not developed, as hepatobiliary clearance may contribute significantly to clearance, which were not determined in vitro or in vivo. For class 1A and 1B, due to insufficient data, SFs were not determined for mouse and dog. As transporters may play a significant role in clearance of class 1A and 1B, SFs were only developed using HEPs (but not LMs), assuming HEPs maintain some transporter functions that are relevant to in vivo. ECCS class 1A and 1B SFs were determined independently as their hepatic uptake is generally mediated by difference transporters (OAT2 and OATPs respectively) (24). The ECCS classes were assigned based on Papp values, pKa, MW and LogD7.4 using the same classification criteria as human (24). ECCS class 4 was further subdivided between compounds predominated eliminated by renal clearance (LogD7.4 ≤ 2, class 4A) and those mainly cleared through metabolism (LogD7.4 > 2, class 4B). In general, the parallel-tube model gave smaller SFs and less variable prediction of CLint than the well-stirred model using both LMs and HEPs (Tables III and IV). The IVIVE of CLint and hepatic blood clearance (CLh(b)) for both LM and HEP of each ECCS class for the different species using the parallel-tube liver model are shown in Figs. 3, 4, 5 and 6. The same IVIVE plots using the well-stirred liver model are summarized in the Supplemental Material (Figs. 2S-5S). Since the large datasets were pulled directly from the internal database, the actual clearance mechanisms involved for the induvial compounds were not investigated in detail with mechanistic information such as mass balance data and metabolite characteristics. Some compounds in the analysis could have high extrahepatic metabolism (e.g., blood hydrolysis, involvement of extrahepatic enzymes), enterohepatic recirculation (due to parent or metabolite back conversion), and/or other non-hepatic clearance mechanisms. In addition, ECCS classification has not been verified on individual compound level. As such, uncertainties and variabilities are expected on IVIVE and SFs. In general, it indicates over-prediction of in vivo clearance when the points are above the line, and it suggests under-prediction of in vivo clearance when the points are below the line.

Table II Number of Compounds Included in the Intrinsic Clearance (CLint) IVIVE Analysis
Table III Summary of Empirical Scaling Factors (SF), AAFE, DI90% and P2fold of Intrinsic Clearance (CLint) for Both Liver Microsomes (LM) and Hepatocytes (HEP) for All ECCS Classes Using the Parallel-Tube Model
Table IV Summary of Empirical Scaling Factors (SF), AAFE, DI90% and P2fold of Intrinsic Clearance (CLint) for Both Liver Microsomes (LM) and Hepatocytes (HEP) for All ECCS Classes Using the Well-Stirred Model
Fig. 1
figure 1

Summary of Empirical Scaling Factors (SF), AAFE, DI90% and P2fold of Intrinsic Clearance (CLint) for Both Liver Microsomes (LM) and Hepatocytes (HEP) for All ECCS Classes Using the Parallel-Tube Model

Fig. 2
figure 2

Summary of Empirical Scaling Factors (SF), AAFE, DI90% and P2fold of Intrinsic Clearance (CLint) for Both Liver Microsomes (LM) and Hepatocytes (HEP) for All ECCS Classes Using the Well-Stirred Model

Fig. 3
figure 3

Mouse IVIVE of Intrinsic (CLint) and Hepatic Blood Clearance (CLh(b)) for Both Liver Microsomes and Hepatocytes for ECCS Class 2/4 Compounds Using the Parallel-tube Model with unity (dashed black) and 2-fold (dotted blue) lines

Fig. 4
figure 4figure 4

Rat IVIVE of Intrinsic (CLint) and Hepatic Blood Clearance (CLh(b)) for Both Liver Microsomes and Hepatocytes for ECCS Class 2/4, 1A and 1B Compounds Using the Parallel-tube Model with unity (dashed black) and 2-fold (dotted blue) lines

Fig. 5
figure 5

Dog IVIVE of Intrinsic (CLint) and Hepatic Blood Clearance (CLh(b)) for Both Liver Microsomes and Hepatocytes for ECCS Class 2/4 Compounds Using the Parallel-tube Model with unity (dashed black) and 2-fold (dotted blue) lines

Fig. 6
figure 6figure 6

NHP IVIVE of Intrinsic (CLint) and Hepatic Blood Clearance (CLh(b)) for Both Liver Microsomes and Hepatocytes for ECCS Class 2/4, 1A and 1B Compounds Using the Parallel-tube Model with unity (dashed black) and 2-fold (dotted blue) lines

For class 2/4 compounds using the parallel-tube liver model, the SFs of intrinsic clearance for all species are ≤2.6 with the exception of mouse and rat (SF 2.4–4.6) using both LMs and HEPs with the parallel-tube model. This suggests that in vitro reagents predict in vivo clearance reasonably well and a general trend of underprediction was observed for the preclinical species. The AAFEs are generally ~1.9–2.5, DI90%s are ~4.9–8.0 and P2folds are ~42–53%. SFs of LMs is trending higher than HEPs in rat, dog and NHP, potentially due to non-CYP mediated mechanisms (only NADPH cofactor was added in the screening LM assays), although the differences are not very large. SFs for rodents are generally higher than other species. Rodents are usually the first species selected for PK studies to evaluate the in vivo properties of drug candidates. PK of higher species are typically conducted for more promising compounds to further evaluate their ADME characteristics and select species for PD and toxicity studies. In addition, rodents tend to have faster metabolism than higher species. As such, more compounds have higher clearance in rodents than the other species. Inaccuracy of physiological parameters, lower activities of the in vitro reagents, and/or higher propensity for extrahepatic clearance may also contribute to the higher SFs in rodents. The rat observed/predicted CLint ratio versus observed CLh(b) relationships for ECCS class 2/4 compounds are shown in Fig. 7 using both parallel-tube and well-stirred models to visualize any systematic bias with increasing clearance. The parallel-tube model with both LM and HEP demonstrated no increase in the observed/predicted CLint ratio even as CLh(b) neared Qh, indicating no changes of prediction bias with increasing CLh(b) (Fig. 7). Using the well-stirred model, however, the observed/predicted rat CLint ratio increased as CLh(b) was near Qh for both LM and HEP. This indicates the well-stirred model tends to underpredict in vivo CLint when clearance is high, especially near Qh, while the parallel-tube model appears to appropriately describe both low and high CLint. To understand if these observations are due to clearance model misspecification, simulations were conducted on the rat HEP ECCS 2/4 compounds (please see Data Simulation in the Methods and Materials section for details), and the results are summarized in Supplemental Material Fig. 6S. The simulations show that the well-stirred model tend to underpredict CLint for high clearance compounds if liver behaves like a well-stirred organ (i.e., can be accurately described by using the well-stirred clearance equation) and more dramatically underpredict if it behaves like the parallel-tube model (i.e., can be accurately described by using the parallel-tube clearance equation). On the contrary, the parallel-tube model predicts observed CLint well across the entire CLh(b) range when the liver behaves like the parallel-tube model and progressively overpredicts when it behaves like the well-stirred model.

Fig. 7
figure 7

Ratio of Observed to Predicted Rat Intrinsic Clearance (CLint) as a Function of In Vivo Hepatic Blood Clearance (CLh(b)) for Liver Microsomes and Hepatocytes and for ECCS Class 2/4 Compounds Using the Parallel-tube or Well-stirred Models with Unity (solid black) Line

For ECCS classes 1A and 1B, the SFs are generally higher than those of class 2/4 (see Table 1S in Supplemental Material for statistical analysis). As class 1A and 1B compounds are acids or zwitterions, transporter-mediated hepatic uptake clearance may play a role and SFs are expected to be higher than metabolism dominated class 2/4. Additionally, transporters may not be functioning at the physiological level under the assay conditions. SFs for class 1B are generally higher than class 1A, suggesting needing higher SFs for OATP substrates, as most 1B compounds are OATP substrates. Rat has high SFs for both class 1A (SF 6.64) and 1B (SF 7.73) (Fig. 1 and Table III). NHP has much higher SF for class 1B (SF 9.50) than 1A (SF 3.23) (Fig. 1 and Table III). Class 1A and 1B SFs for mouse and dog were not developed due to insufficient compounds in these classes to generate reliable SFs. The results indicate in vivo clearance can be reasonably predicted with metabolic clearance in HEP with scaling factors without the need to measure uptake clearance. For class 1A and 1B, the AAFE values are ~2.2–2.9, DI90%s are ~4.9–9.0 and P2folds are ~40–53%. DI90%s are generally higher for class 1A/1B than class 2/4 suggesting higher uncertainties in clearance prediction for class 1A/1B.

Prediction accuracies of in vivo CLint and CLh(b) were evaluated for ECCS class 2/4 compounds that show LM CLint great than 2-fold that of HEP in rat after applying the SFs. Other species were not evaluated due to insufficient compounds having disconnects between LMs and HEPs. The results are presented in Fig. 8 using the parallel-tube model and in Supplemental Material (Fig. 7S) using the well-stirred model. Rat LM appears to predict in vivo CLint more accurately than HEP for these compounds with 1.1-fold overprediction using LM versus 3.5-fold underprediction with HEP (Fig. 8). Similar prediction was observed using the well-stirred model (Supplemental Material, Fig. 7S) with rat LM being more accurate than HEP in predicting CLint for compounds with higher CLint in LM than HEP.

Fig. 8
figure 8

Rat IVIVE for ECCS 2/4 Compounds (N = 56) that Have Liver Microsome Intrinsic Clearance (CLint) Greater than 2-fold that of Hepatocytes after Applying Scaling Factors Using the Parallel-tube Model with Unity (dashed black), Bias (solid red), and 2-fold (dotted blue) Lines

Discussion

Empirical SFs are commonly used in drug discovery to estimate PK parameters, such as clearance and volume of distribution, in order to enhance prediction accuracy (3, 41,42,43,44,45,46). In this study, a large number of drug discovery compounds and commercial drugs were used to evaluate clearance IVIVE and develop SFs per ECCS class using both LM and HEP for four preclinical species. This is the largest study of clearance IVIVEs and SFs reported in the literature. In vitro LMs and HEPs generally predict clearance well and the derived SFs are relatively small (SFs ≤ 2.6) for ECCS class 2/4 compounds in most species with the exception of rodents (SFs ~2.4–4.6) using the parallel-tube model. The reason for needing larger SFs in rodents compared to higher species may be due to less active in vitro reagents, inaccuracy of physiological parameters, and higher likelihood of extrahepatic clearance. The SFs reported here are specific to the lots of LM and HEP used. When purchasing new reagents, it is important to select or prepare new lots that have similar activities as the historical lots in order to generate consistent data. If the activities of the new lots are different than the lots used to develop SFs, new SFs will need to be established for the new reagents. Both parallel-tube and well-stirred liver models were used for data analysis. The parallel-tube model produced smaller SFs, higher accuracy, and lower variability in the clearance prediction than the well-stirred model, especially for high clearance compounds. These data suggest that the parallel-tube model better represents in vivo liver physiology than the well-stirred model.

The DI90% values of predictions varied from 4.8–9.0, indicating 90% of the predictions are within 4.8–9.0 fold of the observed clearance. The P2fold values are generally ~40–53%, suggesting probably to be within 2-fold of observed CLint value is ~40–50%. These prediction accuracies are generally lower compared to actual prediction of candidates for drug discovery projects. The reason is because data are pulled directly from the database without any curation or detailed understanding of the clearance mechanisms involved at the individual compound level. Extrahepatic metabolism and non-metabolic clearance may impact the prediction accuracy. In addition, variabilities from both in vitro and in vivo studies and limited assay replication may also contribute to the uncertainties in prediction.

For some compounds, CLint values between LM and HEP are different. As screening LM assays only have NADPH cofactor with no addition of other cofactors, when HEP CLint is higher than LM, non-CYP enzymes may be involved in the metabolism (e.g., UGTs, SULTs, AO, ADH/ALDH, reductases) (12). On the other hand, when LM CLint is higher than HEP, cell membrane permeability may be limiting the apparent metabolic rate (13). For ECCS 2/4 compounds that demonstrated potential passive permeability-limited clearance in vitro (i.e., LM CLint > 2-fold HEP CLint) in rats, LM (1.1-fold overprediction) predicted rat in vivo clearance more accurately and precisely than rat HEP (3.5-fold underprediction). These observations suggest there are differences in hepatocyte membrane permeability between in vitro and in vivo. In vitro HEP permeability appears to be lower than in vivo in rats. In vivo, permeability seems to no longer limit the rate of metabolism. Further research is needed to understand the underlying mechanisms of the differences between in vitro and vivo permeability.

Conclusions

Development of clearance IVIVE and SFs is important not only for human, but also for pre-clinical species. SFs eliminate systematic biases to improve clearance prediction accuracy and minimize the impact of potential limitations of in vitro reagents due to the quality of the liver tissues (e.g., disease state, enzyme inhibition and induction due to medications and alcohol/drug abuse, and lag time to tissue collection) and preparation procedures. In this study, we evaluated clearance IVIVE and estimated SFs using both LM and HEP for mouse, rat, dog, and NHP with large sets of structurally diverse drug discovery compounds and commercial drugs. The results indicated that SFs were generally not large with the exception of rodents, suggesting that the enzyme activities of the in vitro reagents (i.e., LMs and HEPs) are reasonably preserved. The higher SFs for rodents may be due in part to a loss of metabolic enzyme activities during reagent preparation, inaccuracy of physiological scaling parameters, and/or higher propensity for extrahepatic clearance. IVIVE analysis using screening data without in-depth understanding of clearance mechanisms for individual compounds are inherently more variable and have higher uncertainties compared to well-curated data, with well-understood clearance pathways and higher replicate runs. The parallel-tube model gives smaller SFs, higher accuracy, and lower variability in predictions over a wider hepatic clearance range compared to the well-stirred model. Disconnects of CLint between LM and HEP provide additional mechanistic insights on clearance mechanisms. When CLint LM is greater (2-fold) than HEP, passive permeability may limit the rate of metabolism in HEP. In these cases, rat LM appears to predict in vivo clearance better than HEP, and permeability seems to no longer limit metabolism in vivo. However, further studies are needed to understand the underlying mechanisms of the observation. Examination of clearance IVIVE in preclinical species provides valuable information in human IVIVE translation. The SFs help to reduce systematic biases and increase the confidence in preclinical IVIVE. When there is an IVIVE disconnect of clearance in preclinical species, careful examination is needed to understand the contributing factors, such as extrahepatic metabolism (e.g., AO), transporter mediated clearance (e.g., OATP1B1/1B3), and missing clearance mechanisms (e.g., blood metabolism, renal and biliary clearance). The learnings from preclinical species can be applied to human translation of IVIVE, while considering potential species differences in clearance mechanisms.

Acknowledgements and Disclosures

Authors greatly appreciate the contributions from Pharmacokinetics, Dynamics, and Metabolism (PDM) colleagues. Authors have no conflict of interest for this work.