Introduction

Mixed-effects modelling is nowadays established as a gold-standard approach for the analysis of longitudinal pharmacokinetics (PK) and pharmacodynamics (PD) data. These models are widely used for their ability to describe different levels of variability, and in particular inter-individual variability. Usually, a mixed-effect model is composed by two main components: the model for the observations including the structural model and the residual error model; and the model for the individual parameters, including their relationships with the individual covariates as well as the correlation structure of the random effects (1,2).

Model diagnosing represents a key activity aimed at building confidence around the developed models before using them for any purpose, such as prediction or simulation. Several diagnostic tools already exist for evaluating the structural model and the residual error model; among them the individual fits, the residual-based diagnostic plots and prediction versus observation plots (35). Visual predictive checks (VPC) and posterior predictive checks (PPC) are also powerful tools based on the posterior predictive distribution for evaluating simultaneously all the features of the model (6,7).

Herein, we will focus on diagnosing the model for the individual parameters where diagnostics are often performed to check their marginal distributions, to detect some possible relationships between individual parameters and covariates, or some possiblecorrelations between random effects. Corresponding diagnostic plots are usually based on the empirical Bayes estimates (EBEs) of the individual parameters and EBEs of the random effects.

It is known that the use of EBEs for diagnostic plots and statistical tests is efficient with rich data, when a significant amount of information is available in the data for recovering accurately all the individual parameters. On the contrary, in case of sparse data, tests and plots can be misleading when the estimates of the individual parameters shrink towards the same population values. Diagnostic tools relying on EBEs are therefore not recommended for high shrinkage (8,9).

The objective of a diagnostic tool is two fold: first is to check if the assumptions made on individual parameters are valid; then, if some assumptions are rejected, diagnosis tools should give some guidance on how to improve the model. Model diagnostics is therefore used to eliminate model candidates that do not seem capable of reproducing the observed data (2,4,5). In such process of model building, by definition, none of the features of the “final model” should be rejected. As is the usual case in statistics, it is not because this “final” model has not been rejected that it is necessarily the “true” one. All that we can say is that the experimental data does not allow us to reject it. It is merely one of perhaps many models that cannot be rejected.

The objective of this paper is to propose a new approach for diagnosing models using individual conditional distribution and formally compare this method to the EBE-based classical approach through numerical experiments based on simulated data.

There exists few useful methods for statistical testing in mixed-effects models. Several existing test procedures only concern linear mixed-effects models (1012) or generalized mixed-effects models (1315). Furthermore, the aim of most of these procedures is to detect possible misspecifications of the random-effects structure. Other specific features of the model are considered by several authors, such as the normality of the random effects (16,17), or the error distribution (18).

Bootstrap is a popular method for the global validation of a nonlinear mixed-effects model (19). Even if bootstrapping is an appealing approach, it requires an important computing effort for validating a single model which needs to be fitted many times. Then, it cannot be used for model building, but only for validating the final model. Another method for a global test and which relies on the use of a random projection technique is described in (20).

We propose a general approach for testing separately several features of a mixed-effects model. The method consists in generating individual parameters and individual random effects using their conditional distributions. Then, the sampled parameters, the sampled random effects and the original observations can be used together for producing diagnostic plots and building statistical tests.

Herein, we use a one compartment PK model for oral administration to illustrate the practical properties of the proposed method. The design is such that a limited information about the individual absorption rate constant ka i , for individual i,can be obtained from the data. We compare then diagnostic plots and statistical tests when parameters are given by the EBEs or by a random sample of the conditional distributions.

Methods

Empirical Bayes Estimates Versus Random Sampling from the Conditional Distribution

Calculating the EBE of an individual parameter consists in estimating ψ i by maximizing the conditional distribution p(ψ i |y i ) where y i  = (y ij , 1 ≤ j ≤ n i ) is a sequence of observations. This conditional mode, also known as the maximum a posteriori (MAP) estimate of ψ i , is the most likely value of the individual parameter ψ i , given the observations and a given population distribution p(ψ i ).

However, when the data are sparse, individual estimates of a parameter can “shrink” towards the same population value, which is the mode of the population distribution of this parameter. For a parameter ψ i which is a function of a random effect η i , we can quantify this phenomena by defining the so-called η -shrinkage (9) as:

$$ \eta -\mathrm{shrinkage}=1-\frac{var\left({\hat{\eta}}_i\right)}{\omega^2}, $$

where \( \mathrm{v}\mathrm{a}\mathrm{r}\left({\hat{\eta}}_i\right) \) is the empirical variance of the \( {\hat{\eta}}_i \)’s and \( {\hat{\eta}}_i \) the empirical Bayes estimate of η i that maximizes p(η i |y i ).

Saying that the observations y i provide little information about η i means that \( {\hat{\eta}}_i \) is close to 0. This results as a high level of shrinkage (close to 1) whenever \( \mathrm{v}\mathrm{a}\mathrm{r}\left({\hat{\eta}}_i\right)\ll {\omega}^2 \). Estimates of the ψ i are therefore biased because they do not correctly reflect the marginal distribution p(ψ i ). In particular, their empirical variance is much reduced.

Alternatively, individual parameters ψ i can be drawn from the conditional distribution p(ψ i |y i ) rather than taking the mode. The resulting estimator is unbiased in the following sense:

$$ p\left({\psi}_i\right)=\mathbb{E}\left(p\left({\psi}_i\Big|{y}_i\right)\right). $$
(1)

This relationship is a fundamental one when considering mixed-effects models. It means that, if we randomly draw a vector y i of observations for an individual in a population and then generate a vector ψ i using the conditional distribution p(ψ i |y i ), the distribution of ψ i is the population distribution p(ψ i ). In other words, even if each ψ i is randomly generated using its own conditional distribution, the fact of pooling them allows us to look at them as if they were a sample from p(ψ i ).

A consequence of this important property is that any diagnostic plot based on such simulated individual parameters can be used confidently.

To conceptually illustrate the difference between EBEs and sampled parameters, we will consider 10 individual parameters ψ 1, …, ψ 10 and 10 observations y 1, …, y 10 so that each individual only has one unique observation. The 10 conditional distributions and their modes (i.e. the 10 EBEs) are shown Fig. 1a while 10 parameters randomly sampled from these distributions are displayed Fig. 1b.

Fig. 1
figure 1

Illustration of how sampling from conditional distributions can describe population distribution compared to EBEs in case of shrinkage. (ab) Conditional distributions of ψ 1, …, ψ 10 and (a) the EBEs maximizing these 10 conditional distributions (circles), (b) individual parameters sampled from these 10 conditional distributions (stars); (cd) Population distribution of ψ and (c) the EBEs, (d) the sampled parameters. The model used to generate this illustration is: y i  = ψ i  + ε i where \( {\psi}_i\sim \mathcal{N}\left({\psi}_{\mathrm{pop}},{\omega}^2\right) \) and \( {\varepsilon}_i\sim \mathcal{N}\left(0,{\sigma}^2\right) \). In that case, the conditional distribution of ψ i given y i is a normal distribution with mean μ i  = V(y i /σ 2 + ψ pop/ω 2) and variance V = σ 2 ω 2/(σ 2 + ω 2). We used ψ pop = 10 and ω 2 = σ 2 = 1 for this numerical example.

Because of the η -shrinkage (there is only one observation per individual), we can see Fig. 1c that the empirical distribution of the EBEs is concentrated around the mean ψ pop of the population distribution. On the other hand, Fig. 1d shows that the empirical distribution of the sampled parameters correctly represents this population distribution.

Pharmacokinetic Model

Throughout the manuscript, we will use a simple and classical PK model to illustrate the proposed approach for model diagnostic and hypothesis testing. The model is a one compartment PK model for single oral administration, with first order absorptionand linear elimination:

$$ C\left(t,\psi \right)=\frac{D\ ka}{V\ ka-Cl}\left({e}^{-\left(Cl/V\right)\ t}-{e}^{- ka\ t}\right). $$

where D is the amount of drug administered at time 0. Here, the PK parameters are ψ = (ka, V, Cl).

We then assume an exponential error model for the observed concentration:

$$ {y}_{ij}=C\left({t}_{ij},{\psi}_i\right){e}^{\varepsilon_{ij}}, $$

where \( {\varepsilon}_{ij}\sim \mathcal{N}\left(0,{a}^2\right) \). Here, y ij is the concentration measured for patient i at time t ij and ψ i  = (ka i , V i , Cl i ).

We assume that the individual PK parameters a i , V i and Cl i are log-normally distributed. Furthermore, a linear relationship between log-weight and each log-parameter is assumed:

$$ \log\ \left(k{a}_i\right)= \log\ \left(k{a}_{\mathrm{pop}}\right)+{\beta}_{ka} \log\ \left({w}_i/{w}_{\mathrm{pop}}\right)+{\eta}_{ka,i}, \log\ \left({V}_i\right)= \log\ \left({V}_{\mathrm{pop}}\right)+{\beta}_V \log\ \left({w}_i/{w}_{\mathrm{pop}}\right)+{\eta}_{V,i}, \log\ \left(C{l}_i\right)= \log\ \left(C{l}_{\mathrm{pop}}\right)+{\beta}_{Cl} \log\ \left({w}_i/{w}_{\mathrm{pop}}\right)+{\eta}_{Cl,i}, $$

where w i is the weight of patient i and w pop the typical weight in the population.

The random effects are normally distributed: \( {\eta}_i=\left({\eta}_{ka,i},{\eta}_{V,i},{\eta}_{Cl,i}\right)\sim \mathcal{N}\left(0,\Omega \right) \). Variances of the random effects, i.e. the diagonal elements of Ω, are (ω 2 ka , ω 2 V , ω 2 Cl ) and the correlations between random effects are (r ka,V , r ka,Cl , r V,Cl ).

PK data for N = 150 patients were simulated with this model using D = 100 mg and the following values of the population parameters: ka pop = 1, V pop = 10, Cl pop = 1, ω ka  = 0.3, ω V  = 0.2, ω Cl  = 0.2 and a = 0.15. The volume is function of weight in this example: β V  = 1 while β ka  = β Cl  = 0. We furthermore assume that log-volume and log-clearance are positively correlated: r V,Cl  = 0.6 while r ka,V  = r ka,Cl  = 0.

Individual weights were sampled from a normal distribution with mean w pop = 70 kg and standard deviation 7 kg.

Simulated Designs and Tested Model

We used a design with 3 sampling times per patient: (2 h, 4 h, 8 h, 12 h) for 1 ≤ i ≤ 50; (4 h, 12 h, 24 h, 48 h) for 51 ≤ i ≤ 100 and (1 h, 8 h, 12 h, 24 h) for 101 ≤ i ≤ 150. The simulated concentrations of the 150 individuals are displayed Fig. 2.

Fig. 2
figure 2

Simulated PK data for 150 patients with three different designs (50 patients per design). Simulated PK data of individuals subjected to design 1 (2 h, 4 h, 8 h, 12 h) are depicted in red. Design 2 (1 h, 8 h, 12 h, 24 h) in green, and design 3 (4 h, 12 h, 24 h, 48 h) in blue.

While the data were simulated with a model where β V  = 1 and r V,Cl  = 0.6, we first fitted this data with a “wrong” model, where β V  = 0 and r V,Cl  = 0.

Diagnosing Tools

The process of model building is an iterative process where, at each iteration, we make some hypotheses about the joint distribution of the individual PK parameters.

We then fit this model to the PK data and produce some diagnostic plots. The objective of these plots is to evaluate graphically which of the hypotheses can be considered as valid and which one should be rejected. Then, rejecting some of the hypotheses leads to proposing a new model which in turn needs to be evaluated. Ideally, this process of model building should lead to a final model for which none of the diagnostic plots detect any misspecification.

Here, we will make the following hypotheses:

  • The PK parameters are log-normally distributed.

    1. 1.

      There is no relationship between the covariate (the weight) and the PK parameters:

β ka  = β V  = β Cl  = 0.

  1. 2.

    There is no correlation between random effects:

r ka,V  = r ka,Cl  = r V,Cl  = 0.

We will fit this model to the simulated data, calculate individual parameters by using EBE or by sampling from the conditional distributions and produce the following plots:

  • Comparison of the empirical distribution of the (ψ i ) with their theoretical distribution given by the model. We can for instance compare histograms and probability density functions (pdf).

  • Visualization of the possible relationships between covariates and parameters, or between covariates and random effects, through scatter plots.

  • Visualization of the possible relationships between random effects through scatter plots.

Using diagnostic plots for model building remains quite empirical. Indeed, there is no well defined decision rule to decide which hypotheses made on the model are incorrect and should be rejected. Some quantitative criteria associated to these plots might be helpful for the modeller to take such decision. In other word, we would like to derive formal statistical tests from diagnostic plots.

Testing our hypotheses about the distribution of the individual PK parameters can be carried out with some standard statistical tests, such as:

  • Kolmogorov-Smirnov test for testing the fit of distributions,

  • Pearson’s test for testing linear correlations between - possibly transformed - covariates and random effects: the test statistic is based on Pearson’s correlation coefficient and follows a t-distribution with N-2 degrees of freedom if the samples follow independent normal distributions.

  • Pearson’s test for testing linear correlations between random effects.

The fundamental property (1) ensures that any statistical test based on such sampled individual parameters is unbiased: the effective level of the implemented test is precisely the desired level α. We can then expect that each of the proposed statistical test will wrongly reject the null hypothesis (i.e. reject the model being evaluated when it is correct) with probability α.

Controlling the level of each of these tests is important of course, but their role is mainly to detect misspecification in the model. It is therefore essential to also evaluate the power of these tests in order to know which kind of misspecification can be identified with a reasonable probability.

Sampling Individual Parameters from Conditional Distributions

In practice, sampling ψ i from the conditional distribution p(ψ i |y i ) can be done by Markov Chain Monte Carlo (MCMC) (21).

For the numerical experiments presented below, we used the Metropolis-Hastings (MH) algorithm described in (2) and that combines several proposal distributions. in order to get samples of the individual conditional distributions, we run 200 iterations of this algorithm and used 10 independent Markov chains per individual. We then kept the individual parameters obtained from all the chains at the last iteration.

This MH algorithm was initially implemented in MonolixFootnote 1 together with the SAEM algorithm used for the estimation of the population parameters (2). Monolix then returns individual parameters and random effects sampled from the conditional distributions and use them for the diagnostic plots.

On the other hand, diagnostic plots derived from NONMEMFootnote 2 are only based on EBEs (individual parameters and random effects). Nevertheless, since the MCMC algorithm implemented in Monolix is also implemented in NONMEM, it should be possible to also return the sampled individual parameters together with the EBEs.

Results

Diagnostic Plots

Estimated parameters under this model are: \( {\hat{ka}}_{\mathrm{pop}}=0.99 \), \( {\hat{V}}_{\mathrm{pop}}=10.2 \), \( {\hat{Cl}}_{\mathrm{pop}}=1.02 \), \( {\hat{\omega}}_{ka}=0.14 \), \( {\hat{\omega}}_V=0.25 \), \( {\hat{\omega}}_{Cl}=0.19 \), \( \hat{a}=0.156 \).

Even if the number of subjects is quite large (N = 150), the data can be considered sparse (4 sampling points per individual), providing especially a limited information on the absorption process. The η − shrinkage for ka, V and Cl are respectively 88%, 20% and 20% when it is computed using the EBEs. Then, even if the histograms of the EBEs displayed Fig. 3 (top row) look quite different from the log-normal distributions obtained with theestimated population parameters (in solid red lines), we cannot conclude that the population distributions of the individual PK parameters are misspecified.

Fig. 3
figure 3

Diagnostic plots with EBEs. Top row: Empirical distributions of the individual parameters maximizing the conditional distributions. The estimated population pdf’s are displayed in solid line. Middle row: Relationships between log-weight and random effects maximizing the conditional distributions. Bottom row: Relationships between random effects maximizing the conditional distributions.

Fig. 3 (middle row) shows that identification of relationships between covariates and individual parameters is much less sensitive to shrinkage: EBEs correctly identify the linear relationship existing between log-weight and log-volume, while the other PK parameters ka and Cl do not clearly seem to be function of weight. This good behavior can be explained by the fact that such relationship is related to the central tendency of the distributions of the PK parameters, which ispretty well approximated by the modes.

The η − shrinkage also strongly impacts the joint distribution of the random effects. We can see Fig. 3 (bottom row) that the joint distribution of the estimates of the random effects does not reflect correctly the true distribution. Artificial correlations wrongly appear between all the random effects. This diagnostic plot does not allow to detect that only the correlation between η V and η Cl is relevant.

On the other hand, creating diagnostic plots based on sampled parameters and sampled random effects allows us to use all these diagnostic plots for decision making. Figure 4 (top row) shows a very nice fit between the empirical distributions of the individual PK parameters and the theoretical pdf’s. Based on this plot, we can conclude that there is no reason for rejecting the hypothesis that the three PK parameters are log-normally distributed.

Fig. 4
figure 4

Diagnostic plots with sampled parameters. Top row: Empirical distributions of the individual parameters sampled from the conditional distributions. The estimated population pdf’s are displayed in solid line. Middle row: Relationships between log-weight and random effects sampled from the conditional distributions. Bottom row: Relationships between random effects sampled from the conditional distributions. For visual purpose, the conditional distributions were sampled five times for each individual resulting in generating 5 times more points than in the previous figure.

Figure 4 (middle row) shows a correlation between log (w) and log (V): based on this plot, we can then reject the hypothesis that β V  = 0 while, in the contrary, there is no reason for rejecting the hypothesis that β ka  = β Cl  = 0.

Only a correlation between η V and η Cl clearly appears Fig. 4 (bottom row): we can then reject the hypothesis that r V,Cl  = 0, as it was assumed in the model. On the other hand, there is no reason for rejecting the hypothesis that r ka,V  = r ka,Cl  = 0.

Statistical Tests

Type I Error

In the following, we will test formally each of the hypothesis of the model being evaluated and check if the “true” model can be identified.

First, we can test separately if each PK parameter follows a log-normal distribution defined by the estimated parameters. Table I gives the p-values of the Kolmogorov-Smirnov tests when either the EBEs or the sampled individual PK parameters are used. Results confirm what could be seen in Fig. 3: because of the strong shrinkage for ka, EBEs of ka do not follow the estimated population distribution. On the other hand, the tests based on sampled PK parameters are not affected by the η -shrinkage. Then, both the diagnostic plot displayed Fig. 4 and the p-values of these three tests can be used with confidence to decide not to reject the hypothesis that the PK parameters are log-normally distributed.

Table I Kolmogorov-Smirnov tests for the probability distributions of the individual PK parameters, the relationships between weight and parameters, and the joint distribution of the random effects

Then, we can test if there exists some linear correlation between the log-weight and the log-parameters. Results presented Table I show that EBEs and sampled parameters give similar results for these tests. Both lead to the conclusion that a significant correlation exists between log-weight and log-volume.

Lastly, we can test if there exists some linear correlation between the random effects.

Results presented Table I confirm what could be seen in Figs. 3 and 4: empirical Bayes estimates of the η’s creates some artificial correlations while sampled random effects correctly reproduce the correlation structure of the randomeffects. Based on this test, we can conclude with high confidence that r(η V , η Cl ) ≠ 0.

Controlling the significance level α of a statistical test means that the effective rate of type I error, i.e. the rate of falsely rejected null hypotheses, is expected to be α. Here, the probability to reject the null hypothesis cannot be computed in a closed form, but it can be estimated by Monte-Carlo. We have simulated 500 replicates of the data under the null hypothesis and performed each of the proposed statistical tests at levels α = 0.05 and α = 0.10, for each replicate, and using either the EBEs or the sampled parameters and random effects.

Table II provides the rates of falsely rejected null hypotheses for each of these tests. We can see that the significance levels α = 0.05 and α = 0.10 are very well controlled when sampled PK parameters and sampled random effects are used for any of these tests.

Table II Rates of falsely rejected null hypotheses for statistical tests with significance level α = 0.05 using either the EBEs or the sampled individual PK parameters

On the other hand, a strong bias is observed when the EBEs are used for testing the distribution of the parameters or the correlation between random effects. The rate of type I error of the tests concerning the relationship between weight and PK parameters is more correctly controlled.

Power of the Tests

We finally explore how the statistical tests behave under several alternative hypotheses. We have simulated 200 replicates of the same experiment using the previous design and under various parameter scenari. For each of these scenari, the estimated power of the test is the proportion of rejected null hypotheses among the 200 replicates. Only tests of level α = 0.05 have been performed for this power analysis (similar conclusions were obtained with α = 0.10).

In the first experiment, the values of population PK parameters used for drawing the individual PK parameters are different from the values defining the null hypothesis, i.e. ka pop = 1, V pop = 10 and Cl pop = 1. Figure 5 top row confirms that tests based on EBEs are biased and should not be used for testing the marginal distribution of the parameters. Indeed, even if they look quite powerful, the type I error is significantly overestimated. The threetests based on sampled parameters are unbiased, even if a misspecification in the distribution of ka is difficult to detect with this design.

Fig. 5
figure 5

Power of the statistical tests. Top row: testing the probability distributions of the parameters. Middle row: testing a linear correlation between log-weight and log-parameters. Bottom row: testing linear correlation between parameters. Blue and yellow curves show power when using EBEs and sampled parameters respectively.

We see Fig. 5 bottom row that an effect of moderate size of the covariate (weight) on V and Cl is correctly detected using either the EBEs or the sampled PK parameters. On the other hand, the design only allows to detect an important effect on ka. We also see that, contrary to the previous examples, tests based on EBEs can be used for detecting a relationship between the covariate and an individual parameter. Indeed, these tests seem to be unbiased and slightly more powerful than the tests based on sampled parameters.

We then investigate if linear correlations between PK parameters can be detected. Figure 5 shows that a clear bias is introduced when EBEs are used, while correlation between V and Cl is correctly detected with sampled parameters. Because of the η -shrinkage on ka, only strong correlations (positiveor negative) between ka and V or ka and Cl can be detected with this method.

Application to the Warfarin PK Data

We will now use the pharmacokinetics of warfarin (22) to illustrate the proposed method. Thirty two healthy volunteers received a 1.5 mg/kg single oral dose of warfarin, an anticoagulant used in the prevention of thrombosis. Supplemental Figure S1 shows the warfarin plasmatic concentration for these patients measured at different times.

We will consider a one compartment PK model, assuming a first-order absorption process after a lag-time and a linear elimination:

$$ C\left(t,\psi \right)=\left\{\begin{array}{ll}\frac{D\ ka}{V\ ka-Cl}\left({e}^{-Cl/V\ \left(t- Tlag\right)}-{e}^{- ka\ \left(t- Tlag\right)}\right)\hfill & \mathrm{if}\kern0.5em t\ge Tlag\kern0.5em \begin{array}{c}\hfill \kern0.5em \hfill \\ {}\hfill 0, otherwise.\hfill \end{array}\hfill \end{array}\right. $$

Here, ψ = (Tlag, ka, V, Cl).

We assume log-normal distributions for these 4 PK parameters and a diagonal variance-covariance matrix Ω for the random effects. The residual error model for the observations is a combined error model of the form y ij  = C(y ij , i i ) + (a + bC(y ij , ψ i ))ε ij .

We used Monolix 2016R1 for fitting this model to the warfarin PK data. The empirical distribution of the EBEs of the random effects is displayed Supplemental Figure S2 (top row) and shows a strong shrinkage for the absorption parameters Tlag and ka. Indeed, more than half of the patients have no measurements during the first 24 hours. Then, we merely use the population parameters to predict Tlag and ka for these patients. This large shrinkage does not mean that the model is misspecified, but that the data does not allow us to correctly estimate these individual parameters. As a consequence, EBEs cannot be used for diagnosing the model.

On the other hand, individual PK parameters sampled from the conditional distributions can be used with confidence. The distribution of the random effects displayed Supplemental Figure S2 (bottom row) shows that there is no reason for rejecting the hypothesis of log-normal distributions. A relationship between weight and volume is clearly visible Supplemental Figure S3 as well as a possible relationship between weight and clearance. Lastly, Supplemental Figure S4 identifies a correlation between η V and η Cl . The hypothesis of independent random effects should also be rejected.

In summary, based on these diagnostic plots and statistical tests, the model used for fitting the warfarin PK data should be rejected. A new model to be tested should integrate a correlation between η V and η Cl , a relationship between weight and volume, and possibly between weight and clearance.

Conclusions

In this manuscript, we propose a new method for deriving individual parameters used in a diagnostic perspectives. Instead of using the classical approach of maximizing each conditional distribution, we show that randomly sampling these distribution leads to reliable results and can complement the EBE-based approach widely used. In particular, we show that each proposed test is unbiased, the type I error rate is the desired significance level of the test and the probability to detect a misspecification in the model increases with the magnitude of this misspecification. This method can therefore be used efficiently, possibly in combination with other diagnostic tools, to drive model building in population PKPD analyses.

Our numerical experiments confirmed that EBEs for assessing the distribution of the individual parameters and/or the correlation structure of the random effects may introduce strong biases when η -shrinkage is important. However, interestingly, in our example, we show that the effect of a continuous covariate on a PK parameter is correctly detected using either the EBEs or the sampled parameters. The numerical tests also revealed that the sampling of conditional distribution can also suffer and results in lack of power in presence of η -shrinkage (see Fig. 5).

Thus, even if using EBEs can be helpful for the search of misspecifications, it appears not to be a reliable methods for validation of the final model and sampled parameters should always be used for this aim.

Herein, we only addressed the problem of diagnosing the model for the individual parameters, but the same approach could be developed for other diagnostic plots and for testing other components of the model including residual error model, structural model, and handling of BLQ data.