ForLion: a new algorithm for D-optimal designs under general parametric statistical models with mixed factors

Huang, Yifei; Li, Keren; Mandal, Abhyuday; Yang, Jie

doi:10.1007/s11222-024-10465-x

ForLion: a new algorithm for D-optimal designs under general parametric statistical models with mixed factors

Original Paper
Published: 18 July 2024

Volume 34, article number 157, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Statistics and Computing Aims and scope Submit manuscript

ForLion: a new algorithm for D-optimal designs under general parametric statistical models with mixed factors

Download PDF

Yifei Huang¹,
Keren Li²,
Abhyuday Mandal³ &
…
Jie Yang¹

136 Accesses
Explore all metrics

Abstract

In this paper, we address the problem of designing an experimental plan with both discrete and continuous factors under fairly general parametric statistical models. We propose a new algorithm, named ForLion, to search for locally optimal approximate designs under the D-criterion. The algorithm performs an exhaustive search in a design space with mixed factors while keeping high efficiency and reducing the number of distinct experimental settings. Its optimality is guaranteed by the general equivalence theorem. We present the relevant theoretical results for multinomial logit models (MLM) and generalized linear models (GLM), and demonstrate the superiority of our algorithm over state-of-the-art design algorithms using real-life experiments under MLM and GLM. Our simulation studies show that the ForLion algorithm could reduce the number of experimental settings by 25% or improve the relative efficiency of the designs by 17.5% on average. Our algorithm can help the experimenters reduce the time cost, the usage of experimental devices, and thus the total cost of their experiments while preserving high efficiencies of the designs.

Optimal design of multifactor experiments via grid exploration

Article 13 September 2021

A-ComVar: A Flexible Extension of Common Variance Designs

Article 16 January 2020

Approximate and exact optimal designs for $2^k$ factorial experiments for generalized linear models via second order cone programming

Article 02 January 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Our research is motivated by an experiment on the emergence of house flies for studying biological controls of disease-transmitting fly species (Itepan 1995; Zocchi and Atkinson 1999). In the original experiment (Itepan 1995), $n=3500$ pupae were grouped evenly into seven subsets and exposed in a radiation device tuned to seven different gamma radiation levels $x_i \in \{80, 100, 120, 140, 160, 180, 200\}$ in units Gy, respectively. After a certain period of time, each pupa had one of three possible outcomes: unopened, opened but died (before completing emergence), or completed emergence. The total experimental costs in time and expense were closely related to the number of distinct settings of the radiation device. By searching the grid-1 settings in [80, 200] using their lift-one algorithm, Bu et al. (2020) proposed a design on five settings $\{80, 122, 123, 157, 158\}$ and improved the relative efficiency of the original design by $20.8\%$ in terms of D-criterion. More recently, Ai et al. (2023) obtained a design focusing on four settings $\{0, 101.1, 147.8, 149.3\}$ by employing an algorithm that combines the Fedorov-Wynn (Fedorov and Leonov 2014) and lift-one (Bu et al. 2020) algorithms on searching the continuous region [0, 200]. Having noticed that both Bu et al. (2020)’s and Ai et al. (2023)’s designs contain pairs of settings that are close to each other, we propose a new algorithm, called the ForLion algorithm (see Sect. 2), that incorporates a merging step to combine close experimental settings while maintaining high relative efficiency. For this case, our proposed algorithm identifies a D-optimal design on $\{0, 103.56, 149.26\}$, which may lead to a $40\%$ or $25\%$ reduction of the experimental cost compared to Bu et al. (2020)’s or Ai et al. (2023)’s design, respectively.

In this paper, we consider experimental plans under fairly general statistical models with mixed factors. The pre-determined design region $\mathcal{X} \subset {\mathbb {R}}^d$ with $d\ge 1$ factors is compact, that is, bounded and closed, for typical applications (see Sect. 2.4 in Fedorov and Leonov (2014)). For many applications, ${{\mathcal {X}}} = \prod _{j=1}^d I_j$, where $I_j$ is either a finite set of levels for a qualitative or discrete factor, or an interval $[a_j, b_j]$ for a quantitative or continuous factor. To simplify the notations, we assume that the first k factors are continuous, where $0\le k\le d$, and the last $d-k$ factors are discrete. Suppose $m\ge 2$ distinct experimental settings ${{\textbf{x}}}_1, \ldots , {{\textbf{x}}}_m \in \mathcal{X}$, known as the design points, are chosen, and $n > 0$ experimental units are available for the experiment with $n_i \ge 0$ subjects allocated to ${\textbf{x}}_i$, such that, $n=\sum _{i=1}^m n_i$ . We assume that the responses, which could be vectors, are independent and follow a parametric model $M({{\textbf{x}}}_i; \varvec{\theta })$ with some unknown parameter(s) $\varvec{\theta }\in {\mathbb {R}}^p$, $p\ge 1$. In the design theory, ${{\textbf{w}}} = (w_1, \ldots , w_m)^T = (n_1/n, \ldots , n_m/n)^T$, known as the approximate allocation, is often considered instead of the exact allocation ${\textbf{n}} = (n_1, \ldots , n_m)^T$ (see, for examples, Kiefer (1974), Sect. 1.27 in Pukelsheim (2006), and Sect. 9.1 in Atkinson et al. (2007)). Under regularity conditions, the corresponding Fisher information matrix is ${{\textbf{F}}} = \sum _{i=1}^m w_i{\textbf{F}}_{{{\textbf{x}}}_i} \in {\mathbb {R}}^{p\times p}$ up to a constant n, where ${{\textbf{F}}}_{{{\textbf{x}}}_i}$ is the Fisher information at ${{\textbf{x}}}_i$ . In this paper, the design under consideration takes the form of $\varvec{\xi }= \{({{\textbf{x}}}_i, w_i), i=1, \ldots , m\}$, where m is a flexible positive integer, ${\textbf{x}}_1, \ldots , {{\textbf{x}}}_m$ are distinct design points from $\mathcal{X}$, $0\le w_i \le 1$, and $\sum _{i=1}^m w_i = 1$. We also denote $\varvec{\Xi }= \{\{({{\textbf{x}}}_i, w_i), i=1, \ldots , m\} \mid m\ge 1; {{\textbf{x}}}_i \in \mathcal{X}, 0\le w_i\le 1, i=1, \ldots , m; \sum _{i=1}^m w_i = 1\}$ be the collection of all feasible designs.

Under different criteria, such as D-, A-, or E-criterion (see, for example, Chapter 10 in Atkinson et al. (2007)), many numerical algorithms have been proposed for finding optimal designs. If all factors are discrete, the design region ${{\mathcal {X}}}$ typically contains a finite number of design points, still denoted by m. Then the design problem is to optimize the approximate allocation ${{\textbf{w}}} = (w_1, \ldots , w_m)^T$. Commonly used design algorithms include Fedorov-Wynn (Fedorov 1972; Fedorov and Hackl 1997), multiplicative (Titterington 1976, 1978; Silvey et al. 1978), cocktail (Yu 2011), and lift-one (Yang and Mandal 2015; Yang et al. 2017; Bu et al. 2020), etc. Besides, classical optimization techniques such as Nelder-Mead (Nelder and Mead 1965), quasi-Newton (Broyden 1965; Dennis and Moré 1977), conjugate gradient (Hestenes and Stiefel 1952; Fletcher and Reeves 1964), and simulated annealing (Kirkpatrick et al. 1983) may also be used for the same purpose (Nocedal and Wright 2006). A comprehensive numerical study by Yang et al. (2016) (Table 2) showed that the lift-one algorithm outperforms commonly used optimization algorithms in identifying optimal designs, resulting in designs with fewer points.

Furthermore, many deterministic optimization methods may also be used for finding optimal designs under similar circumstances. Among them, polynomial time (P-time) methods including linear programming (Harman and Jurík 2008), second-order cone programming (Sagnol 2011), semidefinite programming (Duarte et al. 2018; Duarte and Wong 2015; Venables and Ripley 2002; Wong and Zhou 2023; Ye and Zhou 2013), mixed integer linear programming (Vo-Thanh et al. 2018), mixed integer quadratic programming (Harman and Filová 2014), mixed integer second-order cone programming (Sagnol and Harman 2015), and mixed integer semidefinite programming (Duarte 2023), are advantageous for discrete grids due to their polynomial time complexity and capability of managing millions of constraints efficiently. Notably, nonlinear polynomial time (NP-time) methods, such as nonlinear programming (Duarte et al. 2022), semi-infinite programming (Duarte and Wong 2014), and mixed integer nonlinear programming (Duarte et al. 2020) have been utilized as well.

When the factors are continuous, the Fedorov-Wynn algorithm can still be used by adding a new design point in each iteration, which maximizes a sensitivity function on ${{\mathcal {X}}}$ (Fedorov and Leonov 2014). To improve the efficiency, Ai et al. (2023) proposed a new algorithm for D-optimal designs under a continuation-ratio link model with continuous factors, which essentially incorporates the Fedorov-Wynn (for adding new design points) and lift-one (for optimizing the approximate allocation) algorithms. Nevertheless, the Fedorov-Wynn step tends to add unnecessary closely-spaced design points (see Sect. 3.2), which may increase the experimental cost. An alternative approach is to discretize the continuous factors and consider only the grid points (Yang et al. 2013), which may be computationally expensive especially when the number of factors is moderate or large.

Little has been done to construct efficient designs with mixed factors. Lukemire et al. (2019) proposed the d-QPSO algorithm, a modified quantum-behaved particle swarm optimization (PSO) algorithm, for D-optimal designs under generalized linear models with binary responses. Later, Lukemire et al. (2022) extended the PSO algorithm for locally D-optimal designs under the cumulative logit model with ordinal responses. However, like other stochastic optimization algorithms, the PSO-type algorithms cannot guarantee that an optimal solution will ever be found (Kennedy and Eberhart 1995; Poli et al. 2007).

Following Ai et al. (2023) and Lukemire et al. (2019, 2022), we choose D-criterion, which maximizes the objective function $f({\varvec{\xi }}) = \left| {\textbf{F}}(\varvec{\xi }) \right| = \left| \sum _{i=1}^m w_i {{\textbf{F}}}_{{{\textbf{x}}}_i}\right| $, $\varvec{\xi }\in \varvec{\Xi }$. Throughout this paper, we assume $f({\varvec{\xi }}) > 0$ for some $\varvec{\xi }\in \varvec{\Xi }$ to avoid trivial optimization problems. Unlike Bu et al. (2020) and Ai et al. (2023), the proposed ForLion algorithm does not need to assume $\textrm{rank}({{\textbf{F}}}_{{\textbf{x}}}) < p$ for all ${{\textbf{x}}} \in \mathcal{X}$ (see Remark 1 and Example 1). Compared with the PSO-type algorithms for similar purposes (Lukemire et al. 2019, 2022), our ForLion algorithm could improve the relative efficiency of the designs significantly (see Example 3 for an electrostatic discharge experiment discussed by Lukemire et al. (2019) and Sect. S.3 of the Supplementary Material for a surface defects experiment (Phadke 1989; Wu 2008; Lukemire et al. 2022)). Our strategies may be extended to other optimality criteria, such as A-optimality, which minimizes the trace of the inverse of the Fisher information matrix, and E-optimality, which maximizes the smallest eigenvalue of the Fisher information matrix (see, for example, Atkinson et al. (2007)).

The remaining parts of this paper are organized as follows. In Sect. 2, we present the ForLion algorithm for general parametric statistical models. In Sect. 3, we derive the theoretical results for multinomial logistic models (MLM) and revisit the motivated example to demonstrate our algorithm’s performance with mixed factors under general parametric models. In Sect. 4, we specialize our algorithm for generalized linear models (GLM) to enhance computational efficiency by using model-specified formulae and iterations. We use simulation studies to show the advantages of our algorithm. We conclude in Sect. 5.

2 ForLion for D-optimal designs with mixed factors

In this section, we propose a new algorithm, called the ForLion (First-order Lift-one) algorithm, for constructing locally D-optimal approximate designs under a general parametric model $M({{\textbf{x}}}; $ $\varvec{\theta })$ with ${{\textbf{x}}} \in {{\mathcal {X}}} \subset {\mathbb {R}}^d$, $d\ge 1$ and $\varvec{\theta }\in {\mathbb {R}}^p$, $p\ge 1$. As mentioned earlier, the design region ${{\mathcal {X}}} = \prod _{j=1}^d I_j$, where $I_j = [a_j, b_j]$ for $1\le j\le k$, $-\infty< a_j< b_j < \infty $, and $I_j$ is a finite set of at least two distinct numerical levels for $j>k$. To simplify the notation, we still denote $a_j = \min I_j$ and $b_j = \max I_j$ even if $I_j$ is a finite set.

In this paper, we assume $1\le k\le d$. That is, there is at least one continuous factor. For cases with $k=0$, that is, all factors are discrete, one may use the lift-one algorithm for general parametric models (see Remark 1). The goal in this study is to find a design $\varvec{\xi }= \{({{\textbf{x}}}_i, w_i), i=1, \ldots , m\} \in \varvec{\Xi }$ maximizing $f({\varvec{\xi }}) = \left| {{\textbf{F}}}(\varvec{\xi })\right| $, the determinant of ${{\textbf{F}}}(\varvec{\xi })$, where ${\textbf{F}}(\varvec{\xi }) = \sum _{i=1}^m w_i {{\textbf{F}}}_{{{\textbf{x}}}_i} \in {\mathbb {R}}^{p\times p}$. Here $m\ge 1$ is flexible.

Given a design $\varvec{\xi }= \{({{\textbf{x}}}_i, w_i), i=1, \ldots , m\}$ reported by the ForLion algorithm (see Algorithm 1), the general equivalence theorem (Kiefer 1974; Pukelsheim 1993; Stufken and Yang 2012; Fedorov and Leonov 2014) guarantees its D-optimality on ${{\mathcal {X}}}$. As a direct conclusion of Theorem 2.2 in Fedorov and Leonov (2014), we have the theorem as follows under the regularity conditions (see Sect. S.8 in the Supplementary Material, as well as Assumptions (A1), (A2) and (B1)–(B4) in Sect. 2.4 of Fedorov and Leonov (2014)).

Theorem 1

Under regularity conditions, there exists a D-optimal design that contains no more than $p(p+1)/2$ design points. Furthermore, if $\varvec{\xi }$ is obtained by Algorithm 1, it must be D-optimal.

We relegate the proof of Theorem 1 and others to Sect. S.9 of the Supplementary Material.

Remark 1

Lift-one step for general parametric models: For commonly used parametric models, $\textrm{rank}({{\textbf{F}}}_{\textbf{x}}) < p$ for each ${{\textbf{x}}} \in {{\mathcal {X}}}$. For example, all GLMs satisfy $\textrm{rank}({{\textbf{F}}}_{{\textbf{x}}}) = 1$ (see Sect. 4). However, there exist special cases that $\textrm{rank}({{\textbf{F}}}_{{\textbf{x}}}) = p$ for almost all ${\textbf{x}} \in {{\mathcal {X}}}$ (see Example 1 in Sect. 3.1).

The original lift-one algorithm (see Algorithm 3 in the Supplementary Material of Huang et al. (2023)) requires $0\le w_i < 1$ for all $i=1, \ldots , m$, given the current allocation ${{\textbf{w}}} = (w_1, \ldots , w_m)^T$. If $\textrm{rank}({{\textbf{F}}}_{{{\textbf{x}}}_i}) < p$ for all i, then $f(\varvec{\xi }) > 0$ implies $0\le w_i <1$ for all i. In that case, same as in the original lift-one algorithm, we define the allocation function as

$$\begin{aligned} {{\textbf{w}}}_i(z)= & {} \left( \frac{1-z}{1-w_i} w_1, \ldots , \frac{1-z}{1-w_i} w_{i-1}, z, \right. \\{} & {} \left. \frac{1-z}{1-w_i} w_{i+1}, \ldots , \frac{1-z}{1-w_i} w_m\right) ^T \\ {}= & {} \frac{1-z}{1-w_i} {{\textbf{w}}} + \frac{z-w_i}{1-w_i} {{\textbf{e}}}_i \end{aligned}$$

where ${{\textbf{e}}}_i = (0, \ldots , 0, 1, 0, \ldots , 0)^T \in {\mathbb {R}}^m$, whose ith coordinate is 1, and z is a real number in [0, 1], such that ${\textbf{w}}_i(z) = {\textbf{w}}_i$ at $z=w_i$, and ${\textbf{w}}_i(z)={\textbf{e}}_i$ at $z=1$. However, if $\textrm{rank}({{\textbf{F}}}_{{{\textbf{x}}}_i}) = p$ and $w_i=1$ for some i, we still have $f(\varvec{\xi }) > 0$, but the above ${{\textbf{w}}}_i(z)$ is not well defined. In that case, we define the allocation function in the ForLion algorithm as

$$\begin{aligned} {{\textbf{w}}}_i(z)= & {} \left( \frac{1-z}{m-1}, \ldots , \frac{1-z}{m-1}, z, \frac{1-z}{m-1}, \ldots , \frac{1-z}{m-1}\right) ^T \\ {}= & {} \frac{m(1-z)}{m-1}{{\textbf{w}}}_u + \frac{mz-1}{m-1}{{\textbf{e}}}_i \end{aligned}$$

where ${{\textbf{w}}}_u = (1/m, \ldots , 1/m)^T \in {\mathbb {R}}^m$ is a uniform allocation. For $j\ne i$, we define ${{\textbf{w}}}_j(z) = (1-z) {{\textbf{e}}}_i + z {{\textbf{e}}}_j$ . The rest parts are the same as the original life-one algorithm. $\Box $

Remark 2

Convergence in finite iterations: In practice, we may relax the stopping rule $d({{\textbf{x}}}^*, {\varvec{\xi }}_t) \le p$ in Step $6^\circ $ of Algorithm 1 to $d({{\textbf{x}}}^*, {\varvec{\xi }}_t) \le p + \epsilon $, where $\epsilon $ could be the same as in Step $0^\circ $. By Sect. 2.5 in Fedorov and Leonov (2014), $f((1-\alpha ){\varvec{\xi }}_t + \alpha {{\textbf{x}}}^*) - f({\varvec{\xi }}_t) \approx \alpha (d({\textbf{x}}^*, {\varvec{\xi }}_t) - p)$ for small enough $\alpha > 0$, where $(1-\alpha ){\varvec{\xi }}_t + \alpha {{\textbf{x}}}^*$ is the design $\{({{\textbf{x}}}_i^{(t)}, (1-\alpha ) w_i^{(t)}), i=1, \ldots , m_t\} \bigcup \{({{\textbf{x}}}^*, \alpha )\}$. Thus, if we find an ${\textbf{x}}^*$, such that, $d({{\textbf{x}}}^*, {\varvec{\xi }}_t) > p + \epsilon $, then there exists an $\alpha _0 \in (0,1)$, such that, $f((1-\alpha _0){\varvec{\xi }}_t + \alpha _0 {{\textbf{x}}}^*) - f({\varvec{\xi }}_t)> \alpha _0(d({{\textbf{x}}}^*, {\varvec{\xi }}_t) - p)/2 > \alpha _0 \epsilon /2$. For small enough merging threshold $\delta $ (see Steps $0^\circ $ and $2^\circ $), we can still guarantee that $f({\varvec{\xi }}_{t+1}) - f({\varvec{\xi }}_t) > \alpha _0\epsilon /4$ after Step $2^\circ $. Under regularity conditions, ${{\mathcal {X}}}$ is compact, and $f(\varvec{\xi })$ is continuous and bounded. Our algorithm is guaranteed to stop in finite steps. Actually, due to the lift-one step (Step $3^\circ $), $f({\varvec{\xi }}_t)$ is improved fast, especially in the first few steps. For all the examples explored in this paper, our algorithm works efficiently. $\Box $

Remark 3

Distance among design points: In Step $1^\circ $ of Algorithm 1, an initial design is selected such that $\Vert {{\textbf{x}}}_i^{(0)} - {{\textbf{x}}}_j^{(0)}\Vert \ge \delta $, and in Step $2^\circ $, two design points are merged if $\Vert {{\textbf{x}}}_i^{(t)} - {{\textbf{x}}}_j^{(t)}\Vert < \delta $. The algorithm uses the Euclidean distance as a default metric. Nevertheless, to take the effects of ranges or units across factors into consideration, one may choose a different distance, for example, a normalized distance, such that, $\Vert {{\textbf{x}}}_i - {{\textbf{x}}}_j\Vert ^2 = \sum _{l=1}^d \left( \frac{x_{il}-x_{jl}}{b_l-a_l}\right) ^2$, where ${{\textbf{x}}}_i = (x_{i1}, \ldots , x_{id})^T$ and ${{\textbf{x}}}_j = (x_{j1}, \ldots , x_{jd})^T$. Another useful distance is to define $\Vert {{\textbf{x}}}_i - {{\textbf{x}}}_j\Vert = \infty $ whenever their discrete factor levels are different, that is, $x_{il}\ne x_{jl}$ for some $l>k$. Such a distance does not allow any two design points that have distinct discrete factor levels to merge, which makes a difference when $\delta $ is chosen to be larger than the smallest difference between discrete factor levels. Note that the choice of distance and $\delta $ (see Sect. S.7 in the Supplementary Material) won’t affect the continuous search for a new design point in Step $5^\circ $. It is different from the adaptive grid strategies used in the literature (Duarte et al. 2018; Harman et al. 2020; Harman and Rosa 2020) for optimal designs, where the grid can be increasingly reduced in size to locate the support points more accurately. $\Box $

Remark 4

Global maxima: According to Theorem 1, when Algorithm 1 converges, that is, $\max _{{{\textbf{x}}} \in {{\mathcal {X}}}} d({{\textbf{x}}}, {\varvec{\xi }}_t) \le p$, the design ${\varvec{\xi }}_t$ must be D-optimal. Nevertheless, from a practical point of view, there are two possible issues that may occur. Firstly, the algorithm may fail to find the global maxima in Step $5^\circ $, which may happen even with the best optimization software (Givens and Hoeting 2013). As a common practice (see, for example, Sect. 3.2 in Givens and Hoeting (2013)), one may randomly generate multiple (such as 3 or 5) starting points when finding ${{\textbf{x}}}_{(1)}^*$ in Step $5^\circ $, and utilize the best one among them. Secondly, the global maxima or D-optimal design may not be unique (see, for example, Remark 2 in Yang et al. (2016)). In that case, one may keep a collection of D-optimal designs that the algorithm can find. Due to the log-concavity of the D-criterion (see, for example, Fedorov (1972)), any convex combinations of D-optimal designs are still D-optimal. $\Box $

3 D-optimal designs for MLMs

In this section, we consider experiments with categorical responses. Following Bu et al. (2020), given $n_i > 0$ experimental units assigned to a design setting ${{\textbf{x}}}_i \in {{\mathcal {X}}}$, the summarized responses ${{\textbf{Y}}}_i = (Y_{i1}, \ldots , Y_{iJ})^T$ follow $\textrm{Multinomial}(n_i; \pi _{i1}, \ldots , \pi _{iJ})$ with categorical probabilities $\pi _{i1}, \ldots , \pi _{iJ}$, where $J\ge 2$ is the number of categories, and $Y_{ij}$ is the number of experimental units with responses in the jth category. Multinomial logistic models (MLM) have been commonly used for modeling categorical responses (Glonek and McCullagh 1995; Zocchi and Atkinson 1999; Bu et al. 2020). A general MLM can be written as

$$\begin{aligned} {{\textbf{C}}}^T\log ({\textbf{L}}{\varvec{\pi }}_i)={\varvec{\eta }}_i={\textbf{X}}_i{\varvec{\theta }}, \qquad i=1,\cdots ,m \end{aligned}$$

(1)

where ${\varvec{\pi }}_i = (\pi _{i1}, \ldots , \pi _{iJ})^T$ satisfying $\sum _{j=1}^J \pi _{ij}=1$, ${\varvec{\eta }}_i = (\eta _{i1}, \ldots , \eta _{iJ})^T$, ${\textbf{C}}$ is a $(2J-1) \times J$ constant matrix, ${\textbf{X}}_i$ is the model matrix of $J \times p$ at ${{\textbf{x}}}_i$, $\varvec{\theta }\in {\mathbb {R}}^p$ are model parameters, and ${\textbf{L}}$ is a $(2J-1) \times J$ constant matrix taking different forms for four special classes of MLM models, namely, baseline-category, cumulative, adjacent-categories, and continuation-ratio logit models (see Bu et al. (2020) for more details). When $J=2$, all four logit models are essentially logistic regression models for binary responses, which belong to generalized linear models (see Sect. 4).

3.1 Fisher information ${{\textbf{F}}}_{{{\textbf{x}}}}$ and sensitivity function $d({{\textbf{x}}}, \varvec{\xi })$

The $p\times p$ matrix ${{\textbf{F}}}_{{{\textbf{x}}}}$, known as the Fisher information at ${{\textbf{x}}} \in \mathcal{X}$, plays a key role in the ForLion algorithm. We provide its formula in detail in Theorem 2.

Theorem 2

For MLM (1), the Fisher information ${\textbf{F}}_{{\textbf{x}}}$ at ${{\textbf{x}}} \in \mathcal{X}$ (or $\mathcal{X}_{\varvec{\theta }}$ for cumulative logit models, see Bu et al. (2020)) can be written as a block matrix $({\textbf{F}}^{{\textbf{x}}}_{st})_{J\times J} \in {\mathbb {R}}^{p\times p}$, where ${{\textbf{F}}}^{{\textbf{x}}}_{st}$, a sub-matrix of ${\textbf{F}}_{{\textbf{x}}}$ with block row index s and column index t in $\{1, \ldots , J\}$, is given by

$$\begin{aligned} \left\{ \begin{array}{ll} u_{st}^{{\textbf{x}}} \cdot {{\textbf{h}}}_s({{\textbf{x}}}) {{\textbf{h}}}_t({{\textbf{x}}})^T &{} \text{, } \text{ for } 1\le s, t\le J-1\\ \sum _{j=1}^{J-1} u_{jt}^{{\textbf{x}}} \cdot {{\textbf{h}}}_c({{\textbf{x}}}) {{\textbf{h}}}_t({{\textbf{x}}})^T &{} \text{, } \text{ for } s=J, 1\le t\le J-1\\ \sum _{j=1}^{J-1} u_{sj}^{{\textbf{x}}} \cdot {{\textbf{h}}}_s({{\textbf{x}}}) {{\textbf{h}}}_c({{\textbf{x}}})^T &{} \text{, } \text{ for } 1\le s\le J-1, t=J\\ \sum _{i=1}^{J-1} \sum _{j=1}^{J-1} u_{ij}^{{\textbf{x}}} \cdot {\textbf{h}}_c({{\textbf{x}}}) {{\textbf{h}}}_c({{\textbf{x}}})^T &{} \text{, } \text{ for } s=J, t=J \end{array}\right. \end{aligned}$$

where ${{\textbf{h}}}_j({{\textbf{x}}})$ and ${{\textbf{h}}}_c({{\textbf{x}}})$ are predictors at ${{\textbf{x}}}$, and $u_{st}^{{\textbf{x}}}$’s are known functions of ${{\textbf{x}}}$ and $\varvec{\theta }$ (more details can be found in Appendix A).

In response to Remark 1 in Sect. 2, we provide below a surprising example that the Fisher information ${{\textbf{F}}}_{{\textbf{x}}}$ at a single point ${{\textbf{x}}}$ is positive definite for almost all ${{\textbf{x}}} \in \mathcal{X}$.

Example 1

Positive Definite ${{\textbf{F}}}_{{\textbf{x}}}$ We consider a special MLM (1) with non-proportional odds (npo) (see Sect. S.8 in the Supplementary Material of Bu et al. (2020) for more details). Suppose $d=1$ and a feasible design point ${{\textbf{x}}} = x \in [a, b] = \mathcal{X}$, $J\ge 3$, ${{\textbf{h}}}_1({{\textbf{x}}}) = \cdots = {\textbf{h}}_{J-1}({{\textbf{x}}}) \equiv x$. The model matrix at ${{\textbf{x}}} = x$ is

$$\begin{aligned} {{\textbf{X}}}_x = \left( \begin{array}{cccc} x &{} 0 &{} \cdots &{} 0\\ 0 &{} x &{} \cdots &{} 0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} x\\ 0 &{} 0 &{} \cdots &{} 0 \end{array}\right) _{J\times (J-1)} \end{aligned}$$

with $p=J-1$. That is, the model equation (1) in this example is $\varvec{\eta }_x = {{\textbf{X}}}_x \varvec{\theta } = (\beta _1x, \ldots , \beta _{J-1}x, 0)^T$, where $\varvec{\theta }= (\beta _1, \ldots , \beta _{J-1})^T \in {\mathbb {R}}^{J-1}$ are the model parameters. Then ${{\textbf{U}}}_x = (u_{st}^x)_{s,t=1,\ldots , J}$ can be calculated from Theorem A.2 in Bu et al. (2020). Note that $u_{sJ}^x = u_{Js}^x = 0$ for $s=1, \ldots , J-1$ and $u_{JJ}^x = 1$. The Fisher information matrix at ${{\textbf{x}}}=x$ is ${{\textbf{F}}}_x = {{\textbf{X}}}_x^T {{\textbf{U}}}_x {{\textbf{X}}}_x = x^2 {{\textbf{V}}}_x$, where ${{\textbf{V}}}_x = (u_{st}^x)_{s,t=1,\ldots , J-1}$. Then $|{{\textbf{F}}}_x| = x^{2(J-1)} |{{\textbf{V}}}_x|$. According to Equation (S.1) and Lemma S.9 in the Supplementary Material of Bu et al. (2020), $|{{\textbf{V}}}_x|$ equals to

$$\begin{aligned} \left\{ \begin{array}{ll} \prod _{j=1}^J \pi _j^x &{} \text{ for } \text{ baseline-category, } \\ &{} \text{ adjacent-categories, }\\ &{} \text{ and } \text{ continuation-ratio }\\ \frac{\left[ \prod _{j=1}^{J-1} \gamma _j^x (1-\gamma _j^x)\right] ^2}{\prod _{j=1}^J \pi _j^x} &{} \text{ for } \text{ cumulative } \text{ logit } \text{ models } \end{array}\right. \end{aligned}$$

which is always positive, where $\gamma _j^x = \sum _{l=1}^j \pi _l^x \in (0,1)$, $j=1, \ldots , J-1$. In other words, $\textrm{rank}({\textbf{F}}_x)=p$ in this case, as long as $x\ne 0$. $\Box $

There also exists an example of a special MLM such that ${\textbf{F}}_{{{\textbf{x}}}} = {{\textbf{F}}}_{{{\textbf{x}}}'}$ but ${{\textbf{x}}} \ne {{\textbf{x}}}'$ (see Appendix B).

To search for a new design point ${{\textbf{x}}}^*$ in Step $5^\circ $ of Algorithm 1, we utilize the R function optim with the option “L-BFGS-B" that allows box constraints. L-BFGS-B is a limited-memory version of the BFGS algorithm, which itself is one of several quasi-Newton methods (Byrd et al. 1995). It works fairly well in finding solutions even at the boundaries of the box constraints. We give explicit formulae for computing $d({\textbf{x}}, \varvec{\xi })$ below and provide the first-order derivative of the sensitivity function for MLM (1) in Appendix C.

Theorem 3

Consider MLM (1) with a compact ${\mathcal X}$. A design $\varvec{\xi }= \{({{\textbf{x}}}_i, w_i), i=1, \ldots , m\}$ with $f({\varvec{\xi }}) > 0$ is D-optimal if and only if $\max _{{{\textbf{x}}} \in {{\mathcal {X}}}} d({{\textbf{x}}}, {\varvec{\xi }}) \le p$, where

$$\begin{aligned} d({{\textbf{x}}}, \varvec{\xi })= & {} \sum _{j=1}^{J-1} u_{jj}^{{\textbf{x}}} ({{\textbf{h}}}_j^{{\textbf{x}}})^T {{\textbf{C}}}_{jj} {{\textbf{h}}}_j^{{\textbf{x}}} \nonumber \\{} & {} \quad + \sum _{i=1}^{J-1} \sum _{j=1}^{J-1} u_{ij}^{{\textbf{x}}}\cdot ({{\textbf{h}}}_c^{{\textbf{x}}})^T {{\textbf{C}}}_{JJ} {{\textbf{h}}}_c^{{\textbf{x}}}\nonumber \\{} & {} \quad + 2\sum _{i=1}^{J-2} \sum _{j=i+1}^{J-1} u_{ij}^{{\textbf{x}}} ({{\textbf{h}}}_j^{{\textbf{x}}})^T {{\textbf{C}}}_{ij} {{\textbf{h}}}_i^{{\textbf{x}}} \nonumber \\{} & {} \quad + 2\sum _{i=1}^{J-1} \sum _{j=1}^{J-1} u_{ij}^{{\textbf{x}}} ({\textbf{h}}_c^{{\textbf{x}}})^T {{\textbf{C}}}_{iJ} {{\textbf{h}}}_i^{\textbf{x}} \end{aligned}$$

(2)

${{\textbf{C}}}_{ij} \in {\mathbb {R}}^{p_i\times p_j}$ is a submatrix of the $p\times p$ matrix

$$\begin{aligned} {{\textbf{F}}}(\varvec{\xi })^{-1} = \left[ \begin{array}{ccc} {{\textbf{C}}}_{11} &{} \cdots &{} {{\textbf{C}}}_{1J}\\ \vdots &{} \ddots &{} \vdots \\ {{\textbf{C}}}_{J1} &{} \cdots &{} {{\textbf{C}}}_{JJ} \end{array}\right] \end{aligned}$$

$i,j=1, \ldots , J$, $p=\sum _{j=1}^J p_j$, and $p_J=p_c$ . $\Box $

3.2 Example: emergence of house flies

In this section, we revisit the motivating example at the beginning of Sect. 1. The original design (Itepan 1995) assigned $n_i=500$ pupae to each of $m=7$ doses, $x_i = 80, 100, 120, 140, 160, 180, 200$, respectively, which corresponds to the uniform design $\varvec{\xi }_u$ in Table 1. Under a continuation-ratio non-proportional odds (npo) model adopted by Zocchi and Atkinson (1999)

$$\begin{aligned} \log \left( \frac{\pi _{i1}}{\pi _{i2} + \pi _{i3}}\right)= & {} \beta _{11} + \beta _{12} x_i + \beta _{13} x_i^2\\ \log \left( \frac{\pi _{i2}}{\pi _{i3}}\right)= & {} \beta _{21} + \beta _{22} x_i \end{aligned}$$

with fitted parameters $\varvec{\theta }= (\beta _{11}, \beta _{12}, \beta _{13}, \beta _{21}, $ $\beta _{22})^T = (-1.935, -0.02642, 0.0003174,$ $-9.159,$ $0.06386)^T$, Bu et al. (2020) obtained D-optimal designs under different grid sizes using their lift-one algorithm proposed for discrete factors. With grid size of 20, that is, using the design space $\{80, 100, \ldots , 200\}$ that was evenly spaced by 20 units, they obtained a design $\varvec{\xi }_{20} $ containing four design points. With finer grid points on the same interval [80, 200], both their grid-5 design $\varvec{\xi }_5$ (with design space $\{80, 85, \ldots , 200\}$) and grid-1 design $\varvec{\xi }_1$ (with design space $\{80, 81, \ldots , 200\}$) contain five design points (see Table 1). By incorporating the Fedorov-Wynn and lift-one algorithms and continuously searching the extended region [0, 200], Ai et al. (2023) obtained a four-points design ${\varvec{\xi }}_{a}$ (see Example S1 in their Supplementary Material).

For $x \in {{\mathcal {X}}} = [80, 200]$ as in Bu et al. (2020), our ForLion algorithm reports $\varvec{\xi }_*$ with only three design points (see Table 1, as well as the Supplementary Material, Sect. S.2 for details). Compared with $\varvec{\xi }_*$, the relative efficiencies of the $\varvec{\xi }_u$, $\varvec{\xi }_{20}$, $\varvec{\xi }_5$, and $\varvec{\xi }_1$, defined by , are $82.79\%$, $99.68\%$, $99.91\%$, and $99.997\%$, respectively. The increasing pattern of relative efficiencies, from $\varvec{\xi }_{20}$ to $\varvec{\xi }_{5}$ and $\varvec{\xi }_{1}$, indicates that with finer grid points, the lift-one algorithm can search the design space more thoroughly and find better designs. For $x \in [0,200]$ as in Ai et al. (2023), our algorithm yields ${\varvec{\xi }}_*'$ with only three points (see Table 1) and the relative efficiency of Ai et al. (2023)’s ${\varvec{\xi }}_a$ with respect to ${\varvec{\xi }}_*'$ is $99.81\%$. Note that both $\varvec{\xi }_*$ and $\varvec{ \xi }_*'$ from our algorithm contain only three experimental settings, which achieves the minimum m justified by Bu et al. (2020). Given that the cost of using the radiation device is expensive and each run of the experiment takes at least hours, our designs can save $40\%$ or $25\%$ cost and time compared with Bu et al. (2020)’s and Ai et al. (2023)’s, respectively.

Table 1 Designs for the emergence of house files experiment

Full size table

To further check if the improvements by our designs are due to randomness, we conduct a simulation study by generating 100 bootstrapped replicates from the original data. For each bootstrapped data, we obtain the fitted parameters, treat them as the true values, and obtain D-optimal designs by Ai et al. (2023)’s, Bu et al. (2020)’s grid-1, and our ForLion algorithm with $\delta = 0.1$ and $\epsilon =10^{-10}$, respectively. As Fig. 1 shows, our ForLion algorithm achieves the most efficient designs with the least number of design points. Actually, the median number of design points is 5 for Ai’s, 4 for Bu’s, and 3 for ForLion’s. Compared with ForLion’s, the mean relative efficiency is $99.82\%$ for Ai’s and $99.99\%$ for Bu’s. As for computational time, the median time cost on a Windows 11 desktop with 32GB RAM and AMD Ryzen 7 5700 G processor is 3.59s for Ai’s, 161.81s for Bu’s, and 44.88s for ForLion’s.

4 D-optimal designs for GLMs

In this section, we consider experiments with a univariate response Y, which follows a distribution $f(y;\theta ) = \exp \{y b(\theta ) + c(\theta ) + d(y)\}$ in the exponential family with a single parameter $\theta $. Examples include binary response $Y\sim $ Bernoulli$(\theta )$, count response $Y\sim $ Poisson$(\theta )$, positive response $Y\sim $ Gamma$(\kappa , \theta )$ with known $\kappa >0$, and continuous response $Y\sim N(\theta , \sigma ^2)$ with known $\sigma ^2 > 0$ (McCullagh and Nelder 1989). Suppose independent responses $Y_1, \ldots , Y_n$ are collected with corresponding factor level combinations ${{\textbf{x}}}_1, \ldots , {{\textbf{x}}}_n \in {{\mathcal {X}}} \subset {\mathbb {R}}^{d}$, where ${{\textbf{x}}}_i = (x_{i1}, \ldots , x_{id})^T$. Under a generalized linear model (GLM), there exists a link function g, parameters of interest $\varvec{\beta } = (\beta _1, \beta _2, \ldots , \beta _p)^T$, and the corresponding vector of p known and deterministic predictor functions ${{\textbf{h}}} = (h_1, \ldots , h_p)^T$, such that

$$\begin{aligned} E(Y_i) = \mu _i \text{ and } \eta _i = g(\mu _i)= {\textbf{X}}_i^T\varvec{\beta } \end{aligned}$$

(3)

where ${{\textbf{X}}}_i = {{\textbf{h}}}({{\textbf{x}}}_i) = (h_1({\textbf{x}}_i), \ldots , h_p({{\textbf{x}}}_i))^T$, $i=1, \ldots , n$. For many applications, $h_1({{\textbf{x}}}_i)\equiv 1$ represents the intercept of the model.

4.1 ForLion algorithm specialized for GLM

Due to the specific form of GLM’s Fisher information (see Sect. S.4 and (S4.2) in the Supplementary Material), the lift-one algorithm can be extremely efficient by utilizing analytic solutions for each iteration (Yang and Mandal 2015). In this section, we specialize the ForLion algorithm for GLM with explicit formulae in Steps $3^\circ $, $5^\circ $, and $6^\circ $.

For GLM (3), our goal is to find a design $\varvec{\xi }= \{({{\textbf{x}}}_i, w_i), i=1, \ldots , m\}$ maximizing $f({\varvec{\xi }}) = |{{\textbf{X}}}_{\varvec{\xi }}^T{\textbf{W}}_{\varvec{\xi }}{{\textbf{X}}}_{\varvec{\xi }}|$, where ${\textbf{X}}_{\varvec{\xi }} = ({{\textbf{h}}}({{\textbf{x}}}_1), \ldots , {\textbf{h}}({{\textbf{x}}}_m))^T \in {\mathbb {R}}^{m\times p}$ with known predictor functions $h_1, \ldots , h_p$ , and ${\textbf{W}}_{\varvec{\xi }} = \textrm{diag} \{w_1 \nu ({\varvec{\beta }}^T {{\textbf{h}}}({{\textbf{x}}}_1)), \ldots , w_m \nu ({\varvec{\beta }}^T {{\textbf{h}}}({{\textbf{x}}}_m))\}$ with known parameters $\varvec{\beta }= (\beta _1, \ldots , \beta _p)^T$ and a positive differentiable function $\nu $, where $\nu (\eta _i) = (\partial \mu _i/\partial \eta _i)^2/\textrm{Var}(Y_{i})$, for $i=1, \ldots , m$ (see Sects. S.1 and S.4 in the Supplementary Material for examples and more technical details). The sensitivity function $d({{\textbf{x}}}, {\varvec{\xi }}) = \textrm{tr}({{\textbf{F}}}({\varvec{\xi }})^{-1} {{\textbf{F}}}_{{\textbf{x}}})$ in Step $5^\circ $ of Algorithm 1 can be written as $\nu ({\varvec{\beta }}^T {{\textbf{h}}}({{\textbf{x}}})) \cdot {{\textbf{h}}}({{\textbf{x}}})^T ({{\textbf{X}}}_{\varvec{\xi }}^T{\textbf{W}}_{\varvec{\xi }}{{\textbf{X}}}_{\varvec{\xi }})^{-1}$ $ {\textbf{h}}({{\textbf{x}}})$. As a direct conclusion of the general equivalence theorem (see Theorem 2.2 in Fedorov and Leonov (2014)), we have the following theorem for GLMs.

Theorem 4

Consider GLM (3) with a compact design region ${\mathcal X}$. A design $\varvec{\xi }= \{({{\textbf{x}}}_i, w_i), i=1, \ldots , m\}$ with $f({\varvec{\xi }}) = |{\textbf{X}}_{\varvec{\xi }}^T{{\textbf{W}}}_{\varvec{\xi }}{\textbf{X}}_{\varvec{\xi }}| > 0$ is D-optimal if and only if

$$\begin{aligned} \max _{{{\textbf{x}}} \in {{\mathcal {X}}}} \nu ({\varvec{\beta }}^T {{\textbf{h}}}({{\textbf{x}}})) \cdot {{\textbf{h}}}({{\textbf{x}}})^T ({\textbf{X}}_{\varvec{\xi }}^T{{\textbf{W}}}_{\varvec{\xi }}{\textbf{X}}_{\varvec{\xi }})^{-1} {{\textbf{h}}}({{\textbf{x}}}) \le p \end{aligned}$$

(4)

Given the design ${\varvec{\xi }}_t = \{({{\textbf{x}}}_i^{(t)}, w_i^{(t)}), i=1, \ldots , m_t\}$ at the tth iteration, suppose in Step $5^\circ $ we find the design point ${{\textbf{x}}}^* \in {{\mathcal {X}}}$ maximizing $d({{\textbf{x}}}, {\varvec{\xi }}_t) = \nu ({\varvec{\beta }}^T {{\textbf{h}}}({{\textbf{x}}})) \cdot {\textbf{h}}({{\textbf{x}}})^T ({{\textbf{X}}}_{{\varvec{\xi }}_t}^T{\textbf{W}}_{{\varvec{\xi }}_t}{{\textbf{X}}}_{{\varvec{\xi }}_t})^{-1} {{\textbf{h}}}({{\textbf{x}}})$ according to Theorem 4. Recall that $d({{\textbf{x}}}^*, {\varvec{\xi }}_t) \le p$ in this step implies the optimality of ${\varvec{\xi }}_t$ and the end of the iterations. If $d({\textbf{x}}^*, {\varvec{\xi }}_t) > p$, ${{\textbf{x}}}^*$ will be added to form the updated design ${\varvec{\xi }}_{t+1}$ . For GLMs, instead of letting ${\varvec{\xi }}_{t+1} = {\varvec{\xi }}_t \bigcup \{({{\textbf{x}}}^*, 0)\}$, we recommend $\{({\textbf{x}}_i^{(t)}, (1-\alpha _t)w_i^{(t)}), i=1, \ldots , m_t\} \bigcup \{({{\textbf{x}}}^*, \alpha _t)\}$, denoted by $(1-\alpha _t) {\varvec{\xi }}_t \bigcup \{({{\textbf{x}}}^*, \alpha _t)\}$ for simplicity, where $\alpha _t \in [0,1]$ is an initial allocation for the new design point ${\textbf{x}}^*$, which is determined by Theorem 5.

Theorem 5

Given ${\varvec{\xi }}_t = \{({{\textbf{x}}}_i^{(t)}, w_i^{(t)}), i=1, \ldots , m_t\}$ and ${{\textbf{x}}}^* \in {{\mathcal {X}}}$, if we consider ${\varvec{\xi }}_{t+1}$ in the form of $(1-\alpha ) {\varvec{\xi }}_t \bigcup \{({{\textbf{x}}}^*, \alpha )\}$ with $\alpha \in [0,1]$, then

$$\begin{aligned} \alpha _t = \left\{ \begin{array}{cl} \frac{2^p\cdot d_t - (p+1) b_t}{p(2^p\cdot d_t - 2 b_t)} &{} \text{ if } 2^p\cdot d_t > (p+1) b_t\\ 0 &{} \text{ otherwise } \end{array} \right. \end{aligned}$$

maximizes $f({\varvec{\xi }}_{t+1})$ with $d_t = f(\{({\textbf{x}}^{(t)}_1, w^{(t)}_1/2),$$\ldots , ({{\textbf{x}}}^{(t)}_{m_t}, w^{(t)}_{m_t}/2), ({{\textbf{x}}}^*, 1/2)\})$ and $b_t = f({\varvec{\xi }}_t)$.

Based on $\alpha _t$ in Theorem 5, which is obtained essentially via one iteration of the lift-one algorithm (Yang et al. 2016; Yang and Mandal 2015) with ${{\textbf{x}}}^*$ added, we update Step $6^\circ $ of Algorithm 1 with Step $6'$ for GLMs, which speeds up the ForLion algorithm significantly.

$6'$ If $d({{\textbf{x}}}^*, {\varvec{\xi }}_t) \le p$, go to Step $7^\circ $. Otherwise, we let ${\varvec{\xi }}_{t+1} = (1-\alpha _t){\varvec{\xi }}_t \bigcup \{({{\textbf{x}}}^*, \alpha _t)\}$, replace t by $t+1$, and go back to Step $2^\circ $, where $\alpha _t$ is given by Theorem 5.

The advantages of the lift-one algorithm over commonly used numerical algorithms include simplified computation and exact zero weight for negligible design points. For GLMs, Step $3^\circ $ of Algorithm 1 should be specialized with analytic iterations as in Yang and Mandal (2015). We provide the explicit formula for the sensitivity function’s first-order derivative in Sect. S.5 of the Supplementary Material for Step $5^\circ $.

By utilizing the analytical solutions for GLM in Steps $3^\circ $, $5^\circ $ and $6'$, the computation is much faster than the general procedure of the ForLion algorithm (see Example 2 in Sect. 4.3).

4.2 Minimally supported design and initial design

A minimally supported design $\varvec{\xi }= \{({{\textbf{x}}}_i, w_i), i=1, \ldots , m\}$ achieves the smallest possible m such that $f({\varvec{\xi }}) > 0$, or equivalently, the Fisher information matrix is of full rank. Due to the existence of Example 1, m could be as small as 1 for an MLM. Nevertheless, m must be p or above for GLMs due to the following theorem.

Theorem 6

Consider a design $\varvec{\xi }= \{({{\textbf{x}}}_i, w_i), i=1, \ldots , m\}$ with m support points, that is, $w_i > 0$, $i=1, \ldots , m$, for GLM (3). Then $f({\varvec{\xi }}) > 0$ only if ${{\textbf{X}}}_{\varvec{\xi }}$ is of full column rank p, that is, $\textrm{rank}({{\textbf{X}}}_{\varvec{\xi }}) = p$. Therefore, a minimally supported design contains at least p support points. Furthermore, if $\nu ({\varvec{\beta }}^T {{\textbf{h}}}({{\textbf{x}}})) > 0$ for all ${{\textbf{x}}}$ in the design region ${{\mathcal {X}}}$, then $f({\varvec{\xi }}) > 0$ if and only if $\textrm{rank}({\textbf{X}}_{\varvec{\xi }}) = p$.

Theorem 7 shows that a minimally supported D-optimal design under a GLM must be a uniform design.

Theorem 7

Consider a minimally supported design $\varvec{\xi }= \{({\textbf{x}}_i, w_i), i=1, \ldots , p\}$ for GLM (3) that satisfies $f({\varvec{\xi }}) > 0$. It is D-optimal only if $w_i=p^{-1}$, $i=1, \ldots , p$. That is, it is a uniform allocation of its support points.

Based on Theorem 7, we recommend a minimally supported design as the initial design for the ForLion algorithm under GLMs. The advantage is that once p design points ${\textbf{x}}_1, \ldots , {{\textbf{x}}}_p$ are chosen from ${{\mathcal {X}}}$, such that the model matrix ${{\textbf{X}}}_{\varvec{\xi }} = ({\textbf{h}}({{\textbf{x}}}_1), \ldots , {{\textbf{h}}}({{\textbf{x}}}_p))^T$ is of full rank p, then the design $\varvec{\xi }= \{({{\textbf{x}}}_i, 1/p), i=1, \ldots , p\}$ is D-optimal given those p design points.

Recall that a typical design space can take the form of ${\mathcal X} = \prod _{j=1}^d I_j$, where $I_j$ is either a finite set of distinct numerical levels or an interval $[a_j, b_j]$, and $a_j = \min I_j$ and $b_j = \max I_j$ even if $I_j$ is a finite set. As one option in Step $1^\circ $ of Algorithm 1, we suggest to choose p initial design points from $\prod _{j=1}^d \{a_j, b_j\}$. For typical applications, we may assume that there exist p distinct points in $\prod _{j=1}^d \{a_j, b_j\}$ such that the corresponding model matrix ${{\textbf{X}}}_{\varvec{\xi }}$ is of full rank, or equivalently, the $2^d\times p$ matrix consisting of rows ${{\textbf{h}}}({{\textbf{x}}})^T, {{\textbf{x}}} \in \prod _{j=1}^d \{a_j, b_j\}$ is of full rank p. Herein, we specialize Step $1^\circ $ of Algorithm 1 for GLMs as follows:

$1'$:: Construct an initial design ${\varvec{\xi }}_0 = \{({{\textbf{x}}}_i^{(0)}, p^{-1}),$ $i=1, \ldots , p\}$ such that ${{\textbf{x}}}_1^{(0)},$ $\ldots ,$ ${{\textbf{x}}}_p^{(0)} \in \prod _{j=1}^d \{a_j, b_j\}$ and ${{\textbf{X}}}_{\varvec{\xi }_0} = ({{\textbf{h}}}({{\textbf{x}}}_1^{(0)}), \ldots , {{\textbf{h}}}({\textbf{x}}_p^{(0)}))^T$ is of full rank p.

4.3 Examples under GLMs

In this section, we use two examples to show the performance of our ForLion algorithm under GLMs, both with continuous factors involved only as main effects, which have simplified notations (see Sect. S.6 in the Supplementary Material).

Example 2

In Example 4.7 of Stufken and Yang (2012), they considered a logistic model with three continuous factors $\textrm{logit}(\mu _i) = \beta _0 + \beta _1 x_{i1} + \beta _2 x_{i2} + \beta _3 x_{i3}$ with $x_{i1} \in [-2, 2]$, $x_{i2} \in [-1,1]$, and $x_{i3} \in (-\infty , \infty )$. Assuming $(\beta _0, \beta _1, \beta _2, \beta _3) = (1, -0.5, 0.5, 1)$, they obtained an 8-points D-optimal design ${\varvec{\xi }}_o$ theoretically. Using a MacOS-based laptop with CPU 2 GHz Quad-Core and memory 16GB 3733 MHz, our ForLion algorithm specialized for GLMs takes 17 s and our general ForLion algorithm takes 124 s to get the same design ${\varvec{\xi }}_{*}$ (see Table 2). The relative efficiency of ${\varvec{\xi }}_{*}$ compared with ${\varvec{\xi }}_o$ is simply $100\%$. Note that ${\varvec{\xi }}_o$ was obtained from an analytic approach requiring an unbounded $x_{i3}$ . For bounded $x_{i3}$ , such as $x_{i3}\in [-1, 1]$, $[-2, 2]$, or $[-3, 3]$, we can still use the ForLion algorithm to obtain the corresponding D-optimal designs, whose relative efficiencies compared with ${\varvec{\xi }}_o$ or ${\varvec{\xi }}_{*}$ are $85.55\%$, $99.13\%$, and $99.99993\%$, respectively. $\Box $

Example 3

Lukemire et al. (2019) reconsidered the electrostatic discharge (ESD) experiment described by Whitman et al. (2006) with a binary response and five mixed factors. The first four factors LotA, LotB, ESD, Pulse take values in $\{-1, 1\}$, and the fifth factor Voltage $\in [25, 45]$ is continuous. Using their d-QPSO algorithm, Lukemire et al. (2019) obtained a 13-points design ${\varvec{\xi }}_o$ for the model $\textrm{logit} (\mu ) = \beta _0 + \beta _1 \texttt {Lot A} + \beta _2 \texttt {Lot B} + \beta _3 \texttt {ESD} + \beta _4 \texttt {Pulse} + \beta _5 \texttt {Voltage} + \beta _{34} (\texttt{ESD}\times \texttt {Pulse})$ with assumed parameter values $\varvec{\beta }= (-7.5, 1.50, -0.2, -0.15,$ $0.25, 0.35, 0.4)^T$. It takes 88 seconds using the same laptop as in Example 2 for our (GLM) ForLion algorithm to find a slightly better design ${\varvec{\xi }}_{*}$ consisting of 14 points (see Table S2 in the Supplementary Material) with relative efficiency .

Table 2 Designs obtained for Example 2

Full size table

To make a thorough comparison, we randomly generate 100 sets of parameters $\varvec{\beta }$ from independent uniform distributions: U(1.0, 2.0) for LotA, $U(-0.3,$ $ -0.1)$ for LotB, $U(-0.3, 0.0)$ for ESD, U(0.1, 0.4) for Pulse, U(0.25, 0.45) for Voltage, U(0.35, 0.45) for ESD$\times $Pulse, and $U(-8.0, -7.0)$ for Intercept. For each simulated $\varvec{\beta }$, we treat it as the true parameter values and obtain D-optimal designs using d-QPSO, ForLion, and Fedorov-Wynn-liftone (that is, the ForLion algorithm without the merging step, similarly in spirit to Ai et al. (2023)’s) algorithms. For the ForLion algorithm, we use $\delta =0.03$ and $\epsilon =10^{-8}$ (see Sect. S.7 in the Supplementary Material for more discussion on choosing $\delta $). For the d-QPSO algorithm, following Lukemire et al. (2019), we use 5 swarms with 30 particles each and the algorithm searches design with up to 18 support points and the maximum number of iterations 4, 000. Figure 2 shows the numbers of design points of the three algorithms and the relative efficiencies. The median number of design points is 13 for d-QPSO’s, 39 for Fedorov-Wynn-liftone’s, and 13 for Forlion’s. The mean relative efficiencies compared with our ForLion D-optimal designs, defined as $[f(\cdot )/f({\varvec{\xi }}_{\textrm{ForLion}})]^{1/7}$, is $86.95\%$ for d-QPSO’s and $100\%$ for Fedorov-Wynn-liftone’s. The median running time on the same desktop in Sect. 3.2 is 10.94s for d-QPSO’s, 129.74s for Fedorov-Wynn-liftone’s, and 71.21s for ForLion’s. $\Box $

5 Conclusion and discussion

In this paper, we develop the ForLion algorithm to find locally D-optimal approximate designs under fairly general parametric models with mixed factors.

Compared with Bu et al. (2020)’s and Ai et al. (2023)’s algorithm, our ForLion algorithm can reduce the number of distinct experimental settings by $25\%$ above on average while keeping the highest possible efficiency (see Sect. 3.2). In general, for experiments such as the Emergence of house flies (see Sect. 3.2), the total experimental cost not only relies on the number of experimental units but also the number of distinct experimental settings (or runs). For such kinds of experiments, an experimental plan with fewer distinct experimental settings may allow the experimenter to support more experimental units under the same budget limit, and complete the experiment with less time cost. Under the circumstances, our ForLion algorithm may be extended for a modified design problem that maximizes $n^p|{{\textbf{F}}}(\varvec{\xi })|$ under a budget constraint $mC_r + nC_u \le C_0$, that incorporates the experimental run cost $C_r$ and the experimental unit cost $C_u$, and perhaps a time constraint $m\le M_0$ as well.

Compared with PSO-type algorithms, the ForLion algorithm can improve the relative efficiency by $17.5\%$ above on average while achieving a low number of distinct experimental settings (see Example 3). Our ForLion algorithm may be extended for other optimality criteria by adjusting the corresponding objective function $f(\varvec{\xi })$ and the sensitivity function $d({\textbf{x}}, \varvec{\xi })$, as well as a lift-one algorithm modified accordingly to align with those criteria.

In Step $5^\circ $ of Algorithm 1, the search for a new design point $x^*$ involves solving for continuous design variables for each level combination of the discrete variables. When the number of discrete variables increases, the number of scenarios grows exponentially, which may cause computational inefficiency. In this case, one possible solution is to treat some of the discrete variables as continuous variables first, run the ForLion algorithm to obtain a design with continuous levels of those discrete factors, and then modify the design points by rounding the continuous levels of discrete factors to their nearest feasible levels. For a binary factor, one may simply use [0, 1] or $[-1,1]$ as a continuous region of the possible levels. Note that there are also experiments with discrete factors involving three or more categorical levels. For example, the factor of Cleaning Method in an experiment on a polysilicon deposition process for manufacturing very large scale integrated circuits has three levels, namely, None, CM$_2$, and CM$_3$ (Phadke 1989). For those scenarios, one may first transform such a discrete factor, say, the Cleaning Method, to two dummy variables (that is, the indicator variables for CM$_2$ and CM$_3$, respectively) taking values in $\{0,1\}$, then run the ForLion algorithm by treating them as two continuous variables taking values in [0, 1].

6 Supplementary information

We provide explicit formulae of $u_{st}^{{\textbf{x}}}$ in the Fisher information ${\textbf{F}}_{{\textbf{x}}}$ for MLM (1) in Appendix A, a special example of MLM that ${{\textbf{F}}}_{{{\textbf{x}}}} = {\textbf{F}}_{{{\textbf{x}}}'}$ in Appendix B, and the first-order derivative of $d({\textbf{x}}, \varvec{\xi })$ for MLM (1) in Appendix C.

6.1 Supplementary material

Contents of the Supplementary Material are listed below: S.1 Commonly used GLMs: A list of commonly used GLM models, corresponding link functions, $\nu $ functions, and their first-order derivatives; S.2 Technical details of house flies example: Technical details of applying the ForLion algorithm to the emergence of house flies example; S.3 Example: Minimizing surface defects: An example with cumulative logit po model that shows the advantages of the ForLion algorithm; S.4 Fisher information matrix for GLMs: Formulae for computing Fisher information matrix for GLMs; S.5 First-order derivative of sensitivity function for GLMs: Formulae of $\partial d({{\textbf{x}}}, \varvec{\xi })/\partial x_i$ for GLMs; S.6 GLMs with main-effects continuous factors: Details of GLMs with main-effects continuous factors; S.7 Electrostatic discharge example supplementary: The optimal design table for electrostatic discharge example and a simulation study on the effects of merging threshold $\delta $; S.8 Assumptions needed for Theorem 1; S.9 Proofs: Proofs for theorems in this paper.

Data availability

No datasets were generated or analysed during the current study.

References

Atkinson, A.C., Donev, A.N., Tobias, R.D.: Optimum Experimental Designs, with SAS. Oxford University Press, Oxford (2007)
Book Google Scholar
Ai, M., Ye, Z., Yu, J.: Locally D-optimal designs for hierarchical response experiments. Stat. Sin. 33, 381–399 (2023)
MathSciNet Google Scholar
Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16, 1190–1208 (1995)
Article MathSciNet Google Scholar
Bu, X., Majumdar, D., Yang, J.: D-optimal designs for multinomial logistic models. Ann. Stat. 48(2), 983–1000 (2020)
Article MathSciNet Google Scholar
Broyden, C.G.: A class of methods for solving nonlinear simultaneous equations. Math. Comput. 19(92), 577–593 (1965)
Article MathSciNet Google Scholar
Duarte, B.P.M., Atkinson, A.C., Granjo, J.F.O., Oliveira, N.M.C.: Optimal design of experiments for implicit models. J. Am. Stat. Assoc. 117(539), 1424–1437 (2022)
Article MathSciNet Google Scholar
Duarte, B.P.M., Granjo, J.F.O., Wong, W.K.: Optimal exact designs of experiments via mixed integer nonlinear programming. Stat. Comput. 30(1), 93–112 (2020)
Article MathSciNet Google Scholar
Dennis, J.E., Jr., Moré, J.J.: Quasi-newton methods, motivation and theory. SIAM Rev. 19(1), 46–89 (1977)
Article MathSciNet Google Scholar
Duarte, B.P.M., Sagnol, G., Wong, W.K.: An algorithm based on semidefinite programming for finding minimax optimal designs. Comput. Stat. Data Anal. 119, 99–117 (2018)
Article MathSciNet Google Scholar
Duarte, B.P.M.: Exact optimal designs of experiments for factorial models via mixed-integer semidefinite programming. Mathematics 11(4), 854 (2023)
Article Google Scholar
Duarte, B.P.M., Wong, W.K.: A semi-infinite programming based algorithm for finding minimax optimal designs for nonlinear models. Stat. Comput. 24(6), 1063–1080 (2014)
Article MathSciNet Google Scholar
Duarte, B.P.M., Wong, W.K.: Finding Bayesian optimal designs for nonlinear models: a semidefinite programming-based approach. Int. Stat. Rev. 83(2), 239–262 (2015)
Article MathSciNet Google Scholar
Duarte, B.P.M., Wong, W.K., Dette, H.: Adaptive grid semidefinite programming for finding optimal designs. Stat. Comput. 28(2), 441–460 (2018)
Article MathSciNet Google Scholar
Fedorov, V.V.: Theory of Optimal Experiments. Academic Press, Cambridge (1972)
Google Scholar
Fedorov, V.V., Hackl, P.: Model-Oriented Design of Experiments. Springer, New York (1997)
Book Google Scholar
Fedorov, V.V., Leonov, S.L.: Optimal Design for Nonlinear Response Models. Chapman & Hall/CRC, Baton Rouge (2014)
Google Scholar
Fletcher, R., Reeves, C.M.: Function minimization by conjugate gradients. Comput. J. 7(2), 149–154 (1964)
Article MathSciNet Google Scholar
Givens, G.H., Hoeting, J.A.: Computational Statistics, 2nd edn. Wiley, New Jersey (2013)
Google Scholar
Glonek, G.F.V., McCullagh, P.: Multivariate logistic models. J. Roy. Stat. Soc. B 57, 533–546 (1995)
Article Google Scholar
Harman, R., Filová, L.: Computing efficient exact designs of experiments using integer quadratic programming. Comput. Stat. Data Anal. 71, 1159–1167 (2014)
Article MathSciNet Google Scholar
Harman, R., Filová, L., Richtárik, P.: A randomized exchange algorithm for computing optimal approximate designs of experiments. J. Am. Stat. Assoc. 115(529), 348–361 (2020)
Article MathSciNet Google Scholar
Harman, R., Jurík, T.: Computing c-optimal experimental designs using the simplex method of linear programming. Comput. Stat. Data Anal. 53(2), 247–254 (2008)
Article MathSciNet Google Scholar
Harman, R., Rosa, S.: On greedy heuristics for computing D-efficient saturated subsets. Oper. Res. Lett. 48(2), 122–129 (2020)
Article MathSciNet Google Scholar
Hestenes, M.R., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 49(6), 409–436 (1952)
Article MathSciNet Google Scholar
Huang, Y., Tong, L., Yang, J.: Constrained d-optimal design for paid research study. Stat. Sin. (2023). https://www3.stat.sinica.edu.tw/ss_newpaper/SS-2022-0414_na.pdf
Itepan, N.M.: Aumento do periodo de aceitabilidade de pupas de musca domestica l., 1758 (diptera: muscidae), irradiadas com raios gama, como hospedeira de parasitoides (hymenoptera: pteromalidae). Master’s thesis, Universidade de São Paulo (1995)
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN’95-International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995). IEEE
Kirkpatrick, S., Gelatt, C.D., Jr., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)
Article MathSciNet Google Scholar
Kiefer, J.: General equivalence theory for optimum designs (approximate theory). Ann. Stat. 2, 849–879 (1974)
Article MathSciNet Google Scholar
Lukemire, J., Mandal, A., Wong, W.K.: d-QPSO: a quantum-behaved particle swarm technique for finding D-optimal designs with discrete and continuous factors and a binary response. Technometrics 61(1), 77–87 (2019)
Article MathSciNet Google Scholar
Lukemire, J., Mandal, A., Wong, W.K.: Optimal experimental designs for ordinal models with mixed factors for industrial and healthcare applications. J. Qual. Technol. 54(2), 184–196 (2022)
Article Google Scholar
McCullagh, P., Nelder, J.: Generalized Linear Models, 2nd edn. Chapman and Hall/CRC, Baton Rouge (1989)
Book Google Scholar
Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)
Article MathSciNet Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)
Google Scholar
Phadke, M.S.: Quality Engineering Using Robust Design. Prentice-Hall, Englewood Cliffs (1989)
Google Scholar
Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization: an overview. Swarm Intell. 1, 33–57 (2007)
Article Google Scholar
Pukelsheim, F.: Optimal Design of Experiments. Wiley, New Jersey (1993)
Google Scholar
Pukelsheim, F.: Optimal Design of Experiments. SIAM, Philadelphia (2006)
Book Google Scholar
Sagnol, G.: Computing optimal designs of multiresponse experiments reduces to second-order cone programming. J. Stat. Plann. Inference 141(5), 1684–1708 (2011)
Article MathSciNet Google Scholar
Seber, G.A.F.: A Matrix Handbook for Statisticians. Wiley, New Jersey (2008)
Google Scholar
Sagnol, G., Harman, R.: Computing exact -optimal designs by mixed integer second-order cone programming. Ann. Stat. 43(5), 2198–2224 (2015)
Article MathSciNet Google Scholar
Silvey, S.D., Titterington, D.M., Torsney, B.: An algorithm for optimal designs on a finite design space. Commun. Stat. Theory Methods 14, 1379–1389 (1978)
Article Google Scholar
Stufken, J., Yang, M.: Optimal designs for generalized linear models. In: Hinkelmann, K. (ed.) Design and Analysis of Experiments, Volume 3: Special Designs and Applications, pp. 137–164. Wiley, New Jersey (2012). Chap. 4
Titterington, D.M.: Algorithms for computing d-optimal design on finite design spaces. In: Proc. of the 1976 Conf. on Information Science and Systems, vol. 3, pp. 213–216. John Hopkins University, Baltimore (1976)
Titterington, D.M.: Estimation of correlation coefficients by ellipsoidal trimming. J. R. Stat. Soc. Ser. C (Appl. Stat.) 27, 227–234 (1978)
Google Scholar
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer, New York (2002)
Book Google Scholar
Vo-Thanh, N., Jans, R., Schoen, E.D., Goos, P.: Symmetry breaking in mixed integer linear programming formulations for blocking two-level orthogonal experimental designs. Comput. Oper. Res. 97, 96–110 (2018)
Article MathSciNet Google Scholar
Whitman, C., Gilbert, T.M., Rahn, A.M., Antonell, J.A.: Determining factors affecting ESD failure voltage using DOE. Microelectron. Reliab. 46(8), 1228–1237 (2006)
Article Google Scholar
Wu, F.-C.: Simultaneous optimization of robust design with quantitative and ordinal data. Int. J. Ind. Eng. Theory Appl. Pract. 5, 231–238 (2008)
Google Scholar
Wong, W.K., Zhou, J.: Using CVX to construct optimal designs for biomedical studies with multiple objectives. J. Comput. Graph. Stat. 32(2), 744–753 (2023)
Article MathSciNet Google Scholar
Yang, M., Biedermann, S., Tang, E.: On optimal designs for nonlinear models: a general and efficient algorithm. J. Am. Stat. Assoc. 108, 1411–1420 (2013)
Article MathSciNet Google Scholar
Yang, J., Mandal, A.: D-optimal factorial designs under generalized linear models. Commun. Stat. Simul. Comput. 44, 2264–2277 (2015)
Article MathSciNet Google Scholar
Yang, J., Mandal, A., Majumdar, D.: Optimal designs for $2^k$ factorial experiments with binary response. Stat. Sin. 26, 385–411 (2016)
Google Scholar
Yang, J., Tong, L., Mandal, A.: D-optimal designs with ordered categorical data. Stat. Sin. 27, 1879–1902 (2017)
MathSciNet Google Scholar
Yu, Y.: D-optimal designs via a cocktail algorithm. Stat. Comput. 21, 475–481 (2011)
Article MathSciNet Google Scholar
Ye, J.J., Zhou, J.: Minimizing the condition number to construct design points for polynomial regression models. SIAM J. Optim. 23(1), 666–686 (2013)
Article MathSciNet Google Scholar
Zocchi, S.S., Atkinson, A.C.: Optimum experimental designs for multinomial logistic models. Biometrics 55, 437–444 (1999)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors gratefully acknowledge the authors of Ai et al. (2023), Lukemire et al. (2019) and Lukemire et al. (2022) for kindly sharing their source codes, which we used to implement and compare their methods with ours. The authors gratefully acknowledge the support from the U.S. NSF grants DMS-1924859 and DMS-2311186.

Author information

Authors and Affiliations

Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, Chicago, IL, 60607, USA
Yifei Huang & Jie Yang
Department of Mathematics, University of Alabama at Birmingham, Birmingham, AL, 35294, USA
Keren Li
Department of Statistics, University of Georgia, Athens, GA, 30602, USA
Abhyuday Mandal

Authors

Yifei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Keren Li
View author publications
You can also search for this author in PubMed Google Scholar
Abhyuday Mandal
View author publications
You can also search for this author in PubMed Google Scholar
Jie Yang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization and methodology, all authors; software, Y.H., K.L., and J.Y.; validation, Y.H., A.M., and J.Y.; formal analysis, Y.H. and K.L.; investigation, Y.H. and J.Y.; resources, all authors; data curation, Y.H.; writing-original draft preparation, all authors; writing-review and editing, all authors.; supervision, J.Y. and A.M.; project administration, J.Y. and A.M.; funding acquisition, J.Y. and A.M.. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Jie Yang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 1561 KB)

Appendices

Computing $u_{st}^{{\textbf{x}}}$ in Fisher information ${\textbf{F}}_{{\textbf{x}}}$

In this section, we provide more technical details for Sect. 3.1 and Theorem 2.

For MLM (1), Corollary 3.1 in Bu et al. (2020) provided an alternative form ${{\textbf{F}}}_{{{\textbf{x}}}_i} = {\textbf{X}}_i^T {{\textbf{U}}}_i {{\textbf{X}}}_i$, which we use for computing the Fisher information ${{\textbf{F}}}_{{\textbf{x}}}$ at an arbitrary ${{\textbf{x}}} \in {{\mathcal {X}}}$. More specifically, first of all, the corresponding model matrix at ${{\textbf{x}}}$ is

$$\begin{aligned} {{\textbf{X}}}_{{\textbf{x}}}= \begin{pmatrix} {{\textbf{h}}}_1^T({{\textbf{x}}}) &{} {\varvec{0}}^T &{} \cdots &{} {\varvec{0}}^T&{} {{\textbf{h}}}_c^T({{\textbf{x}}})\\ {\varvec{0}}^T &{} {{\textbf{h}}}_2^T({{\textbf{x}}}) &{}\ddots &{} \vdots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} {\varvec{0}}^T &{} {{\textbf{h}}}_c^T({{\textbf{x}}})\\ {\varvec{0}}^T &{} \cdots &{} {\varvec{0}}^T &{} {{\textbf{h}}}_{J-1}^T({{\textbf{x}}}) &{} {{\textbf{h}}}_c^T({{\textbf{x}}})\\ {\varvec{0}}^T &{} \cdots &{} \cdots &{} {\varvec{0}}^T &{} {\varvec{0}}^T\\ \end{pmatrix}_{J \times p} \end{aligned}$$

(A1)

where ${{\textbf{h}}}^T_j(\cdot ) = (h_{j1}(\cdot ), \ldots , h_{jp_j}(\cdot ))$ and ${{\textbf{h}}}^T_c(\cdot )$ $ = $ $(h_1(\cdot ), \ldots , h_{p_c}(\cdot ))$ are known predictor functions. We let $\varvec{\beta }_j$ and $\varvec{\zeta }$ denote the model parameters associated with ${{\textbf{h}}}^T_j({{\textbf{x}}})$ and ${{\textbf{h}}}^T_c({{\textbf{x}}})$, respectively, then the model parameter vector $\varvec{\theta }=(\varvec{\beta }_{1},\varvec{\beta }_{2},\cdots ,\varvec{\beta }_{J-1},\varvec{\zeta })^T \in {\mathbb {R}}^p$, and the linear predictor $\varvec{\eta }_{{\textbf{x}}} = {{\textbf{X}}}_{{\textbf{x}}} \varvec{\theta }= (\eta _1^{{\textbf{x}}}, \ldots , \eta _{J-1}^{{\textbf{x}}}, 0)^T \in {\mathbb {R}}^J$, where $\eta _j^{{\textbf{x}}} = {{\textbf{h}}}_j^T({{\textbf{x}}}) \varvec{\beta }_j + {{\textbf{h}}}_c^T({{\textbf{x}}}) \varvec{\zeta }$, $j=1, \ldots , J-1$.

According to Lemmas S.10, S.12 and S.13 in the Supplementary Material of Bu et al. (2020), the categorical probabilities $\varvec{\pi }_{{\textbf{x}}} = (\pi _1^{{\textbf{x}}}, \ldots , \pi _J^{{\textbf{x}}})^T \in {\mathbb {R}}^J$ at ${{\textbf{x}}}$ for baseline-category, adjacent-categories and continuation-ratio logit models can be expressed as follows:

$$\begin{aligned} \pi _j^{{\textbf{x}}} = \left\{ \begin{array}{cl} \frac{\exp \{\eta _j^{{\textbf{x}}}\}}{\exp \{\eta _1^{{\textbf{x}}}\} + \cdots + \exp \{\eta _{J-1}^{{\textbf{x}}}\} + 1} &{} \text{ baseline-category }\\ \frac{\exp \{\eta _{J-1}^{{\textbf{x}}} + \cdots +\eta _j^{{\textbf{x}}}\}}{D_j} &{} \text{ adjacent-categories }\\ \exp \{\eta _j^{{\textbf{x}}}\} \prod _{l=1}^j (\exp \{\eta _l^{{\textbf{x}}}\} + 1)^{-1} &{} \text{ continuation-ratio } \end{array}\right. \nonumber \\ \end{aligned}$$

(A2)

for $j=1, \ldots , J-1$, where $D_j = \exp \{\eta _{J-1}^{{\textbf{x}}} + \cdots + \eta _1^{{\textbf{x}}}\} + \exp \{\eta _{J-1}^{{\textbf{x}}} + \cdots + \eta _2^{{\textbf{x}}}\} + \cdots +\exp \{\eta _{J-1}^{{\textbf{x}}}\} + 1$, and

$$\begin{aligned} \pi _J^{{\textbf{x}}} = \left\{ \begin{array}{cl} \frac{1}{\exp \{\eta _1^{{\textbf{x}}}\} + \cdots + \exp \{\eta _{J-1}^{{\textbf{x}}}\} + 1} &{} \text{ baseline-category }\\ \frac{1}{D_J} &{} \text{ adjacent-categories }\\ \prod _{l=1}^{J-1} (\exp \{\eta _l^{{\textbf{x}}}\} + 1)^{-1} &{} \text{ continuation-ratio } \end{array}\right. \end{aligned}$$

where $D_J = \exp \{\eta _{J-1}^{{\textbf{x}}} + \cdots + \eta _1^{{\textbf{x}}}\} + \exp \{\eta _{J-1}^{{\textbf{x}}} + \cdots + \eta _2^{{\textbf{x}}}\} + \cdots +\exp \{\eta _{J-1}^{{\textbf{x}}}\} + 1$. Note that we provide the expression of $\pi _J^{{\textbf{x}}}$ for completeness while $\pi _J^{{\textbf{x}}} = 1 - \pi _1^{{\textbf{x}}} - \cdots - \pi _{J-1}^{{\textbf{x}}}$ is an easier way for numerical calculations.

As for cumulative logit models, the candidate ${{\textbf{x}}}$ must satisfy $-\infty< \eta _1^{{\textbf{x}}}< \eta _2^{{\textbf{x}}}< \cdots< \eta _{J-1}^{{\textbf{x}}} < \infty $. Otherwise, $0< \pi _j^{\textbf{x}} < 1$ might be violated for some $j=1, \ldots , J$. In other words, the feasible design region should be

$$\begin{aligned} {{\mathcal {X}}}_{\varvec{\theta }} = \{{{\textbf{x}}}\in {{\mathcal {X}}} \mid -\infty< \eta _1^{{\textbf{x}}}< \eta _2^{{\textbf{x}}}< \cdots< \eta _{J-1}^{{\textbf{x}}} < \infty \} \end{aligned}$$

(A3)

which depends on the regression parameter $\varvec{\theta }$ (see Section S.14 in the Supplementary Material of Bu et al. (2020) for such an example). For cumulative logit models, if ${{\textbf{x}}} \in {{\mathcal {X}}}_{\varvec{\theta }}$, then

$$\begin{aligned} \pi _j^{{\textbf{x}}} = \left\{ \begin{array}{ll} \frac{\exp \{\eta _1^{{\textbf{x}}}\}}{1+\exp \{\eta _1^{{\textbf{x}}}\}} &{} j=1\\ \frac{\exp \{\eta _j^{{\textbf{x}}}\}}{1+\exp \{\eta _j^{{\textbf{x}}}\}} - \frac{\exp \{\eta _{j-1}^{{\textbf{x}}}\}}{1+\exp \{\eta _{j-1}^{{\textbf{x}}}\}} &{} 1< j < J\\ \frac{1}{1+\exp \{\eta _{J-1}^{{\textbf{x}}}\}} &{} j=J \end{array}\right. \end{aligned}$$

(A4)

according to Lemma S.11 of Bu et al. (2020).

Once $\varvec{\pi }_{{\textbf{x}}} \in {\mathbb {R}}^J$ is obtained, we can calculate $u_{st}^{{\textbf{x}}} = u_{st}({\varvec{\pi }}_{\textbf{x}})$ based on Theorem A.2 in Bu et al. (2020) as follows:

(i)
$u_{st}^{{\textbf{x}}} = u_{ts}^{{\textbf{x}}}$, $s,t=1, \ldots , J$;
(ii)
$u_{sJ}^{{\textbf{x}}} = 0$ for $s=1, \ldots , J-1$ and $u_{JJ}^{{\textbf{x}}} = 1$;
(iii)
For $s=1, \ldots , J-1$, $u_{ss}^{{\textbf{x}}}$ is
$$\begin{aligned} \left\{ \begin{array}{ll} \pi _s^{{\textbf{x}}} (1-\pi _s^{{\textbf{x}}}) &{} \text{ for } \text{ baseline-category },\\ (\gamma _s^{{\textbf{x}}})^2(1-\gamma _s^{{\textbf{x}}})^2((\pi _s^{{\textbf{x}}})^{-1} + (\pi _{s+1}^{{\textbf{x}}})^{-1}) &{} \text{ for } \text{ cumulative },\\ \gamma _s^{{\textbf{x}}}(1-\gamma _s^{{\textbf{x}}}) &{} \text{ for } \text{ adjacent-categories },\\ \pi _s^{{\textbf{x}}}(1-\gamma _s^{{\textbf{x}}})(1-\gamma _{s-1}^{\textbf{x}})^{-1} &{} \text{ for } \text{ continuation-ratio }; \end{array}\right. \end{aligned}$$
(iv)
For $1\le s < t \le J-1$, $u_{st}^{{\textbf{x}}}$ is
$$\begin{aligned} \left\{ \begin{array}{ll} -\pi _s^{{\textbf{x}}} \pi _t^{{\textbf{x}}} &{} \text{ for } \text{ baseline-category },\\ -\gamma _s^{{\textbf{x}}}\gamma _t^{{\textbf{x}}}(1-\gamma _s^{{\textbf{x}}})(1-\gamma _t^{{\textbf{x}}})(\pi _t^{{\textbf{x}}})^{-1} &{} \text{ for } \text{ cumulative }, t-s=1,\\ 0 &{} \text{ for } \text{ cumulative }, t-s>1,\\ \gamma _s^{{\textbf{x}}}(1-\gamma _t^{{\textbf{x}}}) &{} \text{ for } \text{ adjacent-categories },\\ 0 &{} \text{ for } \text{ continuation-ratio }; \end{array}\right. \end{aligned}$$

where $\gamma _j^{{\textbf{x}}} = \pi _1^{{\textbf{x}}} + \cdots + \pi _j^{{\textbf{x}}}$, $j=1, \ldots , J-1$; $\gamma _0^{{\textbf{x}}}\equiv 0$ and $\gamma _J^{{\textbf{x}}}\equiv 1$.

Example that ${{\textbf{F}}}_{{{\textbf{x}}}} = {{\textbf{F}}}_{{{\textbf{x}}}'}$ with ${{\textbf{x}}} \ne {{\textbf{x}}}'$

Consider a special MLM (1) with proportional odds (po) (see Section S.7 in the Supplementary Material of Bu et al. (2020) for more technical details). Suppose $d=2$ and a feasible design point ${{\textbf{x}}} = (x_1, x_2)^T \in [a, b]\times [-c, c] = \mathcal{X}$, $c > 0$, $J\ge 2$, ${{\textbf{h}}}_c({{\textbf{x}}}) = (x_1, x_2^2)^T$. Then the model matrix at ${{\textbf{x}}} = (x_1, x_2)^T$ is

$$\begin{aligned} {{\textbf{X}}}_{{\textbf{x}}} = \left( \begin{array}{cccccc} 1 &{} 0 &{} \cdots &{} 0 &{} x_1 &{} x_2^2\\ 0 &{} 1 &{} \ddots &{} \vdots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} 0 &{} x_1 &{} x_2^2\\ 0 &{} \cdots &{} 0 &{} 1 &{} x_1 &{} x_2^2\\ 0 &{} \cdots &{} 0 &{} 0 &{} 0 &{} 0 \end{array}\right) _{J\times (J+1)} \end{aligned}$$

Then $p=J+1$. Let $\varvec{\theta }= (\beta _1, \ldots , \beta _{J-1}, \zeta _1, \zeta _2)^T \in {\mathbb {R}}^{J+1}$ be the model parameters (since $\varvec{\theta }$ is fixed, we may assume that $\mathcal{X} = \mathcal{X}_{\varvec{\theta }}$ if the model is a cumulative logit model). Let ${{\textbf{x}}}' = (x_1, -x_2)^T$. Then ${{\textbf{X}}}_{{\textbf{x}}} = {{\textbf{X}}}_{{{\textbf{x}}}'}$ and thus $\varvec{\eta }_{{\textbf{x}}} = \varvec{\eta }_{{{\textbf{x}}}'}$. According to (A2) (or (A4)), we obtain $\varvec{\pi }_{{\textbf{x}}} = \varvec{\pi }_{{{\textbf{x}}}'}$ and then ${{\textbf{U}}}_{{\textbf{x}}} = {{\textbf{U}}}_{{{\textbf{x}}}'}$ . The Fisher information matrix at ${{\textbf{x}}}$ is ${\textbf{F}}_{{\textbf{x}}} = {{\textbf{X}}}_{{\textbf{x}}}^T {{\textbf{U}}}_{{\textbf{x}}} {{\textbf{X}}}_{{\textbf{x}}} = {{\textbf{X}}}_{{{\textbf{x}}}'}^T {\textbf{U}}_{{{\textbf{x}}}'} {{\textbf{X}}}_{{{\textbf{x}}}'} = {{\textbf{F}}}_{{\textbf{x}}'}$. Note that ${{\textbf{x}}} \ne {{\textbf{x}}}'$ if $x_2 \ne 0$.

First-order derivative of sensitivity function

As mentioned in Sect. 3.1, to apply Algorithm 1 for MLM, we need to calculate the first-order derivative of the sensitivity function $d({\textbf{x}}, \varvec{\xi })$.

Recall that the first k ($1\le k\le d$) factors are continuous. Given ${{\textbf{x}}} = (x_1, \ldots , x_d)^T \in \mathcal{X}$, for each $i=1, \ldots , k$, according to Formulae 17.1(a), 17.2(a) and 17.7 in Seber (2008),

$$\begin{aligned}{} & {} \frac{\partial d({{\textbf{x}}}, \varvec{\xi })}{\partial x_i}\ =\ \frac{\partial \textrm{tr}({{\textbf{F}}}(\varvec{\xi })^{-1} {{\textbf{F}}}_{{\textbf{x}}})}{\partial x_i} \nonumber \\= & {} \textrm{tr}\left( {{\textbf{F}}}(\varvec{\xi })^{-1} \frac{\partial {{\textbf{F}}}_{{\textbf{x}}}}{\partial x_i}\right) \nonumber \\= & {} \textrm{tr}\left( {{\textbf{F}}}(\varvec{\xi })^{-1} \left[ \frac{\partial {{\textbf{X}}}_{{\textbf{x}}}^T}{\partial x_i} {{\textbf{U}}}_{{\textbf{x}}} {{\textbf{X}}}_{{\textbf{x}}} + {{\textbf{X}}}_{{\textbf{x}}}^T \frac{\partial {{\textbf{U}}}_{{\textbf{x}}}}{\partial x_i} {{\textbf{X}}}_{{\textbf{x}}}\right. \right. \nonumber \\{} & {} + \left. \left. {{\textbf{X}}}_{{\textbf{x}}}^T {{\textbf{U}}}_{{\textbf{x}}} \frac{\partial {{\textbf{X}}}_{{\textbf{x}}}}{\partial x_i}\right] \right) \end{aligned}$$

(C5)

where

$$\begin{aligned} \frac{\partial {{\textbf{X}}}_{{\textbf{x}}}}{\partial x_i}= \begin{pmatrix} \frac{\partial {{\textbf{h}}}_1^T({{\textbf{x}}})}{\partial x_i} &{} {\varvec{0}}^T &{} \cdots &{} {\varvec{0}}^T&{} \frac{\partial {{\textbf{h}}}_c^T({{\textbf{x}}})}{\partial x_i}\\ {\varvec{0}}^T &{} \frac{\partial {{\textbf{h}}}_2^T({{\textbf{x}}})}{\partial x_i} &{}\ddots &{} \vdots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} {\varvec{0}}^T &{} \frac{\partial {{\textbf{h}}}_c^T({{\textbf{x}}})}{\partial x_i}\\ {\varvec{0}}^T &{} \cdots &{} {\varvec{0}}^T &{} \frac{\partial {{\textbf{h}}}_{J-1}^T({{\textbf{x}}})}{\partial x_i} &{} \frac{\partial {{\textbf{h}}}_c^T({{\textbf{x}}})}{\partial x_i}\\ {\varvec{0}}^T &{} \cdots &{} \cdots &{} {\varvec{0}}^T &{} {\varvec{0}}^T\\ \end{pmatrix}_{J \times p} \end{aligned}$$

(C6)

$\frac{\partial {{\textbf{U}}}_{{\textbf{x}}}}{\partial x_i} = \left( \frac{\partial u^{{\textbf{x}}}_{st}}{\partial x_i}\right) _{s,t=1, \ldots , J}$ with

$$\begin{aligned} \frac{\partial u^{{\textbf{x}}}_{st}}{\partial x_i}= & {} \frac{\partial u^{{\textbf{x}}}_{st}}{\partial \varvec{\pi }_{{\textbf{x}}}^T} \cdot \frac{\partial \varvec{\pi }_{{\textbf{x}}}}{\partial \varvec{\eta }_{{\textbf{x}}}^T} \cdot \frac{\partial \varvec{\eta }_{{\textbf{x}}}}{\partial x_i}\nonumber \\= & {} \frac{\partial u^{{\textbf{x}}}_{st}}{\partial \varvec{\pi }_{{\textbf{x}}}^T} \cdot \left( {{\textbf{C}}}^T {\textbf{D}}_{{\textbf{x}}}^{-1} {{\textbf{L}}}\right) ^{-1} \cdot \frac{\partial {{\textbf{X}}}_{{\textbf{x}}}}{\partial x_i} \cdot \varvec{\theta } \end{aligned}$$

(C7)

${{\textbf{C}}}$ and ${{\textbf{L}}}$ defined as in (1), and ${{\textbf{D}}}_{{\textbf{x}}} = \textrm{diag}({{\textbf{L}}} \varvec{\pi }_{{\textbf{x}}})$. Explicit formula of $({{\textbf{C}}}^T {{\textbf{D}}}_{{\textbf{x}}}^{-1} {{\textbf{L}}})^{-1}$ can be found in Section S.3 in the Supplementary Material of Bu et al. (2020) with ${{\textbf{x}}}_i$ replaced by $ {{\textbf{x}}}$. As for $\frac{\partial u^{{\textbf{x}}}_{st}}{\partial \varvec{\pi }_{{\textbf{x}}}^T}$, we have the following explicit formulae

(i)
$\frac{\partial u_{st}^{{\textbf{x}}}}{\partial \varvec{\pi }_{{\textbf{x}}}} = \frac{\partial u_{ts}^{{\textbf{x}}}}{\partial \varvec{\pi }_{{\textbf{x}}}}$, $s,t=1, \ldots , J$;
(ii)
$\frac{\partial u_{sJ}^{{\textbf{x}}}}{\partial \varvec{\pi }_{{\textbf{x}}}} = {{\textbf{0}}} \in {\mathbb {R}}^J$ for $s=1, \ldots , J$;
(iii)
For $s=1, \ldots , J-1$, $\frac{\partial u_{ss}^{{\textbf{x}}}}{\partial \varvec{\pi }_{{\textbf{x}}}}$ is
$$\begin{aligned} \left\{ \begin{array}{cl} \left( \pi _s^{{\textbf{x}}} {{\textbf{1}}}_{s-1}^T, 1-\pi _s^{{\textbf{x}}}, \pi _s^{{\textbf{x}}} {{\textbf{1}}}_{J-s}^T\right) ^T &{} \text{ for } \text{ baseline-category }\\ u_{ss}^{{\textbf{x}}} \left[ \left( \frac{2}{\gamma _s^{{\textbf{x}}}} {{\textbf{1}}}_s^T, \frac{2}{1-\gamma _s^{{\textbf{x}}}} {{\textbf{1}}}_{J-s}^T\right) ^T\right. &{} \\ \left. - \frac{\pi _{s+1}^{{\textbf{x}}} {{\textbf{e}}}_s}{\pi _s^{{\textbf{x}}} (\pi _s^{{\textbf{x}}} + \pi _{s+1}^{{\textbf{x}}})} - \frac{\pi _s^{{\textbf{x}}} {{\textbf{e}}}_{s+1}}{\pi _{s+1}^{{\textbf{x}}} (\pi _s^{{\textbf{x}}} + \pi _{s+1}^{{\textbf{x}}})}\right] &{} \text{ for } \text{ cumulative }\\ \left( (1-\gamma _s^{{\textbf{x}}}) {{\textbf{1}}}_s^T, \gamma _s^{{\textbf{x}}} {{\textbf{1}}}_{J-s}^T\right) ^T &{} \text{ for } \text{ adjacent-categories }\\ \left( {{\textbf{0}}}_{s-1}^T, \frac{(1-\gamma _s^{\textbf{x}})^2}{(1-\gamma _{s-1}^{{\textbf{x}}})^2}, \frac{(\pi _s^{{\textbf{x}}})^2 {{\textbf{1}}}_{J-s}^T}{(1-\gamma _{s-1}^{{\textbf{x}}})^2}\right) ^T &{} \text{ for } \text{ continuation-ratio } \end{array}\right. \end{aligned}$$
where ${{\textbf{e}}}_s$ is the $J\times 1$ vector with the sth coordinate 1 and all others 0, ${{\textbf{1}}}_s$ is the $s\times 1$ vector of all 1, and ${{\textbf{0}}}_s$ is the $s\times 1$ vector of all 0.
(iv)
For $1\le s < t \le J-1$, $\frac{\partial u_{st}^{{\textbf{x}}}}{\partial \varvec{\pi }_{{\textbf{x}}}}$ is
$$\begin{aligned} \left\{ \begin{array}{cl} \left( {{\textbf{0}}}_{s-1}^T, -\pi _t^{{\textbf{x}}}, {{\textbf{0}}}_{t-s-1}^T, -\pi _s^{{\textbf{x}}}, {{\textbf{0}}}_{J-t}^T\right) ^T &{} \text{ for } \text{ baseline-category }\\ \left( -(1-\gamma _s^{{\textbf{x}}})(1-\gamma _t^{{\textbf{x}}})\left( 1 + \frac{2\gamma _s^{{\textbf{x}}}}{\pi _t^{{\textbf{x}}}}\right) {{\textbf{1}}}_s^T, \right. &{} \\ -\gamma _s^{{\textbf{x}}} (1-\gamma _t^{{\textbf{x}}}) \left[ 1 - \frac{\gamma _s^{{\textbf{x}}} (1-\gamma _t^{{\textbf{x}}})}{(\pi _t^{{\textbf{x}}})^2}\right] , &{} \\ \left. -\gamma _s^{{\textbf{x}}} \gamma _t^{{\textbf{x}}} \left[ 1 + \frac{2(1-\gamma _t^{{\textbf{x}}})}{\pi _t}\right] {{\textbf{1}}}_{J-s-1}^T\right) ^T &{} \text{ for } \text{ cumulative }, t-s=1\\ {{\textbf{0}}}_J &{} \text{ for } \text{ cumulative }, t-s>1\\ \left( (1-\gamma _t^{{\textbf{x}}}) {{\textbf{1}}}_s^T, {{\textbf{0}}}_{t-s}^T, \gamma _s^{{\textbf{x}}} {{\textbf{1}}}_{J-t}^T\right) ^T &{} \text{ for } \text{ adjacent-categories }\\ {{\textbf{0}}}_J &{} \text{ for } \text{ continuation-ratio } \end{array}\right. \end{aligned}$$

where $\gamma _j^{{\textbf{x}}} = \pi _1^{{\textbf{x}}} + \cdots + \pi _j^{{\textbf{x}}}$, $j=1, \ldots , J-1$; $\gamma _0^{{\textbf{x}}}\equiv 0$ and $\gamma _J^{{\textbf{x}}}\equiv 1$.

Thus the explicit formulae for $\frac{\partial d({{\textbf{x}}}, \varvec{\xi })}{\partial x_i}$, $i=1, \ldots , k$ can be obtained via (C5). Only $\frac{\partial {\textbf{X}}_{{\textbf{x}}}}{\partial x_i}$ is related to i, which may speed up the computations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, Y., Li, K., Mandal, A. et al. ForLion: a new algorithm for D-optimal designs under general parametric statistical models with mixed factors. Stat Comput 34, 157 (2024). https://doi.org/10.1007/s11222-024-10465-x

Download citation

Received: 02 April 2024
Accepted: 26 June 2024
Published: 18 July 2024
DOI: https://doi.org/10.1007/s11222-024-10465-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

ForLion: a new algorithm for D-optimal designs under general parametric statistical models with mixed factors

Abstract

Similar content being viewed by others

Optimal design of multifactor experiments via grid exploration

A-ComVar: A Flexible Extension of Common Variance Designs

Approximate and exact optimal designs for \(2^k\) factorial experiments for generalized linear models via second order cone programming

Explore related subjects

1 Introduction

2 ForLion for D-optimal designs with mixed factors

Theorem 1

Remark 1

Remark 2

Remark 3

Remark 4

3 D-optimal designs for MLMs

3.1 Fisher information \({{\textbf{F}}}_{{{\textbf{x}}}}\) and sensitivity function \(d({{\textbf{x}}}, \varvec{\xi })\)

Theorem 2

Example 1

Theorem 3

3.2 Example: emergence of house flies

4 D-optimal designs for GLMs

4.1 ForLion algorithm specialized for GLM

Theorem 4

Theorem 5

4.2 Minimally supported design and initial design

Theorem 6

Theorem 7

4.3 Examples under GLMs

Example 2

Example 3

5 Conclusion and discussion

6 Supplementary information

6.1 Supplementary material

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 1561 KB)

Appendices

Computing \(u_{st}^{{\textbf{x}}}\) in Fisher information \({\textbf{F}}_{{\textbf{x}}}\)

Example that \({{\textbf{F}}}_{{{\textbf{x}}}} = {{\textbf{F}}}_{{{\textbf{x}}}'}\) with \({{\textbf{x}}} \ne {{\textbf{x}}}'\)

First-order derivative of sensitivity function

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation