A sequential designing-modeling technique when the input factors are not equally important

Elsawah, A. M.; Wang, Yi-An; Chen, Zhihan; Tank, Fatih

doi:10.1007/s40314-023-02519-z

A sequential designing-modeling technique when the input factors are not equally important

Published: 07 December 2023

Volume 43, article number 9, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Computational and Applied Mathematics Aims and scope Submit manuscript

A sequential designing-modeling technique when the input factors are not equally important

Download PDF

A. M. Elsawah ORCID: orcid.org/0000-0001-6116-4779^1,2,3,
Yi-An Wang²,
Zhihan Chen² &
…
Fatih Tank⁴

212 Accesses
1 Citation
Explore all metrics

Abstract

The first thing springs to mind for understanding, forecasting, and improving the behavior of a complex system is a data-based model. This paper presents a sequential designing-modeling technique when the input factors do not have the same influence. The power of the combination of the design of experiments approach and modeling approach is investigated. The proposed technique adds the input factors to the process and designs and models them one after the other. At each step, one input factor is added based on its significance (impact), while each remaining input factor is set at its highest-influencing point (value). Ranking the factors in terms of significance and determining the point that has the highest effect for each factor are investigated. A comparison study between the new proposed sequential-stages technique (SeqST) and the classical single-stage technique (SinST) is given. The main results show that: (i) the performance of the SeqST is better than the performance of the SinST under different experimental conditions and scenarios, (ii) when there is a small number of training points in an experiment, there is a larger difference between the performance of the SeqST and the SinST than there is when there is a large number, (iii) when there are huge gaps between the importance of the factors in an experiment, there is a larger difference between the performance of the SeqST and the SinST than there is when there are small gaps, (iv) the SeqST has a much better performance using the correct order of the importance of the factors, and (v) the SeqST has a much better performance using a descending order of the numbers of the training points in the follow-up stages. In conclusion, for experiments with few trials and/or big gaps between the factors’ importance, it is highly recommended to use the SeqST with the ascending order of the factors’ importance and a decreasing order of the numbers of training points in the follow-up stages. This study gives a benchmark that guide experimenters to effectively designing and modeling their experiments.

Designing Optimal Large Four-Level Experiments: A New Technique Without Recourse to Optimization Softwares

Article 22 September 2021

The Role of Computational Intelligence in Experimental Design: A Literature Review

Multiple doubling: a simple effective construction technique for optimal two-level experimental designs

Article 01 February 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The first thing springs to mind for understanding, forecasting, and improving the behavior of complex experiments for real-life phenomena, industrial applications, and scientific investigations is a data-based model. Designing and modeling a studied experiment are the two key stages for this purpose. The significant purpose of the first stage, designing the experiment, is the selection of a representative dataset that provide precise information and correct understanding about the most significant features and behavior of the phenomenon under the experimentation (cf. Elsawah 2021a). Modeling the collected representative dataset, i.e., screening the relationship between input factors and their responses, is the second stage that can be used to estimate unknown parameters and predict the behavior of the studied phenomenon and thus guide the investigators to improve the inputs or experimental conditions for optimizing the corresponding outputs (cf. Elsawah 2021b). This logical idea is a classical methodology that is extremely used in computer and physical experiments (cf. Elsawah 2023a, b). For example, it is used in the industry in designing the process, reducing the process time, improving the quality of the products by reducing variability and increasing reliability, and reducing the overall costs (cf. Elsawah 2022a).

Efficient designing and modeling methods are able to capture maximum valuable (accurate) information about the behavior of a given experiment, and thus, an efficient model can be established based on the optimal representative dataset to screen the relationship between the inputs and their corresponding responses that can be used to estimate significant unknown parameters without bias and with minimum variance and forecast the future behavior of the studied phenomenon. Whereas non-efficient designing or/and modeling methods cannot produce useful and correct information nor provide accurate estimation or prediction (Elsawah 2022b). The practice demonstrated that effectively designing and modeling experiments are significant hard problems experimenters may face in many real-life applications. Despite the fact that many approaches have been offered, the challenge faced by the experimenters is still daunting.

The significant problem in improving the designing and modeling methods is that the researchers are improving the methods of each stage independently. On one hand, the idea of design of experiment approach (Fisher 1935) and the corresponding approaches and developments are used to improve the first stage, designing the experiments, and many efficient methods are given to optimally select representative datasets. On the second hand, the power of the modeling approach and its corresponding methods such as machine learning (Samuel 1959) are used to improve the second stage, modeling the experiments. However, these two approaches are complementary and not alternative and their power can be merged to support each other. The combination of design of experiment and modeling has recently attracted the attention of researchers (cf. Lujan-Moreno et al. 2018; Salmaso et al. 2022).

Even though there is obvious link between the design of experiment and modeling, there are surprisingly few papers on addressing the potential usefulness of a combination of the two concepts. For instance, Staelin (2003) used the principles of design of experiment to identify optimal or nearly optimal initial parameter settings in an example of support vector machines; Packianather et al. (2000) applied the Taguchi design approach to optimize the design parameters in an example of neural networks; Sukthomya and Tannock (2005), Ortiz-Rodriguez et al. (2006), and Balestrassi et al. (2009) all reached the conclusion that the design of experiment approach allows for gaining a profound understanding of the effects of parameters on the network performance and hence enables better parameter adjustments. The existing work compares or combines the two concepts in specific areas of interest or for specific problem investigations (cf. for example Mohamed et al. 2023; Prasath et al. 2021, 2022), but a paper producing a generalizable assessment of how the two methodologies can be applied jointly to develop a new efficient designing-modeling approach has not been put forward so far and the work in this topic is limited. Readers who are interested in learning more new approaches for designing or modeling experiments may refer to Sikirica et al. (2023), Iordanis et al. (2022), Zhang et al. (2022) and Elsawah (2017a, 2017b).

Consider an explicit function for an experiment with p input factors $ X_1,X_2,\ldots ,X_p$ and only one output factor Y and the experimenter wants to estimate the true model $Y=F(X_1,X_2,\ldots ,X_p)$ that gives the relationship between the p input factors and their corresponding responses. The classical modeling technique estimates the model $Y=F(X_1,X_2,\ldots ,X_p)$ in one step based on a selected representative dataset that is an $n\times p$ data matrix by selecting n different values from the range of each input factor ${X_i},~i=1,\ldots ,p$. However, the accuracy of the approximate model ${\widehat{Y}}={\widehat{F}}(X_1,X_2,\ldots ,X_p)$ in many cases is not good, especially when there is no or little prior information about the true model. Therefore, the logical idea is that: The weight of importance of each input factor needs to be taken into the consideration and a closer look at the sub-models between the most important input factors and their corresponding responses need to be investigated. This paper presents a sequential designing-modeling technique (SeqST) that takes the weight of the importance of each input factor into consideration. The power of the combination of the sequential design of experiment approach and sequential modeling approach is investigated. The input factors are added to the proposed technique and modeled sequentially, one input factor is added at each stage, according to their importance (i.e., expected influence on the output), while each remaining input factor keeps fixed at a given point (value) that has the highest influence based on a prior knowledge or an initial experiment (cf. Sect. 3 for more details). Based on this simple introduction of the new proposed SeqST, the following logical questions may arise: How to rank the importance of the input factors in order? How to find the point of the highest influence for each factor? What is the effect of the total number of training points on the performance of the SeqST? What is the effect of the number of training points in each stage on the performance of the SeqST? What is the effect of the order of the importance of the input factors on the performance of the SeqST? What is the effect of the gap between the importance of the input factors on the performance of the SeqST? This paper tries to answer these interesting questions to investigate the performance of the proposed SeqST for different scenarios that give benchmarks to guide the experimenters to effectively designing and modeling their experiments. The power of the new proposed SeqST is measured by comparing its performance with the performance of the classical modeling technique, single-stage technique (SinST).

The rest of this paper is organized as follows. Section 2 gives the new proposed SeqST. Measuring the importance of each factor and finding the point with the highest influence for each factors are discussed in Sect. 3. Section 4 gives an illustrative example based on the discussions in Sects. 2 and 3. The performance of the new proposed SeqST is compared with the performance of the SinST using linear and non-linear models in Sect. 5. Section 6 gives further investigations for the performance of the proposed SeqST using different scenarios of the number of training points and the order of the importance of the input factors. We close through the conclusion and future work in Sect. 7.

2 The new proposed sequential stages designing-modeling technique

Consider an experiment with p input factors $ X_1,X_2,\ldots ,X_p$ and only one output factor Y and the experimenter wants to find the meta-model ${\widehat{Y}}={\widehat{F}}(X_1,X_2,\ldots ,X_p)$ that gives the relationship between the p input factors and their corresponding responses. This paper presents a step-by-step technique for incorporating design of experiment approach into modeling approach and adapting it to address some drawbacks of the existing techniques. Due to the limitation of the space and for a clear explanation, the new proposed SeqST uses the regression model from the modeling approach, which is the most basic strategy in the modeling approach and its success is more conducive to the proliferation of other advanced models. However, many different models can be used to extend this study. The new proposed SeqST is given by the following steps:

$\underline{{\varvec{Preparation stage:}}}$ Rank the p inputs $X_1,X_2,\ldots ,X_p$ according to their importance, i.e., influence on the output. Let $X_{1:p}\ggg X_{2:p}\ggg \cdots \ggg X_{p:p}$ is the corresponding importance order of the p input factors, where $X_{1:p}$ is the input with the highest importance and $X_{p:p}$ is the input with the lowest importance. Determine the most important level (value) of each input factor, i.e., the value for each factor that has the highest importance. Let $x^*_{1:p},x^*_{2:p},\ldots ,$ and $x^*_{p:p}$ are the p highest influence levels of the p input factors $X_{1:p},~X_{2:p},\ldots ,$ and $X_{p:p},$ respectively. It is worth mentioning that the importance (or influence) of the input factors and their levels that have the highest influences can be given based on expert knowledge or prior information by investigating an initial small experiment. If there is no prior information, Sect. 3 investigates a theoretical method to estimate the importance of each factor and the point with the highest influence for each factor.
$\underline{{\varvec{First designing-modeling stage:}}}$ Generate the first-stage dataset (design) that is an $n_1\times p$ data matrix ${\textbf{U}}_1=\left[ \textbf{D}_{1},X^*_{2:p},\ldots ,X^*_{p:p}\right] ,$ where $\textbf{D}_{1}=\left( x^{(1)}_{1:p},\ldots ,x^{(n_1)}_{1:p}\right) ^T$ is an optimal design from the experimental design viewpoint over the domain of the highest importance input factor $X_{1:p}$ and $X^*_{k:p}=(x^*_{k:p},\ldots ,x^*_{k:p})^T$ is a vector that all of its $n_1$ values are fixed to the highest importance level value $x^*_{k:p}$ of the kth input factor $X_{k:p}$ for $k=2,\ldots ,p.$ Calculate the first-stage observed output vector via a physical experiment or exact output vector via a computer experiment, say $Y_1=F({\textbf{U}}_{1}).$ Find the first-stage meta-model $\widehat{F_1}$ that is the approximate model for the relationship between the $\textbf{D}_{1}$ in the first-stage design ${\textbf{U}}_{1}$ and the corresponding first-stage observed output factor $Y_1=F({\textbf{U}}_{1}).$
$\underline{{\varvec{Second designing-modeling stage:}}}$ Generate the second-stage design that is an $n_2\times p$ data matrix ${\textbf{U}}_2=\left[ \textbf{D}_{2},X^*_{3:p},\ldots ,X^*_{p:p}\right] ,$ where $\textbf{D}_{2}=\left( \begin{array}{cccc} x^{(1)}_{1:p}&{}\ldots &{}x^{(n_2)}_{1:p} \\ x^{(1)}_{2:p}&{}\ldots &{}x^{(n_2)}_{2:p} \\ \end{array}\right) ^T$ is an optimal design from the experimental design viewpoint over the domain of the first two highest importance input factors $X_{1:p}$ and $X_{2:p},$ and $X^*_{k:p}=(x^*_{k:p},\ldots ,x^*_{k:p})^T$ is a vector that all of its $n_2$ values are fixed to the highest importance level value $x^*_{k:p}$ of the kth input factor $X_{k:p}$ for $k=3,\ldots ,p.$ Calculate the second-stage observed output vector via a physical experiment or exact output factor via a computer experiment, say $Y_2=F({\textbf{U}}_{2}).$ Find the second-stage meta-model $\widehat{F_2}$ that is the approximate model for the relationship between the $\textbf{D}_{2}$ in the second-stage design ${\textbf{U}}_{2}$ and the corresponding second-stage observed output factor $Y_2=F({\textbf{U}}_{2}).$
${\underline{{\varvec{Third designing-modeling stage:}}}}$ Generate the third-stage design that is an $n_3\times p$ data matrix ${\textbf{U}}_3=\left[ \textbf{D}_{3},X^*_{4:p},\ldots ,X^*_{p:p}\right] ,$ where $\textbf{D}_{3}=\left( \begin{array}{cccc} x^{(1)}_{1:p}&{}\ldots &{}x^{(n_3)}_{1:p} \\ x^{(1)}_{2:p}&{}\ldots &{}x^{(n_3)}_{2:p} \\ x^{(1)}_{3:p}&{}\ldots &{}x^{(n_3)}_{3:p} \\ \end{array}\right) ^T$ is an optimal design from the experimental design viewpoint over the domain of the first three highest influence inputs $X_{1:p},X_{2:p}$ and $X_{3:p},$ and $X^*_{k:p}=(x^*_{k:p},\ldots ,x^*_{k:p})^T$ is a vector that all of its $n_3$ values are fixed to the highest influence level value $x^*_{k:p}$ of the kth input for $k=4,\ldots ,p.$ Calculate the third-stage observed output vector via a physical experiment or exact output vector via a computer experiment, say $Y_3=F({\textbf{U}}_{3}).$ Find the third-stage meta-model $\widehat{F_3}$ that is the approximate model for the relationship between the $\textbf{D}_{3}$ in the third-stage design ${\textbf{U}}_{3}$ and the corresponding third-stage observed output vector $Y_3=F({\textbf{U}}_{3}).$
$\underline{{\varvec{P-th designing-modeling stage}}}$ Repeat the above systematic strategy up to the last stage as follows. Generate the pth-stage design that is an $n_p\times p$ data matrix ${\textbf{U}}_p=\left[ \textbf{D}_{p}\right] ,$ where $\textbf{D}_{p}=\left( \begin{array}{cccc} x^{(1)}_{1:p}&{}\ldots &{}x^{(n_p)}_{1:p} \\ \vdots &{}\vdots &{}\vdots \\ x^{(1)}_{p:p}&{}\ldots &{}x^{(n_p)}_{p:p} \\ \end{array}\right) ^T$ is an optimal design from the experimental design viewpoint over the domain of all the p inputs $X_{1:p},X_{2:p},\ldots ,X_{p:p}.$ Calculate the pth-stage observed output vector via a physical experiment or exact output vector via a computer experiment, say $Y_p=F({\textbf{U}}_{p}).$ Find the pth-stage meta-model $\widehat{F_p}$ that is the approximate model for the relationship between the $\textbf{D}_{p}$ in the pth-stage design ${\textbf{U}}_{p}$ and the corresponding pth-stage observed output vector $Y_p=F({\textbf{U}}_{p}).$
$\underline{{\varvec{Final Meta-Model:}}}$ To define the overall meta-model, we use the idea of the weighted average for the coefficients of the factors in the meta-models $\widehat{F_1},\ldots ,\widehat{F_p}.$ For instance as given in Fig. 1, for an experiment with three factors without interactions and the three meta-models are polynomial models as follows: $\widehat{F_1}=\beta _1+a_{11}X_1+a_{12}X^2_1,$ $\widehat{F_2}=\beta _2+a_{21}X_1+a_{22}X^2_1+b_{21}X_2+b_{22}X^2_2$ and $\widehat{F_3}=\beta _3+a_{31}X_1+a_{32}X^2_1+b_{31}X_2+b_{32}X^2_2+c_{31}X_3+c_{32}X^2_3.$ Therefore, the overall meta-model is the weighted average that is given as follows:
$$\begin{aligned} {\widehat{F}}= & {} \frac{1}{3} \sum _{k= 1}^{3} {\beta }_k+\left( \sum _{k = 1}^{3}\frac{a_{k1}}{3}\right) {X_1}+\left( \sum _{k = 1}^{3}\frac{a_{k2}}{3}\right) {X^2_1} +\left( \sum _{k = 2}^{3}\frac{b_{k1}}{2}\right) {X_2} \\{} & {} +\left( \sum _{k = 2}^{2}\frac{b_{k2}}{2}\right) {X^2_2} +c_{31}{X_3}+c_{32}{X^2_3}. \end{aligned}$$

Now comes to mind the following logical question: How to select the optimal design (dataset) from the experimental design viewpoint for each stage over the domain of the input factors in each stage? An efficient way for selecting optimal representative training datasets for the new proposed SeqST is to make use of the techniques of experimental design approach. The optimality selection of experimental points (design) from an experimental region that provides valuable information about a given experiment is the most significant hard problem investigators may face, especially when there is no prior information about the model structure between the inputs and the corresponding outputs. An intuitive idea to overcome the mentioned problem is to scatter the representative training points in an intelligent manner to cover the experimental region well, which is called a space-filling design (cf. Elsawah 2022c). Among strategies coined for computer experiments, Latin hypercube designs (LHDs) (Mckay et al. 1979; Iman and Conover 1980) have become very popular. Other strategies include orthogonal arrays (Owen 1992), and Hammersley designs (Diwekar and kalagnanam 1997; Hammersley 1960). To illustrate their popularity, Fig. 1a in Viana (2013) (cf. Fig. 2) shows an approximate number of publications that referred to at least one of these three techniques. An LHD spreads its representative training points everywhere in the region with as few gaps or holes as possible (cf. Fig. 3), and thus, it gives a good representation of the experimental region with even fewer points. LHDs play an important role in computer simulation (cf. Husslage et al. 2011; Fang et al. 2006; Elsawah and Gong 2023). Therefore, LHDs is used in this study. It is pertinent to point out that the new proposed SeqST can be carried out utilizing uniform designs, which are a class of optimal space-filling designs that are currently extensively used in a variety of practical applications (cf. Elsawah and Vishwakarma 2022).

3 On the importance of the input factors and their points

The following logical question comes to mind after reading the preparation stage of the new proposed SeqST: If there is no prior information, how to determine the order of the importance of the factors and the points with the highest influence for each factor? This section tries to provide an answer to this significant question for computer experiments. Consider a computer experiment with p independent input factors $X_{i}\in [LB_i,UB_i],~1\le i\le p$ and $x^*_{k}$ is the point with the highest influence for the kth factor $X_{k},~1\le k\le p.$ From physics point of view, the points $x^*_{k}$ with the highest influence can be defined as the Mass Centers (MCs). The MC is a point that causes a rigid body to maintain its equilibrium state. Within a solid Q with volume V, if the mass distribution is continuous with density $\rho $, the integral of the weighted position coordinates of the points connected to the center of mass R can be expressed as follows:

$$\begin{aligned} \iiint _Q\rho (r)(r-R) \,\textrm{d}V= 0, \end{aligned}$$

(1)

where r is the vector representing the position of a point with respect to a fixed origin and the solution of coordinate R is given as follows:

$$\begin{aligned} R = \frac{1}{M}\iiint _Q\rho (r)r\,\textrm{d}V, \end{aligned}$$

(2)

where M is the total mass of the solid. For further details, the reader may refer to Mark (2009). If the body is formed by a function from mathematics viewpoint, its volume has a uniform density distributed state with a constant $\rho (r).$ Therefore, for a function with p factors $F(X_1,X_2,\ldots ,X_p)$, (2) can be rewritten as follows:

$$\begin{aligned} R = \frac{1}{M}\mathop {\int \cdots \int \cdots \int }\limits _{\text {p integrals}} F(X_1,X_2,\ldots ,X_p)\,\textrm{d}V. \end{aligned}$$

(3)

The point $x^*_{k}$ with the highest influence for the factor $X_k$ is defined as the point that divides the function into two parts with same mass, $M_L=M_R$ (cf. Fig. 4). Therefore, from (3), we get

$$\begin{aligned} \frac{1}{R}\mathop {\int \cdots \int \cdots \int }\limits _{\text {p integrals}} F(X_1,X_2,\ldots ,X_p)\,\textrm{d}V_L = \frac{1}{R}\mathop {\int \cdots \int \cdots \int }\limits _{\text {p integrals}} F(X_1,X_2,\ldots ,X_p)\,\textrm{d}V_R. \end{aligned}$$

(4)

From (4), the point $x^*_{k}$ with the highest influence for the factor $X_k$ is the solution of the following equation:

$$\begin{aligned} \begin{aligned}&\int _{LB_p}^{UB_p}\ldots \int _{LB_k}^{x_k^*} \ldots \int _{LB_1}^{UB_1} F(X_1,\ldots ,X_k,\ldots ,X_p) \ \,\textrm{d}X_1\ldots \,\textrm{d}X_k\ldots \,\textrm{d}X_p \\&\quad = \int _{LB_p}^{UB_p}\ldots \int _{x_k^*}^{UB_k} \ldots \int _{LB_1}^{UB_1} F(X_1,\ldots ,X_k,\ldots ,X_p) \ \,\textrm{d}X_1\ldots \,\textrm{d}X_k\ldots \,\textrm{d}X_p. \end{aligned} \end{aligned}$$

(5)

Using the calculated points $x^*_{k},~1\le k\le p$ with the highest impacts, the importance of the factor $X_k$ can be measured by its corresponding area as follows:

$$\begin{aligned} A(X_k) = \left| \int _{LB_k}^{UB_k} F(x_1^*,x_2^*,\ldots ,X_k,\ldots ,x_p^*) \,\textrm{d}X_k\right| . \end{aligned}$$

(6)

The p areas $A(x_k),~1\le k\le p$ need to be calculated and sorted in a decreasing order as follows:

$$\begin{aligned} A(X_{1:p})> A(X_{2:p})> \cdots > A(X_{p:p}), \end{aligned}$$

where $X_{1:p}$ is the input with the highest importance and $X_{p:p}$ is the input with the lowest importance.

4 An illustrative example

The above-mentioned steps and discussions in Sects. 2 and 3 are used and explained using LHDs, polynomial regression models, and the following computer experiment:

$$\begin{aligned} Y=F(X_1,X_2,X_3)=200+5X_1^{2}+100X_1+\frac{1}{25} X_2^{2}+50X_2+\frac{1}{175} X_3^{2}+X_3,~0\le X_i\le 1,~1\le i\le 3. \end{aligned}$$

Based on (5), the points with the highest impacts for the factors $X_1,$ $X_2$, and $X_3$ are calculated as follows $x^*_{1}=0.5470,$ $x^*_{2}=0.5225$, and $x^*_{3}=0.5005,$ respectively. Based on (6), the corresponding areas for the factors $X_1,$ $X_2$, and $X_3$ are calculated as follows $A(X_1)=51.67,$ $A(X_2) =25.01$, and $A(X_3)=0.5019,$ respectively. Therefore, the order of the importance of the input factors is given as follows $X_{1}\ggg X_{2}\ggg X_{3}.$ Table 1 gives LHDs with 11, 16, and 20 points for the first, second, and third stages, respectively, and their corresponding outputs. From Table 1 and the proposed SeqST in Sect. 2, we get

The first meta-model $\widehat{F_1}$ gives the following relationship between the LHD $\textbf{D}_{1}=[X_{1}]$ and the corresponding output ${Y}_1=F({\textbf{U}}_{1})$:
$$\begin{aligned} \widehat{F_1}=278.3730+30.4786X_1. \end{aligned}$$
The second meta-model $\widehat{F_2}$ gives the following relationship between the LHD $\textbf{D}_{2}=[X_{1}~X_{2}]$ and the corresponding output ${Y}_2=F({\textbf{U}}_{2})$:
$$\begin{aligned} \widehat{F_2}=278.1910+31.5193X_1+1.6511X^2_1+15.1802X_2+0.0123X^2_2. \end{aligned}$$
The third meta-model $\widehat{F_3}$ gives the following relationship between the LHD $\textbf{D}_{3}=[X_{1}~X_{2}~X_{3}]$ and the corresponding output ${Y}_3=F({\textbf{U}}_{3})$:
$$\begin{aligned} \widehat{F_3}= & {} 275.0492+29.9669X_1+1.5248X^2_1+15.8355X_2\\{} & {} +0.0124X^2_2+0.2970X_3+0.0018X^2_3. \end{aligned}$$

Therefore, the overall meta-model is the weighted average that is given as follows:

$$\begin{aligned} {\widehat{F}}_{SeqST}= & {} 277.2204+30.6549X_1+15.5078X_2+0.2970X_3+1.5880X_1^2\\{} & {} +0.0123X_2^2+0.0018X_3^2. \end{aligned}$$

To test the performance of this meta-model, the SinST is used to find another meta-model using an LHD ${\textbf{U}}$ with the same number of points in the three stages of the SeqST, i.e., $n=n_1+n_2+n_3=11+16+20=47$. Table 2 gives an LHD ${\textbf{U}}=[X_{1}~X_{2}~X_{3}]$ and the corresponding output ${Y}=F({\textbf{U}})$. From Table 2, the meta-model that describes the relationship between ${\textbf{U}}$ and ${Y}=F({\textbf{U}})$ is given as follows:

$$\begin{aligned} {\widehat{F}}_{SinST}= & {} 277.2204+28.9878X_1+14.4962X_2+0.2900X_3\\{} & {} +1.4974X_1^2+0.0120X_2^2+0.0017X_3^2. \end{aligned}$$

Figure 5 gives all the 47 values of ${F}({\textbf{U}}_{\textrm{test}}),$ ${\widehat{F}}_{SeqST}({\textbf{U}}_{\textrm{test}})$ and ${\widehat{F}}_{SinST}({\textbf{U}}_{\textrm{test}})$ and the absolute differences between each two of them using an LHD ${\textbf{U}}_{\textrm{test}}$ with 47 points as a testing dataset. The results show that the values of ${\widehat{F}}_{SeqST}({\textbf{U}}_{\textrm{test}})$ are closer to ${F}({\textbf{U}}_{\textrm{test}})$ than the values of ${\widehat{F}}_{SinST}({\textbf{U}}_{\textrm{test}}).$ Moreover, the mean square error (MSE), $MSE =\frac{1}{n}\sum _{i = 1}^{n} (F_i - \widehat{F_i})^2,$ of these two meta-models are given as follows:

$$\begin{aligned} MSE_{SeqST}=1.0875\times 10^3<MSE_{SinST}=1.6838\times 10^3. \end{aligned}$$

Therefore, the SeqST is much better than the SinST.

Table 1 The three-stage designs and their corresponding observed outputs for the SeqST for the illustrative example

Full size table

Table 2 The single-stage design and its corresponding observed outputs for the SinST for the illustrative example

Full size table

5 The performance assessment of the new proposed SeqST

To evaluate the performance of our proposed methodology, we consider the following four examples, two linear models and two non-linear models. The first linear model is the so-called the pullulan production model. Although pullulan has been produced commercially since 1978, the production mechanism on the genetic level is still far from being fully understood. As a result, only empirical models can be built to optimize pullulan production. One of these models is derived by Goksungur et al. (2005) as follows:

$$\begin{aligned} Y_1= & {} -29.851+1.189X_1+0.057X_2+5.086X_3-0.011X^2_1 -0.0000607X^2_2-1.3633X^2_3 \\{} & {} -0.000296X_1X_2+0.0263X_1X_3. \end{aligned}$$

This model predicts the final concentration of pullulan (g/L) as a function of the initial substrate concentration ($X_1$), the speed of agitation ($X_2$), and the airflow rate ($X_3$). The ranges of variation of the independent variables are $X_1\in [30\,\,70]$ g/L, $X_2\in [200\,\,600]$ rpm, and $X_3\in [1\,\,3]$ vvm. The range of variation of the dependent variable $Y_1$ is $[4.96\,\,17]$. The second linear model is the so-called the Goldprice model that has been studied by Andre et al. (2000) and Ranjan et al. (2008). The Goldprice function is given by

$$\begin{aligned} Y_2= & {} \left[ 1+(X_1+X_2+1)^2\left( 19-14X_1+3X^2_1-14X_2+6X_1X_2+3X^2_2\right) \right] \\{} & {} \times \left[ 30+(2X_1-2X_2)^2\left( 18-32X_1+12X^2_1+48X_2-36X_1X_2+27X^2_2\right) \right] , \end{aligned}$$

where the two input factors $X_1$ and $X_2$ are defined on the domain $[-2\,\,2]\times [-2\,\, 2].$

The non-linear model is an equation selected for its very different topology and non-linearity compared to the first two models. The first non-linear model is given as follows:

$$\begin{aligned} Y_3=\frac{\ln (X_1)(\sin X_2+4)}{\exp (X_3)}+\ln (X_1){\exp (X_3)}, \end{aligned}$$

where the ranges of variation of the independent variables are $X_1\in [ 0.1\,\, 10],~X_2\in [-\pi /2\,\,\pi /2],$ and $X_3\in [0\,\,1]$ leading to a variation of the dependent variable Y in the range of $[-13.82\,\,13.82].$ The second non-linear model is given as follows:

$$\begin{aligned} Y_4=\exp (X_1)+\sin (X_2)+X_3^7, \end{aligned}$$

where the range of variation of the independent variables is $[0\,\,1].$

A comparison study between the mean squared error (MSE), $MSE =\frac{1}{n}\sum _{i = 1}^{n} (F_i - \widehat{F_i})^2,$ and mean absolute error (MAE), $MAE =\frac{1}{n}\sum _{i = 1}^{n} |F_i - \widehat{F_i}|,$ of the meta-models using the new proposed SeqST and the classical SinST is given based on the above-mentioned four models using the LHDs as training and testing datasets, the polynomial models as the fitting models, and the medians of the ranges of the input factors as the points with the highest impacts. To have a fair comparison study between the SeqST and SinST, the number of representative training points for SinST is selected to be equal to the total number of representative training points in all the p stages of the SeqST, i.e., $n=n_1+n_2+\cdots +n_p.$ Since the representative training datasets and the representative testing datasets (LHDs) are not deterministic for a given n, the minimum, mean, median, and $95\%$ confidence interval ($95\%$CI) of the MSEs and MAEs of the approximate meta-models of the above-mentioned four models using the SeqST and SinST based on about 5000 different randomly generated representative training datasets and representative testing datasets are given in Table 3 to investigate the behavior of the SeqST for any randomly generated representative datasets. From Table 3, we get the following:

The new proposed SeqST is better than the classical SinST for all the four models, where the values of the MSE and MAE via the SeqST are smaller than their values via the SinST. The SeqST is better than the SinST for 5000 different training and testing datasets, where the minimum, mean, and median of about 5000 MSE and MAE values via the SeqST are less than their values via the SinST for all the cases.
The gaps among the impacts of the input factors for $Y_3$ > (i.e., greater than) the gaps among the impacts of the input factors for $Y_4$ > the gaps among the impacts of the input factors for $Y_1$ > the gaps among the impacts of the input factors for $Y_2$. The performance of the SeqST for $Y_3$ $\succeq $ (i.e., better than) the performance of the SeqST for $Y_4$ $\succeq $ the performance of the SeqST for $Y_1\succeq $ the performance of the SeqST for $Y_2$, where the percentage differences between the minimum, mean, and median of the MSEs (and MAEs) for the SeqST and SinST for $Y_3$ > that for $Y_4>$ that for $Y_1>$ that for $Y_2.$ That is, when there are significant gaps among the impacts of the input factors, the accuracy of the SeqST increases.

Table 3 The simulation results for the performance of the SeqST and SinST using the above-mentioned models $Y_1,Y_2,Y_3$ and $Y_4$ via 5000 repetitions

Full size table

6 Further interesting investigation for the performance of the SeqST

After the above-mentioned results come to mind the following new logical questions: What is the effect of the order of the importance of the input factors on the accuracy of the new proposed SeqST? What is the effect of the gaps among the importance of the input factors on the accuracy of the new proposed SeqST? What is the effect of the total number of points on the accuracy of the new proposed SeqST? What is the effect of the number of points in each stage on the accuracy of the new proposed SeqST? The answers of these questions provide benchmarks for the optimality use of the new proposed SeqST. This section tries to answer these questions and other interesting questions based on computer experiments.

Let the following non-linear model:

$$\begin{aligned} Y_5= -e^{-\left( X_{1}+0.5\right) ^{2}}-2 e^{-\left( {X}_{2}-0.5\right) ^{2}}-4 e^{-\left( {X}_{3}+3\right) ^{2}},~0\le X_i\le 1,~1\le i\le 3. \end{aligned}$$

Figure 6 investigates the importance of the three input factors for the model $Y_5$. From Fig. 6 and based on the area under each curve, we get that $X_2 \ggg X_1 \ggg X_3$ is the order of the importance of $Y_5.$ To check the effect of the number of points in each stage and the order of the importance of the input factors on the accuracy of the SeqST, different numbers of points in each stage are used as follows: $10\le n_i\le 100,~1\le i\le 3$ and $n_1+n_2+n_3=120$. Figures 7, 8, and 9 give the MSE values of the SeqST for the model $Y_5$ using different number of points in each stage based on the following three different order of importance: $X_2 \ggg X_1 \ggg X_3$ (correct order), $X_1 \ggg X_2\ggg X_3$ (wrong order), and $X_3 \ggg X_2 \ggg X_1$ (wrong order), respectively. Figure 10 gives a comparison study between the SeqST and SinST based on different number of points in each stage from the three stages of $Y_5,$ where the number of points in the SinST n is equal to the number of points in the three stages of the SeqST, i.e., $n=n_1+n_2+n_3.$ From Figs. 7, 8, 9 and 10, we conclude that:

The MSE values using the correct order of the importance are less than the MSE values using the wrong order of the importance for any number of points in each stage, where the ranges of MSE values are about $(0.06\,\,0.073),~(0.24\,\,0.65)$, and $(0.43\,\,0.63)$ for $X_2 \ggg X_1 \ggg X_3$ (correct order), $X_1 \ggg X_2\ggg X_3$ (wrong order), and $X_3 \ggg X_2 \ggg X_1$ (wrong order), respectively. Therefore, it is recommended to carefully check the order of the importance before using the new proposed SeqST.
The new proposed SeqST is better than the classical SinST for any number of points, where the MSE values for the SeqST are less than the MSE values for the SinST. However, the SeqST is much better than the SinST for a small number of points (cf. Fig. 10). Therefore, it is recommended to use the new proposed SeqST for small number of points (experiments with a few trials).

Moreover, from the discussions about the models $Y_1,~Y_2,~Y_3$, and $Y_4$ in Sect. 5, it is observed that: When there are significant gaps among the impacts of the input factors, the accuracy of the new proposed SeqST increases. The following discussion tries to give more investigations for this interesting observation using two different types of gaps among the impacts of the input factors. The first type is the power gap that is investigated using the following model:

$$\begin{aligned} Y_6=X_1^{\alpha _{1}}+X_2^{\alpha _{2}}+X_3^{\alpha _{3}},~0\le X_i\le 1,~1\le \alpha _i\le 8,~1\le i\le 3,~3\le \alpha _1+\alpha _2+\alpha _3\le 10. \end{aligned}$$

The second type is the coefficient gap that is investigated using the following model:

$$\begin{aligned} Y_7=\beta _{1} X_1+\beta _{2}X_2+\beta _{3} X_3,~0\le X_i\le 1,~1\le \beta _i\le 18,~1\le i\le 3,~3\le \beta _1+\beta _2+\beta _3\le 20. \end{aligned}$$

Figures 11 and 12 give the differences of the medians of the MSE values using the new proposed SeqST and the medians of the MSE values using the classical SinST based on about 5000 different randomly generated representative training datasets and representative testing datasets for different powers and coefficients of the models $Y_6$ and $Y_7,$ respectively. The order is taken here as: $X_1 \ggg X_2 \ggg X_3.$ That is, the correct power that is consistent with this order is $\alpha _1<\alpha _2<\alpha _3;$ however, the correct coefficient that is consistent with this order is $\beta _1>\beta _2>\beta _3.$ From Figs. 11 and 12, we get

When there are big gaps among the impacts of the input factors, the performance of the new proposed SeqST is much better than the performance of the classical SinST. Keep in mind that $0\le X_i\le 1,$ i.e., when there are big gaps among the powers, $\alpha _1,\alpha _2,$ and $ \alpha _3,$ we have small gaps among the impacts of the input factors, $X_1,~X_2,$ and $X_3,$ and vice versa. However, when there are big gaps among the coefficients, $\beta _1,~\beta _2,$ and $\beta _3$, we have big gaps among the the impacts of the input factors, $X_1,~X_2,$ and $X_3,$ and vice versa. Therefore, it is recommended to use the new proposed SeqST for experiments with large gaps among the impacts of their input factors.
For small powers $\alpha _1$ and $\alpha _2$ and large power $\alpha _3$ (i.e., correct order of the importance), the performance of the new proposed SeqST is much better than its performance for large power $\alpha _3$ (i.e., wrong order of the importance). For large coefficients $\beta _1$ and $\beta _2$ and small coefficient $\beta _3$ (i.e., correct order of the importance), the performance of the SeqST is much better than its performance for large coefficient $\beta _3$ (i.e., wrong order of the importance). Therefore, we get the same conclusion that is mentioned above: It is recommended to carefully check the importance order before using the new proposed SeqST.

To provide a more investigation to the effect of the number of points in each stage on the accuracy of the new proposed SeqST, let the following model:

$$\begin{aligned} Y_8=X_1^{4}+\frac{1}{2} X_2^{4}+\frac{1}{3} X_3^{4},~0\le X_i\le 1,~1\le i\le 3. \end{aligned}$$

Figure 13 investigates the importance of the three input factors for the models $Y_8$. From Fig. 13 and based on the area under each curve, we get that $X_1 \ggg X_2 \ggg X_3$ is the order of the importance of $Y_8.$ Table 4 gives the MSE values and MAE values for $Y_8$ based on the correct order of the importance and different number of training points in each stage. Moreover, Table 4 gives the MSE values and MAE values for the above-mentioned $Y_4$ and $Y_5.$ From Table 4, we conclude that: $n_1>n_2>n_3$ is the best selection of the number of the training points in the three stages. Therefore, it is recommended to use the new proposed SeqST with a descending order of the numbers of training points in its stages.

Table 4 The MSE and MAE for different number of points

Full size table

7 Conclusion and future work

This paper gives a new sequential stage technique (SeqST) for designing and modeling experiments when the input factors are not equally important. In the new proposed SeqST, the input factors are added to the process and modeled sequentially according to their importance, one input factor is added at each stage, while each remaining input factor keeps fixed at a given point that has the highest influence. A comparison study between the new proposed SeqST and the classical single-stage technique (SinST) is investigated. The effects of: the order of the importance of the input factors, the number of the training points in each stage, the total number of the training points, and the gaps among the influences of the input factors, on the performance of the new proposed SeqST are investigated. This study gives a benchmark that guide experimenters to effectively designing and modeling their experiments. The main results show that:

The performance of the new proposed SeqST is better than the performance of the classical SinST under different experimental conditions and scenarios.
The deviation between the performance of the new proposed SeqST and the classical SinST for small number of training points is larger than that when there are a large number of training points.
The deviation between the performance of the new proposed SeqST and the SinST for experiments with large gaps among the impacts of their factors is larger than that when there are small gaps among the impacts of their factors.
The new proposed SeqST has a good performance using the correct order of the importance of the input factors.
The new proposed SeqST has a good performance using a descending order of the numbers of the training points in its stages.

Therefore, we conclude that the new proposed SeqST is highly recommended to be used with the correct order of the importance of the input factors using a descending order of the training points in its stages for experiments with a few trials and/or large gaps between the importance of their factors.

During this work, the following interesting new ideas for future work have been arisen. The first author is working on them, and some theoretical and simulation results are obtained. However, more time and effort are needed to crystallize them in high-quality research papers with significant results.

This paper is a good first stone toward more future work in this regard. For instance, there is a significant need to theoretically study the behavior of the new proposed SeqST more deeply. In this study, the LHDs are used as training and testing datasets and the polynomial model is used as the fitting model. The logical questions are that: What is the effect of the type of training and testing datasets on the performance of the new proposed SeqST? What is the effect of the type of fitting model on the performance of the new proposed SeqST? Is the new proposed SeqST still applicable to implicit functional relationships in engineering without prior information? In the future work, the performance of the new proposed SeqST under various types of optimal experimental designs, such as uniform designs, orthogonal arrays, D-optimal designs, and various types of machine learning modeling techniques, will be investigated.
Elsawah 2022d (cf. its Sect. 5) presented a mixture factor-weight WD (MFWWD) as a new criterion for constructing new uniform mixture factor-weight experimental designs (training and testing datasets) when the input factors are not equally important. A comparison study between the classical SinST using the new uniform mixture factor-weight experimental designs and the new proposed SeqST using classical uniform designs in all of its stages will be investigated in the future work.

Data availability

All data generated or analyzed during this study are included in this article.

References

Andre J, Siarry P, Dognon T (2000) An improvement of the standard genetic algorithm fighting premature convergence. Adv Eng Softw 32(1):49–60
Article Google Scholar
Balestrassi PP, Popova E, Paiva AD, Lima JM (2009) Design of experiments on neural network’s training for nonlinear time series forecasting. Neurocomputing 72(46):1160–1178
Article Google Scholar
Diwekar UM, Kalagnanam JR (1997) Efficient sampling technique for optimization under uncertainty. AIChE J 43(2):440–447
Article CAS ADS Google Scholar
Elsawah AM (2017a) A closer look at de-aliasing effects using an efficient foldover technique. Statistics 51(3):532–557
Elsawah AM (2017b) A powerful and efficient algorithm for breaking the links between aliased effects in asymmetric designs. Aust N Z J Stat 59(1):17–41
Elsawah AM (2021a) Multiple doubling: a simple effective construction technique for optimal two-level experimental designs. Stat Pap 62:2923–2967
Elsawah AM (2021b) An appealing technique for designing optimal large experiments with three-level factors. J Comput Appl Math 384:113164
Elsawah AM (2022a) A novel non-heuristic search technique for constructing uniform designs with a mixture of two- and four-level factors: a simple industrial applicable approach. J Korean Stat Soc 51:716–757
Elsawah AM (2022b) Designing optimal large four-level experiments: a new technique without recourse to optimization softwares. Commun Math Stat 10:623–652
Elsawah AM (2022c) Improving the space-filling behavior of multiple triple designs. Comput Appl Math 41:180
Elsawah AM (2022d) Novel techniques for performing follow-up experiments based on prior information from initial-stage experiments. Statistics 56(5):1133–1165
Elsawah AM (2023a) A novel hybrid algorithm for designing mixed three- and nine-level experiments without modeling assumptions. Commun Stat Simul Comput. https://doi.org/10.1080/03610918.2023.2269323
Elsawah AM (2023b) A novel doubling-tripling-threshold accepting hybrid algorithm for constructing asymmetric space-filling designs. J Korean Stat Soc. https://doi.org/10.1007/s42952-023-00232-5
Elsawah AM, Gong Y (2023) A new non-iterative deterministic algorithm for constructing asymptotically orthogonal maximin distance Latin hypercube designs. J Korean Stat Soc 52:621–646
Article MathSciNet Google Scholar
Elsawah AM, Vishwakarma GK (2022) A systematic construction approach for nonregular fractional factorial four-level designs via quaternary linear codes. Comput Appl Math 41:323
Article MathSciNet Google Scholar
Fang KT, Li RZ, Sudjianto A (2006) Design and modeling for computer experiments. Chapman and Hall/CRC, New York
Google Scholar
Fisher RA (1935) The design of experiments. Oliver and Boyd, Edinburgh
Google Scholar
Goksungur YS, Dagbagli AU, Guvenc U (2005) Optimisation of pullulan production from synthetic medium by aureobsidium pullulans in a stirred tank reactor by response surface methodology. J Chem Technol Biotechnol 80(7):819–827
Article Google Scholar
Hammersley JM (1960) Monte Carlo methods for solving multivariate problems. Ann N Y Acad Sci 86(3):844–874
Article ADS Google Scholar
Husslage BGM, Rennen G, van Dam ER, Hertog D (2011) Space-filling Latin hypercube designs for computer experiments. Optim Eng 12(4):611–630
Article Google Scholar
Iman RL, Conover WJ (1980) Small sample sensitivity analysis techniques for computer models, with an application to risk assessment. Commun Stat Theory Methods 17:1749–1842
Article MathSciNet Google Scholar
Iordanis I, Koukouvinos C, Silou I (2022) Classification accuracy improvement using conditioned Latin hypercube sampling in supervised machine learning. In: 12th international conference on dependable systems, services and technologies (DESSERT), Athens, Greece
Lujan-Moreno GA, Howard PR, Rojas OG, Montgomery DC (2018) Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study. Expert Syst Appl 109:195–205
Article Google Scholar
Mark L (2009) The mathematical mechanic: using physical reasoning to solve problems. Princeton University Press, Princeton
Google Scholar
Mckay MD, Beckman RJ, Conover WJ (1979) A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21(2):239–245
MathSciNet Google Scholar
Mohamed HS, Elsawah AM, Shao YB, Wu CS, Bakri M (2023) Analysis on the shear failure of HSS S690-CWGs via mathematical modelling. Eng Fail Anal 143:106881
Article CAS Google Scholar
Ortiz-Rodriguez JM, Martinez-Blanco MR, Vega-Carrillo HR (2006) Robust design of artificial neural networks applying the Taguchi methodology and DoE. In: Electronics, robotics and automotive mechanics conference (CERMA’06), pp 131–136. https://doi.org/10.1109/CERMA.2006.83
Owen AB (1992) Orthogonal arrays for computer experiments, integration and visualization. Stat Sin 2:439–452
MathSciNet Google Scholar
Packianather MS, Drake PR, Rowlands H (2000) Optimizing the parameters of multilayered feedforward neural networks through Taguchi design of experiments. Qual Reliab Eng Int 16(6):461–473
Article Google Scholar
Prasath BB, Elsawah AM, Liyuan Z, Poon K (2021) Modeling and optimization of the effect of abiotic stressors on the productivity of the biomass, chlorophyll and lutein in microalgae Chlorella pyrenoidosa. J Agric Food Res 5:100163
Google Scholar
Prasath BB, Zahir M, Elsawah AM, Raza M, Lecong C, Chutian S, Poon K (2022) Statistical approaches in modeling of the interaction between bacteria and diatom under a dual-species co-cultivation system. J King Saud Univ Sci 34(1):101743
Article Google Scholar
Ranjan R, Bingham D, Michailidis G (2008) Sequential experiment design for contour estimation from complex computer codes. Technometrics 50:527–541
Article MathSciNet Google Scholar
Salmaso L, Pegoraro L, Giancristofaro RA, Ceccato R, Bianchi A, Restello S, Scarabottolo D (2022) Design of experiments and machine learning to improve robustness of predictive maintenance with application to a real case study. Commun Stat Simul Comput 51(2):570–852
Article MathSciNet Google Scholar
Samuel AL (1959) Some studies in machine learning using the game of checkers. IBM J Res Dev 3(3):210–229
Article MathSciNet Google Scholar
Sikirica A, Grbcic L, Kranjcevic L (2023) Machine learning based surrogate models for microchannel heat sink optimization. Appl Therm Eng 222:119917
Article Google Scholar
Staelin C (2003) Parameter selection for support vector machines. Hewlett-Packard Company, London
Google Scholar
Sukthomya W, Tannock J (2005) The optimisation of neural network parameters using Taguchi’s design of experiments approach: an application in manufacturing process modelling. Neural Comput Appl 14(4):337–344
Article Google Scholar
Viana FAC (2013) Things you wanted to know about the Latin hypercube design and were afraid to ask. In: 10th world congress on structural and multidisciplinary optimization
Zhang S, Feng G, Yuan F, Guo S (2022) Twin support vector regression model based on heteroscedastic Gaussian noise and its application. IEEE Access 10:111738–111748
Article Google Scholar

Download references

Acknowledgements

The authors greatly appreciate valuable comments and suggestions of the Editor and referees that significantly improved the paper. Elsawah greatly appreciates the kind support of Prof. Kai-Tai Fang.

Funding

Elsawah’s work was supported by the UIC Research Grants with number of (R201810, R201912 and R202010); the Curriculum Development and Teaching Enhancement with number of (UICR0400046-21CTL); the Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College with number of (2022B1212010006); and the Guangdong Higher Education Upgrading Plan (2021-2025) with number of (UIC R0400001-22).

Author information

Authors and Affiliations

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai, 519087, China
A. M. Elsawah
Department of Statistics and Data Science, Faculty of Science and Technology, Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai, 519087, China
A. M. Elsawah, Yi-An Wang & Zhihan Chen
Department of Mathematics, Faculty of Science, Zagazig University, Zagazig, 44519, Egypt
A. M. Elsawah
Department of Actuarial Sciences, Faculty of Applied Sciences, Ankara University, Cankaya, Ankara, Turkey
Fatih Tank

Authors

A. M. Elsawah
View author publications
You can also search for this author in PubMed Google Scholar
Yi-An Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhihan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Fatih Tank
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. M. Elsawah.

Ethics declarations

Conflict of interest

There is no conflict of interest.

Additional information

Communicated by Graçaliz Pereira Dimuro.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Elsawah, A.M., Wang, YA., Chen, Z. et al. A sequential designing-modeling technique when the input factors are not equally important. Comp. Appl. Math. 43, 9 (2024). https://doi.org/10.1007/s40314-023-02519-z

Download citation

Received: 11 February 2023
Revised: 08 September 2023
Accepted: 03 November 2023
Published: 07 December 2023
DOI: https://doi.org/10.1007/s40314-023-02519-z

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A sequential designing-modeling technique when the input factors are not equally important

Abstract

Similar content being viewed by others

Designing Optimal Large Four-Level Experiments: A New Technique Without Recourse to Optimization Softwares

The Role of Computational Intelligence in Experimental Design: A Literature Review

Multiple doubling: a simple effective construction technique for optimal two-level experimental designs

1 Introduction

2 The new proposed sequential stages designing-modeling technique

3 On the importance of the input factors and their points

4 An illustrative example

5 The performance assessment of the new proposed SeqST

6 Further interesting investigation for the performance of the SeqST

7 Conclusion and future work

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A sequential designing-modeling technique when the input factors are not equally important

Abstract

Similar content being viewed by others

Designing Optimal Large Four-Level Experiments: A New Technique Without Recourse to Optimization Softwares

The Role of Computational Intelligence in Experimental Design: A Literature Review

Multiple doubling: a simple effective construction technique for optimal two-level experimental designs

Explore related subjects

1 Introduction

2 The new proposed sequential stages designing-modeling technique

3 On the importance of the input factors and their points

4 An illustrative example

5 The performance assessment of the new proposed SeqST

6 Further interesting investigation for the performance of the SeqST

7 Conclusion and future work

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation