Keywords

1 Introduction

In the past three decades, computer simulated experiments (CSE) have replaced classical experiments to investigate physical complex phenomena, especially when classical (physical) experiments are not feasible. For example, the use of reservoir simulator to predict ultimate recovery of oil, the use of finite element codes to predict behavior of metal structure under stress, and so on [1]. The nature of computer simulated experiments is deterministic [2]; hence identical settings of input variables always produce an identical set of output response. Consequently, space filling designs that aim to spread the design points over a region of interest are required. The most popular class of space filling design in the context of computer simulated experiments is latin hypercube design (LHD). LHD design was originally proposed by Mckay and co-workers [3] in 1979. The optimal LHD can be constructed through combinatorial methods (non-search algorithm) [4] or through search algorithms [5, 6]. The former method generates design with good design properties but it is restricted in terms of a design size. For example methods proposed by Butler [4] are limited to a design size of a prime number. The latter method is based largely on improving design by exchanging between the pairs of design points. Exchange algorithms can be time consuming to implement, however, the generated design are flexible and straightforward. The CSEs are usually complex and consist of many input variables to investigate [7] and hence a large number of design runs are required to estimate the parameters corresponding to the factors of interest in the model. For example, if the problem of interest consists of d input variables and n number of runs, the total number of all possible LHD is \( (n!)^{d} \). Obviously this number explodes exponentially as the values of n and d increase; hence the full space of LHD cannot be explored. In this case we need search algorithms to lead us to a good design with respect to an optimality criterion. The key idea of all existing search algorithms is to use some kinds of exchange procedures to move towards the better designs.

The search based approach for selecting a design is implemented by combining search algorithms and the optimality criterion [8]. For example, Morris and Mitchell [5] adopted a version of Simulated Annealing algorithms (SA) to search for optimal LHDs with respect to ϕ p criterion. Li and Wu [8] proposed a columnwise-pairwise algorithm (CP) with respect to the D efficiency criterion. It was reported that CP is very simple and easy to implement. Ye and his co-workers [6] adapted CP algorithm to search for symmetric LHD under various optimality criteria such as entropy and ϕ p criteria. Park [9] proposed a row-wise element exchange algorithm along with IMSE and entropy criteria. Leary et al. [10] adapted CP and SA algorithms to construct the optimal designs within the orthogonal-array based Latin hypercube class by using the ϕ p criteria. Jin et al. [11] developed an enhanced stochastic evolutionary algorithm (ESE) to search for the best design considering various optimality criteria such as a maximin distance criterion, ϕ p criterion and entropy criterion. ESE has received wide attention from researchers due to its performance in constructing the optimal LHD. Liefvendahl and Stocki [12] applied a version of Genetic algorithm (GA) to search for the optimal LHD considering ϕ p and a maximin distance criterion. A similar work can be found in [13] as the authors applied GA for constructing maximin designs. Grosso et al. [14] used the iterated local search algorithm and SA in constructing the optimal LHD under maximin distance and ϕ p criterion. Vianna et al. [15] proposed the algorithm for fast optimal LHD by using the idea of seed design under maximin distance and ϕ p criterion. Husslage and Rennen [16] proposed the method to construct the optimal LHD using different starting points. Due to the popularity of SA and ESE along with ϕ p criteria, this paper presents the enhancement method to improve the capability of SA and ESE under ϕ p criterion. In the following sections, we describe designs and optimality criteria, followed by search algorithms and its modification, results and conclusion, respectively.

2 Latin Hypercube Design and Optimality Criteria

2.1 Latin Hypercube Design

LHD is a matrix (X), which obtains n rows and d columns where n is the number of runs and d is the number of input variables. LHD can be constructed based on the idea of stratified sampling, which can ensure all subregions in the space are sampled with equally probability. A Latin hypercube sampling has

$$ X_{ij} = \frac{{\pi_{ij} - U_{ij} }}{n}, $$
(1)

where \( \pi_{i1} ,\pi_{i2} , \ldots ,\pi_{id} \) are independent random permutation of (1, 2, …, n) and \( U_{ij} \) are n x d values of i.i.d. uniform U[0,1] random variables independent of the \( \pi_{ij} \). In practice, LHD can be easily generated by a random permutation of each column which contains the levels (1, 2 ,…, n). Then the d columns are combined together to form the design matrix X. LHD can ensure uniform coverage of each input variable from a different single dimension. This shows a benefit of LHD on deterministic computer experiments.

2.2 The ϕ P Optimality Criterion

Morris and Mitchell [5] proposed a modification of maximin distance criterion to search for the optimal design. For a given design X, the Euclidean intersite distance between any two design points can be calculated from

$$ d(x_{i} ,x_{j} ) = \left[ {\sum\limits_{k = 1}^{d} {(x_{ik} - x_{jk} )^{2} } } \right]^{1/t} $$
(2)

By using (2), all intersite distances for every pairs of design points are calculated and can be expressed in a symmetric matrix. Let Euclidean distance list \( d_{1} ,d_{2} , \ldots ,d_{m} \) be the distinct elements list from the smallest to largest, and also define index list \( (J_{1} ,J_{2} , \ldots ,J_{m} ) \) which \( J_{j} \) is the number of pairs of sites in the design separated by distance \( d_{j} \). Thus X is a maximin design if among available designs, it maximizes \( d_{j} \) while \( J_{j} \) is minimized. The scalar criterion can be expressed as

$$ \phi_{p} = [\sum\limits_{j = 1}^{m} {J_{j} d_{j}^{ - p} ]} , $$
(3)

where p is a positive integer, \( J_{j} \) and \( d_{j} \) specified from X. In this study, the adaptive form of ϕ p [11] which is simpler than (3) to implement is considered

$$ \phi_{p} = \left[ {\sum\limits_{i = 1}^{n - 1} {\sum\limits_{j = i + 1}^{n} {\frac{1}{{d_{ij}^{p} }}} } } \right]^{{\frac{1}{p}}} $$
(4)

After ϕ p value has been calculated, a design that minimizes ϕ p is considered as an optimal design.

3 Search Algorithms

There are many search algorithms for constructing optimal LHD. This section will firstly discuss two search algorithms Stimulated Annealing (SA) and Enhanced Stochastic Evolutionary (ESE), and then explain how to modify and improve each algorithm for better efficiency.

3.1 Simulated Annealing (SA) Algorithm

SA is based on the analogy between the simulation of annealing solids and the problem solving of large combinatorial optimization problems [9]. Morris and Mitchell [5] adapted SA to construct optimal LHD using ϕ p optimality criterion. LHD minimizing ϕ p value is reserved as the best optimal LHD in the search space. The entire process can be described in Algorithm 1.

SA performs searching by obtaining a new LHD (X try ) via randomly element-exchange and update X with a better LHD or a worse LHD with satisfying probability, then update X best with a better LHD. The probability of accepting a worse design is controlled by a value of temperature t, a chance of worse LHD acceptance decreases by a cooling system (t = t * C t ). SA has several parameters such as t 0 , I max and C t . The discussion of setting SA parameters for LHD construction and how well SA can perform in terms of moving away from a local optimum value of ϕ p can be seen in [5]. In this study, we use a heuristic method to find the best set of parameters to use in SA as presented in [17].

This paper focuses on improving the efficiency of SA to construct optimal LHD. From Eq. (4), it can be obviously seen that time for ϕ p calculation mainly depends on the number of run (n), hence it is approximate to the Euclidean distance matrix calculation (n 2). The search process takes a long time mainly because of ϕ p calculation. Jin et al. [11] has proposed an optimized way to calculate ϕ p when element-exchange is assigned to LHD. The element-exchange is a swap operation between two selected points. Hence, when exchanging points between rows i 1 and i 2 within column k \( (x_{{i_{1} k}} \leftrightarrow x_{{i_{2} k}} ) \), only elements in rows i 1 and i 2 , and columns i 1 and i 2 in the distance matrix are changed. The ϕ p of LHD after element-exchange can be calculated with a linear time as follows.

For any 1 ≤ jn and ji 1 , i 2 let

$$ s(i_{1} ,i_{2} ,k,j) = \left| {x_{{i_{2} k}} - x_{jk} } \right|^{t} - \left| {x_{{i_{1} k}} - x_{jk} } \right|^{t} $$
(5)

then

$$ d^{\prime}_{{i_{1} j}} = d^{\prime}_{{ji_{1} }} = \left[ {d_{{i_{1} j}}^{t} + s(i_{1} ,i_{2} ,k,j)} \right]^{1/t} $$
(6)

and

$$ d^{\prime}_{{i_{2} j}} = d^{\prime}_{{ji_{2} }} = \left[ {d_{{i_{2} j}}^{t} + s(i_{1} ,i_{2} ,k,j)} \right]^{1/t} $$
(7)

Thus new ϕ p is computed by

$$ \phi_{p}^{\prime } = \left[ \begin{aligned} & \phi_{p}^{p} + \sum\limits_{{1 \le j \le n,j \ne i_{1} ,i_{2} }} {\left[ {(d_{i1j}^{\prime } )^{ - p} - (d_{i1j} )^{ - p} } \right]} + \\ & \sum\limits_{{1 \le j \le n,j \ne i_{1} ,i_{2} }} {\left[ {(d_{i2j}^{\prime } )^{ - p} - (d_{i2j} )^{ - p} } \right]} \\ \end{aligned} \right]^{1/p} $$
(8)

As shown in (5)–(8), when performing element exchange, only some rows and columns will be updated, hence there is no need to reconstruct the entire distance matrix to calculate ϕ p , hence the complexity of ϕ p calculation reduces to a linear time. Therefore, after this modification, the modified SA (MSA) performs effectively by reducing time complexity especially in ϕ p calculation.

3.2 Enhanced Stochastic Evolutionary (ESE) Algorithm

Jin et al. [11] proposed an algorithm called enhanced stochastic evolutionary (ESE) to construct an optimal design for CSE. The algorithm performs searching in 2 steps, a local search called inner loop and updating a global best and fine tuning probability of accepting a worse design called outer loop. The inner loop performs a local search by constructing a set of LHD and selecting the optimal LHD from the set. The set contains J LHDs formed by element-exchanging at column i mod d of X. If the selected design is better or not better but has good enough probability then the local optimal design (X) will be updated.

From Algorithm 2, the inner loop performs in 4–7, and repeats with the maximum loop (M). The X best and X are updated with the acceptance criteria. The outer loop controls the process by updating the value of temperature Th. Unlike SA, the process of updating the temperature (Th) is not fixed, but is controlled by the performance of searching in terms of the inner loop improvement by the number of improvement (n imp ) and number of acceptance (n acpt ). There are two processes of updating Th called improving process and exploration process. The process is described in 9, when X best get improved in the inner loop and better than the previous best design (X old_best ), the improving process is active otherwise exploration process is active.

In improving process (flag imp  = 1), Th is adjusted based on the performance in a local best LHD search by considering an acceptance ratio (n acpt /M) and the improvement ratio in (n imp /M), β 1 and β 2 are cutting point values for updating Th, where 0 < β 1  < 1. Jin et al. [4] recommended β 1 to be 0.1. If the improvement ratio is greater than β 1 , Th is increased by α 1 (Th = Th/α 1 ), else if the improvement ratio is lower than the acceptance ratio Th is decreased by α 1 (Th = Th * α 1 ), otherwise unchanged, where 0 < α 1  < 1 and α 1  = 0.8 as suggested by Jin et al. [4].

In exploration process (flag imp  = 0), when a local best LHD performs worse, Th is adjusted to help a search process moving away from a local optimal design by considering only the accept ratio. If the acceptance ratio is between β 1 and β 2 where 0 < β 1  < β 2  < 1, Th is increased by α 2 . If the acceptance ratio is greater than β 2 , Th is decreased by α 3 , where 0 < α 2  < α 3  < 1, α 2  = 0.9 and α 3  = 0.7, Th is rapidly increased, this means more worse designs could be accepted, then Th is decreased slowly for searching better design after moving away from a local optimal design. Jin et al. [11] recommended 0.1 for β 1 and 0.8 for β 2 .

3.3 Modified Enhancement of Stochastic Evolutionary (MESE)

This section we present the enhancement method on ESE. The modified version is called MESE. We combine the advantage of SA (i.e. local search process) and the advantage of ESE (i.e. global search process) to improve the search process. MESE still contains 2 nested loops. The outer loop is similar to ESE except a stopping rule and a new variable X gbest to record the best so far LHD, while the inner loop is modified as describe in Algorithm 3. The maximum number of cycles used is replaced by the following condition. If X best is not improved from the global best design (X gbest ) for \( \delta \) consecutive times, the search process is terminated. In this study, we set δ = 40 [6].

We have modified ESE by combining a part of SA into the modified ESE (MESE). Algorithm 3 explains this modification.

In Algorithm 3, the major enhancement was made in the inner loop; the modification is on 4–6 and 7. Instead of generating J distinct LHDs at one time and select the best one, we modify it by constructing one LHD (X tmp ) at a time and keep the best one (X try) for J iterations. This change can decrease the computational time complexity since we keep the optimal LHD while generating it, instead of obtaining the entire distinct J LHDs and deciding the most optimal one. In this study the parameters J is set to be nC2/5 but not larger than 50, and the parameter M is in a range of 2 * nC2 * d/JM ≤ 100. The tolerance level (tl) is set to 0.0001, from the empirical study, the smaller value does not improve the search process. C max  = d * 10 + 10 and Max = 40. All simulation studies presented in this paper were performed using R program.

4 Result and Discussion

The values of ϕ p at the termination step of MSA, ESE and MESE for each dimension of problems are presented in Table 1. Each case was repeated for 10 times to consider the effect of different starting points. The descriptive statistics (max, min, mean and sd.) on the ϕ p values obtained from each search technique are presented. The results indicate that MSA, ESE and MESE perform similarly for small dimension of problem (i.e. d = 2 and 3) in terms of minimization of ϕ p . Further, the standard deviation values displayed a slightly larger amount of variation over 10 replications in ESE and MESE than that of MSA. This indicates the consistency in the search process for MSA when different starting points are considered. When considering medium dimension of problems (i.e. d = 7, 8 and 10), ϕ p values from ESE and MESE are slightly lower than MSA. In addition, the standard deviation values obtained from ESE and MESE are smaller than MSA. This indicates that the search process of ESE and MESE is more consistent when the search space is larger. For large dimensions of problem (i.e. d = 10 and 15), both of ESE and MESE perform better than MSA while MESE is slightly better than ESE in terms of minimization of ϕ p values. Hence if the goal is concerned with a good space filling design property, either ESE or MESE can be used for constructing the optimal LHD.

Table 1 Performance of MSA, ESE and MESE

The results of the performance (efficiency) for MSA, ESE and MESE algorithms are presented in terms of time and number of exchanges for each algorithm to reach the optimal ϕ p values, as shown in Table 1. As mentioned before, for each dimension of problems the search algorithms are repeated for 10 times, hence all values are presented as the average values. It can be clearly seen that MESE converges much faster than ESE and MSA as the time elapsed is less than that of ESE and MSA. When considering the number of exchange required in the search process, it is observed that number of exchange in MESE is less than the other two algorithms. The similar results are also observed in the case of medium and large dimension of problems as MESE converges much faster than MSA while it performs slightly better than ESE. This indicates that if time constraint is taken into account, MESE could be the better choice for constructing the optimal LHD designs.

5 Conclusion

This paper presents the method to enhance the SA and ESE algorithms in the construction of the optimal LHD. The major enhancement method appears in the calculation of ϕ p criterion and the tolerance level setting in SA. For MESE, the enhancement is applied by using the combination of SA and ESE especially in the inner loop as shown in Algorithm 3. As presented in the result section, MESE performs better than ESE and MSA in terms of the design property achievement and the efficiency. Hence MESE would be recommended for the construction of optimal LHD for CSE. In order to extend the conclusion, other classes of design can be developed and collaborated with MESE to search for the best design in the class. Further, other types of search algorithm like Particle swarm optimization (PSO) or any type of clever algorithms can be further developed in constructing an optimal LHD. The validation of the approximation model accuracy developed from the obtained optimal LHD could also be further investigated in order to explore the relation of space filling property and prediction accuracy of the model.