Calibration of Genetic Algorithm Parameters for Mining-Related Optimization Problems

Villalba Matamoros, Martha E.; Kumral, Mustafa

doi:10.1007/s11053-018-9395-2

Calibration of Genetic Algorithm Parameters for Mining-Related Optimization Problems

Original Paper
Published: 30 July 2018

Volume 28, pages 443–456, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Natural Resources Research Aims and scope Submit manuscript

Calibration of Genetic Algorithm Parameters for Mining-Related Optimization Problems

Download PDF

Martha E. Villalba Matamoros¹ &
Mustafa Kumral¹

537 Accesses
12 Citations
Explore all metrics

Abstract

Genetic algorithms (GA) are widely used to solve engineering optimization problems. The quality and performance of the solution generated strongly depend on the selection of the GA parameter values (crossover and mutation rates and population size). We propose an approach based on full factorial and response surface methodology experimental designs to calibrate GA parameters such that the objective function is maximized/minimized and the relative importance of the parameters is quantified. The approach was tested by applying it to stope optimization of underground mines, where profit can vary ± 7% based solely on GA parameters. Results showed that: (1) a larger population size did not always increase solution time; (2) solution time was positively related to crossover and mutation rates; and (3) simultaneous analysis of solution time and profit illustrated the trade-off between acceptable computing time and profit desirability through GA parameter selection. This approach can be used to calibrate parameters of other metaheuristics.

A Mixture Design of Experiments Approach for Genetic Algorithm Tuning Applied to Multi-objective Optimization

Multi-Objective Genetic Algorithms for Chemical Engineering Applications

The Best Genetic Algorithm I

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Metaheuristic approaches such as genetic algorithms, simulated annealing and ant colony optimization have been widely used to deal with various mining optimization problems (Kumral 2004; Kumral and Dowd 2005; Leite and Dimitrakopoulos 2007; Manchuk and Deutsch 2008; Lamghari and Dimitrakopoulos 2012; Shishvan and Sattarvand 2015; Goodfellow and Dimitrakopoulos 2016; Ruiseco et al. 2016; Sauvageau and Kumral 2016; Villalba and Kumral 2018a, b). Metaheuristic techniques are popular because they are easy to formulate, are highly adaptable to the problem structure, require less rigorous mathematical background on the part of the user, and require shorter computational times compared to exact approaches (Rayward-Smith 1996). However, it is almost impossible to know how near a solution is to optimal. As an intelligent and iterative process, metaheuristic generation adapts global (exploration) and local (exploitation) search concepts to find near-optimal solutions based on learning strategies (Osman and Laporte 1996). Given that mines are planned and designed under many uncertainties, near-optimal solutions are often acceptable. “Nearness” depends on the trade-off between exploration and exploitation, which is governed by the selection of parameter values. Thus, parameter calibration is a key issue when implementing a metaheuristic approach. However, it is usually overlooked and parameters are selected arbitrarily instead.

Evolutionary algorithms like genetic algorithms (GA) have been used in finance and mine planning optimization with positive results, for example in: increasing the profit by controlling dilution, accounting for orebody uncertainty, and balancing mining directions and orebody orientation in stope definition for underground mining (Villalba and Kumral 2018a, b); determining ore and waste dig limits of daily production for open-pit mining (Ruiseco et al. 2016; Ruiseco and Kumral 2017); finding the optimal location for a mine facility; and optimizing parameters of the Schwartz–Smith two-factor model to analyze the information from a transaction in future markets to reduce financial risk in future contracts on commodity prices (Sauvageau and Kumral 2016). However, these GA-based methodologies did not cover in detail the importance of defining their GA control parameters.

A generation in GA starts with an initial solution set or population of chromosomes. Using crossover and mutation operators, new offspring are formed. Solutions with high fitness values are retained to generate offspring. In other words, the higher the fitness, the higher the chance that solution will pass its genotype to the next generation. This process is repeated over many iterations until convergence. The performance of a GA depends mainly on the parameters (Mitchell 1999; Reeves 2003; Pandey et al. 2014), whose values should be determined carefully; otherwise, the search may end with a solution that is trapped in a local optimum. In practice, parameter values are usually calibrated one at a time using a simple sensitivity analysis. Although this method can potentially find sensitive parameters, it is time-consuming and inefficient in mine planning optimization where there are many parameters.

In the past, various approaches have been proposed to calibrate GA parameters. Grefenstette (1986) calibrated GA parameters through an adaptive search and tuning strategy, whereby two performance metrics were used: (1) online performance based on the average performance of examined structures during the search (t = 0, 1,…,T), where each structure is evaluated at each time t; and (2) offline performance based on the average of the best performances in time intervals [0,t]. Global robustness requires performance measures for the entire set of the response surface. For example, six control parameters needed 1000 evaluations using metalevel GAs, and each GA tested against five functions. Since the first experiment selected samples from a performance distribution, the best 20 samples of this distribution required additional testing. Even though the tests succeeded in finding control parameters that optimized the GA performance, this approach may require intractable CPU time to select control parameters in relevant mining problems. Eiben et al. (1999) proposed a taxonomy to eliminate vagueness in terminology and reviewed previous researches regarding control parameters in evolutionary algorithms. In that paper, three drawbacks regarding calibration were emphasized: (i) trying all combinations become nearly impossible in some problems; (ii) even if interactions are ignored, the evaluations can be time-consuming; and (iii) despite running many configurations, there is still a chance that selected parameters are not optimal The epistasis iteration between control parameters—which include the mutual impact of parameters on each other, and the complexity of joint influence of parameters on GA behavior—challenges any optimization of GA setting. A self-adaptive method could assist GAs in arranging their parameter itself; otherwise, a skeptical approach could use heuristic tools to adjust parameters adding to adaptive parameter control. Nannen and Eiben (2007) presented an approach called Relevance Estimation and Value Calibration of GA parameters (REVAC), which systematically explores the range of possible parameter setting combinations. In this approach, a distribution by each parameter (marginal density function) over a parameter’s range assigns high probability to values that lead to excellent GA performance. Distributions with a narrow peak or a broad plateau correspond to highly or moderately relevant parameters, respectively. Thus, values with high probability in such distributions define the GAs setting. With this approach, however, the use of 1000 evaluations demands unmanageable computing time; therefore, it is hard to implement in mining problems.

This paper proposes an approach to calibrate the population size and crossover and mutation rates of a GA using (1) full factorial design (FFD) to estimate main effects and parameter interactions and (2) response surface methodology (RSM) to find optimum values of design parameters. These two methods share the following characteristics (Bezerra et al. 2008): (a) the experimental design includes different levels and combinations of factors; (b) the factors are independent variables; (c) “levels” refer to values that these factors can take and are used during coding to replace design factors with an indicator set (e.g., in a three-level experiment design, the low, middle and high values are replaced by – 1, 0, + 1, respectively); (d) responses are dependent variables; and (e) residual error helps to measure how well a model fits the experimental data where low residual error is desired.

Full Factorial Design

Ronald A. Fisher introduced FFD in the 1920s through the early 1930s in collaboration with researchers from many fields and proposed the three principles of experimental design: randomization, replication and blocking (Montgomery 1997). Randomization prevents unknown bias, which can modify the result of an experiment. Replication of an experiment under the same conditions is performed to estimate experimental error and increase the precision of the experiment. Blocking helps to increase precision by eliminating the effect of nuisance factors on experimental error (Clifton Young 1996; Telford 2007).

A FFD consists of two or more factors, each with two or more levels. It differs from other designs because experimental units take all combinations of the factor levels (Fig. 1). A FFD is also called a “fully crossed design” because it permits analysis of the effect of all factors on the response variable, all levels of the factors and interactions between factors. This design is geometrically assembled because values are taken from the edge midpoints, axials and vertices of a cube in case of three factors (Fig. 1, left), that is, the number of factors implies n-dimensional shape. A FFD requires 3ⁿ runs (n = number of factors studied with three levels) to cover all experimental points. To overcome flaw in the exponential number of combinations of factors, an improved version called fraction factorial design can collect a fraction of the total number of vertices.

Response Surface Methodology

The RSM was introduced by George E. P. Box and Kenneth B. Wilson in the 1950s for use in chemical industries (Box and Wilson 1951). The ability to refine models and optimize a response that depends on several variables are the main strengths of RSM (Montgomery 1997). The methodology consists of: (1) choosing the independent variables; (2) delimiting the experimental region; (3) determining the experimental design; (4) fitting the experimental data; (5) evaluating the model fitness; and (6) evaluating the displacement in direction to the optimal region, which may lead to finding the optimal values of the experimental variables (Bezerra et al. 2008). Polynomial functions describe the relationships between response and independent variables. Since a single polynomial model may not represent the functional relationship over the entire domain, the domain can be divided to yield a reasonable approximation per portion. Interactions between independent variables can be modeled by a low-order (e.g., first order or second order) polynomial model to describe the system and explore experimental conditions leading to its optimization (Bezerra et al. 2008).

The Box–Behnken (known as three-level) and Box–Wilson (known as central composite) are two common designs in RSM. The Box–Behnken design performs well with few factors. It is suitable for fitting quadratic models that require three levels of each factor; the treatment combinations are at the edge midpoints and center of the process spaces (Ferreira et al. 2007). For instance, the geometry of a Box–Behnken design for three factors may be as a sphere partially within a cube whose edge midpoints correspond to tangents on the sphere at 12 locations (Fig. 1, middle). The number of experiments is 2n (n − 1) + c_p = 13, where c_p is a central point and n is number of factors. The Box–Wilson design is considered a fractional design that adds points to estimate curvature. For three factors, the circumscribed central composite will have points that describe a sphere around the factorial cube. The design may consider 2ⁿ + 2n + c_p = 16 experiments, which include fractional factorial points, axial points with distance alpha from c_p and center points (Fig. 1, right). However, if the data lack curvature, FFD or experimental design for the first-order model can be explored.

Model Formulation

During development of a GA-based stope optimization method, Villalba and Kumral (2018a, 2018b) noted that the GA parameter values significantly affected the quality of the solution where profits vary ± 11 and ± 7%, respectively. Furthermore, the quality of the solution could not be assessed because the parameter selection was based on sensitivity analysis, which randomizes one variable at a time and cannot quantify interaction effects of multiple parameters on profit or how well the solution space was searched. In the current research, multiple runs of the stope optimizer for each parameter combination were executed with the main goals of (1) finding the parameter configuration that maximizes profit in the stope optimizer using RSM and (2) understanding the relative importance of each parameter and quantifying parameter interactions using FFD.

Each experiment provides a response (y) and the change in response produced by a change in the level of the factor defines their factor effect. An iteration between factors is presented when the difference in response between levels of one factor is not the same as other factors (Montgomery 1997). Thus, the iteration between factors is described by a regression model (Eq. 1), which describes the set of responses (Eq. 2) given by metaheuristic optimization.

$$ \hat{y}(\text{z} ) = \beta_{0} + \beta_{1} z_{1} + \beta_{2} z_{2} + \beta_{3} z_{3} + \beta_{4} z_{4} + \cdots + \beta_{p} z_{p} + \varepsilon $$

(1)

Factors or p predictor variables z₁, z_2,…,z_p in Eq. 1 are associated with the metaheuristic optimization parameters to be calibrated. For instance, if the model requires three parameters to be calibrated, predictor variables z₁, z₂ and z₃ will be linked with these parameters, and their interactions terms are represented by predictor variable z₄ to z_p. The regression model also contains a variable ε, which is the measurement error and the effect of other variables not explicitly considered in the model. The linear regression theory of a single $ \hat{y}(\text{z} ) $ response defines a mean that depends continuously on the z₁, z₂,…,z_p and random error ε (Johnson and Wichern 2007). The unknown parameters β₀, β₁…β_p in Eq. 1 are estimated from the solution of n experiments or y responses. The model becomes represented by n independent observations on y and their linked values of z, thus:

$$ \left[ {\begin{array}{*{20}c} {y_{1} } \\ {y_{2} } \\ \vdots \\ {y_{n} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} 1 & {z_{11} } & {z_{12} } & \cdots & {z_{1p} } \\ 1 & {z_{21} } & {z_{22} } & \cdots & {z_{2p} } \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & {z_{n1} } & {z_{n2} } & \cdots & {z_{np} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\beta_{0} } \\ {\beta_{1} } \\ \vdots \\ {\beta_{p} } \\ \end{array} } \right] + \left[ {\begin{array}{*{20}c} {\varepsilon_{1} } \\ {\varepsilon_{2} } \\ \vdots \\ {\varepsilon_{n} } \\ \end{array} } \right] $$

(2)

The polynomial model permits examination of the set of predictor variables to determine how well they predict a response and which variables are significant predictors of a response. The regression equation from fitting a model to the observed y₁, y₂,…,y_n and corresponding z₁₁,…,z_1p; z₂₁,…,z_2p,…,z_n1,…,z_np known values are used to calculate regression coefficients β₀, β₁,…,β_p using least square estimation. However, since variables have different units, it is challenging to directly compare coefficients because lower coefficients do not necessarily represent less important predictors. A standardized coefficient (std-beta) is needed to compare the coefficients and find the predictors that impact more (or less) on the response. The coefficients of the predictor with negligible influence on the response or multiple predictors linearly related with other predictors can be dropped from the selected model (Helland 2000). The z values that represent the interaction of the control parameters generate a high dimension surface called response surface. This response surface can be portrayed as a contour plot. In the absence of iterations, the lines of this contour plot will be parallel and straight and would correspond to a flat surface. The response surface plot facilitates finding the best setting for the control parameters graphically.

The regression model fits the set of observations with values derived from metaheuristic optimization. Once the best-fit $ \hat{y}_{r} \left( z \right) $ regression model is found for r response, the values that provide the best response in the response surface are found by maximizing a $ d_{r} \left( {\hat{y}_{r} (z)} \right) $ desirability function (Eqs. 3 and 4) using a gradient descent algorithm. In addition, the procedure gives the option to have R responses r = 1,…,R given the same dependent variables. Maximizing the desirability function provides the values of the dependent variables that generate the maximum or minimum response given the regression model. This function requires three parameters to calculate desirability values: the maximum limit U_r, the lower limit L_r per response r, and exponent s (weight), which defines how important it is to reach U_r in response maximization (Eq. 3) and to reach L_r in response minimization (Eq. 4).

$$ d_{r} \left( {\hat{y}_{r} (z)} \right) = \left\{ {\begin{array}{l} {0\quad{\text{ if }}\hat{y}_{r} \left( z \right)\, <\, L_{r} } \\ {\left( {\frac{{\hat{y}_{r} \left( z \right) - L_{r} }}{{U_{r} - L_{r} }}} \right)^{s}\quad {\text{ if L}}_{r} \le \hat{y}\left( z \right) \le } \\ {1\quad{\text{ if }}\hat{y}_{r} \left( z \right) \,> \,U_{r} } \\ \end{array} U_{r} \, }\quad \right.\forall r = 1,..,R $$

(3)

$$ d_{r} \left( {\hat{y}_{r} (z)} \right) = \left\{ {\begin{array}{l} {1\quad{\text{ if }}\hat{y}\left( z \right)\, < \,L_{r} } \\ {\left( {\frac{{\hat{y}_{r} \left( z \right) - U_{r} }}{{L_{r} - U_{r} }}} \right)^{s} \quad{\text{ if }}L_{r} \le \hat{y}\left( z \right) \le } \\ { 0\quad {\text{ if }}\hat{y}\left( z \right) \,>\, U_{r} } \\ \end{array} U_{r} \, } \quad \right.\forall r = 1,..,R $$

(4)

The weight considers s = 1 when the desirability function is linear, s > 1 when the points close to U_r or L_r, need to be treated with high importance and s < 1 when they need to be treated with lower importance (Bezerra et al. 2008). The U_r must denote a larger desired value in maximization (Eq. 3) and L_r with lower desired value in minimization. The desirability function,$ d_{r} \left( {\hat{y}_{r} (z)} \right) $, is used to maximize (or/and minimize) R responses simultaneously (Derringer and Suich 1980) through the gradient ascent (or descent) iterative optimization algorithm for finding their maximum (and/or minimum) values. The search starts at any location in the function, then a small step proportional to the positive (or negative) gradient vector is taken at each iteration to approach a local maximum (or minimum) of the function. To move alongside (or against) the gradient, the values of this step are aggregated to (or subtracted from) the location that is updated during iterations. The goal is to find a sequence of updates that show convergence of the function to a local maximum (or minimum), which can also be global if the function is convex (Snyman 2005). An overall desirability D combines the individual $ d_{r} \left( {\hat{y}_{r} (z)} \right) $ ∀r = 1,…,R desirabilities using a geometric mean (Eq. 5). Desirability ranges from 0 to 1, where 0 corresponds to undesirable responses and 1 corresponds to the most desirable response to reach U_r and/or L_r.

$$ D = \left( {d_{1} \left( {\hat{y}_{1} (z)} \right) \times d_{2} \left( {\hat{y}_{2} (z)} \right) \ldots \times d_{R} \left( {\hat{y}_{R} (z)} \right)} \right)^{{\frac{1}{R}}} $$

(5)

The methodology described here is used to calibrate GA parameters, but can be used with any metaheuristic algorithm. Evolutionary algorithms like GA have been used in mine planning; however, these GA-based methodologies did not cover in detail the importance of defining their GA control parameters, such as population size, crossover rate, and mutation rate (Kumral 2004; Ruiseco et al. 2016; Sauvageau and Kumral 2016; Verhoeff 2017; Villalba and Kumral 2018a).

The population size (P) affects GA performance: a small P will have limited access to the search space, resulting in poor performance. A large P with more access to the search space prevents premature convergence to suboptimal solutions; however, a low convergence rate results since a large P requires more evaluations per generation.

The crossover rate (C) influences the number of times crossover is used, where C × P solutions or (C × solution space) locations in the solutions undertake crossovers. The crossover exploits the current solutions, which lead the population to converge on a good solution, which can be the global optimum. The higher the C, the more quickly a population accounts for new solutions, but too high a C carries the risk of discarding high-performance solutions before the selection process produces any improvement (Grefenstette 1986). Crossover rates that are too low will not permit enough new solutions; changes in the population will be null, resulting in a lower exploration rate.

The mutation rate (M) assists in increasing variability of a population as a secondary search operator. After the crossover operation, the block position of new population solutions undergoes random changes, which increase the probability of exploring the search space. Mutation rates that are too low lead to premature convergence to local optima instead of the global optimum. High mutation rates lead to random searches, which diminishes the GA search ability and prevents converging on the optimum solution (Reeves 2003).

The control parameters in GA can vary among mining deposits: the optimal control parameters at deposit A will differ from deposit B because mining deposits are complex and their grade spatial variability influences their mine planning optimization setting parameters. The control parameters P, C and M can lead to profit variation of ± 11% (Villalba and Kumral 2018a) and affect the performance and efficiency of GA. Thus, mine planning optimization requires different control parameters values. The methodology herein balances exploitation and exploration ability by finding the optimal control parameters. The case study demonstrates the calibration of these parameters by maximizing the desirability function of approaching the desired profit by using a gradient descent algorithm. Three regression models obtained from three experiments assist in defining their respective desirability function.

Case Study

To illustrate the merits of the proposed approach, the control parameters of GA-based stope optimization are calibrated. The optimization determines orebody portion to maximize the profit of underground mining operation (Villalba and Kumral 2017). Input data are from a narrow vein gold deposit sector located in a volume of 140 m in the east, 188 m in the north and 150 m in the vertical direction. A total of 109 composites of gold data and a three-dimensional geological solid were used to calculate vein grades, with gold values ranging from 0.017 to 28.26 g/t. This data corresponds to the case study data used and stope layout algorithm proposed by Villalba and Kumral (2018b). Multiple realizations or equally probable orebody scenarios, simulated using sequential Gaussian simulation (Deutsch and Journel 1998; Shi et al. 2000), were scaled into the blocks to provide input to the stochastic stope layout optimizer. To account for orebody uncertainty and find the most profitable mining direction of stopes following the orebody directions, the stope layout based on GA was performed in three stages: (1) measure stope layout uncertainty based on equally probable orebody realizations; (2) create an average design whereby feasibility evaluation breeds an initial population; and (3) use GA operators to improve this initial population over generations. The proposed methodology mainly affects the third stage where GA is used. The control parameters of the GA were calibrated using experimental design tools in JMP statistical software (v. 13.0.0), in which desirability is maximized using gradient descent algorithm and prediction profiler tools.

The FFD, Box–Behnken, and Box–Wilson designs define the stope layout profit as a single response variable and the desirability function (Eq. 3) for maximization, minimization and target. The desirability is maximization of the profit, expressed in thousands (K). L₁ = US$18,500 K is the lower limit, U₁ = US$22,000 K is the upper limit and importance is s = 1. Desirability is 0 when a design provides profits < US$18,500 K, and is 1 when a design has profit > US$22,000 K. Each design evaluation (Table 1) considers P from 10 to 40, C from 0.1 to 0.4 and M from 0.05 to 0.15. Ranges of these control parameters were taken from Villalba and Kumral (2018b). The sensitivity analysis showed that values outside these ranges did not generate higher profit. The lower and upper values of these ranges correspond to the minimum and maximum values, respectively, in the definition of factors in surface methodology and two-level FFD. Also, the FFD uses an additional level (P = 25, C = 0.25, M = 0.1) to make a fair comparison with the other two designs that also consider central points in their experimental designs. The first, second and third design requires 27, 13 and 15 experiments, respectively. These experiments use three levels referring to the values that P, C, and M can take and are used during coding (the low, middle and high values are replaced by –, 0, +, respectively, in JMP software), e.g., the experiment 000 in Table 1 means that P, C, and M take middle values 25, 0.25, and 0.1, respectively. The computing time was stored only for the Box–Wilson experiment because a further analysis considers both profit and also time as responses in Figure 8.

Table 1 Population size (P), crossover rate (C), mutation rate (M), profit and solution time for full factorial design (FFD), Box–Behnken and Box–Wilson experimental designs

Full size table

Sensitivity analysis considers the effect of one parameter at each iteration and ignores the inter-dependencies among the parameters. Therefore, computational time increases due to a large number of tests, unrealistic scenarios may be unnecessarily examined, and less information is gathered. On the other hand, experimental design requires a reduced number of experiments to describe the relationship between factors and predictors by a polynomial model, which assists in finding optimal control parameters while maximizing profit. To calibrate parameters in this case study, sensitivity analysis required 60 experiments, FFD required 45% of sensitivity analysis experiments, and Box–Wilson and Box–Behnken required 44–52% of FFD experiments (Table 1). Thus, the proposed approach based on experimental design simplifies and shows better performance in comparison with sensitivity analysis.

The initial attempt at model selection considers only the first-order model, which gives the basic model representation and helps to understand the initial contribution on the profit of P, C, and M independently before accounting for their interactions. As expected, the fit of this preliminary RSM without cross product factors is low: R² = 0.35, 0.42 and 0.17 for FFD, Box–Behnken and Box–Wilson designs, respectively. However, if standardized coefficients (std-beta) are compared, P has the strongest influence in the FFD, Box–Behnken and Box–Wilson designs (std-beta = 0.35, 0.46 and 0.39, respectively) followed by C (std-beta = 0.25, 0.43 and 0.12, respectively) and M (std-beta = 0.09, 0.16 and − 0.07, respectively). The variance inflation factor (VIF) across the three designs varies between 1 and 1.3, where VIF < 10 means there is no multicollinearity between the factors or the predictor variables. Ideally, the predictor variables of regression models are weakly correlated with each other but highly correlated with the dependent variable.

The lack of fitness of the preliminary model suggests more terms are needed. Terms may be added in the following order: (1) first-order, (2) cross product (iteration) and (3) second-order (quadratic) and (4) higher-order terms. Some scenarios will require transforming either the responses or the factors (Melvin 2000). The three preliminary first-order models were expanded to cross products in FFD (R² = 0.56) and second-order in RSM for both Box–Behnken (R² = 0.88) and Box–Wilson (R² = 0.74) designs. Adjusted R² values increased (from 0.23 to 0.32 in FFD, from 0.23 to 0.37 in Box–Behnken and from – 0.03 to 0.34 in Box–Wilson) because the adjusted R² considers the number of predictors as denominators in their equations. R² is an intuitive measure for model selection and knows how well a model fits a set of observations; however, additional residual plot, knowledge about nature of the problem, and other model statistics assist in the fitness evaluation. Since low R² value still indicates a right relationship between predictors and response, low R² values obtained are tolerated in the calibration of GA control parameters because the proposed approach focusses in modeling the relationship between factors with higher priority than profit prediction itself. When the GA control parameters are defined, the profit is maximized again using metaheuristic stope layout optimization.

In the three designs, the least square algorithm was used to fit the multiple regression models, where their intercept terms ranged from US$18,941 K to US$19,643 K. The interactions between independent variables given by FFD—during an screening stage—we modeled by a low-order polynomial model (Eq. 6) to explain the system and explore experimental conditions, whereas RSM required a second-order polynomial to define the nature of the response surface in the optimal region that assist in finding optimal setting while maximizing the response (Eqs. 7 and 8).

As observed in the preliminary model and Table 2, P had the strongest effect on the profit (std-beta = 0.35–0.46) and M had the least effect (std-beta = 0.08–0.16) among the designs.

Table 2 Standardized coefficients in regression models for FFD, Box–Behnken and Box–Wilson experimental designs

Full size table

Results of Full Factorial Design

The regression model for the three-level FFD is shown in Eq. 6. Profit depends on seven terms, which are three control parameters and their iteration terms.

$$ \begin{aligned} {\text{Profit}} & = 18941.3 - 10.9P + 644.81C + 680M + 70(P - 25)(C - 0.25) - 55.9(P - 25)(M - 0.1) \\ & \quad + 18022.2(C - 0.25)(M - 0.1) \\ \end{aligned} $$

(6)

As expected, the FFD surface did not perform as well as the RSM designs to fit the case study data (black dots in Fig. 2), but suggests graphically a critical point to be explored that is congruent with the RSM design results. This corresponds to P, C and M of 40, > 0.40 and > 0.10, respectively.

Exploration of the response surface model (of two variables at the time) must focus on the vicinity of these critical points that may represent a local or global maximum and are graphical approximations. However, maximization of the desirability function—obtaining values close to the specified U₁ = US$22,000 K—will better facilitate defining the optimum setting for all whole contributing variables or GA control parameters. The prediction profiler in Figure 3 shows how the prediction model of the stope profits changes when the variable settings are modified.

The maximum combined desirability obtained by FFD is approximately 0.43 because the profits obtained by the prediction model are primarily < U₁. This desirability is obtained when P = 40, C = 0.40 and M = 0.15; however, the prediction profiler graphs and surface response suggest exploring places with higher C and M values.

Results of Box–Behnken Design

Profit depends on 10 terms, which are combinations of three control parameters (as factors) until the second order. The regression model term related to C in the Box–Behnken design (Eq. 7) yielded a std-beta of 0.43, close to the maximum std-beta (0.46) obtained by P. The iteration of P and C also yielded a high std-beta (0.43); however, the quadratic terms of P maintain a high std-beta (absolute − 0.39) whereas the quadratic term of C yielded a very low std-beta (0.09).

$$ {\text{Profit}} = 19,602 + 173.6P + 160.8C + 60.6M + 228P \times C + 107.8P \times M + 37C \times M - 237.4P^{2} - 57.6C^{2} + 75.6M^{2} $$

(7)

Additional second-order terms in the model generated surfaces that fit the data and determined the best possible combinations of variables (Fig. 4).

The desirability value using the Box–Behnken design was higher (0.47) than using FFD because the prediction model generates profits closer to U₁ = US$20,000 K (Fig. 5).

The optimal setting provided by this desirability maximization is P = 40, C = 0.40 and M = 0.15. At a P of approximately 40, profit steadily increases as C increases. However, the profile of the mutation ratio does not show convergence at the optimal setting.

Results of Box–Wilson Design

The regression model (Eq. 8) for the Box–Wilson design and the influence of the parameters on the response follow the findings for previous designs with a high std-beta (0.40) obtained for P; however, iterations of C and M obtained higher std-beta values (0.53 and absolute − 0.32, respectively). Thus, C and M have a synergistic effect on the response.

$$ {\text{Profit}} = 19,656 + 155.6P + 45.5C - 26.4M + 122.1P \times C + - 67.9P \times M + 233.1C \times M - 59P^{2} - 60.5C^{2} + 215M^{2} $$

(8)

Similar to Box–Behnken, Box–Wilson design (Fig. 6) suggested a critical point associated with potential optimum GA control parameters that reach the maximum profit in the response surface (P, C, and M of approximately 40, > 0.4 and > 0.10, respectively). This location is also related to the critical point found by FFD.

The prediction profiler of the mutation rate is improved by the Box–Wilson design (Fig. 7), but the maximized desirability value is less than the two previous designs (0.40). The prediction profile found two optimal setting parameters (P and C) similar to the two previous designs and P, C and M found their best profit at values 40, 0.40, and 0.116, respectively.

In summary, comparison of the surface profiler and contours of two factors with their response profit are shown for each design. Their directions of movement relative to the fitted surface lead toward the optimum response. The directions with the steepest slopes move more rapidly toward maximization (Melvin 2000). The population size (P = 40) and the crossover rate (C = 0.40) are defined and ratified by the three designs, but mutation rate needs to be confirmed from a reduced interval of possible variations identified by the prediction profiler and desirability maximization in the Box–Wilson design. Thus, mutation rate was tested at 0.10, 0.116 and 0.12 using GA optimization and the profits obtained were US$19,927, 19,887 and 18,810 K, respectively. Also, the optimal setting for the two first designs with M = 0.15 and a better desirability value (which implied higher profit was found by regression model) was also tested using GA optimization. This last test provided a profit of US$19,675 K, lower than the profit using M from the reduced interval. Thus, the parameters were calibrated as follows: P = 40, C = 0.40 and M = 0.10. This decision was assisted by maximizing the desirability value and verifying the prediction profiler. Since the response values of the experiments correspond to metaheuristic solutions, a critical point or optimal setting here may represent a value close to the global optimum in stope layout optimization.

GA Solution Time as a Second Response

Moreover, the case study considered the GA solution time for a specified number of generations as a second response variable in the analysis. This post-optimization value can assist in calibrating the parameters and balancing exploration and exploitation in GA because reasonable convergence time is relevant in decision-making. The goal is to find the minimum solution time while simultaneously maximizing the profit. The desirability of this second response (r = 2) is set as minimization of the solution time where L₂ = 1 h as the lower limit, U₂ = 30 h as the upper limit and importance s = 1. Desirability is 0 when solution time > U₂ and is 1 when solution time is < L₂.

The maximized combined desirability was 0.57, and the prediction profiler shows a maximum profit aligned with a minimum solution time (Fig. 8). As expected, the profit is lower than using a single response where only the profit is maximized; however, the profit and solving time evaluated simultaneously confirmed that P = 40 is the optimal setting for this case study. Further, a larger population size does not always imply a longer solution time; rather, crossover and mutation steadily increase the solution time, though the effect is less evident for mutation rate. These parameters are case–study-dependent because the variability within each solution may affect their selection. In other words, the stope layout pattern, which depends on orebody spatial variability, may affect the choice of GA parameters. This profit and solution time analysis using experimental design may also assist in further exploration and exploitation solution space decisions using higher or lower setting parameters.

Conclusions

Genetic algorithms (GA) provide a powerful stochastic search strategy that can be used in various optimization problems including mine planning. Control parameter selection is either overlooked or it is based on sensitivity analysis. Therefore, the impact of each parameter on the optimization, the relative importance of each parameter and the interactions among parameters are unknown. In the case study, these parameters could cause profit variation of ± 7% and affect the performance and efficiency of GAs. It should be noted that these parameters are not universal and differ among optimization trials. Therefore, the proposed approach should be repeated for different instances.

In seeking a balance between GA exploitation and ability to explore the solution space via parameter calibration, a multiple regression model assisted in maximizing a desirability function to obtain profit close to a target value through a gradient ascent/descend algorithm. This defines the optimum setting for all GA control parameters.

The case study was implemented for three classical experimental design techniques. The 3-level full factorial design determined the impact of each parameter on the response variable, as well as the relative importance of parameters and their interactions, whereas the Box–Behnken and Box–Wilson designs as part of the response surface methodology (RSM) found the parameter configuration that maximizes profit during stope optimization. Population size had the strongest effect on profit (std-beta 0.35–0.61), crossover rate had a moderate effect (std-beta 0.12–0.43) and mutation rate had the weakest effect (std-beta 0.08–0.16) across the designs. These three designs produced $ \hat{y}_{r} \left( z \right) $ regression models, and their surface and predictor profilers illustrated the location of critical points or potential optimal parameters; however, the maximization of $ d_{r} \left( {\hat{y}_{r} (z)} \right) $ desirability function defines the optimum setting for all the parameters simultaneously. Thus, the desirability function to obtain profit close to the desired profit defined the population size (P = 40) and crossover rate (C = 0.4), but the mutation rate (M = 0.10) required an additional evaluation from a reduced interval (0.1–0.12) identified by the prediction profiler in RSM, from the original interval (0.5–0.15). These parameters were tested in the stope layout metaheuristic optimization and ratified as optimal because they provided the highest profit. Since the response values correspond to metaheuristic solutions, the optimal setting of parameters may be linked to the global optimum in the stope layout optimization.

Furthermore, minimization of solution time was considered as a second response in the analysis. As expected, the profit was lower than using a single response where only the profit is maximized; however, evaluating profit and solution time simultaneously confirmed that (1) population size (P = 40) is still the optimal setting for this case study, (2) a larger population size does not always imply a longer solution time, (3) the solution time increases steadily in proportion to crossover and mutation rates, (4) simultaneous profit and solution time analysis may assist in further exploration and exploitation decision with higher or lower setting parameters, and (5) the trade-off of acceptable computing time and profit desirability are illustrated. As the case study showed, the experimental design can be used to determine the values of GA operators.

References

Bezerra, M. A., Santelli, R. E., Oliveira, E. P., Villar, L. S., & Escaleira, L. A. (2008). Response surface methodology (RSM) as a tool for optimization in analytical chemistry. Talanta, 76(5), 965–977. https://doi.org/10.1016/j.talanta.2008.05.019.
Article Google Scholar
Box, G. E., & Wilson, K. B. (1951). On the experimental attainment of optimum conditions. Journal of the Royal Statistical Society (Series B), 13, 1–45.
Google Scholar
Clifton Young, J. (1996). Blocking, replication, and randomization—The key to effective experimentation: A case study. Quality Engineering, 9(2), 269–277.
Article Google Scholar
Derringer, G., & Suich, R. (1980). Simultaneous optimization of several response variables. Journal of Quality Technology, 12(4), 214–219.
Article Google Scholar
Deutsch, C. V., & Journel, A. G. (1998). Geostatistical software library and user’s guide. New York: Oxford University Press.
Google Scholar
Eiben, Á. E., Hinterding, R., & Michalewicz, Z. (1999). Parameter control in evolutionary algorithms. IEEE Transactions on Evolutionary Computation, 3(2), 124–141. https://doi.org/10.1109/4235.771166.
Article Google Scholar
Ferreira, S. L., Bruns, R. E., Ferreira, H. S., Matos, G. D., David, J. M., Brandao, G. C., et al. (2007). Box–Behnken design: An alternative for the optimization of analytical methods. Analytica Chimica Acta, 597(2), 179–186. https://doi.org/10.1016/j.aca.2007.07.011.
Article Google Scholar
Goodfellow, R. C., & Dimitrakopoulos, R. (2016). Global optimization of open pit mining complexes with uncertainty. Applied Soft Computing, 40, 292–304.
Article Google Scholar
Grefenstette, J. J. (1986). Optimization of control parameters for genetic algorithms. IEEE Transactions on Systems Man and Cybernetics, 16(1), 122–128. https://doi.org/10.1109/Tsmc.1986.289288.
Article Google Scholar
Helland, I. S. (2000). Model reduction for prediction in regression models. Scandinavian Journal of Statistics, 27(1), 1–20.
Article Google Scholar
Johnson, R. A., & Wichern, D. W. (2007). Applied multivariate statistical analysis (6th ed.). New Jersey: Prentice-Hall.
Google Scholar
Kumral, M. (2004). Optimal location of a mine facility by genetic algorithms. IMM Transactions, Mining Technology, 113(2), A83–A88. https://doi.org/10.1179/037178404225004940.
Article Google Scholar
Kumral, M., & Dowd, P. (2005). A simulated annealing approach to mine production scheduling. Journal of the Operational Research Society, 56(8), 922–930.
Article Google Scholar
Lamghari, A., & Dimitrakopoulos, R. (2012). A diversified Tabu search approach for the open-pit mine production scheduling problem with metal uncertainty. European Journal of Operational Research, 222(3), 642–652.
Article Google Scholar
Leite, A., & Dimitrakopoulos, R. (2007). Stochastic optimisation model for open pit mine planning: Application and risk analysis at copper deposit. Mining Technology, 116(3), 109–118.
Article Google Scholar
Manchuk, J., & Deutsch, C. V. (2008). Optimizing stope designs and sequences in underground mines. SME Transactions, 324, 67–75.
Google Scholar
Melvin, T. (2000). Response surface optimization using JMP Software. Baltimore: Qualistics.
Google Scholar
Mitchell, M. (1999). An introduction to genetic algorithms. Cambridge: Massachusetts Institute of Technology.
Google Scholar
Montgomery, D. C. (1997). Design and analysis of experiments. New York: Wiley.
Google Scholar
Nannen, V., & Eiben, A. E. (2007). Relevance estimation and value calibration of evolutionary algorithm parameters. Paper presented at the 20th international joint conference on artificial intelligence, Hyderabad, India,
Osman, I. H., & Laporte, G. (1996). Metaheuristics: A bibliography. New York: Springer.
Google Scholar
Pandey, H. M., Chaudhary, A., & Mehrotra, D. (2014). A comparative review of approaches to prevent premature convergence in GA. Applied Soft Computing, 24, 1047–1077. https://doi.org/10.1016/j.asoc.2014.08.025.
Article Google Scholar
Rayward-Smith, V. J. (1996). Modern heuristic techniques. In V. J. Rayward-Smith, I. H. Osman, C. R. Reeves, & G. D. Smith (Eds.), Modern heuristic search methods (pp. 1–25). New York: Wiley.
Google Scholar
Reeves, C. (2003). Genetic algorithms. Handbook of metaheuristics (pp. 55–82). New York: Kluwer Academic.
Chapter Google Scholar
Ruiseco, J. R., & Kumral, M. (2017). A practical approach to mine equipment sizing in relation to dig-limit optimization in complex orebodies: Multi-rock type, multi-process, and multi-metal case. Natural Resources Research, 26(1), 23–35.
Article Google Scholar
Ruiseco, J. R., Williams, J., & Kumral, M. (2016). Optimizing ore–waste dig-limits as part of operational mine planning through genetic algorithms. Natural Resources Research, 25(4), 473–485.
Article Google Scholar
Sauvageau, M., & Kumral, M. (2016). Genetic algorithms for the optimisation of the Schwartz-Smith two-factor model: A case study on a copper deposit. International Journal of Mining, Reclamation and Environment, 32, 1–19.
Google Scholar
Shi, B., Bloom, L., & Mueller, U. (2000). Applications of conditional simulation to a positively skewed platinum mineralization. Natural Resources Research, 9(1), 53–64.
Article Google Scholar
Shishvan, M. S., & Sattarvand, J. (2015). Long term production planning of open pit mines by ant colony optimization. European Journal of Operational Research, 240(3), 825–836.
Article Google Scholar
Snyman, J. (2005). Practical mathematical optimization: An introduction to basic optimization theory and classical and new gradient-based algorithms (Vol. 97). New York: Springer.
Google Scholar
Telford, J. K. (2007). A brief introduction to design of experiments. Johns Hopkins APL Technical Digest, 27(3), 224–232.
Google Scholar
Verhoeff, R. L. A. (2017). Using genetic algorithms for underground stope design optimization in mining. A stochastic analysis. M.Sc. thesis, Delft University of Technology.
Villalba, M. E., & Kumral, M. (2017). Heuristic stope layout optimization accounting for variable stope dimensions and dilution management. International Journal of Mining and Mineral Engineering, 8(1), 1–18. https://doi.org/10.1504/IJMME.2017.082680.
Article Google Scholar
Villalba, M. E., & Kumral, M. (2018a). Underground mine planning: Stope layout optimization under uncertainty using genetic algorithms. International Journal of Mining, Reclamation and Environment (in press). https://doi.org/10.1080/17480930.2018.1486692.
Villalba, M. E., & Kumral, M. (2018b). A value adding approach to hard-rock underground mining operations: Balancing orebody orientation and mining direction (under submission).

Download references

Acknowledgments

This research was conducted with financial support from the Natural Sciences and Engineering Research Council of Canada (NSERC Fund # 242984), and we thank NSERC for this support.

Author information

Authors and Affiliations

Department of Mining and Materials Engineering, McGill University, 3450 University Street, Montreal, QC, H3A 0E8, Canada
Martha E. Villalba Matamoros & Mustafa Kumral

Authors

Martha E. Villalba Matamoros
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Kumral
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mustafa Kumral.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Villalba Matamoros, M.E., Kumral, M. Calibration of Genetic Algorithm Parameters for Mining-Related Optimization Problems. Nat Resour Res 28, 443–456 (2019). https://doi.org/10.1007/s11053-018-9395-2

Download citation

Received: 10 May 2018
Accepted: 23 July 2018
Published: 30 July 2018
Issue Date: 01 April 2019
DOI: https://doi.org/10.1007/s11053-018-9395-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Calibration of Genetic Algorithm Parameters for Mining-Related Optimization Problems

Abstract

Similar content being viewed by others

A Mixture Design of Experiments Approach for Genetic Algorithm Tuning Applied to Multi-objective Optimization