Keywords

1 Introduction

In the real world, there exist a lot of dynamic multi-objective optimization problems (DMOPs) which usually involve several conflicting objective functions and may be subject to a number of constraints. Furthermore, the objective functions, constraints and/or relative parameters of DMOPs may change over time [1]. In this work, we consider the following continuous DMOPs:

$$\begin{aligned} {\left\{ \begin{array}{ll} \min \limits _{\varvec{v} \in \varOmega }F(\varvec{v},t)=\big (f_1(\varvec{v},t),f_2(\varvec{v},t), ..., f_m(\varvec{v},t) \big ) ^ \mathrm {T}\\ s.t. \quad h_i(\varvec{v},t)\le 0, i=1,2, ..., p; \quad l_j(\varvec{v},t)=0, j=1,2, ..., r \end{array}\right. } \end{aligned}$$
(1)

where v is the decision vector defined in the decision space \(\varOmega \), \(F({{\varvec{v}}},t)\) is the objective vector that calculates a numerical value for each objective of the solution v at time t, m is the number of objectives. The functions \(h_i\) and \(l_j\) are the inequality and equality constraints which change over t, respectively. As a result, the Pareto optimal solution set (POS) in the decision space and/or Pareto optimal front (POF) in the objective space of DMOPs may also change over time, which imposes a big challenge to evolutionary multi-objective optimization (EMO) researches.

Over the past decade or so, there are increasing interests in designing dynamic evolutionary multi-objective optimization (DEMO) algorithms to handle DMOPs. The task of a DEMO is to trace the movement of the POS and/or POF with reasonable computational costs. How to take action to respond to a new change is the key issue of DEMO, since an effective change reaction method can help the DEMO algorithm adapt to the new environment quickly. A lot of change reaction methods have been proposed to handle changes. These methods include diversity introduction after a change occurs [2, 3] or maintain diversity throughout the run [4], multi-population approaches [5], memory schemes [6, 7], and prediction strategies [8,9,10,11,12,13,14].

In recent years, prediction strategies have gained much attention. This kind of methods predicts POS of the new environment in advance by exploiting different machine learning techniques. According to the amount of historical information used for prediction, the prediction strategies can be roughly categorized into two types: local information-based prediction (LIP) and time series-based prediction (TSP). LIP approaches usually only use information obtained from several recent environments to make predictions, so that the history data has not been fully mined. Some representative LIP methods are simple linear model [9], Kalman filter [11], differential model [12], and transfer learning [13]. In contrast, TSP approaches make use of much more historical environmental information than LIP approaches. TSP approaches first collect approximate optimal solutions from past environments as training data, and then feed them to some machining learning model to predict the location of the new optimal solutions. Therefore, TSP approaches generally have higher training costs than LIP approaches. Autoregressive model [8, 10] and support vector regression [14] are some well-known models used in TSP approaches.

The Motivation of This Work. Dealing with a dynamic and uncertain environment is not a unique challenge to evolutionary optimization. There are also active research activities within online machine learning community that try to tackle similar challenges from changing data streams [15]. Online learning is used to update the prediction model for future data at each time step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once [16]. Online learning is also used in situations where it is necessary for the algorithm to dynamically adapt to new patterns in the data [17]. In addition, online learning algorithms are typically easy to implement and generally have low computational cost [18], which is particularly suitable for dynamic optimization. Inspired by the aforementioned advantages of online learning, we introduce online learning into DEMO in order to help dynamic optimization algorithms better adapt to changing environments.

The Contribution of This Work. We first introduce Passive-Aggressive Regression (PAR, a common online learning technique) into DEMO, and then propose a PAR-based prediction strategy (PARS) to react the new change of DMOPs. Furthermore, we incorporate this prediction strategy into MOEA/D-DE [19] to deal with DMOPs. Finally, in order to fairly evaluate the performance of the proposed prediction strategy, it is compared with three state-of-the-art prediction methods under the same dynamic MOEA/D-DE framework.

The rest of this paper is organized as follows. Section 2 provides a brief survey on the prediction strategies for DEMO, and introduces the basic knowledge of PAR. In Sect. 3, PARS is proposed, and then it is incorporated into MOEA/D-DE to handle DMOPs. In the following section, some experimental results are reported to show the effectiveness of our proposed strategy. The final section concludes this paper.

2 Related Works

2.1 Prediction Strategies for DEMO Algorithm

Local Information-Based Prediction Strategies: Zhou et al. [9] proposed a prediction strategy (PRE) in 2007. In PRE, a simple linear prediction model with Gaussian noise is used to reinitialize population in new environment by utilizing only optimal individuals of last two environments. The idea of this simple linear prediction model was employed by other algorithms which focused predicted optimal individuals on some special points [7, 20] or multiple directions [21]. Instead of making predictions based on data from the previous two environments, Cao et al. [12] proposed a differential model for predicting the movement of the centroid by its locations in three former environments. Muruganantham et al. [11] proposed a Kalman filter (KF) prediction model to guide the search toward a new POS. This KF model was essentially implemented as first- or second-order linear prediction model [14]. Recently, Jiang et al. [13] only utilized the approximate optimal solutions under two sequential environments to construct a prediction model by adopting transfer learning.

Time Series-Based Prediction Strategies: Zhou et al. [10] proposed a population-based prediction strategy (PPS). The movement of centers of the population is learnt by a univariate autoregressive model. The predicted center point and estimated manifold are utilized to generate a new population for the next environment. Cao et al. [14] proposed a support vector regression (SVR) predictor for better solving DMOPS with nonlinear correlation. Whenever there is a new change taken place, the historical approximate optimal solutions in time series are first used to train a group of SVR models. Then the POS of the next environment can be predicted by using the trained SVR models.

2.2 Online Passive-Aggressive Regression

In 2006, Crammer et al. [18] proposed a Passive-Aggressive Regression (PAR) which has become a common online machine learning method. To deal with noises of the samples, two PAR variants (i.e., PAR-I and PAR-II) were also proposed in [18]. In this paper, we only focus on PAR-II just for simplicity. On each time step, the PAR-II algorithm receives an instance \({{\varvec{x}}}^{t} \in R^{n}\) and predicts a target value \({\hat{y}^t} \in R\) using its regression function, that is, \({\hat{y}^t} = {{{\varvec{w}}}^t}\cdot {{{\varvec{x}}}^t}\) where \({{\varvec{w}}}^{t}\) is the incrementally learned vector. After prediction, the algorithm obtains the true target value \(y^{t}\) and suffers an instantaneous loss. PAR-II uses the \(\varepsilon \)-insensitive hinge loss function:

$$\begin{aligned} \ell _\varepsilon (\varvec{w}^{t};(\varvec{x}^{t},y^{t}))={\left\{ \begin{array}{ll} 0, &{} |\varvec{w}^{t} \cdot \varvec{x}^{t}-y^{t}|\le \varepsilon \\ |\varvec{w}^{t} \cdot \varvec{x}^{t} -y^{t}|-\varepsilon , &{} otherwise \end{array}\right. } \end{aligned}$$
(2)

where \(\varepsilon \) is a positive parameter which controls the sensitivity to prediction errors. At the end of each time step, the algorithm uses \({{\varvec{w}}}^{t}\) and the example \(({{\varvec{x}}}^{t}, y^{t})\) to generate a new weight vector \({{\varvec{w}}}^{t+1}\), which will be utilized to continue making a prediction on the next time step. Firstly, \( {{\varvec{w}}}^{1} \) is initialized to (0, . . . ,0). On each time step, the PAR-II algorithm updates the new weight vector to be,

$$\begin{aligned} \varvec{w}^{t+1}=\arg \min \limits _{\varvec{w} \in R^n } \frac{1}{2} ||\varvec{w}-\varvec{w}^t||^2+C{\xi }^{2} \quad s.t. \quad \ell _\varepsilon \big (\varvec{w};(\varvec{x}^t,y^t)\big )\le \xi \end{aligned}$$
(3)

where \(\xi \) is a non-negative slack variable and C is a positive parameter which controls the influence of the slack term on the objective function. Specifically, larger values of C imply a more aggressive update step and therefore C is referred to as the aggressiveness parameter of the algorithm. Using the shorthand \({\ell ^t} = {\ell _\varepsilon }\left( {{{\varvec{w}}}^{t};({{{\varvec{x}}}^t},{y^t})} \right) \), the update given in Eq. (3) has a closed form solution as follows,

$$\begin{aligned} \varvec{w}^{t+1}=\varvec{w}^t+sign(y^t-\hat{y}^t)\tau ^t {{\varvec{x}}}^t \quad where \quad \tau ^t=\frac{\ell ^t}{||{{\varvec{x}}}^t||^2+\frac{1}{2C}} \end{aligned}$$
(4)

This update keeps a balance between the amount of progress made on each time step and the amount of information retained from previous time steps [18]. On one hand, this update requires \({{\varvec{w}}}^{t+1}\) to correctly predict the current example with a sufficiently high accuracy and thus progress is made. On the other hand, \({{\varvec{w}}}^{t+1}\) must be as close to \({{\varvec{w}}}^{t}\) as possible in order to retain the information learned from previous time steps. After given the loss function Eq. (2) and the update rule Eq. (4), the pseudo-code of PAR-II is presented in Algorithm 1.

figure a
Fig. 1.
figure 1

MOEA/D-PARS algorithm framework

3 Proposed PAR-Based Prediction Strategy

Whenever there is a new change taken place, DEMO algorithm need make some operation to response the new change so as to adapt to new environment as quickly as possible. In the following, we will propose a new prediction strategy to adapt to the new change by using PAR-II. Algorithm 2 presents our PAR-based strategy (PARS) for predicting the new POS of the next environment. We assume that there are N individuals in the population and each individual has d decision variables. Because we perform regression prediction on each dimension variable separately, there are a total of \(N*d\) PAR-II regression models in this algorithm. In addition, we assume that the dimension of the weight vector \({{\varvec{w}}}\) of PAR-II is q. Firstly line 3 is to generate a time series s from the stored populations. Then, M samples are gotten from s in line 4. In order to reduce the training time, the size of training set is limited to no more than 20, so the size M is set to \(min(20, t-q)\). Referring to Algorithm 1, line 5 to 15 is used to implement a PAR-II. Line 17 predicts one component value of a Pareto optimal solution on time step t+1 and checks whether it is out of the boundary of the decision space. Since Algorithm 2 has three loops, its overall time complexity is \(O(N*d*M)\).

figure b

In Fig. 1, we further integrate PARS into MOEA/D-DE to generate a PARS assisted dynamic MOEA/D-DE algorithm (termed MOEA/D-PARS) which is mainly composed of an environment change detection operator, PARS prediction strategy and a MOEA/D-DE optimization [19]. A change detection operator suggested in [2] is adopted to detect whether a new change has occurred. 10% of individuals in the population are selected randomly as detectors and re-evaluated in every generation. If any detector’s objective values are different from its previous ones, then we assume a change has taken place. If a new environmental change is detected, PARS is called to respond to the change. Later on, the population is evolved for a generation by using MOEA/D-DE. Finally, if the stop condition is not met, the algorithm continues tracking the next dynamic change. Considering the time complexity of MOEA/D-PARS, we assume that the environmental change frequency is \(\tau _{T}\), that is, there are \(\tau _{T}\) generations of evolutionary multi-objective optimization in an environmental change. The time complexities of the change detect operator, PARS, and MOEA/D-DE are O(mN), O(N*d*M), and O(mNT) [19] respectively, where T is the number of weight vectors, and m is the number of objective functions. Therefore, the time complexity of MOEA/D-PARS during an environmental change is \(O(\tau _{T}mNT)\).

4 Computational Experiments

4.1 Experimental Settings

Benchmark Problems. Fourteen different benchmark problems for CEC2018 competition on dynamic multi-objective optimization [22] are chosen here to compare the performance of DEMO algorithms.

Compared Prediction Strategies and DEMO Algorithms. Three well-known prediction strategies, i.e., PRE [9], PPS [10], and KF [11], are chosen here to validate the performance of the proposed PARS. For fairly comparison, we choose MOEA/D-DE as their basic multi-objective optimizer. In addition to the proposed MOEA/D-PARS algorithm, we integrate the aforementioned PRE, PPS, and KF into MOEA/D-DE, and then we get other three compared dynamic MOEA/D-DE algorithms which are termed MOEA/D-PRE, MOEA/D-PPS, and MOEA/D-KF respectively. All the algorithms are implemented by using Python programming.

Performance Metrics. Mean Inverted Generational Distance (MIGD) [11, 14] is selected as performance metrics. It provides a quantitative measurement for both the proximity and diversity goals of DEMO algorithms. A lower value of MIGD implies that the algorithm has better optimization performance. In addition, the IGD(t) [23] is used to illustrated the evolutionary tracking curves of the compared DEMO algorithms.

Parameter Setting. The parameter settings of all the DEMO algorithms tested in this paper were almost the same as those of their original papers. Some key parameters in these algorithms were set as follows:

  1. 1)

    Population size (N) and the number of decision variables (d): They were set to be 100 and 10 respectively for all benchmark problems.

  2. 2)

    Common control parameters in MOEA/D-DE: CR = 0.5, F = 0.5 in the differential evolution operator. Distribution \(\eta _{m}\) = 20, mutation probability \(p_{m}\,=\,1/d\) in the polynomial mutation operator. In addition, neighborhood size T = 20 and neighborhood selection probability \(\delta \) = 0.8.

  3. 3)

    Parameters setting for PARS strategy: q = 4, C = 1, \(\varepsilon \) = 0.05.

  4. 4)

    Number of runs and stopping condition: For each tested algorithm, it was executed 20 runs on each test instance, and the experimental results were recorded to obtain the statistical information. In each run, each algorithm stopped at pre-specified 100 environmental changes. During each environmental change, each algorithm has the same number of function evaluations.

Table 1. Mean and standard deviation values of MIGD obtained by four algorithms on the DF1–DF7 problems
Table 2. Mean and standard deviation values of MIGD obtained by four algorithms on the DF8–DF14 problems

4.2 Experimental Results

The experiments were conducted with different combinations of change severity levels (\(n_{T}\)) and frequencies (\(\tau _{T}\)) in order to study the impact of dynamics in changing environments. They were set to 5 and 10, respectively. Comparison results of four DEMO algorithms (i.e., MOEA/D-PRE, MOEA/D-PPS, MOEA/D-KF, and MOEA/D-PARS) are reported to study the influence of different prediction strategies on DEMO algorithms. Table 1 and Table 2 detail MIGD values of these algorithms on DF benchmark problems. These two tables show that MOEA/D-PARS obtains competitive results on the majority of the DF instances, which implies that MOEA/D-PARS has good performance regarding convergence and distribution. Specifically, MOEA/D-PARS obtains the best results on DF2, DF10, and DF11 instances, whereas MOEA/D-KF performs the best on DF5, DF7 and DF14 instances. MOEA/D-PARS also achieves the best results on DF1, DF3, DF4, DF6, and DF9 under some certain combinations of severity level and frequency. Besides, MOEA/D-PRE achieves the best results in a few cases. In most cases, the results obtained by MOEA/D-PPS are relatively poor.

Apart from presenting the data in the above tables, we also provide the average tracking performance of these algorithms over 20 runs in Fig. 2, where the average IGD values are plotted versus the time. For the sake of limitation of paper length, only results of four problems with \(\tau _{T}=10\) and \(n_{T}=5\) are presented here. It is clear that the tracking performance of MOEA/D-PARS is better than the other three algorithms on DF2 and DF10 problems. On DF4 problem, the tracking performance of the four algorithms is very similar. On DF6 problem, although MOEA/D-PARS keeps fluctuating in IGD values in the first 50 time steps like the other three algorithms, it clearly outperformed the other competitors in the remaining time.

Fig. 2.
figure 2

Evolutionary average IGD(t) curves obtained by four algorithms over 20 runs on the DF2, DF4, DF6, and DF10 problems with \(\tau _T=10\) and \(n_T=5\)

5 Conclusions

Both evolutionary dynamic optimization and online machine learning communities face the challenge of dynamic and uncertain environment change problems. Hence, there may be an excellent opportunity for cross fertilization between them [15]. This paper first introduces PAR, a common online learning technique, into DEMO, and then proposes a PAR-based prediction strategy (PARS) to react the new environmental change of DMOPs. Furthermore, the proposed prediction strategy is integrated into MOEA/D-DE to deal with DMOPs. Finally, the proposed PARS is compared with three state-of-the-art prediction strategies under the same dynamic MOEA/D-DE framework. The experimental results show that MOEA/D-PARS is promising for handling DMOPs, which demonstrates that PARS is a competitive prediction strategy for DEMO.

The work presented here is preliminary, and there are some possible directions for future work. We would like to investigate the influence of parameter settings of PAR-II and other PAR variants on algorithm performance. In addition, we will try to design other prediction strategy based on other kind of online machine learning techniques for better dealing with DMOPs.