Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

To model and solve a bi-level or multi-level optimization problem, we have to first understand basic single-level optimization models and related solution methods. This chapter introduces related concepts, models and solution methods of basic single-level optimization including linear programming, non-linear programming, multi-objective programming, goal programming, Stackelberg game theory, and particle swarm optimization. These knowledge will be used in the rest of the book.

This chapter is organized as follows. Section 2.1 introduces basic single-level optimization concepts and models. Section 2.2 presents the solution method of linear programming. Section 2.3 addresses non-linear programming by its definition, classification, theories, and solution methods. Section 2.4 gives the models and solution methods of multi-objective programming. Section 2.5 introduces goal programming and its solution process. In Sect. 2.6, we present the principles, theorems and applications of Stackelberg game theory. Particle swarm optimization, which will be used as a solution method for non-linear optimization problem, is then introduced in Sect. 2.7. Section 2.8 presents a summary.

1 Concepts

The core of the decision process is to formulate an identified decision problem and then find an optimal solution. Many decision models have been developed and different types of decision models require different kinds of decision-making methods to obtain solutions. Popular decision models include (1) Analytic Hierarchy Process (AHP), which allows consideration of both the qualitative and quantitative aspects of a decision problem and reduces a complex decision to a series of pairwise comparisons; (2) Grid Analysis, also known as decision matrix analysis or multi-attribute utility theory, in which the decision matrices are the most effective, and multiple alternatives and criteria will be taken into account in the decision process; (3) Decision Tree, which is a graph of decisions and their possible consequences, is used to create a plan to reach a goal; and (4) Optimization model which is a more sophisticated approach to solve decision problems and is the main focus of this book.

Optimization, also called mathematical programming, refers to the study of decision problems in which one seeks to minimize (min) or maximize (max) a function by systematically choosing the values of variables within their allowed sets. Many real-world decision problems can be modeled by an optimization framework. To model a decision problem as an optimization model, we need, in principle, three sets of basic variables: decision variables, result variables and uncontrollable variables (or parameters).

Decision Variables describe alternative courses of action and are determined by related decision makers. For example, for a product planning problem, the number of products to be produced is a decision variable.

Result Variables are outputs and are often described by objective functions, such as profit (max) and cost (min). The outputs are determined by decision makers, the factors that cannot be controlled by decision makers, and the relationships among the variables.

Uncontrollable Variables (or Parameters) are the factors that affect the result variables but are not under the control of decision makers. These factors can be fixed, in which case they are called parameters, or they can vary. These factors are uncontrollable because they are determined by elements of the system environment. Some of these variables limit decision makers and therefore form what are called the constraints of the problem. For example, each product’s cost of production should be less than the total profit, and each product should meet marketing requirements and so on in a product planning problem.

There are many types of optimization models such as linear programming, non-linear programming, multi-objective programming, and bi-level programming.

Linear Programming (LP) is an important type of optimization in which the objective function and constraints are all linear. Linear programming problems include specialized algorithms for their solutions and for other types of optimization problems by solving linear programming problems as sub-problems. Linear programming is heavily used in various management activities, either to maximize the profit or minimize the cost. It is also the key technique of other optimization problems.

Now, we re-consider Example 1.1 discussed in Chap. 1 to explain how to build a model for an LP practical decision problem.

Example 2.1

A company produces two kinds of products: A and B. We know that the profit of one unit of A and B is $40 and $70, respectively. However, the company has limitations in its labor (a total of 501 labor hours available per time slot; each A needs 4 h and B 3 h), machine (a total of 401 machine hours available, each A needs 2 h and B 5 h), and marketing requirements (the need to produce 10 units of A and 20 units of B respectively). The decision problem is how many A and B should be produced to obtain the maximum profit. Using these settings and requirements, we can establish a linear programming model:

  • Decision variables:

    $$ \begin{aligned} x_{1} & = {\text{units}}\,{\text{of}}\,A_{1} \,{\text{to}}\,{\text{be}}\,{\text{produced}}; \\ x_{2} & = {\text{units}}\,{\text{of}}\,A_{2} \,{\text{to}}\,{\text{be}}\,{\text{produced}}. \\ \end{aligned} $$
  • Result variable (objective function):

  • Maximize total profit: \( 40x_{1} + 70x_{2} , \)

  • Labor constraint (hours): \( 4x_{1} + 3x_{2} \le 501, \)

  • Machine constraint (hours): \( 2x_{1} + 5x_{2} \le 401, \)

  • Marketing requirement for \( x_{1} \) (units): \( x_{1} \ge 10, \)

  • Marketing requirement for \( x_{2} \) (units): \( x_{2} \ge 20. \)

This is a linear programming problem and can be modeled by linear programming (see Sect. 2.2).

Non-linear Programming (NLP) is the process of solving a programming problem subject to certain constraints, over a set of unknown real variables, along with an objective function to be maximized or minimized, as with linear programming, but where some of the constraints or the objective function are non-linear. For example,

$$ \begin{aligned} & \mathop {\hbox{min} }\limits_{{x_{1} ,x_{2} }} \quad 40x_{1}^{2} + 70x_{2}^{3} \\ & {\text{s}} . {\text{t}}.\quad x_{1}^{2} + 20x_{2} \le 100, \\ & \quad \quad 2x_{1} + 3\sqrt {x_{2} } \le 140, \\ & \quad \quad x_{1} \ge 10,x_{2} \ge 20. \\ \end{aligned} $$

Multi-objective Programming (MOP) is the process of simultaneously optimizing two or more conflicting objectives subject to certain constraints. MOP problems can be found in a variety of fields, such as product and process design, aircraft design, automobile design, or wherever optimal decisions need to be made in the presence of trade-offs between two or more conflicting objectives. Maximizing profit and minimizing the cost of a product; maximizing performance and minimizing the fuel consumption of a vehicle; and minimizing weight while maximizing the strength of a particular component are all examples of multi-objective optimization problems.

In general, a multi-objective programming problem should not have a single solution that simultaneously minimizes or maximizes each objective to its fullest. In each case an objective must have reached a point such that, when attempting to optimize the objective further, other objectives suffer as a result. Finding such a solution, and quantifying how much better this solution is compared to other solutions, is the goal when setting up and solving a multi-objective optimization problem. For example,

$$ \begin{aligned} & \mathop {\hbox{min} }\limits_{{x_{1} ,x_{2} }} \quad \left( {\begin{array}{*{20}c} {40x_{1} + 70x_{2} } \\ {50x_{1} + 60x_{2} } \\ \end{array} } \right) \\ & {\text{s}} . {\text{t}} .\quad 10{x}_{1} + 20x_{2} \le 109, \\ & \quad \quad 20x_{1} + 30x_{2} \le 419. \\ \end{aligned} $$

Bi-level programming (BLP) and multi-level programming (MLP) are complex optimization situations where one optimization problem is embedded in another one. A bi-level programming problem is a multi-level programming problem having two levels. Below is an example of bi-level programming. More detail will be presented in Chap. 3.

$$ \begin{aligned} & \mathop {\hbox{min} }\limits_{{x_{1} }} \; 40x_{1} + 70x_{2} \\ & \;{\text{s}} . {\text{t}} .\; 10{x}_{1} + 20x_{2} \le 119, \\ & \quad \; 20x_{1} + 30x_{2} \le 409, \\ & \quad \mathop {\hbox{min} }\limits_{{x_{2} }} \; 50x_{1} + 60x_{2} \\ & \quad \; {\text{s}} . {\text{t}} .\; 10{x}_{1} + 8x_{2} \le 109, \\ & \qquad \;\;\;x_{1} \ge 10,x_{2} \ge 2. \\ \end{aligned} $$

We can see that optimization is an ideal model for decision making. The single limitation is that it works only if the problem is structured and, for the most part, deterministic. An optimization model defines the required input data, the desired output, and the mathematical relationships in a precise manner.

2 Linear Programming

Linear programming is a mathematical approach to determining a means to achieve the best outcome (such as maximum profit or minimum cost) in a given mathematical model. This model is defined by an objective function and one or more constraints which have linear formats. A typical example would be taking the limitations of materials and labor described by linear inequalities, and then determining the “best” production levels for the maximal profit defined by a linear formula, under those limitations.

LP problem can be written as:

$$ \begin{aligned} & \mathop {\hbox{max} }\limits_{x} \quad f(x) = cx \\ & {\text{s}} . {\text{t}} .\quad A_{x} \le b, \\ \end{aligned} $$
(2.1)

where x represents the vector of decision variables, c and b are vectors of known coefficients, and A is a known matrix of coefficients. The expression f(x) to be maximized (in other cases, it may be minimized) is called the objective function. The equations Ax ≤ b are the constraints which specify a convex polytope over which the objective function is to be optimized. Both f(x) and Ax have linear formats.

Linear programming has a tremendous number of application fields. It has been used extensively in business and engineering, in the areas of transportation, energy, telecommunications, and manufacturing. It has been proved to be useful in modeling diverse types of problems in planning, routing, scheduling, assignment, and design.

Just as with standard maximization problems, the method most frequently used to solve LP problems is the simplex method (Charnes and Cooper 1957). This method provides us with a systematic way of examining the vertices of the feasible region to determine the optimal value of the objective function. As is well-known, the simplex method has proven remarkably efficient in practice.

3 Non-linear Programming

Non-linear programming is the process of solving a problem of equalities and inequalities, collectively termed constraints, over a set of unknown real variables, along with an objective function to be maximized or minimized, where some of the constraints or the objective function are non-linear. Formally, an NLP problem can be written as:

$$ \mathop {\hbox{min} }\limits_{x} \quad f(x) $$
(2.2a)
$$ {\text{s}} . {\text{t}} .\quad h(x) = 0, $$
(2.2b)
$$ g(x) \ge 0, $$
(2.2c)

where \( x \in R^{n} ,\,f:R^{n} \to R,\,h:R^{n} \to R^{p} ,\,g:R^{n} \to R^{q} . \) A point x that satisfies the constraints given by (2.2b) and (2.2c) is called a feasible solution to problem (2.2a)–(2.2c). A collection of all such feasible solutions forms the feasible region. NLP is then use to search a feasible solution \( \bar{x} \) such that \( f(\bar{x}) \le f(x) \) for any feasible solution x. \( \bar{x} \) is called an optimal solution to the problem (2.2a2.2c). In special cases when the objective function of (2.2a) and constraints (2.2b) and (2.2c) all have linear forms, the problem (2.2a2.2c) reduces to a linear programming problem (2.1).

3.1 Varieties of Non-linear Programming

Based on the mathematical characteristics of the objective function (2.2a) and the constraints (2.2b) and (2.2c), NLP can be in many different formats. For an objective function or a constraint, the format can be linear, sum of squares of linear functions, quadratic functions, sum of squares of non-linear functions, sparse non-linear functions, or non-smooth non-linear functions.

Based on combinations of the above formats of the objective and constraints, an NLP problem can be a specific type (such as linear objective function, but the constraint is a quadratic function) and thus have particular properties.

3.2 Theories and Optimality Conditions of Non-linear Programming

In this section, we introduce the most important and widely used theories and optimality conditions of NLP. We first denote the feasible region of problem (2.2a2.2c) by S. The following definitions and results can be found in Bazaraa et al. (2013).

Definition 2.1

A point \( x^{*} \in S \) is called a relative or local minimum of f(x) over S if there is an \( \varepsilon > 0 \) such that \( f(x) \le f(x^{*} ) \) for all \( x \in S \) within a distance ε of \( x^{*} \). If \( f(x) > f(x^{*} ) \) for all \( x \in S \), \( x \ne x^{*} \) within a distance ε of \( x^{*} \), then \( x^{*} \) is called a strict relative minimum of f(x) over S.

Definition 2.2

A point \( x^{*} \in X \) is called a global minimum of f(x) over S if \( f(x) \ge f(x^{*} ) \) for all \( x \in S \). If \( f(x) > f(x^{*} ) \) for all \( x \in S \), \( x \ne x^{*} \), then \( x^{*} \) is called a strict global minimum of f(x) over S.

For situations where constraints are absent, the following two theorems hold.

Theorem 2.1

Let \( f :\,R^{n} \to R \) be twice continuously differentiable throughout a neighborhood of \( x^{*} \) . If f has a relative minimum at \( x^{*} \) , then it necessarily follows that

  1. 1.

    The gradient vector \( \nabla f(x^{*} ) = 0 \).

  2. 2.

    \( F(x^{*} ) \) is positive semi-definite, where \( F(x^{*} ) \) is the Hessian matrix of f(x) at \( x^{*} \).

Theorem 2.2

Let \( f :{\mkern 1mu} R^{n} \to R \) be twice continuously differentiable throughout a neighborhood of \( x^{*} \) . Then a sufficient condition for f(x) to have a strict relative minimum at \( x^{*} \) , where \( \nabla f(x^{*} ) = 0 \) holds, is that \( F(x^{*} ) \) is positive definite.

For NLP problems involving only equality constraints, the following definition and theories hold.

Definition 2.3

A point \( x^{*} \) satisfying the constraints \( h(x^{*} ) = 0 \) is called a regular point of the constraints if the gradient vectors \( \nabla h_{1} (x^{*} ),\, \ldots \,,\nabla h_{m} (x^{*} ) \) are linearly independent.

Theorem 2.3

At a regular point \( x^{*} \) of the surfaces \( S = \{ x|h(x) = 0\} \) , the tangent plane is equal to \( T = \{ y|\nabla h(x)y = 0\} \).

Theorem 2.4

Suppose that \( x^{*} \) is a local minimum of f(x) subject to \( h(x) = 0 \) as well as a regular point of these constraints. There then exists a vector \( \lambda \in R^{m} \) such that \( \nabla f(x^{*} ) - \lambda \nabla h(x^{*} ) = 0 \).

The following definitions and theories are used for the general NLP problem (2.2a2.2c).

Definition 2.4

Let \( x^{*} \) be a point satisfying the constraints \( h\left( {x^{ * } } \right) = 0 \) and \( g\left( {x^{ * } } \right) \ge 0 \); and let \( J \) be the set of indices \( j \) such that \( g_{j} (x^{ * } ) = 0 \). Then \( x^{ * } \) is called a regular point of these constraints if the gradient vectors \( \nabla h_{i} \left( {x^{ * } } \right)\left( {1 \le i \le m} \right) \), \( \nabla g_{j} (x^{ * } )(j \in J) \) are linear independent.

Theorem 2.5

(Kuhn-Tucker Conditions) Let \( x^{ * } \) be a relative minimum for the problem (2.2a2.2c) and suppose that \( x^{ * } \) is a regular point for the constraints. Then there exists a vector \( \lambda \in R^{m} \) and a vector \( \mu \in R^{q} \) such that

$$ \nabla f\left( {x^{ * } } \right) - \lambda \nabla h\left( {x^{ * } } \right) - \mu \nabla g\left( {x^{ * } } \right) = 0, $$
(2.3a)
$$ \mu g\left( {x^{ * } } \right) = 0, $$
(2.3b)
$$ \mu \ge 0, $$
(2.3c)
$$ h\left( {x^{ * } } \right) = 0, g\left( {x^{ * } } \right) \ge 0. $$
(2.3d)

3.3 Methods for Solving Non-linear Programming Problems

For an NLP problem in which the objective function and constraints have linear forms, the problem becomes an LP problem which can be solved using the well-known simple algorithm.

If the objective function of an NLP problem is convex (for the minimization problem), or concave (for the maximization problem), and the constraint set is convex, then the programming problem is called a convex programming problem and general methods from convex optimization can be used.

Several methods are available for solving non-convex problems. One method is to use special formulations of LP problems. Another involves the use of the branch-and-bound technique, where the programming problem is divided into subclasses to be solved with convex (minimization problem) or linear approximations that form a lower bound on the overall cost within the subdivision. With subsequent divisions, an actual solution will be obtained at some point whose cost is equal to the best lower bound obtained for any of the approximate solutions. This solution is optimal, although possibly not unique. The method may also be terminated early, with the assurance that the best feasible solution is within a tolerance of the best point found; such points are called \( \varepsilon \)-optimal solution. Terminating to \( \varepsilon \)-optimal solution is typically necessary to ensure finite termination. This is especially useful for large, difficult problems, and problems with uncertain costs or values where the uncertainty can be estimated with appropriate reliability estimation.

Under differentiability and constraint qualifications, the Kuhn–Tucker conditions provide the necessary conditions for a solution to be optimal. Under convexity, these conditions are also sufficient.

The most popular methods for NLP problems include Zoutendijk’s feasible direction method, the gradient projection method, the penalty method, and the Lagrangian method (Bazaraa et al. 2013).

The above-mentioned methods depend on certain mathematical properties of the NLP problems to be solved. Sometimes, these properties are difficult to satisfy. In such situations, these methods become invalid. Heuristics-based methods such as Genetic Algorithms (Tang et al. 2011), Particle Swarm Optimization (Nezhad et al. 2013), on the other hand, do not have this limitation and are thus another direction for NLP problems.

4 Multi-objective Programming

The main characteristics of Multi-objective Programming (MOP) are that decision makers need to achieve multiple objectives simultaneously while these multiple objectives are non-commensurable and conflict with each other.

4.1 Multi-objective Programming Model

An MOP model considers a vector of variables, objective functions, and constraints. It attempts to maximize (or minimize) the objective functions. Since this problem rarely has a unique solution, we expect to choose a solution from among the set of feasible solutions, which will be explained later in this section. Generally, a MOP problem can be formulated as follows:

$$ \begin{aligned} & \mathop {\hbox{max} }\limits_{x} \quad f(x) \\ & {\text{s}}.{\text{t}}. \quad x \in X = \left\{ {x|g(x) \le 0} \right\} \\ \end{aligned} $$
(2.4)

where \( f\left( x \right) \) represents \( k \) conflicting objective functions, \( g\left( x \right) \le 0 \) represents \( m \) constraints, and \( x \in R^{n} \) is a \( n \)-dimensional vector of decision variables.

Multi-objective linear programming (MOLP) is one of the most important forms of MOP problems, which are specified by linear objective functions subject to a set of linear constraints. The standard form of a MOLP problem can be written as follows:

$$ \begin{aligned} & \mathop {\hbox{max} }\limits_{x} \quad Cx \\ & {\text{s}}.{\text{t}}.\quad x \in X = \left\{ {x|Ax \le b} \right\} \\ \end{aligned} $$
(2.5)

where \( C \) is a \( k \times n \) objective function matrix, \( A \) is an \( m \times n \) constraint matrix, \( b \) is a \( m \)-dimensional vector, and \( x \) is a \( n \)-dimensional vector of decision variable.

We have the following notion for a complete optimal solution.

Definition 2.5

(Sakawa 1993) \( x^{*} \) is said to be a complete optimal solution, if and only if there exists a \( x^{*} \in X \) such that \( f_{i} \left( {x^{*} } \right) \ge f_{i} \left( x \right)( i = 1,\, \ldots ,\,k) \)) for all \( x \in X . \)

Also, ideal solution, superior solution, or utopia point are equivalent terms indicating a complete optimal solution (Lu et al. 2007).

In general, a complete optimal solution that simultaneously maximizes (or minimizes) all objective functions does not always exist when the objective functions conflict with each other. Thus, a concept of Pareto optimal solution is introduced into MOLP.

Definition 2.6

(Sakawa 1993) \( x^{*} \) is said to be a Pareto optimal solution, if and only if there does not exist another \( x \in X \) such that \( f_{i} \left( x \right) \ge f_{i} \left( {x^{*} } \right) \) for all i and \( f_{i} \left( x \right) \ne f_{i} \left( {x^{*} } \right) \) for at least one i.

The Pareto optimal solution is also called a non-dominated solution, non-inferior solution, efficient solution, and non-dominate solution.

In addition to the Pareto optimal solution, the following weak Pareto optimal solution is defined as a slightly weaker solution concept than the Pareto optimal solution.

Definition 2.7

(Sakawa 1993) \( x^{*} \) is said to be a weak Pareto optimal solution, if and only if there does not exist another \( x \in X \) such that \( f_{i} \left( x \right) > f_{i} \left( {x^{*} } \right), i = 1, \ldots ,k \).

Here, let \( X^{CO} \), \( X^{P} \) and \( X^{WP} \) denote complete optimal, Pareto optimal, and weak Pareto optimal solution sets, respectively. Then from above definitions, we can easily obtain the following relations:

$$ X^{CO} \subseteq X^{P} \subseteq X^{WP} . $$
(2.6)

A satisfactory solution belongs to a reduced subset of the feasible set that exceeds all of the aspiration levels of each objective. A set of satisfactory solutions is composed of acceptable alternatives. Satisfactory solutions do not need to be non-dominated, and a preferred solution is a non-dominated solution selected as the final choice through decision makers’ involvement in the information processing stage.

The rest of this chapter focuses mainly on MOLP, the linear form of MOP.

4.2 Multi-objective Linear Programming Methods

The methods for solving MOLP problems have been well developed and classified into four classes by Hwang and Masud (1979) and Lai and Hwang (1994). We list them in Table 2.1.

Table 2.1 A classification of MOLP methods

As shown in Table 2.1, the first class of MOLP methods basically does not require any more information nor interaction with decision makers once the objective functions and constraints have been defined. The solution to a MOLP problem is presented on the basis of assumptions made about decision makers’ preferences.

The second class of MOLP methods assumes that decision makers have a set of goals to achieve and that these goals will be established before formulation of a mathematical programming model. The multi-objective goal programming (MOGP) assumes that decision makers can specify goals for the objective functions. The key idea behind goal programming is to minimize deviation from the goals or aspiration levels set by decision makers. In most cases, therefore, MOGP seems to yield a satisfactory solution rather than an optimal one. More details about MOGP problem will be discussed later.

The third class of MOLP, interactive methods, requires more involvement and interaction with decision makers in the solving process. The interaction takes place through decision makers’ computer interface at each iteration. Trade-off or preference information from decision makers at each iteration is used to determine a new solution, therefore decision makers actually gain insights into the problem. Interactive programming was first initiated by Geoffrion et al. (1972) and further developed by many researchers. The STEP method (Benayoun et al. 1971) in particular is known to be one of the first interactive MOLP techniques, to which there have been a number of modifications and extensions. The interactive MOGP method was also proposed (Dyer 1972), which attempts to provide a link between MOGP and interactive methods.

Lastly, the purpose of the fourth class is to determine a subset of the complete set of non-dominated solutions to a MOLP problem. It deals strictly with constraints and does not consider the decision makers’ preferences. The desired outcome is to narrow the possible courses of actions and select the preferred course of action more easily.

Interaction is one of the most important features for solving MOLP problems. There are three types of interaction in the MOLP problem solving process: pre-interaction (before the solution process), pro-interaction (during the solution process), and post-interaction (after the solution process). The seven MOLP methods selected from Table 2.1, ESGP, IMOLP, ISGP, MOGP, STEM, STEUER, and ZW, have differences in the interaction processes with decision makers. The MOGP, IMOLP and ISGP methods involve pre-interaction with users prior to the solution process through the collection of weights, goals, and priorities of objectives from users. The STEM method engages in pro-interaction during the solution process. Its principle is to require decision makers to nominate the amounts to be sacrificed of satisfactory objectives until all objectives become satisfactory. It first displays a solution and the ideal value of each objective. It then asks decision makers to accept or reject this solution. If it is accepted, the solution is taken as the final satisfactory solution. However, decision makers often make further searches so that more alternative solutions can be generated. If the current solution is rejected, a relaxation process starts. Decision makers will accept a certain level of relaxation of a satisfactory objective to allow the improvement of unsatisfactory objectives. When the relaxation fails, the system enables decision makers to continue re-entering a set of relaxation values and a new solution is then found. If decision makers accept this solution, it becomes the final satisfactory solution. Otherwise the system repeats the above process. Post-interaction is used in all seven methods. After a set of candidate solutions has been generated, decision makers are required to choose the most satisfactory solution.

Now, we give details of the weighting method for solving MOLP problems.

The key idea of the weighting method is to transform the multiple objective functions in the MOLP problem (2.5) into a weighted single objective function, which is described as follows:

$$ \begin{aligned} & \mathop {\hbox{max} }\limits_{x} \quad wCx \\ & {\text{s}}.{\text{t}}. \quad x \in X = \left\{ {x|Ax \le b} \right\} \\ \end{aligned} $$
(2.7)

where \( w = \left( {w_{1} ,w_{2} , \ldots ,w_{k} } \right) \ge 0 \) is a vector of weighting coefficients assigned to the objective functions.

Example 2.2

Let us consider the following example of a MOLP problem.

$$ \begin{aligned} & \mathop {\hbox{max} }\limits_{{x_{1} ,x_{2} }} \; f\left( x \right) = \left( {\begin{array}{*{20}c} {f_{1} \left( x \right)} \\ {f_{2} \left( x \right)} \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {2x_{1} + x_{2} } \\ { - x_{1} + 2x_{2} } \\ \end{array} } \right) \\ & \;{\text{s}}.{\text{t}}.\; {-}x_{1} + 3x_{2} \le 21, \\ & \quad \; x_{1} + 3x_{2} \le 27, \\ & \quad \; 4x_{1} + 3x_{2} \le 45, \\ & \quad \; 3x_{1} + x_{2} \le 30. \\ \end{aligned} $$
(2.8)

Let \( X \) denote the feasible region of problem (2.8). When \( w_{1} = 0.5, w_{2} = 0.5 \), the weighting problem is formulated as

$$ \begin{aligned} & \hbox{max} \quad wf\left( x \right) = 0.5x_{1} + 1.5x_{2} \\ & {\text{s}}.{\text{t}}.\quad (x_{1} ,x_{2} ) \in X. \\ \end{aligned} $$

The optimal solution is \( (x_{1}^{*} , x_{2}^{*} ) = (3, 8) \), and the optimal objective function value is \( f^{*} \left( x \right) = (f_{1}^{*} \left( x \right), f_{2}^{*} (x)) = (14,13) \).

When \( w_{1} = 1, w_{2} = 0 \), the optimal solution is \( (x_{1}^{*} , x_{2}^{*} ) = (9,3) \), and the optimal objective function value is \( f^{*} \left( x \right) = (f_{1}^{*} \left( x \right), f_{2}^{*} (x)) = (21, - 3) \).

When \( w_{1} = 0, w_{2} = 1 \), the optimal solution is \( (x_{1}^{*} , x_{2}^{*} ) = (0, 7) \), and the optimal objective function value is \( f^{*} \left( x \right) = (f_{1}^{*} \left( x \right), f_{2}^{*} (x)) = (7, 14) \).

4.3 A Case-Based Example

Example 2.3

A manufacturing company has six types of milling machine, lathe, grinder, jig saw, drill press, and band saw, whose capacities are to be devoted to producing three products \( x_{1} \), \( x_{2} \), and \( x_{3} \). Decision makers have three objectives: maximizing profit, quality, and worker satisfaction. It is assumed that the parameters and objectives of the MOLP problem are defined precisely in this example. For instance, to produce one unit of \( x_{1} \) requires 12 h of machine milling, as listed in Table 2.2 (Lai 1995).

Table 2.2 Production planning data

This problem can be described by a MOLP model as follows:

$$ \begin{aligned} & \mathop {\hbox{max} }\limits_{{x_{1} ,x_{2} , x_{3} }} \; f\left( x \right) = \left( {\begin{array}{*{20}c} {50x_{1} + 100x_{2} + 17.5x_{3} } \\ {92x_{1} + 75x_{2} + 50x_{3} } \\ {25x_{1} + 100x_{2} + 75x_{3} } \\ \end{array} } \right) \\ & \;{\text{s}}.{\text{t}}.\; 12x_{1} + 17x_{2} \le 1400, \\ & \;\;\;\;\;\; \; 3x_{1} + 9x_{2} + 8x_{3} \le 1000, \\ & \;\;\;\;\;\; \; 10x_{1} + 13x_{2} + 15x_{3} \le 1750, \\ & \;\;\;\;\;\; \; 6x_{1} + 16x_{3} \le 1325, \\ & \;\;\;\;\;\; \; 12x_{2} + 7x_{3} \le 900, \\ & \;\;\;\;\;\; \; 9.5x_{1} + 9.5x_{2} + 4x_{3} \le 1075, \\ & \;\;\;\;\;\; \; x_{1} ,x_{2} , x_{3} \ge 0. \\ \end{aligned} $$
(2.9)

We can see that this is a typical multi-objective programming problem.

5 Goal Programming

Goal programming (GP), originally proposed by Charnes and Cooper (1957), is a great strategy to deal with multi-objective optimization problems by setting multiple goals, as we mentioned before. In some decision situations, a decision maker may have more than one objective, with the improvement on one objective to be achieved only at the expense of others. For example, a coordinator of a multi-division firm considers three objectives in making an aggregate production plan: to maximize the net profit, to maximize the quality of products, and to maximize worker satisfaction (Example 2.3). The three objectives could be in conflict with each other, but must be considered simultaneously. Any improvement in one objective may be achieved only at the expense of other objectives.

Goal programming takes a ‘satisfactory solution’ strategy. It requests a decision maker to set a goal or a target for the objective (a set of goals for a MOLP) that the person wishes to attain. A preferred solution is then defined to minimize the deviation from the goal. Therefore, goal programming would appear to yield a satisfactory solution rather than an optimal one. Now we give a formal description of the method adopted by goal programming.

Suppose that a MOLP problem is defined as follows:

$$ \begin{aligned} & \mathop {\hbox{max} }\limits_{x} \quad f\left( x \right) = \left( {\alpha_{1} x,\alpha_{2} x, \ldots ,\alpha_{k} x} \right) \\ & {\text{s}}.{\text{t}}.\quad Ax \le b. \\ \end{aligned} $$
(2.10)

For problem (2.10), there are a total of \( k \) objectives \( \left( {\alpha_{1} x,\alpha_{2} x, \ldots ,\alpha_{k} x} \right) \) to achieve. We give goals \( g_{i} (i = 1,2, \ldots ,k) \) for the ith objective. Our effort is now focused on making each objective \( \alpha_{i} x \) as close to its goal \( g_{i} (i = 1,2, \ldots ,k) \), as possible. The problem (2.10) is then transformed as follows:

$$ \begin{aligned} & \mathop {\hbox{min} }\limits_{{x,v_{1}^{ - } ,v_{1}^{ + } , \ldots , v_{k}^{ - } ,v_{k}^{ + } }} \quad v_{1}^{ - } + v_{1}^{ + } + \cdots + v_{k}^{ - } + v_{k}^{ + } \\ &\quad \quad {\text{s}}.{\text{t}}.\quad\quad\quad\; \alpha_{1} x + v_{1}^{ - } - v_{1}^{ + } = g_{1} , \\ & \quad\quad\quad\quad \quad\quad\;\; \alpha_{2} x + v_{2}^{ - } - v_{2}^{ + } = g_{2} , \\ &\quad \quad\quad\quad\quad\quad \quad \quad \quad \;\;\vdots \\ &\quad \quad\quad\quad \quad\quad\;\; \alpha_{k} x + v_{k}^{ - } - v_{k}^{ + } = g_{k}, \\ &\quad\quad \quad\quad\quad \quad\;\; v_{1}^{ - } ,v_{1}^{ + } , \ldots , v_{k}^{ - } ,v_{k}^{ + } \ge 0, \\ &\quad\quad\quad \quad\quad\quad\;\; Ax \le b. \\ \end{aligned} $$
(2.11)

To give a more clear understanding of the idea adopted, \( v_{i}^{ - } \) and \( v_{i}^{ + } (i = 1, \ldots ,k) \) can be defined as follows:

$$ \begin{aligned} & v_{1}^{ + } = \frac{1}{{2\left( {\left| {\alpha _{1} x - g_{1} } \right| + \left( {\alpha _{1} x - g_{1} } \right)} \right)}}, \\ & v_{1}^{ - } = \frac{1}{{2\left( {\left| {\alpha _{1} x - g_{1} } \right| - \left( {\alpha _{1} x - g_{1} } \right)} \right)}}, \\ & \quad \quad \quad \quad \quad \quad \;\; \vdots \\ & v_{k}^{ + } = \frac{1}{{2\left( {\left| {\alpha _{k} x - g_{k} } \right| + \left( {\alpha _{k} x - g_{k} } \right)} \right)}}, \\& v_{k}^{ - } = \frac{1}{{2\left( {\left| {\alpha _{k} x - g_{1} } \right| - \left( {\alpha _{k} x - g_{k} } \right)} \right)}}. \\ \end{aligned} $$

In the above formula, \( v_{i}^{ - } \) and \( v_{i}^{ + } , \) are deviation variables representing the under-achievement and over-achievement of the \( i \)th goal \( g_{i} , \) for the \( i \)th objective \( \alpha_{i} x (i = 1, \ldots ,k), \) respectively.

The problem defined by (2.11) is a standard linear programming problem which can be solved by the simplex method.

There are some variants of goal programming. The initial goal programming formulations order the deviations between objectives and goals into a number of priority levels. The minimization of the deviation at a higher priority level are more important than the deviations at lower priority levels. This is called lexicographicor pre-emptive goal programming (Amador and Romero 1989). When clear priority ordering of the goals to be achieved exists, lexicographic goal programming can be used.

Weighted or non pre-emptive goal programming can be used if a decision maker is more interested in making direct comparisons of the objectives. In this situation, all the deviations between objectives and goals are multiplied by weights, which reflect the relative importance of the objectives. We add these weighted deviations together as a single sum to form the objective function. This process is defined by the following formula:

$$ \begin{aligned} & \mathop {\hbox{min} }\limits_{{x,v_{1}^{ - } ,v_{1,}^{ + } , \ldots ,v_{k}^{ - } ,v_{k}^{ + } }} \quad v = w_{1}^{ - } v_{1}^{ - } + w_{1}^{ + } v_{1}^{ + } + \cdots + w_{k}^{ - } v_{k}^{ - } + w_{k}^{ + } v_{k}^{ + } \\ & {\text{s}} . {\text{t}} .\quad \alpha_{1} x + v_{1}^{ - } - v_{1}^{ + } = g_{1} , \\ & \quad \quad \alpha_{2} x + v_{2}^{ - } - v_{2}^{ + } = g_{2} , \\ & \quad \quad \quad \quad \quad \vdots \\ & \quad \quad \alpha_{k} x + v_{k}^{ - } - v_{k}^{ + } = g_{k} , \\ & \quad \quad v_{1}^{ - } ,v_{1}^{ + } , \ldots , v_{k}^{ - } ,v_{k}^{ + } \ge 0, \\ & \quad \quad Ax \le b, \\ \end{aligned} $$

where \( w_{i}^{ - } \) and \( w_{i}^{ + } (i = 1,2, \ldots ,k) \) are non-negative constants representing the relative importance to be assigned to the positive and negative deviations for each of the relevant goals.

Based on goal programming as previously introduced and the MOLP model, MOGP requires that goals are set for each objective, following which a preferred solution is defined as one which minimizes the deviations from those goals.

We assume that the goals \( g = (g_{1} , \ldots ,g_{k} ) \) are specified for objective functions \( f(x) = (f_{1} (x), \ldots ,f_{k} (x)) \) by decision makers, and a decision variable \( x^{*} \in X \) in the MOLP problem is sought so that the objective functions \( f^{*} \left( x \right) = (f_{1}^{*} \left( x \right), \ldots ,f_{k}^{*} \left( x \right)) \) are as close as possible to the goals \( g = (g_{1} , \ldots ,g_{k} ) \).

The deviation between \( f^{*} \left( x \right) = (f_{1}^{*} \left( x \right), \ldots ,f_{k}^{*} \left( x \right)) \) and \( g = (g_{1} , \ldots ,g_{k} ) \) is usually defined as a deviation function \( D(f\left( x \right),g) \). The MOGP can then be defined as an optimization problem:

$$ \begin{aligned} & \mathop { \hbox{min} }\limits_{x \in X} \quad D\left( {f\left( x \right), g} \right) \\ & {\text{s}}.{\text{t}}.\quad x \in X = \left\{ {x \in R^{n} |Ax \le b} \right\}, \\ \end{aligned} $$
(2.12)

that is, find an \( x^{*} \in X \), which minimizes \( D(f\left( x \right),g) \) or

$$ x^{*} = \arg \mathop {\hbox{min} }\limits_{x \in X} D\left( {f\left( x \right), g} \right) . $$
(2.13)

Normally, the deviation function \( D(f\left( x \right),g) \) is a maximum of deviation of individual goals,

$$ D\left( {f\left( x \right),g} \right) = \hbox{max} \left\{ {D_{1} \left( {f_{1} \left( x \right),g_{1} } \right), \ldots ,D_{k} \left( {f_{k} \left( x \right),g_{k} } \right)} \right\}. $$
(2.14)

From (2.12) and (2.14), the min–max approach is applied to the GP problem:

$$ \mathop {\hbox{min} }\limits_{x \in X} \hbox{max} \left\{ {D_{1} \left( {f_{1} \left( x \right), g_{1} } \right), \ldots , D_{k} \left( {f_{k} \left( x \right),g_{k} } \right)} \right\}. $$
(2.15)

By introducing an auxiliary variable \( \gamma \), (2.15) can then be transformed into the following linear programming problem:

$$ \begin{aligned} & \mathop { \hbox{min} }\limits_{x} \quad \gamma \\ & {\text{s}}.{\text{t}}.\quad D_{1} \left( {f_{1} \left( x \right),g_{1} } \right) \le \gamma , \\ & \quad \quad D_{2} \left( {f_{2} \left( x \right), g_{2} } \right) \le \gamma , \\ & \quad \quad \quad \quad \vdots \\ & \quad \quad D_{k} \left( {f_{k} \left( x \right), g_{k} } \right) \le \gamma , \\ & \quad \quad Ax \le b. \\ \end{aligned} $$
(2.16)

Example 2.4

Let us consider the following example of a MOLP problem:

$$ \begin{aligned} & \mathop {\hbox{max} }\limits_{{x_{1} ,x_{2} }} \quad f\left( x \right) = \left( {\begin{array}{*{20}c} {2x_{1} + x_{2} } \\ { - x_{1} + 2x_{2} } \\ \end{array} } \right) \\ & {\text{s}}.{\text{t}}.\quad {-} x_{1} + 3x_{2} \le 21, \\ & \quad \quad x_{1} + 3x_{2} \le 27, \\ & \quad \quad 4x_{1} + 3x_{2} \le 45, \\ & \quad \quad 3x_{1} + x_{2} \le 30, \\ & \quad \quad x_{1} ,x_{2} \ge 0. \\ \end{aligned} $$

Suppose the goals are specified as \( g = (10, 10) \) for the two objective functions. The original MOLP problem can be converted as the following LP problem with the auxiliary variable \( \gamma \):

$$ \begin{aligned} & \mathop { \hbox{min} }\limits_{{x_{1} ,x_{2} }} \quad \gamma \\ & {\text{s}}.{\text{t}}.\quad 2x_{1} + x_{2} - 10 \le \gamma , \\ & \quad \quad {-} x_{1} + 2x_{2} - 10 \le \gamma , \\ & \quad \quad {-} x_{1} + 3x_{2} \le 21, \\ & \quad \quad x_{1} + 3x_{2} \le 27, \\ & \quad \quad 4x_{1} + 3x_{2} \le 45, \\ & \quad \quad 3x_{1} + x_{2} \le 30, \\ & \quad \quad x_{1} ,x_{2} \ge 0. \\ \end{aligned} $$

The optimal solution then is \( (x_{1}^{*} , x_{2}^{*} ) = (2, 6) \), and the optimal objective function values are \( f^{*} \left( x \right) = (f_{1}^{*} \left( x \right), f_{2}^{*} (x)) = (10, 10) \).

When the goals are specified as \( g = (15, 15) \), the optimal solution is \( (x_{1}^{*} , x_{2}^{*} ) = (1.865, 7.622) \), and the optimal objective function values are \( f^{*} \left( x \right) = (f_{1}^{*} \left( x \right), f_{2}^{*} (x)) = (11.351, 13.378) \). We learn from the optimal objective function values that the goals are not achieved. The reason is that the goals specified are beyond the feasible constraint area. The point of \( \left( {x_{1}^{*} , x_{2}^{*} } \right) = (1.865, 7.622) \) is on the boundary of the feasible constraint area.

Goal programming has the advantages of being simple and easy to use. It can handle relatively large numbers of variables, constraints and objectives, which accounts for the large number of goal programming applications in many diverse fields, such as business management, transportation planning, and resource optimization. A limitation of goal programming is that setting the goals for some of the objectives may not be straight forward. In-depth field knowledge might be required to solve a decision problem, and experiments sometimes need to be carried out to set suitable goals.

6 Stackelberg Game Model

The Stackelberg game model, which is also called a leader-follower game, was first proposed by Heinrich von Stackelberg in 1952 (Stackelberg 1952). It is based on economic monopolization phenomena. In a Stackelberg game, one player acts as a leader and the rest as followers. The problem is then to find an optimal strategy for the leader, assuming that the followers react in a rational way which will optimize their objective functions, given the leader’s actions.

Stackelberg used a hierarchical model to describe a market situation in which decision makers try to optimize their decisions based on individually different objectives but are affected by a certain hierarchy.

6.1 Stackelberg Game and Bi-level Programming

The Stackelberg leadership model considers the case of a single leader and follower. Let \( X \) and \( Y \) be the strategy sets for the leader and follower respectively. Denote their objective function by \( F(x,y) \) and \( f(x,y) \) respectively. Knowing the selection \( x \) of the leader, the follower can select his best strategy \( y(x) \) such that his objective function \( f(x,y) \) is maximized, i.e.,

$$ y\left( x \right) \in \varPhi \left( x \right) = { \arg }\mathop { \hbox{max} }\limits_{y \in Y} f\left( {x,y} \right). $$
(2.17)

The leader then obtains the best strategy \( x \in X \) as

$$ x \in { \arg }\mathop { \hbox{max} }\limits_{x \in X} \{ F\left( {x,y} \right)|y \in \varPhi \left( x \right)\} . $$
(2.18)

Formulae (2.17) and (2.18) can be combined to express the Stackelberg game as follows:

$$ \begin{aligned} & \mathop { \hbox{max} }\limits_{x} \quad F\left( {x,y} \right) \\ & {\text{s}}.{\text{t}}.\quad x \in X, \\ & \quad \quad y \in { \arg }\mathop { \hbox{max} }\limits_{y \in Y} f\left( {x,y} \right). \\ \end{aligned} $$

Bi-level programming (see Chap. 3) is more general than Stackelberg game in the sense that the strategy sets (also called the admissible sets) depend on both \( x \) and \( y \). This leads to a general bi-level programming (Candler and Norton 1977) as follows:

$$ \begin{aligned} & \mathop { \hbox{max} }\limits_{x} \quad F\left( {x,y} \right) \\ & {\text{s}}.{\text{t}}.\quad G(x,y) \le 0, \\ & \quad \quad y \in {\text{argmax}} \{ f\left( {x,y} \right)|g(x,y) \le 0\} . \\ \end{aligned} $$
(2.19)

Bi-level programming problem (2.19) is a generalization of several well-known optimization problem (Dempe 2002). For example, if \( F\left( {x,y} \right) = - f\left( {x,y} \right) \), then it is a classical min–max problem; if \( F\left( {x,y} \right) = f\left( {x,y} \right) \), we have a realization of the decomposition approach to optimization problem; if the dependence of both the leader’s and the follower’s problem on \( y \) is dropped, the problem is reduced to a bi-criteria optimization problem.

6.2 Stackelberg Game and Nash Game

The Stackelberg game can be considered as an extension of the well-known Nash game (Nash 1951). In the Nash game, we assume that there are \( k \) players, and the ith player has a strategy set \( X_{i} \), and his objective function is \( f_{i} (x) \) for \( i = 1,2, \ldots ,k \), where \( x = (x_{1} ,x_{2} , \ldots ,x_{k} ) \). Each player chooses a strategy based on the choices of the other players and there is no hierarchy. The unstructured problem is modeled as follows: for \( i = 1,2, \ldots ,k \), we have \( \mathop { \hbox{max} }\limits_{{x_{i} \in X_{i} }} f_{i} (x) \).

This is a Nash game in which all players aim to maximize their corresponding objective functions.

In contrast, there is a hierarchy between the leader and followers in the Stackelberg game. The leader is aware of the choices of the followers, thus the leader, being in a superior position with regard to everyone else, can achieve the best objective while forcing the followers to respond to this choice of strategy by solving the Stackelberg game. Without loss of generality, we now assume that the first player is the leader, and the rest of the players are followers. Let \( X_{1 - } = X_{2} \times X_{3} \times \cdots \times X_{k} ,f_{1 - } \left( x \right) = \left( {f_{2} \left( x \right), \ldots ,f_{k} \left( x \right)} \right) \), and \( x_{1 - } = \left( {x_{2} , \ldots ,x_{k} } \right) \in X_{1 - } . \) The above Nash game is accordingly transformed into a Stackelberg game, which is given as follows:

$$ \begin{aligned} & \mathop { \hbox{max} }\limits_{{x_{1} \in X_{1} }} \quad f_{1} (x) \\ & {\text{s}}.{\text{t}}.\quad x_{1 - } \in {\text{argmax}} \{ f_{1 - } \left( x \right)|x_{1 - } \in X_{1 - } \} . \\ \end{aligned} $$

This is a Stackelberg game or leader-follower game.

6.3 Applications of Stackelberg Games

The investigation of Stackelberg games is strongly motivated by real world applications, and Stackelberg games techniques have been applied with remarkable success in many domains, such as transportation network design, production planning and logistics.

Stackelberg games have been applied to the network design problem (Ben-Ayed 1988) arising in transportation systems. In the accompanying formulation, a central planner controls investment costs at the system level, while operational costs depend on traffic flow, which is determined by the individual user’s route selection. Because users are assumed to make decisions to maximize their peculiar utility functions, their choices do not necessarily coincide with the choices that are optimal for the system. Nevertheless, the central planner can influence users’ choices by improving certain links, making some relatively more attractive than others. In deciding on these improvements, the central planner tries to influence users’ preferences in such a way that total costs are minimized. The partition of the control variables between the upper and lower levels naturally leads to a bi-level formulation.

Moreover, a fuzzy Stackelberg game model was set up to control traffic flow in a disaster area after an earthquake (Feng and Wen 2005). When a severe earthquake occurs, roadway systems usually suffer various degrees of damage, reducing their capacity and causing traffic congestion. Maintaining viable traffic functions to facilitate the saving of more lives is a crucial mission task following an earthquake. The aim of the commander of the government Emergency-Response Centre at county and city level (the upper level) is to allow traffic to pass through disaster areas to the extent that is possible given the roadway’s capacity, while road users (at the lower level) always choose the shortest route to affect emergency rescues. To solve this decision problem, the bi-level technique has been used post-earthquake to provide an efficient traffic control strategy for recovery from chaos.

A Stackelberg game has been formulated for a newsboy problem. The decision makers are the manufacturer and retailers. The former acts as a leader who controls the product price, and the retailers as the followers who decide the quantity of newspapers to order. The relationship between the manufacturer and retailers is a sequential non-cooperative game. The manufacturer first decides the product price, and the retailers then decides the quantity. The manufacturer tries to determine product price and maximize his profit after considering the retailers’ behavior. The retailers’ decision is to optimize the order quantity so as to maximize his profit at a given product price. Clearly, this newsboy problem can be modeled as a Stackelberg game.

In addition, Stackelberg games are frequently utilized in many other real-world cases, such as resource allocation, network investigation, and engineering. These applications have provided stimulating environments for the development of Stackelberg games.

7 Particle Swarm Optimization

In the computational intelligence area, particle swarm optimization (PSO) is a computational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. PSO is a heuristic algorithm proposed by Kennedy and Eberhart (1995), Shi and Eberhart (1998).

Inspired by the social behavior of animals, such as fish schooling and bird flocking, PSO is a kind of population-based algorithm. The population of PSO is called a swarm, and each individual in the swarm is called a particle. The similarity between PSO and other evolutionary algorithms lies in the fact that an individual in the community is moved to a good area according to its fitness for the environment. Unlike other evolutionary computation methods, however, each particle in PSO has an adaptable velocity (position change), according to which it moves in the search space (Parsopoulos and Vrahatis 2002). Moreover, each particle has a memory, remembering the best position it has ever visited in the search space (Kennedy and Eberhart 1995). Thus, its movement is an aggregated acceleration towards its best previously visited position and towards the best particle of a topological neighborhood.

Suppose the current search space for PSO is \( n \)-dimensional, then the \( i \)th particle of the swarm can be represented by an \( n \)-dimensional vector, \( x_{i} = (x_{i1} , \ldots ,x_{in} ) \). The velocity (position change) of this particle can thus be represented by another \( n \)-dimensional vector \( v_{i} = (v_{i1} , \ldots ,v_{in} ). \) The best previously visited position of the \( i \)th particle is denoted as \( p_{i} = (p_{i1} , \ldots ,p_{in} ) \). Defining \( g \) as the index of the best particle in the swarm (i.e., the \( g \)th particle is the best), and letting the superscripts denote the iteration number, the swarm is manipulated according to the following two equations (Eberhart et al. 1996):

$$ \begin{aligned} v_{id}^{k + 1} & = wv_{id}^{k} + cr_{1}^{k} \left( {p_{id} - x_{id}^{k} } \right) + cr_{2}^{k} \left( {p_{gd}^{k} - x_{id}^{k} } \right), \\ x_{id}^{k + 1} & = x_{id}^{k} + v_{id}^{K + 1} , \\ \end{aligned} $$

where \( d = 1, \ldots ,n \) denotes the \( d \)-dimensional vector, \( i = 1,2, \ldots ,N \) denotes the \( i \)th particle, \( N \) is the size of the swarm, \( w \) is the inertia weight, \( c \) is a positive constant, called the acceleration constant, \( r_{1} ,r_{2} \) are random numbers, uniformly distributed in [0,1], and \( k \) determines the iteration number.

Like many other global optimization methods, whether deterministic or evolutionary, PSO suffers from the problem of local optima. The existence of many local optimal solutions makes it difficult for PSO to detect the global optimal solution. In some cases, sub-optimal solutions are acceptable although not desirable, while in others, a global optimal solution is indispensable. The development of robust and efficient methods for avoiding local solutions is the subject of current PSO research.

Stretching technique (Parsopoulos and Vrahatis 2002) has been shown through simulation experiments and it provide an effective way for the PSO method to escape local optimal solution.

The idea behind the function of Stretching is to perform a two-stage transformation of the original objective function \( F\left( x \right). \) The two-stage transformation can be applied immediately after a local optimization solution \( \bar{x} \) of the function \( F(x) \) has been detected. This transformation has been proposed by Parsopoulos and Vrahatis (2002) and is defined as follows:

$$ G(x) = F(x) + \gamma_{1} \left| {x - \bar{x}} \right|({\text{sign}}(F(x) - F(\bar{x})) + 1), $$
(2.20)
$$ H(x) = G(x) + \gamma_{2} \frac{{{\text{sign}}\left( {F(x) - F(\bar{x})} \right) + 1}}{{{ \tanh }\left( {\mu (G(x) - G(\bar{x}))} \right)}} , $$
(2.21)

where \( \gamma_{1} \), \( \gamma_{2} \) and \( \mu \) are arbitrary chosen positive constants, and \( {\text{sign}}\left( x \right) \) is defined by:

$$ {\text{sign}}\left( x \right) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {if\,x < 0;} \hfill \\ {0,} \hfill & {if\,x = 0;} \hfill \\ { - 1,} \hfill & {if\,x < 0.} \hfill \\ \end{array} } \right. $$

The first transformation stage, defined in (2.20), elevates the function \( F\left( x \right) \) and eliminates all the local optimization solutions that are less optimal than the result of \( F(\bar{x}). \) The second stage, defined by (2.21), stretches the neighborhood of \( \bar{x} \) upwards, since it assigns higher function values to those points. Neither stage changes the local optimal solutions which can produce more optimal results than \( \bar{x} \). Thus, the location of the global solution can be left unchanged.

Because PSO requires only primitive mathematical operators and is computationally inexpensive in terms of both memory requirements and speed (Parsopoulos and Vrahatis 2002), it has good convergence performance and has been successfully applied in many fields such as neural network training (Zhang et al. 2007), integral programming (Kitayama and Yasuda 2006), multi-objective optimization (Ho et al. 2006), and decision making (Nenortaitė 2007).

8 Summary

This chapter addresses the basic concepts and models of optimization theory: linear programming, non-linear programming, goal programming, multi-objective programming, Stackelberg games, and particle swarm optimization are introduced. These concepts, models and solution techniques will be used in the rest of this book.