Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Particle Swarm Optimization

Particle swarm optimization is attributed to Kennedy, Eberhart and Shi [1]. It is an iterative gradient-free search algorithm inspired by biological swarming such as bird flocking, fish schooling, herding of land animals or collective behavior of insects. The search starts with a random set (here called swarm) of solutions (here called particles). The particles travel through a search space and are rated according to a user defined objective function. Their movements are a function of the individual experience and information acquired from other particles. However, the velocity vectors are not deterministic. The instantaneous strength of social and individual behavior varies randomly for each particle in each iteration. In a basic version of the algorithm the only piece of information shared among the particles is the global best solution found so far. Each particle stores also its best solution found so far. The velocity and position update rules are as follows

$$ \begin{aligned} \varvec{v}_{j} \left( {i + 1} \right) =&\; c_{1} \varvec{v}_{j} \left( i \right) + c_{2} \,rand\left( {} \right)\left( {\varvec{p}_{j}^{pbest} - \varvec{p}_{j} \left( i \right)} \right) \\ & + c_{3} rand\left( {} \right)\left( {\varvec{p}_{j}^{gbest} - \varvec{p}_{j} \left( i \right)} \right) \\ \end{aligned} $$
(1)
$$ \varvec{p}_{j} \left( {i + 1} \right) = \varvec{p}_{j} \left( i \right) + \varvec{v}_{j} \left( {i + 1} \right), $$
(2)

where: \( j \) is the particle identification number, \( i \) denotes the iteration number, \( \varvec{v}_{j} \) and \( \varvec{p}_{j} \) are speed and position of the \( j \)-th particle, \( \varvec{p}_{j}^{pbest} \) stores the best solution proposed so far by the \( j \)-th particle (pbest), \( \varvec{p}^{gbest} \) denotes the best solution found so far by the swarm (gbest), c 1, c 2 and c 3 are the explorative factor (inertia weight), the individuality factor and the social factor, respectively. It is to note that the speed \( \varvec{v}_{j} \left( {i + 1} \right) \) in the rule (2) should be multiplied by time to represent physical velocity. However, it is common practice to set the time increment to \( 1 \) and thereafter neglect it in (2). An introduction of a different time increment does not influence a behavior of the swarm because coefficients present in (1) have to be divided by this increment. The search path is not deterministic because of the last two terms in (1) that include multiplication by the random numbers rand() generated for each particle in each iteration. The random numbers are uniformly distributed in the unit interval. In all experiments described in this chapter, the c 1, c 2 and c 3 factors have been calculated using the constricted PSO formula [1] and are \( 0.73\,,\,0.73 \cdot 2.05 \) and \( 0.73 \cdot 2.05 \), respectively. If the basic velocity update rule does not manifest satisfactory search abilities, numerous refinements are available. A fairly representative survey can be found in [2, 3]. Taking into account the no free lunch theorem for optimization [4] one can conclude that there is no ultimate version of the PSO. The set of modifications should be selected individually, usually by trials and errors, to suit a given optimization problem. These modifications range from simple velocity clamping to substantial adjustments of particle communication principles like, e.g., in the fully-informed PSO [5]. Additionally, there are certain modifications dedicated to online optimization in non-stationary environments [6]. The rules of swarm movements can be rearranged to handle multi-objective [7] and/or multi-modal [8] problems. Various neighborhood topologies are proposed to serve these purposes [9]. Our main goal is to keep the swarm as simple as possible and still effective in the controller tuning tasks encountered in power electronic or drive systems. This chapter deals with offline optimization in such systems, hence the environment is assumed to be stationary even if a measurement noise is included in the model. Moreover, all optimizations are performed on numerical models of the plants, therefore the search space does not need to be bounded to specific territory resulting from safety requirements for the given plant. The numerical model should include all nonlinearities, noises and other parasitic effects crucial from control point of view. No information on gradient is harnessed by the PSO. This gives a lot of freedom to the designer at problem formulation stage—the mathematical model of the system being optimized can include, e.g., discontinuities. From the same reason, the objective function chosen by the designer can be of any type and one can focus on formulating this function to be aligned with the desired behavior of the system without bothering about its mathematical properties in terms of complexity and differentiability.

In the case of all discussed problems no special measures have been taken regarding swarm movements. The very basic formula (1) tends to produce implementable solutions. All problems have been handled as single-modal and no explicit multi-objectivity has been introduced. It is common among control practitioners to reduce problems with contradictory objectives to single-objective ones with the help of weighted terms in the objective function. This methodology has been employed also here. The simplest communication topology has been used, i.e. the gbest attracts all particles and no neighborhood operator is present. However, it has been decided to introduce absorbing walls [10] if clear physical or theoretical constraints are identifiable. This means that the speed of a particle is reset to zero if known boundaries are crossed. No other modifications to the standard PSO have been identified as necessary to effectively tune the below discussed controllers.

2 Objective Function

The objective function (also called cost function, energy function, performance index or fitness function) determines the behavior of the optimal controller. There are some commonly used performance measures as the integral of squared error (ISE)

$$ J_{\text{ISE}} = \int\limits_{0}^{\infty } {e^{2} \left( t \right){\kern 1pt} {\rm{d}}t,} $$
(3)

where \( e\left( t \right) \) denotes control error, or the generalized ISE

$$ J_{\text{IGSE}} = \int\limits_{0}^{\infty } {\left( {e^{2} \left( t \right) + \alpha \dot{e}^{2} \left( t \right)} \right){\kern 1pt} {\rm{d}}t,} $$
(4)

where \( \alpha \) is a subjective weighting factor, or the integral of squared error and derivative of control effort

$$ J_{\text{ISEDCE}} = \int\limits_{0}^{\infty } {\left( {e^{2} \left( t \right) + \beta \dot{u}^{2} \left( t \right)} \right){\kern 1pt} {\rm{d}}t,} $$
(5)

where \( u\left( t \right) \) denotes control effort (control signal) and \( \beta \) is again a subjective weighting factor. Especially (3) earns its popularity by producing problems that are easy to approach analytically for some classes of control systems ([11] might serve as the example). However, in a gradient free optimization the performance index can be chosen freely to reflect in the best possible way the desired behavior of the system. It has been decided that for all studied cases the performance indices will be positive definite and the optimization problem will be of the minimization type. The philosophy of signals contributing to the performance index its mean squared value has been kept in all discussed tuning procedures. Nevertheless, to promote sometimes contradictory behaviors like fast transients and no chattering at steady state, additional functions are introduced into the performance index definition allowing for selective contribution to the overall value. For example, the dynamics of the control effort is not penalized during the specific time interval after the change in the reference signal whereas this penalty is non-zero at the steady state. These time windows have to be carefully chosen for each term in the performance index according to physical limitations of the plant. Also, it is common practice to add terms that take into account overshooting or crossing acceptable levels for control signal. The nature of the PSO enables the designer to work with any form of performance index. However, our main goal is to keep this stage simple without sacrificing the performance of the resulting system. All proposed here performance measures have the form of

$$ \begin{aligned} J_{c} = \frac{{T_{s} }}{{t_{\text{stop}} }}\sum\limits_{k = 1}^{{\frac{{t_{\text{stop}} }}{{T_{s} }}}} & (f_{1} \left( k \right)\varvec{e}_{y}^{T} \left( k \right)\varvec{e}_{y} \left( k \right) + f_{2} \left( k \right)\varDelta \varvec{u}^{T} \left( k \right)\varDelta \varvec{u}\left( k \right) \\ & + f_{3} \left( k \right)\varvec{u}_{\text{aux}}^{T} \left( k \right)\varvec{u}_{\text{aux}} \left( k \right) + f_{4} \left( k \right)\varvec{y}_{\text{aux}}^{T} \left( k \right)\varvec{y}_{\text{aux}} \left( k \right)), \\ \end{aligned} $$
(6)

where \( \varvec{e}_{\text{y}} \), \( \varvec{u} \), \( \varvec{u}_{\text{aux}} \) and \( \varvec{y}_{\text{aux}} \) are vectors (in the MIMO case, e.g. as in three-phase converters) containing control errors, control signals, auxiliary signals from the controller and auxiliary signals from the plant, respectively (as depicted in Fig. 1). The discrete representation of the performance measure has been chosen to correspond to the assumed digital implementation of a control system. All signals are sampled at the rate of \( T_{s} \). The \( t_{\text{stop}} \) denotes the assessment test time for the particle (equal for all particles). It is to note that the infinite integration limit commonly used in the analytical approach has to be changed to finite test time if the performance evaluation is done by using signal samples recorded during a numerical simulation or a physical experiment. The functions \( f_{1} \), \( f_{2} \), \( f_{3} \) and \( f_{4} \) are bivalent functions with zero value for intervals with no penalty for a given behavior and positive value, usually different for each function with one of them set to 1, for intervals with a penalty for this behavior. These intervals are correlated with the reference test signal(s) and the test disturbance(s). In some designs they depend on states of a system when, e.g., additional penalty for overshooting is needed. It should be stressed that the bivalency is assumed here to make the design process more intuitive. From now onward the bivalent functions will be referred to as switching functions. The resulting system is optimal for a given shape of the reference and disturbance signals. That is why it is crucial to design the test scenario that includes representative set of anticipated system states. The scaling by the reciprocal of the number of samples present in (6) does not influence the optimization process and is introduced solely to make the value of the performance index easier in interpretation as the mean value of the sum of squares. The test reference signal(s) should take into account physical limits of a plant, e.g. the available acceleration. Otherwise, a dominant contribution to the cost function value coming from demanded behavior outside the physical limits makes determining upper values for switching functions significantly more difficult. It is common practice to implement ramps and s-ramps as reference models for speed or position in electric drive systems. This limits first and second derivative of the reference signal, respectively. It is also practical to use first and second order lag elements if these derivatives are expected to be limited. An example is shown in Figs. 2 and 3. The step reference signal should be avoided in such performance index based assessment tests because this does not reflect most real-life applications. For example, the s-ramp speed reference model is frequently used in drivetrain systems to limit the jerk which is important for a lifetime of a mechanical part of the system and for a comfort of its users, e.g. passengers of a vehicle.

Fig. 1
figure 1

Selected signals for performance calculation

Fig. 2
figure 2

Reference signal shaped by using rate limiter (a) and first order lag element \( G\left( s \right) = \frac{1}{s\tau + 1} \) with \( \tau = 1 \) s fed by the step signal (b)

Fig. 3
figure 3

Reference signal shaped directly as a quadratic spline (a) and generated using second order lag element \( G\left( s \right) = \frac{1}{{\left( {s\tau_{1} + 1} \right)\left( {s\tau_{2} + 1} \right)}} \) with \( \tau_{1} = \tau_{2} = 1 \) s fed by the step signal (b)

It is common that in the early stage of the search many particles cannot be rated using (6) because the simulation stops before reaching the assumed \( t_{\text{stop}} \) due to numerical problems. The simulation is also stopped intentionally before \( t_{\text{stop}} \) if states of a plant reach unacceptable levels from the physical implementation point of view. For some search problems the particles do not carry directly values of parameters of the model but those values are calculated using the values stored in the particle (see Sect. 6). It is possible then that the identification of poor solutions may takes place even before running the numerical model of the system. It has been tested that rating all such particles as equally poor may impair swarm’s capability to keep a good balance between exploration and exploitation. Any functions can be used to rate particles in the event that (6) is not applicable as far as they preserve logical order and their codomains do not overlap. The idea is illustrated in Fig. 4.

Fig. 4
figure 4

The methodology of distinguishing particles’ performance outside the area covered by the definition of \( J_{c} \)

The PSO itself puts a minor computational burden on the optimization procedure. The time needed to complete the optimization is dictated by the wall clock time required by the numerical model of the system to be simulated. Non surprisingly, the number of needed simulations depends highly on the form of the objective function. Some guidelines on choosing a good number of particles and iterations are given in [12]. It has been verified that \( 50 \) particles and \( 100 \) swarm iterations would suffice in most of the discussed here problems. However, this implies up to 5,000 runs of the model. It could be problematic if the performance of, e.g., a drive with a pulse width modulated (PWM) converter is to be assessed. Usually, a simulation step size two orders of magnitude lower than the controller sampling time is required to obtain trustworthy numerical results. In order to tackle such problems in a reasonable wall clock time, one will need processing capacity extending far beyond the one offered by today’s personal computers. On the other hand, many controller tuning tasks encountered in power electronics and drives deal with plant natural frequencies significantly lower than the used PWM frequencies. In such cases it is reasonable to neglect discontinuities introduced by the modulator and simplify the converter to a linear amplifier with a delay. Obtained model usually produces reliable numerical results for a simulation step size equal to the controller sample time. This has helped to reduce computational complexity of the below presented experiments to levels resulting in several-hour-long tuning procedures. The number of swarm iterations is always a subjective choice. Even though there are various indices elaborated for assessing a search progress, none of them is free from subjective choice of thresholds. A fairly representative set of stopping criteria has been described in [13]. The thresholds are usually determined using the guess and check method. For all following optimization problems the number of swarm iterations has been set arbitrary to meet assumed wall-clock time constraints. However, it is advisable to monitor swarm diversity variations which can be helpful in detection of ill-posed problems. A well-established measure of diversity incorporates Euclidean distance of particles to the mean and is defined, see e.g. [14], as follows

$$ D_{\text{dist}} = \frac{1}{{N_{p} \sqrt {N_{d} } }}\sum\limits_{j = 1}^{{N_{p} }} \sqrt {\sum\limits_{n = 1}^{{N_{d} }} \left( {p_{jn} - \overline{p}_{n} } \right)^{2} } , $$
(7)

where \( N_{p} \) is the swarm size, \( N_{d} \) is the dimensionality of the problem and \( \overline{p} \) is the average point. Originally this diversity measure uses the length of the longest diagonal in the search space instead of \( \sqrt {N_{d} } \). However, the original definition cannot be applied to swarms with at least one unbounded search direction which is the case in all discussed here systems. If more insight into separate dimensions is needed, a slightly different diversity measure could be used. It contains standard deviations of proposed solutions that can also be merged into a single formula by calculating their mean value per dimension

$$ D_{\text{std}} = \frac{1}{{N_{d} }}\sum\limits_{n = 1}^{{N_{d} }} \sqrt {\frac{1}{{N_{p} }}\sum\limits_{j = 1}^{{N_{p} }} \left( {p_{jn} - \overline{p}_{n} } \right)^{2} } . $$
(8)

By monitoring the evolution of inner sums in (8) one can detect dimensions with poor convergence. If the swarm does not calm down in selected dimensions, it could suggest that the problem is ill-posed for these dimensions. Such a lack of convergence in selected dimensions can also be intensified by measurement and system noises. Possible solutions are rethinking parameters to be optimized and/or redefining the fitness function.

3 Simultaneous Tuning of Cascaded PI Position and Speed Controllers

There exist numerous analytical and experimental methods for effective PID tuning in a cascaded position, speed and torque control system frequently used in servo drives. Just to mention some of them: modulus and symmetrical optimum methods (Kessler’s criteria) [15], Ziegler-Nichols method [16] with different tuning charts, e.g. Pessen recipe, Seborg et al. recipes (some-overshoot rule, no-overshoot rule), Tyreus-Luyben tuning chart. In most cases the resulting control quality is sufficient and there is no need for more elaborated tuning procedures. The example of PSO for a BLDC servo drive serves here only illustrative purposes. However, some easily identifiable advantages of the evolutionary gradientless optimization are present in contrast to the abovementioned methods: the optimizer can work with any user-defined performance index and simultaneous tuning of more than one out of the cascaded controllers is possible. These principles have already been used in [17] for optimizing a cascaded PI control structure with respect to the \( H_{\infty } \) norm. Moreover, the process and the control system can be modeled in any drag-and-drop environment and can include all crucial nonlinearities, i.e. controller saturation and anti-windup, and parasitic effects as the measurement noise.

The dynamics of a three-phase brushless DC machine can be numerically modeled using following mathematical description

$$ \left\{ {\begin{array}{*{20}c} {u_{a} (t) = L\frac{{{\text{d}}i_{a} (t)}}{{{\text{d}}t}} + Ri_{a} (t) + e_{a} \left( {\alpha_{e} (t),\omega_{m} (t)} \right)} \\ {u_{b} (t) = L\frac{{{\text{d}}i_{b} (t)}}{{{\text{d}}t}} + Ri_{b} (t) + e_{b} \left( {\alpha_{e} (t),\omega_{m} (t)} \right)} \\ {u_{c} (t) = L\frac{{{\text{d}}i_{c} (t)}}{{{\text{d}}t}} + Ri_{c} (t) + e_{c} \left( {\alpha_{e} (t),\omega_{m} (t)} \right),} \\ \end{array} } \right. $$
(9)

where

$$ \left. {e_{x} \left( {\alpha_{e} (t),\omega_{m} (t)} \right) = K_{\text{BLDC}} k_{ex} \left( {\alpha_{e} (t)} \right)\omega_{m} (t)} \right|_{x = a,b,c} . $$
(10)

The electromagnetic torque produced in the machine can be calculated using formula

$$ T_{e} \left( t \right) = \frac{{\sum\limits_{x = a,b,c} e_{x} \left( {\alpha_{e} \left( t \right),\omega_{m} \left( t \right)} \right)i_{x} \left( t \right)}}{{\omega_{m} \left( t \right)}}, $$
(11)

where \( e_{x} \left( {x = a,b,c} \right) \) is the phase back-EMF voltage, \( K_{\text{BLDC}} \) is the back-EMF constant, \( k_{ex} \left( {x = a,b,c} \right) \) is the ideal trapezoidal shape function, \( L \) and \( R \) denote the stator inductance and resistance, \( \alpha_{e} = p\alpha_{m} \) is the electrical rotor angle equal to the mechanical rotor angle \( \alpha_{m} \) multiplied by the number of pole pairs, \( \omega_{m} \) is the rotor angular speed. If the motor is electronically commutated in such a way that the current flows only through two phases, its dynamics can be modeled using equations similar to the ones describing a brushed DC machine

$$ u_{\text{BLDC}} (t) = 2L\frac{{{\text{d}}i_{\text{BLDC}} \left( t \right)}}{{{\text{d}}t}} + 2Ri_{\text{BLDC}} \left( t \right) + 2K_{\text{BLDC}} \omega_{m} \left( t \right) $$
(12)
$$ T_{e} \left( t \right) = 2K_{\text{BLDC}} i_{\text{BLDC}} \left( t \right) $$
(13)

accompanied by the Newton’s law for rotation

$$ T_{e} \left( t \right) = J\frac{{{\text{d}}\omega \left( t \right)}}{{{\text{d}}t}} + T_{\text{load}} \left( t \right) + F_{v} \omega_{m} \left( t \right), $$
(14)

where \( T_{\text{load}} \) is the load torque and \( F_{v} \) is the viscous friction coefficient. The BLDC servo drive considered here consists of a hypothetical converter-fed motor equipped with a torque/current PI control loop, tuned with the help of the modulus optimum method, and a cascaded speed and position PI controllers. Parameters of the drive are given in Table 1. All controllers include standard anti-windup algorithm depicted in Fig. 5. Input signals of the speed and torque controllers have per-unit values with the motor nominal values as the base ones. The input of the position controller is left unscaled. The model of the servo drive is then connected to the PSO (Fig. 6) and a performance index is defined as follows

Table 1 Parameters of the torque controlled BLDC drive
Fig. 5
figure 5

Discrete implementation of a PI controller with saturation and anti-windup (conditional integration algorithm): \( k_{P} \)—proportional path gain, \( k_{I} \)—integral path gain, \( T_{s} \)—controller sampling time, \( u_{ \hbox{min} } \) and \( u_{ \hbox{max} } \) are minimal and maximal control signal levels

Fig. 6
figure 6

The BLDC servo drive connected to the PSO system

$$ \begin{aligned} J_{c}^{\text{BLDC}} = \frac{{T_{s} }}{{t_{\text{stop}} }}\sum\limits_{k = 1}^{{\frac{{t_{\text{stop}} }}{{T_{s} }}}} (f_{1}^{\text{BLDC}} \left( k \right)e_{\alpha }^{2} \left( k \right) & + f_{2}^{\text{BLDC}} \left( k \right)\left( {\varDelta T_{e}^{\text{ref}} \left( k \right)} \right)^{2} \\ & + f_{3}^{\text{BLDC}} \left( k \right)e_{\alpha }^{2} \left( k \right)), \\ \end{aligned} $$
(15)

where the function \( f_{1}^{\text{BLDC}} \) introduces penalty for position control error after the time equal to a theoretical time needed to travel a test angle under assumption that the only physical limits are these related to maximal absolute values of electromagnetic torque and angular speed, the function \( f_{2}^{\text{BLDC}} \) switches off the penalty for reference torque variations during transients forced by the test reference speed and test disturbance torque, and the function \( f_{3}^{\text{BLDC}} \) penalizes for an overshoot.

For the discussed PI controllers tuning task, each particle is a vector of candidate settings for both controllers. These settings can be explicitly stored as the vector components. However, it has been tested that in some tuning problems it is easier and sometimes also more effective to perform the search in an exponential scale. For the purpose of this search, the particle is a vector

$$ \varvec{p}^{\text{BLDC}} = \left[ {{ \log }_{10} k_{{{\text{P}}\omega }} ,{ \log }_{10} k_{{{\text{I}}\omega }} ,{ \log }_{10} k_{{{\text{P}}\alpha }} ,{ \log }_{10} k_{{{\text{I}}\alpha }} } \right], $$
(16)

where \( k_{{{\text{P}}\omega }} \), \( k_{{{\text{I}}\omega }} \), \( k_{\text{Pa}} \) and \( k_{{{\text{I}}\alpha }} \) are the controllers’ gains. The swarm consists of 30 particles and is stopped arbitrary after 100 iterations. The diversity (7) and standard deviations present in (8) are inspected visually. In the case of any stochastic search algorithm each search attempt is unique due to the presence of random variables in the speed update rule (1). It is recommended to repeat several times the search, to be able to assess how conditioned the problem is. If the search process is repeatable in terms of a final position of the swarm, the optimization task is well-posed. For illustrative purposes selected iterations are shown in Figs. 7, 8, 9. The evolution of the performance index \( J_{c}^{\text{BLDC}} \) for the gbest solution is depicted in Fig. 10. However, the most informative are standard deviations shown in Fig. 11. They clearly indicate that the solutions proposed by the swarm for the quadruple of controller gains are convergent except the gain \( k_{\text{Ia}} \) in the integral path of the position controller. The reason for this is that in the test scenario used for assessing a particle the dominant term comes from a steady state error for a constant reference. A physical integration present in the plant, since the angular position controlled in the outer loop is the integral of the speed controlled in the inner loop, is then sufficient to accomplish the objective. It should be noted that the proposed gbest for \( k_{\text{Ia}} \) is near zero (see Fig. 9). The dynamics of the system for the gbest after 100 search iterations is illustrated in Fig. 12. The swarm has identified that the assumed controller topology is excessive for this task. The behavior of the swarm illustrates its ability to prompt to the designer potential further refinements of the assumed topology. This can be especially helpful when a controller for a gray-boxed or a black-boxed plant has to be designed. Obviously, there is a possibility to extend such a random search into simultaneous topology selection and gain tuning. Two typical situations could be addressed: optimization of a semi-fixed-structure controller or selecting the best fixed-structure controller from a set of predefined structures. The former refers, e.g., to determining the number of neurons in a neurocontroller. The control structure is fixed—the type of artificial neural network is assumed to be fixed—the only decision variable related to the structure is the number of hidden neurons. On the other hand, the latter refers to a situation where a set of structurally different controllers is tested by the particles, i.e. one entry of the particle vector is a pointer to a set of predefined structures. This is especially useful for black-boxed plants when little or no information about the dynamics of the process is available. The designer can then define a set of potentially applicable control structures: cascaded PIs, augmented-state feedback controllers, various neurocontrollers, iterative learning controllers, repetitive controllers, etc. The PSO will then find the most suitable one for a given black-boxed process.

Fig. 7
figure 7

Initial particles position—the bigger in diameter black dot denotes gbest

Fig. 8
figure 8

Position of particles after five iterations

Fig. 9
figure 9

Position of particles after 100 iterations—swarm near its equilibrium

Fig. 10
figure 10

Evolution of the performance index \( J_{c}^{\text{BLDC}} \) for gbest solution during the search

Fig. 11
figure 11

Evolution of standard deviations during the BLDC servo drive tuning

Fig. 12
figure 12

Response of the BLDC servo drive tuned by the PSO

4 Adaptive Online Trained Speed Neurocontroller

Nonlinear and adaptive speed controllers are often used to cope with inertia variations present in many applications. A robotic arm or an electric vehicle carrying various loads can serve as the examples. A direct method assumes introduction of an inertia estimator into the system (see e.g. [19]). In indirect methods no inertia is estimated explicitly. These methods usually take advantage of introducing nonlinearity into the controller with the intention to reduce sensitivity to variations of plant parameters. Fuzzy logic (FL), artificial neural networks (ANN) and their combinations are commonly used for implementation of nonlinear controllers. This nonlinearity can be static, i.e. determined in an offline optimization procedure, or can be tuned continuously during the regular operation of a drive. A fairy representative examples of online and offline trained neurocontrollers can be found in [2028]. The online trained neurocontrollers offer natural capability of adaptation. A learning algorithm is kept active during regular operation of the drive [29, 30]. It has been already identified that the resilient backpropagation Rprop algorithm possesses properties that are especially useful if real-time training is considered. The algorithm is less sensitive to noise in comparison to the original backpropagation algorithm because it takes into account only a sign of the gradient. In many tests the Rprop outperforms other first-order training algorithms in terms of convergence [31]. An Rprop modification known as the Rprop with weight-backtracking [31] has been selected in this study. Its pseudocode is as follows:

where \( E \) is the cost function (usually MSE), \( w_{ij} \) is the weight of a neural connection, \( \delta_{ \hbox{min} } \) and \( \delta_{ \hbox{max} } \) are allowable minimal and maximal absolute values of \( \varDelta w_{ij} \), \( \delta_{ij} \) is the current weight change, \( \eta^{ - } \) and \( \eta^{ + } \) are decrease and increase factors for \( \delta_{ij} \).

If a speed control task is considered and no repetitiveness of this process is assumed (see [32] for more details), the cost function is as follows

$$ E_{\text{ANN}}^{\text{SPEED}} \left( k \right) = \frac{1}{2}\left( {\omega_{m}^{\text{ref}} \left( k \right) - \omega_{m} \left( k \right)} \right)^{2} . $$
(17)

It has been tested in several different systems that the most crucial settings are \( \delta_{ \hbox{max} } \), \( \eta^{ - } \) and \( \eta^{ + } \) as far as the training process is equivalent to the control task. Other parameters as the number of neurons, the length of the tapped delay line (TDL) or \( \delta_{ \hbox{min} } \) are easy to tune using the guess and check method. The latter can usually be set to zero or to a very small positive value. Some recommendations on \( \delta_{ \hbox{max} } ,\,\eta^{ - } \) and \( \eta^{ + } \) potentially working settings are available in the literature (e.g. [31]). They are reported as suitable for selected offline benchmarks. It was verified that these recommendations cannot be easily extended to online tasks. This only shows that any optimization task is always problem specific (see Sect. 2) and parameters of the Rprop have to be adjusted for a given drive system. Some level of automatism can be easily achieved by employing the swarm-based optimization as reported in [33].

For illustrative purposes the controller shown in Fig. 13 has been implemented in a hypothetical drive (Fig. 14) suitable for a passenger city car (of assumed total mass variations equal to \( 1,5 0 0\, \pm 300\,{\text{kg}} \)) and then has been optimized using a user defined performance index. Each particle is a vector

Fig. 13
figure 13

Topology of the speed neurocontroller for normalized signals with \( \omega_{\text{mN}} \) and \( T_{\text{eN}} \) denoting nominal speed and torque values

Fig. 14
figure 14

Adaptive neurocontroller as a part of a vehicle’s control system (FFNN stands for the feed-forward neural network)

$$ \varvec{p}^{\text{SPEED}} = \left[ {\eta^{ - } ,\eta^{ + } ,\delta_{ \hbox{max} } } \right] $$
(18)

of candidate settings for the Rprop adaptation rule. There exist clear search boundaries resulting from the Rprop itself. Absorbing walls have been introduced to limit the search to acceptable regions and they are as follows

$$ \left\{ {\begin{array}{*{20}c} {0 < \eta^{ - } < 1} \hfill \\ {\eta^{ + } > 1} \hfill \\ {\delta_{ \hbox{max} } > 0} \hfill \\ \end{array} } \right. $$
(19)

The swarm consists of 27 particles rated according to the following performance index

$$ \begin{aligned} J_{c}^{\text{SPEED}} = \frac{{T_{s} }}{{t_{\text{stop}} }}\sum\limits_{k = 1}^{{\frac{{t_{\text{stop}} }}{{T_{s} }}}} \left( {f_{1}^{\text{SPEED}} \left( k \right)e_{\omega }^{2} \left( k \right)} \right. & + f_{2}^{\text{SPEED}} \left( k \right)\left( {\varDelta T_{e}^{\text{ref}} \left( k \right)} \right)^{2} \\ & + \left. {f_{3}^{\text{SPEED}} \left( k \right)e_{\omega }^{2} \left( k \right)} \right), \\ \end{aligned} $$
(20)

where \( f_{3}^{\text{SPEED}} \) detects overshooting. The performance index has been changed in comparison to the one proposed in [33] so as to test whether a performance index similar to the one used in Sect. 4 can produce satisfactory results. In the previous work, an expert knowledge about the controller has been incorporated into the performance index definition. Here this knowledge is neglected, i.e. no special measures during the rating related to random initial weights of ANN are taken. The system is assumed to be black-boxed. Variations of the vehicle inertia have been included in the test scenario to optimize the controller for anticipated operating conditions. The evolution of the swarm is illustrated in Fig. 15. The performance of the drive along with the switching functions used in (20) is shown in Fig. 16. The speed overshoot visible in Fig. 16 during the first acceleration is the result of a random initial weights of the neurocontroller. The training procedure needs some transients on speed to be able to identify the dynamics of the plant. After one vehicle braking no subsequent significant overshoots occur.

Fig. 15
figure 15

Swarm position after 1st, 5th and 50th iteration—the bigger in diameter black dot denotes gbest

Fig. 16
figure 16

Response of the drive with the speed neurocontroller tuned by the PSO—moment of inertia drops 20 % at \( 60 \) s and rises 20 % at \( 100 \) s in comparison to the value set for the first \( 60 \) s (assumed total mass variations \( 1,500\; \pm 3 0 0\;{\text{kg}} \))

5 Augmented Full-State Feedback Controller for a Three-Phase Inverter

Oscillatory controllers have proven to be one of the highest performance alternatives for AC voltage control in many applications, including grid converters, active power filters and sine wave inverters. Examples of such solutions are described in [3438]. The solution considered here relies solely on the LQR design method and has been proposed in [39]. A full-state feedback controller with a state vector augmented to include integral plus multiple oscillatory actions has been implemented in the \( dq0 \) rotating reference frame with all its gains calculated in one pass using the LQR approach. The LQR design method is known to deliver good performance in practical systems. The resulting system is inherently stable and the controller is relatively simple in coding. The procedure comes down to preparation of a state-space description of an augmented system (plant plus auxiliary controller states), setting weighting matrices in the quadratic cost function and calling the function that solves analytically the optimization problem, known as the discrete-time algebraic Riccati equation (DARE). A three-phase four-leg inverter with an output LC filter depicted in Fig. 17 is considered as the plant to be controlled. Therefore, its mathematical model in the \( dq0 \) rotating reference frame is as follows

Fig. 17
figure 17

Three-phase four-leg inverter with LC filter and augmented full state feedback controller

$$ \frac{\text{d}}{{{\text{d}}t}}\varvec{x}_{f} \left( t \right) = \varvec{A}_{f}^{\text{cont}} \varvec{x}_{f} \left( t \right) + \varvec{B}_{f}^{\text{cont}} \varvec{u}\left( t \right) + \varvec{E}_{f}^{\text{cont}} z\left( t \right), $$
(21)

where

$$ \varvec{A}_{f}^{\text{cont}} = \left[ {\begin{array}{*{20}c} { - \frac{{R_{f} }}{{L_{f} }}} & {\omega_{1} } & 0 & { - \frac{{k_{i} }}{{k_{u} L_{f} }}} & 0 & 0 \\ { - \omega_{1} } & { - \frac{{R_{f} }}{{L_{f} }}} & 0 & 0 & { - \frac{{k_{i} }}{{k_{u} L_{f} }}} & 0 \\ 0 & 0 & { - \frac{{R_{f} + 3R_{n} }}{{L_{f} + 3L_{n} }}} & 0 & 0 & { - \frac{{k_{i} }}{{k_{u} (L_{f} + 3L_{n} )}}} \\ {\frac{{k_{u} }}{{k_{i} C_{f} }}} & 0 & 0 & 0 & {\omega_{1} } & 0 \\ 0 & {\frac{{k_{u} }}{{k_{i} C_{f} }}} & 0 & { - \omega_{1} } & 0 & 0 \\ 0 & 0 & {\frac{{k_{u} }}{{k_{i} C_{f} }}} & 0 & 0 & 0 \\ \end{array} } \right] $$
(22)
$$ \varvec{B}_{f}^{\text{cont}} = \left[ {\begin{array}{*{20}c} {\frac{{k_{dc} k_{i} }}{{L_{f} }}} & 0 & 0 \\ 0 & {\frac{{k_{dc} k_{i} }}{{L_{f} }}} & 0 \\ 0 & 0 & {\frac{{k_{dc} k_{i} }}{{L_{f} + 3L_{n} }}} \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \end{array} } \right] $$
(23)
$$ \varvec{E}_{f}^{\text{cont}} = \left[ {\begin{array}{*{20}c} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ { - \frac{{k_{u} }}{{C_{f} }}} & 0 & 0 \\ 0 & { - \frac{{k_{u} }}{{C_{f} }}} & 0 \\ 0 & 0 & { - \frac{{k_{u} }}{{C_{f} }}} \\ \end{array} } \right] $$
(24)

and

$$ \varvec{x}_{f} (t) = \left[ {i_{Ld}^{\text{msrd}} \left( t \right),i_{Lq}^{\text{msrd}} \left( t \right),i_{L0}^{\text{msrd}} \left( t \right),u_{Cd}^{\text{msrd}} \left( t \right),u_{Cq}^{\text{msrd}} \left( t \right),u_{C0}^{\text{msrd}} \left( t \right)} \right]^{T} $$
(25)
$$ \varvec{u}\left( t \right) = \left[ {u_{d} \left( t \right),u_{q} \left( t \right),u_{0} \left( t \right)} \right]^{T} $$
(26)
$$ z\left( t \right) = \left[ {i_{od} \left( t \right),i_{oq} \left( t \right),i_{o0} \left( t \right)} \right]^{T} , $$
(27)

where \( L_{f} ,R_{f} ,L_{n} ,R_{n} ,C_{f} \,{\text{and}}\,\omega_{1} \) denote respectively inductances and resistances of phase and neutral filter legs, capacitance of the filter, and angular speed of the reference frame \( dq0 \) equal to the fundamental angular frequency of the desired output voltage. The load current vector \( z\left( t \right) \) represents the unmeasured disturbance. This description already accommodates \( k_{u} \), \( k_{i} \), \( k_{dc} \) gains that model current and voltage transducers, and the voltage source inverter, respectively. The superscript \( \bullet^{\text{msrd}} \) denotes the output signal of a measurement transducer. Thereafter, auxiliary states are introduced to achieve control objectives, i.e. zero steady-state error for the reference frequency and good disturbance rejection for the anticipated load current dominant harmonics. Thus, the auxiliary state variables are as follows

$$ \frac{\text{d}}{{{\text{d}}t}}{\varvec{x}}_{0} \left( t \right) = {\varvec{e}}\left(t\right) $$
(28)

with

$$ {\varvec{e}}\left( t \right) = [\underbrace {{u_{Cd}^{\text{msrd}} \left( t \right) - u_{d}^{\text{ref}} \left( t \right)}}_{{e_{d} (t)}},\,\underbrace {{u_{Cq}^{\text{msrd}} \left( t \right) - u_{q}^{\text{ref}} \left( t \right)}}_{{e_{q} (t)}},\,\underbrace {{u_{C0}^{\text{msrd}} \left( t \right) - u_{0}^{\text{ref}} \left( t \right)}}_{{e_{0} (t)}}]^{T} $$
(29)

and

$$ \left\{ \begin{gathered} \frac{\text{d}}{{{\text{d}}t}}x_{1} \left( t \right) = x_{2} \left( t \right) \hfill \\ \frac{\text{d}}{{{\text{d}}t}}x_{2} \left( t \right) = - \omega^{2} {\kern 1pt} x_{1} \left( t \right) + e\left( t \right), \hfill \\ \end{gathered} \right. $$
(30)

where \( \omega \) is the desired resonant angular frequency and \( e\left( t \right) \) denotes a selected control error component from (29). The number and the value of resonant angular frequencies \( \omega \) are selected individually for each voltage component and have to reflect anticipated load current harmonics seen in the \( dq0 \) reference frame. The integral term can be regarded as the special case of the oscillatory term with zero resonant frequency. As a result, the auxiliary subsystem can be categorized as a sole multi-oscillatory (MOSC) subsystem

$$ \frac{\text{d}}{{{\text{d}}t}}\varvec{x}_{\omega } \left( t \right) = \varvec{A}_{\omega }^{\text{cont}} \varvec{x}_{\omega } \left( t \right) + \varvec{B}_{\omega }^{\text{cont}} \varvec{e}\left( t \right), $$
(31)

where the auxiliary state vector \( x_{\omega } \) refers to all three voltage components and accommodates any desired frequency from available bandwidth. The auxiliary states are merged with the plant states and the augmented state matrix and the input matrix are composed as follows

$$ \varvec{A}^{\text{cont}} = \left[ {\begin{array}{*{20}c} {\varvec{A}_{f}^{\text{cont}} } & 0 \\ {\left[ {\begin{array}{*{20}c} {0} & {\varvec{B}_{\omega }^{\text{cont}} } \\ \end{array} } \right]} & {\varvec{A}_{\omega }^{\text{cont}} } \\ \end{array} } \right] $$
(32)
$$ \varvec{B}^{\text{cont}} = \left[ {\varvec{B}_{f}^{\text{cont}} ,{0}} \right]^{T} . $$
(33)

This description is then transformed into the discrete-time domain using ZOH method and weighting matrices Q and R have to be determined in the quadratic cost function

$$ J_{\text{LQ}} = \sum\limits_{k = 1}^{\infty } \left( {\varvec{x}^{T} \left( k \right)\varvec{Qx}\left( k \right) + \varvec{u}^{T} \left( k \right)\varvec{Ru}\left( k \right)} \right) $$
(34)

being the part of the LQR definition. The dynamics of the non-disturbed augmented full-state feedback system

$$ \varvec{x}\left( {k + 1} \right) = \left( {\varvec{A} - \varvec{BK}} \right)\varvec{x}\left( k \right) $$
(35)

for the zero reference signals is then shaped by \( \varvec{K} \) designed using the LQR method. Selecting \( \varvec{Q} \) and \( \varvec{R} \) is the crucial step and in the most common approach of guessing and checking this step involves an expert knowledge combined with usually numerous trials. Moreover, the resulting controller though optimal according to (34) is not optimal according to commonly used control performance indices as (3), (4) or (5). On the other hand, with the help of a population based optimizer, a full-state feedback controller can be tuned according to a user defined cost function [4042]. It has been verified that in the case of the discussed controller the performance index

$$ J_{c}^{\text{LQR}} = \frac{{T_{s} }}{{t_{\text{stop}} }}\sum\limits_{k = 1}^{{\frac{{t_{\text{stop}} }}{{T_{s} }}}} \left( {\varvec{e}^{T} \left( k \right)\varvec{e}\left( k \right) + \beta \varDelta \varvec{u}^{T} \left( k \right)\varDelta \varvec{u}\left( k \right)} \right), $$
(36)

where \( \beta \) is the weighting factor that directly influences dynamics of the control signal by penalizing this dynamics, is able to produce practical controller gains applicable in the real system in the presence of a measurement noise. In the discussed system the LQR design approach has been kept. However, it should be noticed that a swarm could also perform direct search for \( \varvec{K} \) entries [40] or closed-loop poles [41]. If no presumptions concerning the poles or the gains are available, the LQR approach seems to be the most effective if the convergence rate of the stochastic search is taken into account. This conclusion is similar if the augmented state feedback controller is tuned by the human using a trial and error method.

The optimization takes place in offline mode on numerical model of the physical converter with parameters as in Table 2. The method has been verified for the case of the auxiliary states covering 2nd, 3rd, 4th, 6th, 8th, 9th, 10th and 12th harmonic in the \( dq \) paths and 1st, 3rd, 5th, 6th, 7th, 9th, 11th and 12th harmonic for the 0-component. The auxiliary states related to integral actions for all error components are also included. This produces the 59-dimensional optimization problem: 3 entries of R and 57 entries of Q with one entry set arbitrary to 1 as scaling of (34) does not influence the result. The entries in Q are related to: 6 measured state variables, 3 integral actions and \( 2 \cdot 3 \cdot 8 \) oscillatory state variables. It has been decided to merge selected search dimensions to get less challenging problem from wall-clock time perspective. First of all, the Bryson’s rule [43] has been applied to normalize the weighting entries of Q and R. Next, penalties are not varied for a given harmonic (regardless to its occurrence in the different axis). Moreover, it has been tested that the search performed in an exponential scale is more effective in comparison to the search in a linear scale. This gives entries with the decision variables as exponents, e.g. for 3rd harmonic of the form of \( 3^{2} \omega_{1}^{2} 10^{{q_{3} }} \) and \( 10^{{q_{3} }} \) for the two auxiliary states introduced by the oscillatory term. The optimization can then be run in 15D space. The particle is a vector

Table 2 Selected parameters of the laboratory setup
$$ \varvec{p}^{\text{LQR}} = \left[ {q_{L} ,q_{C} ,q_{0} ,q_{1} ,\, \ldots ,\,q_{12} } \right] $$
(37)

of candidate exponents determining weighting coefficients for the LQR cost function (34). All search directions could have been left unconstrained because of lack of clear physical boundaries. The entries of \( {\mathbf{Q}} \) are positive for any real-valued particle (37). However, the absorbing walls have been introduced at −15 and 15 to turn the particles back to practical search regions. The walls have been set with considerable surplus according to the experience gathered during previously used trial and error tuning method. Particles that cannot be rated using (36) are handled as in Fig. 18. The swarm of 50 particles is connected to the numerical model of the system as in Fig. 19. Envelopes of the reference voltages seen in natural reference frame are shaped using a first order lag element (see Fig. 2) with a time constant of 0.05 s to avoid excessive contribution to the performance index due to zero initial conditions for the LC filter. Alternatively, a switching function could be introduced in (36) as discussed in Sect. 1.

Fig. 18
figure 18

The order of performance regions for the LQR

Fig. 19
figure 19

PSO connected to the numerical model of the three-phase four-leg inverter with the output LC filter

Position and speed graphs have been broken into 3D plots. An illustrative selection of such graphs is shown in Figs. 20, 21, 22 and 23. The evolution of the swarm in one selected dimension is shown in Fig. 24. The performance of the resulting system under the load current from Fig. 25 is shown in Fig. 26. The obtained matrix K is transferred to the physical controller without any further alterations. Selected parameters of the laboratory setup are given in Table 2. The tuning procedure from the designer side is only one-dimensional and finding a good β for (36) that produces desired behavior of the physical system usually takes less than five trials. The performance of the physical system under nonlinear loads is illustrated in Figs. 27 and 28. The transient state caused by the step resistive load in one phase (and no-load operation of other phases) is shown in Fig. 29. The harmonic contents are compared in Table 3.

Fig. 20
figure 20

Initial position of the swarm—the bigger in diameter black dot denotes gbest

Fig. 21
figure 21

Position of the particles at 15th iteration, i.e. at the early stage of the search

Fig. 22
figure 22

Velocity vectors of the particles at 15th iteration

Fig. 23
figure 23

Position of the swarm near its equilibrium after \( 75 \) iterations

Fig. 24
figure 24

The evolution of gbest (curved line) and the swarm position (dots) over optimization iterations for \( q_{C} \)

Fig. 25
figure 25

The test load current (phase \( a \))

Fig. 26
figure 26

The performance of the system optimized according to the performance index \( J_{c}^{\text{LQR}} \)—instantaneous control error (on the left) and MSE per period of the reference signal (on the right)

Fig. 27
figure 27

Output voltage \( \left( {u_{\text{Ca}} ,\,u_{\text{Cb}} \,,\,\,u_{\text{Cc}} } \right) \) and output current \( \left( {i_{\text{oa}} } \right) \) waveforms under the three-pulse diode rectifier load: \( U_{\text{Ca}} = 230 \) Vrms, \( I_{\text{oa}} = 14.14 \) Arms, THD\( _{{U_{\text{Ca}} }} = 1.58\;\% \), THD\( _{{I_{\text{oa}} }} = 179\;\% \)

Fig. 28
figure 28

Output voltage \( \left( {u_{\text{Ca}} ,\,u_{\text{Cb}} \,,\,\,u_{\text{Cc}} } \right) \) and output current (\( i_{\text{oa}} \)) waveforms under the six-pulse diode rectifier load: \( U_{\text{Ca}} = 230 \) Vrms, \( I_{\text{oa}} = 15.11 \) Arms, THD\( _{{U_{\text{Ca}} }} = 1.68\;\% \), THD\( _{{I_{\text{oa}} }} = 105\;\% \)

Fig. 29
figure 29

Transient state on capacitors voltage and output current for the step load \( \left( {R_{\text{step}} = 8\,\Omega } \right) \) in phase \( a \) (phases \( b \), \( c \) left unloaded)

Table 3 Harmonic spectra of the output voltage \( \left( {u_{\text{Ca}} } \right) \) and the load current \( \left( {i_{\text{oa}} } \right) \)

6 Conclusions

Nowadays, control systems become more and more elaborated and related tuning procedures often require theoretical insight into the system. And even then, analytical tuning procedures available for these systems usually forces a designer to guess some parameters that are needed as their input arguments. Also often these arguments do not directly shape the dynamics of the closed-loop system, and consequently achieving the desired behavior of the system is not a straightforward task. That is why still cascaded PI controllers are dominant in the industrial practice. They earn their popularity because of relatively simple tuning methods and low computational complexity. Theoretically, any controller tuning task can be redefined into a performance index optimization problem. Relevant controller settings turn into decision variables. Such an optimization problem can be solved with little or no insight into the system by using gradientless population-based optimizers. This approach can be applied to problems that do not have yet an analytical solution as well as to problems that do have one. In the case of the latter this can result in much more straightforward procedure from the designer point of view in comparison to the original analytical solution. It has been illustrated that a swarm of particles can support control engineers in a simultaneous tuning of cascaded controllers, in identifying potential excessiveness of a control structure (as in Sect. 4), in reducing dimensionality of the problem in terms of number of parameters that have to be passed to the function by the user (as in Sect. 6). Moreover, swarms are extremely useful when a user-defined performance index should be addressed in the system and analytical solution has not been yet developed. It has been shown that swarms can help to tune adaptive neurocontrollers (as in Sect. 5) that otherwise would have to be tuned by using a time-consuming trial and error method accompanied by an expert knowledge. It should be noted that the PSO is itself a trial-and-error-like method. However, points that are to be visited in the solution space are determined by the swarm itself in the iterative manner. The visual inspection of a performance often used during human made trials is replaced by rating solutions according to a user defined real-valued performance index. In most practical engineering problems no analytical solution is expected as far as stochastic search is able to deliver good suboptimal solution and this turns out to be achievable if some basic expert knowledge is incorporated into the performance index definition.