Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The need to handle the computational intensity of fractional order differintegration operators was an obstacle in between useful applications and theory. Rapid growth in the technology of fast computation platforms has made it possible to offer versatile design and simulation tools, from which the field of control engineering has benefited remarkably.

In [123], fundamental issues regarding the fractional calculus, fractional differential equations, and a viewpoint from the systems and control engineering are elaborated, and several exemplar cases are taken into consideration. One such application area focuses on PID control with derivative and integral actions having fractional orders, i.e., PIλDμ control is implemented. In the literature, several applications of PIλDμ controllers have been reported. The early notion of the scheme is reported by [34]. In [5] and [6], tuning of the controller parameters is considered when the plant under control is a fractional order one. Ziegler–Nichols type tuning rules are derived in [7], and rules for industrial applications are designed in [8]. The application of fractional order PID controllers in chemical reaction systems is reported in [9], and the issues regarding the frequency domain are considered in [10]. Tuning based on genetic algorithms is considered in [11], where the best parameter configuration is coded appropriately and a search algorithm is executed to find a parameter set that meets the performance specifications. A similar approach exploiting the particle swarm optimization for finding a good set of gains and differintegration orders is in [12]. Clearly, the cited volume of works demonstrates that the interest to PID control is growing also in the direction of fractional order versions. Unsurprisingly, the reason for this is the widespread use of the variants of PID controller and the confidence of the engineers in industry.

The idea of approximating the fractional order operators has been considered in [13], where a fractional order integrator is generalized by a neural network observing some history of the input and the output. The fundamental advancement introduced here is to generalize a PID controller using a neural structure with a similar network structure.

This chapter is organized as follows: Sect. 2.2 briefly gives the definitions of widely used fractional differintegration formulas and basics of fractional calculus; Sect. 2.3 describes the Levenberg–Marquardt training scheme and neural network structure; Sect. 2.4 presents a set of simulation studies, and the concluding remarks are given in Sect. 2.5 at the end of the chapter.

2 Fundamental Issues in Fractional Order Systems and Control

Let D β denote the differintegration operator of order β, where β ∈ . For positive values of β, the operator is a differentiator whereas the negative values of β correspond to integrators. This representation lets D β to be a differintegration operator whose functionality depends upon the numerical value of β. With n being an integer and n − 1 ≤ β < n, Riemann–Liouville definition of the β-fold fractional differintegration is defined by (2.1) where Caputo’s definition for which is in (2.2).

$$\begin{array}{rcl}{ \mathbf{D}}^{\beta }f(t) = \frac{1} {\Gamma (n - \beta )}{\left ( \frac{\mathrm{d}} {\mathrm{d}t}\right )}^{n}{ \int \nolimits }_{0}^{t} \frac{f(\tau )} {{(t - \tau )}^{\beta -n+1}}\mathrm{d}\tau & &\end{array}$$
(2.1)
$$\begin{array}{rcl}{ \mathbf{D}}^{\beta }f(t) = \frac{1} {\Gamma (n - \beta )}{\int \nolimits }_{0}^{t} \frac{{f}^{(n)}(\tau )} {{(t - \tau )}^{\beta -n+1}}\mathrm{d}\tau & &\end{array}$$
(2.2)

where \(\Gamma (\beta ) ={ \int }_{0}^{\infty }{e}^{-t}{t}^{\beta -1}\mathrm{d}t\) is the well-known Gamma function. In both definitions, we assumed the lower terminal zero and the integrals start from zero. Considering a k , b k  ∈  and α k , β k  ∈   + , one can define the following differential equation:

$$\begin{array}{rlrlrl} ({a}_{n}{\mathbf{D}}^{{\alpha }_{n} } + {a}_{n-1}{\mathbf{D}}^{{\alpha }_{n-1} } + \cdots + {a}_{0}{\mathbf{D}}^{{\alpha }_{0} })y(t) = ({b}_{m}{\mathbf{D}}^{{\beta }_{m} } + {b}_{m-1}{\mathbf{D}}^{{\beta }_{m-1} } + \cdots + {b}_{0}{\mathbf{D}}^{{\beta }_{0} })u(t)&& \end{array}$$
(2.3)

and with the assumption that all initial conditions are zero, obtain the transfer function given by (2.4).

$$\frac{Y (s)} {U(s)} = \frac{{b}_{m}{s}^{{\beta }_{m}} + {b}_{m-1}{s}^{{\beta }_{m-1}} + \cdots + {b}_{0}{s}^{{\beta }_{0}}} {{a}_{n}{s}^{{\alpha }_{n}} + {a}_{n-1}{s}^{{\alpha }_{n-1}} + \cdots + {a}_{0}{s}^{{\alpha }_{0}}}$$
(2.4)

Denoting frequency by ω and substituting s = jω in (2.4), one can exploit the techniques of frequency domain. A significant difference in the Bode magnitude plot is to observe that the asymptotes can have any slope other that the integer multiples of 20 dB per decade, and this is a substantially important flexibility for modeling and identification research. When the state space models are taken into consideration, we have

$$\begin{array}{rlrlrl} {\mathbf{D}}^{\beta }\mathbf{x} & = \mathbf{Ax} + \mathbf{B}u & & \\ y & = \mathbf{Cx} + \mathit{Du} &\end{array}$$
(2.5)

and we obtain the transfer function via taking the Laplace transform in the usual sense, i.e.,

$$\begin{array}{rlrlrl} H(s) = \mathbf{C}{\left ({s}^{\beta }\mathbf{I} -\mathbf{A}\right )}^{-1}\mathbf{B} + D & &\end{array}$$
(2.6)

For the state space representation in (2.5), if λ i is an eigenvalue of the matrix A, the condition

$$\begin{array}{rlrlrl} \vert \arg ({\lambda }_{i})\vert > \beta \frac{\pi } {2} & &\end{array}$$
(2.7)

is required for stability. It is possible to apply the same condition for the transfer function representation in (2.4), where λ i s denotes the roots of the expression in the denominator.

The implementation issues are closely related to the numerical realization of the operators defined in (2.1) and (2.2). There are several approaches in the literature and Crone is the most frequently used scheme in approximating the fractional order differintegration operators [1]. More explicitly, the algorithm determines a number of poles and zeros and approximates the magnitude plot over a predefined range of the frequency spectrum. In (2.8), the expression used in Crone approximation is given and the approximation accuracy is depicted for N = 3 and 9 in Fig. 2.1. According to the approximates shown, it is clearly seen that the accuracy is improved as N gets larger, yet the price paid for this is the complexity and the technique presented next is a remedy to handle the difficulties stemming from the implementation issues.

$$\begin{array}{rlrlrl} {s}^{\beta } \approx K\frac{{\prod \nolimits }_{k=1}^{N}1 + s/{w}_{\mathit{pk}}} {{\prod \nolimits }_{k=1}^{N}1 + s/{w}_{\mathit{zk}}} & &\end{array}$$
(2.8)

The PIλDμ controller with the operator described above has the transfer function given by (2.9), where E(s) is the error entering the controller and U(s) stands for the output.

$$\begin{array}{rlrlrl} \frac{U(s)} {E(s)} = {K}_{p} + \frac{{K}_{i}} {{s}^{\lambda }} + {K}_{d}{s}^{\mu } & &\end{array}$$
(2.9)

In Fig. 2.2, it is illustrated that the classical PID controller variants correspond to a subset in the \(\lambda \mbox{ \textendash }\mu \) coordinate system, and there are infinitely many parameter configurations that may lead to different performance indications.

Fig. 2.1
figure 1_2

Crone approximation to the operator s 0. 5 with \({\omega }_{\min } = 1e - 3\,\mathrm{rad/s}\), \({\omega }_{\max } = 1e + 3\,\mathrm{rad/s}\). Left column: N = 3, Right column: N = 9

Fig. 2.2
figure 2_2

Continuous values of the differintegration orders λ and μ enables to obtain infinitely many configurations of PIλDμ controller where the variants of the classical PID controller correspond to a subset of the domain

3 Neural Network-Based Modeling and Levenberg–Marquardt Training Scheme

In this work, we consider the feedforward neural network structure shown in Fig. 2.3, where there are m inputs, R neurons in the first hidden layer, and Q hidden layer in the second hidden layer. Since the neural structure is aimed to imitate a PIλDμ controller, the model has a single output. The hidden layers have hyperbolic tangent-type nonlinear activation while the output layer neuron is linear.

Fig. 2.3
figure 3_2

Feedforward neural network structure with R neurons in the first, Q neurons in the second hidden layer

The powerful mapping capabilities of neural networks have made them useful tools of modeling research especially when the entity to be used is in the form of raw data. This particular property is mainly because of the fact that real systems have many variables, the variables involved in the modeling process are typically noisy, and the underlying physical phenomenon is sometimes nonlinear. Due to the inextricably intertwined nature of the describing differential (or difference) equations, which are not known precisely, it becomes a tedious task to see the relationship between the variables involved. In such cases, black box models such as neural networks, fuzzy logic, or the methods adapted from the artificial intelligence come into the picture as tools representing the input/output behavior accurately. In what follows, we describe briefly the Levenberg–Marquardt training scheme for adjusting the parameters of a neural structure [14]. Since the algorithm is a soft transition in between the Newton’s method and the standard gradient descent, it very quickly locates the global minimum (if achievable) of the cost hypersurface, which is denoted by J in (2.10).

$$J = \frac{1} {2}{\sum }_{p=1}^{P}{\left ({d}_{ p} - {y}_{p}(e,\phi )\right )}^{2}$$
(2.10)

where y p denotes the response of the single output neural network, and d p stands for the corresponding target output. In (2.10), ϕ is the set of all adjustable parameters of the neural structure (weights and the biases), and u is the vector of inputs which are selected according to the following procedure:

$$\phi (t + 1) = \phi (t) -{\left (\mu I + \Phi {(t)}^{\mathrm{T}}\Phi (t)\right )}^{-1}\Phi {(t)}^{\mathrm{T}}F(t)$$
(2.11)

where μ is the regularization parameter, F(t) = [f 1f 2f P ]T is the vector of errors described as \({f}_{i} = {d}_{i} - {y}_{i}(e,\phi )i = 1,2,\ldots,P\), where P is the number of training pairs and Φ is the Jacobian given explicitly by (2.12)

$$\Phi = \left [\begin{array}{llll} \dfrac{\partial {f}_{1}} {\partial {\phi }_{1}} & \dfrac{\partial {f}_{1}} {\partial {\phi }_{2}} & \cdots & \dfrac{\partial {f}_{1}} {\partial {\phi }_{H}} \\ \dfrac{\partial {f}_{2}} {\partial {\phi }_{1}} & \dfrac{\partial {f}_{2}} {\partial {\phi }_{2}} & \cdots & \dfrac{\partial {f}_{2}} {\partial {\phi }_{H}}\\ \vdots & \vdots & \ddots & \vdots \\ \dfrac{\partial {f}_{P}} {\partial {\phi }_{1}} & \dfrac{\partial {f}_{P}} {\partial {\phi }_{2}} & \cdots & \dfrac{\partial {f}_{P}} {\partial {\phi }_{H}}\\ \end{array} \right ]$$
(2.12)

where there are H adjustable parameters within the vector ϕ. In the application of the tuning law in (2.11), if μ is large, the algorithm behaves more like the gradient descent; conversely, if μ is small, the prescribed updates are more like the Gauss–Newton updates. The algorithm removes the problem of rank deficiency in (2.11) and improves the performance of gradient descent significantly.

4 Simulation Studies

The first stage of emulating the response of a PIλDμ controller is to select a representative set of inputs to be applied to the PIλDμ controller and to collect the response. We have set N = 9 and follow the procedure described below.

For n = 1 to #experiments

Set a random K p  ∈ (0, 2)

Set a random K d  ∈ (0, 1)

Set a random K i  ∈ (0, 1)

Set a random μ ∈ (0, 1)

Set a random λ ∈ (0, 1)

Apply u(t) and obtain y(t) for t ∈ [0, 3]

Store u(t), y(t), K p , K d , K i , μ, λ

End

A total of 200 experiments with step size 1 ms have been carried out to obtain the data to be used for training data. Once the set of all responses are collected, a matrix is formed, a generic row of which has the following structure:

$$[y(k),y(k - 1),\cdots \,,y(k - d),{K}_{p}(k),{K}_{d}(k),{K}_{i}(k),\lambda (k),\mu (k)]$$
(2.13)

where k is the time index indicating y(k) = y(kT) and T = 1 ms, and there are d + 6 columns in each row and the delay depth d is a user-defined parameter. Denote the matrix, whose generic row is shown above, by Ω. In order to obtain the training data set, we downsample the matrix Ω by selecting the first row of every 100 consecutive row blocks. This significantly reduces the computational load of the training scheme, and according to the given procedure, 60,000 pairs of training data are generated and a neural network having m = 16 inputs is constructed. In Fig. 2.4, the evolution for the training data is shown with that obtained for the checking data, which is obtained by running 15 experiments and the same procedure of downsampling.

Fig. 2.4
figure 4_2

Feedforward neural network structure with R neurons in the first, Q neurons in the second hidden layer

At 128th epoch, the best set network parameters is obtained, and after this time the checking error for the neural model starts increasing and the training scheme stops the parameter tuning when J = 0. 01778. In what follows, we discuss the performance of the neural model as a PIλDμ controller.

As an illustrative example, we consider the following control problem, which is simple yet our goal is to compare the responses of two controllers, namely, PIλ D μ controller and its neural network-based approximate. The plant dynamics is given below:

$$\begin{array}{rlrlrl} \frac{Y (s)} {U(s)} = \frac{1} {s(s + 1)} & &\end{array}$$
(2.14)

where Y is the plant output and U is the control input. We choose K p  = 2. 5, K d  = 0. 9, K i  = 0. 1, μ = 0. 02, λ = 0. 7 and apply a step command that rises when t = 1 s. The command signal, the response obtained with the PIλDμ controller exploiting the above parameters, and the result obtained with the trained neural network emulator are shown on the top row of Fig. 2.5, where the response of PIλDμ controller is obtained using the toolbox described in [15]. For a better comparison, the bottom row depicts the difference in between the plant responses obtained for both controllers individually. Clearly the results suggest that the neural network-based controller is able to imitate the PIλDμ controller to a very good extent as the two responses are very close to each other.

Fig. 2.5
figure 5_2

For the first example, system response and the difference in between the two responses obtained with the PIλDμ controller and its neural network-based substitute

A better comparison is to consider the control signals that are produced by the PIλDμ controller (u FracPID) and the neural network controller (u NNPID). The results are seen in Fig. 2.6, where the two control signals are shown together on the top subplot, whereas the difference between them is illustrated in the bottom subplot. Clearly the two control signals are very close to each other; furthermore, the signal generated by the neural network is smoother than its alternative when t = 1. This particular example demonstrates that the neural network-based realization can be a good candidate for replacing the PIλDμ controller.

Fig. 2.6
figure 6_2

The control signals generated by the PIλDμ controller and its neural network-based substitute. The bottom row shows the difference between the two signals

Define the following relative error as given in (2.15), where T denotes the final time. For the results seen above, we obtain e rel = 0. 1091, which is an acceptably small value indicating the similarity of the two control signals seen in Fig. 2.6.

$${e}_{\mathrm{rel}} := \frac{ \frac{1} {T}{ \int \nolimits }_{0}^{T}\left \vert {u}_{\mathrm{FracPID}} - {u}_{\mathrm{NNPID}}\right \vert \mathrm{d}t} { \frac{1} {T}{ \int \nolimits }_{0}^{T}\left \vert {u}_{\mathrm{FracPID}}\right \vert \mathrm{d}t}$$
(2.15)

In Table 2.1, we summarize a number of test cases with corresponding relative error values. The data presented in the table indicate that the proposed controller is able to perform well for a wide range of controller gains and for small values of λ and μ. However, for another control problem, the proposed scheme may perform better for larger values of differintegration orders. To see this, as a second example, we consider the following plant dynamics:

$$\begin{array}{rcl}{ x}_{1}^{(0.1)}& =& {x}_{ 2} \\ {x}_{1}^{(0.4)}& =& {x}_{ 3} \\ {x}_{3}^{(0.8)}& =& f({x}_{ 1},{x}_{2},{x}_{3}) + \Delta ({x}_{1},{x}_{2},{x}_{3},t) + g(t){x}_{4} + \xi (t) \\ {x}_{4}^{(0.5)}& =& u \end{array}$$
(2.16)

where Δ(x 1,x 2,x 3) and ξ(t) are uncertainties and disturbance terms that are not available to the designer. In the above equation, we have

$$\begin{array}{rcl} f({x}_{1},{x}_{2},{x}_{3}) = -0.5{x}_{1} - 0.5{x}_{2}^{3} - 0.5{x}_{ 3}\vert {x}_{3}\vert & &\end{array}$$
(2.17)
$$\begin{array}{rcl} g(t) = 1 + 0.1\sin \left (\frac{\pi t} {3} \right )& &\end{array}$$
(2.18)
$$\begin{array}{rlrlrl} \Delta ({x}_{1},{x}_{2},{x}_{3},t) & = \left (-0.05 + 0.25\sin \left (5\pi t\right )\right ){x}_{1} + \left (-0.03 + 0.3\cos (5\pi t)\right ){x}_{2}^{3} & & \\ &\quad + \left (-0.05 + 0.25\sin (7\pi t)\right ){x}_{3}\vert {x}_{3}\vert &\end{array}$$
(2.19)
$$\xi (t) = 0.2\sin (4\pi t)$$
(2.20)

The plant considered is a nonlinear one having four states, disturbance terms, and uncertainties. The time-varying gain multiplying the state x 4 in (2.14) makes the problem further complicated, and we compare the neural network substitute of the PIλDμ controller given by

$$\frac{U(s)} {E(s)} = 2 + \frac{0.7} {{s}^{0.9}} + 0.6{s}^{0.75}$$
(2.21)

The results are illustrated in Figs. 2.7 and 2.8. The responses of the system for both controllers are depicted in Fig. 2.7, where we see that the two responses are very close to each other. The similarity in the fluctuations around the setpoint is another result to emphasize. The outputs of the controllers are analyzed in Fig. 2.8, where we see that the PIλDμ controller generates a very large magnitude spike when the step change in the command signal occurs, whereas the neural network-based substitute produces a smoother control signal, and this is reflected as a slight difference in between the plant responses to controllers being compared. The two controllers produce similar signals when the plant output is forced to lie around unity, which is seen in the middle subplot of Fig. 2.8, and the difference between the two control signals is seen to be bounded by 0.05 during this period. The value of e rel for this case is equal to 20.3283, which seems large but noticing the peak in the top subplot of Fig. 2.8; this could be seen tolerable as the PIλDμ controller requests high magnitude control signals when there is a step change in the command.

Table 2.1 Performance of the proposed controller for a number of different parameter configurations
Fig. 2.7
figure 7_2

For the second example, system response and the difference in between the two responses obtained with the PIλDμ controller and its neural network-based substitute

Fig. 2.8
figure 8_2

The control signals generated by the PIλDμ controller and its neural network-based substitute. The top row illustrates the two signals when the step change occurs. The middle row depicts the closeness of the two signals when t > 5 s, and the bottom row shows the difference in between the two signals

A last issue to consider here is the possibility of increasing the performance obtained by the chosen neural network structure, which is 16-25-10-1. One can argue that the neural network could be realized as a single hidden layer one, or with two hidden layers with less number of neurons in each. In obtaining the neural model, whose results are discussed, many trials have been performed, and it is seen that the approximation performance could be increased if there are more neurons in the hidden layers. In a similar fashion, a better map could be constructed if earlier values of the incoming error signal are taken into consideration. This enlarges the network size and makes it more intense computationally to train the model. Depending on the problem in hand, the goal of this chapter is to demonstrate that a fractional order PIλDμ control could be replicated to a certain extent using neural network models, and the findings of the chapter support these claims thoroughly.

5 Conclusions

This chapter discusses the use of standard neural network models for imitating the behavior of a PIλDμ controller, whose parameters are provided explicitly as the inputs to the neural network. The motivation in focusing this has been the difficulty of realizing fractional order controllers requiring high orders of approximation for accuracy. The method followed here is to collect a set of data and to optimize the set of parameters to obtain an emulator of the PIλDμ controller. Aside from the parameters of the PIλDμ controller, the neural model observed some history of the input and outputs a value approximating the response of the PIλDμ controller. Several exemplar cases are presented, and it is seen that the use of neural network models is a practical alternative in realizing the PIλDμ controllers. Furthermore, the developed neural model allows modifying the controller parameters online as those parameters are supplied as eternal inputs to the network.