1 Introduction

Robot manipulator is multi-input and multi-output (MIMO), highly nonlinear and coupled system, designed to automatically perform tasks emulating or reproducing, in a specific area, human actions. Typical applications include welding, painting, assembly, etc. Industrial robotics must inspect the products quickly and accurately what is required to improve its performance by inserting efficient controllers. Therefore, designing an efficient controller for this system is a challenging task [1]. Despite the success of modern control theory, robot manipulator controllers still commonly employ classical proportional–derivative (PD) or proportional–integral–derivative (PID) algorithm [2,3,4,5]. To improve their performances, most of those controllers have been designed using linear or linearized models. In [6, 7], applications of nonlinear PID control are proposed either by adding a nonlinear proportional and derivative term to PID controller or by gravity compensating robot manipulators. To eliminate the oscillation angle of the end-effectors of the single-link flexible joint robot manipulator, the position trajectory is performed by state feedback control method [8]. A generalized PID-type control scheme with simple tuning for the global regulation of robot was also presented with constrained inputs [9]. Regarding stability and/or control approaches such as predictive control, state feedback control, decentralized control, the readers may be referred to these works [10,11,12,13,14,15,16,17]. In adaptive context, the authors of [18] presented an adaptive PID control with bounded disturbance while those of [19] have designed the controller in respect of the nonlinear uncertainties of the system. The adaptation law is motivated from the sliding mode control and derived to tune the PID gains in order to minimize the sliding conditions. Afterward, other applications based on adaptive neural networks were introduced to control robot manipulator [20]. Recently, some researchers have made an effort to develop a new control strategies based on artificial intelligent. An approach based on partitioning of the space into segments describes to learning feedback control of robot manipulator where the dynamic behavior of the robot is considered as decoupled linear system controlled by conventional PID controllers. The segments are represented as fuzzy sets ensuring continuity of control variables [21]. Fuzzy logic self-tuning PID controller is introduced in [22,23,24], and the controller parameters are varied and computed online. In the hybrid context, learning automata is used to adjust the parameters of fuzzy-PID controller for optimal tracking of robot [25]. In order to minimize steady-state error with respect to uncertainties in robot control, PID control needs a big integral gain where a neural compensator is added to the classical PD control with a large derivative gain [26].

An elegant way of enhancing the performance of PID controllers is to find the appropriate settling of parameters values \((K_{p}, K_{i}\) and \(K_{d})\) to achieve optimal performance of the robot. In the recent years, several works have been carried out in optimization approaches. In the literature, some works have proposed methods combining genetic algorithm (GA) and fuzzy logic to tune PID gains for the robot control with unknown internal behavior where in some applications, GA is used as a main gain estimator associated with fuzzy logic as a ranking basement for GA [27], and in others, fuzzy logic is used as a main gain estimator associated with a signal analyzer to extract, from controller error signal, some performance indexes, i.e., overshoot, rise time, and steady-state error [28]. An other application of GA is to find optimal input and output membership functions of a fuzzy self-tuning PID controller [29]. In order to control two-degree-of-freedom (2-DOF) robot manipulator, Elkhateeb and Badr in [30] applied the bee colony algorithm to tune the PID controller. A modified version fruit fly optimization algorithm (MFOA) is also used to find the optimal PID parameters where nonlinear robot’s dynamic is linearized and decoupled using a nonlinear feedback linearization control technique [31]. In the same context, a new modified biogeography-based optimization (BBO) algorithm is developed to tune PID parameters of a five bar robot [32]. Other interesting works have emerged thanks to PSO algorithm [33,34,35,36,37,38,39] such as the application of PSO algorithm in control chain of SCARA robot. In other hand, we noticed an increased attention to investigate the application of GWO such as the optimization of robot path planning with a multi-objective GWO approach [40].

In this work, a novel technique of optimization called whale optimization algorithm (WOA) inspired by Mirjalili in 2016 [41] is applied to tune PID controller for the trajectory tracking of 2-DOF robot manipulator. In order to test the effectiveness of this technique, the WOA-PID controller was compared to those obtained from PSO and GWO algorithms. The rest of this paper is organized in seven sections. The second section exhibits a brief outline of PID control followed by a review of the evolutionary algorithms used in this work (section 3). The next section is devoted to the 2-DOF robot dynamic’s. The fifth part discusses simulation results for all developed controllers. The control of robot manipulator and robustness test appear, respectively, in sections 6 and 7. The last section concludes the paper and suggests some directions for the future works.

2 Design of PID controller

A PID controller is essentially a generic closed-loop feedback mechanism and monitors the error between measured system output and the desired set point. From this error, a control signal is computed to adjust the process performance. The differential equation of PID controller is:

$$u(t)=K_{p} e(t)+K_{i} \int _{0}^{t_{f}} e(t) \mathrm {d}t+ K_{d} \frac{\mathrm {d}e}{\mathrm {d}t}$$
(1)

where \(K_{p}, K_{i}\) and \(K_{d}\) are the proportional, integral, and derivative gains, respectively. The superposition of these three actions constitutes the mechanism for adjustment of process performance as shown in Fig. 1. The continuous transfer function of the PID controller is obtained through Laplace transform as:

$$C_{{\mathrm{PID}}}(s)=K_{p} + \frac{K_{i}}{s}+K_{d} s$$
(2)
Fig. 1
figure 1

PID control mechanism

3 Evolutionary algorithms

Evolutionary algorithms are exciting new probabilistic research tools inspired by biological models that have great power to solve problems in a variety of domains. Because they ideally do not make any assumption about the underlying fitness landscape. In this section, we are interested to a class of optimization techniques that are whales optimizer algorithm, grey wolf optimizer, and particle swarm optimizer.

3.1 Whale optimizer algorithm (WOA)

The WOA is a new meta-heuristic optimization algorithm mimicking the hunting behavior of humpback whales [41]. The particularity of this approach is in the manner of the simulated hunting behavior which is made randomly, i.e., the best search agent to chase the prey and the use of spiral to simulate bubble-net attacking mechanism of humpback whales. The philosophy of hunting can be described in three steps:

  1. 1.

    Encircling prey Once the humpback whales recognize the location of prey, they encircle them. Otherwise, when the position of the optimal design in the search space is not known a priori, the WOA algorithm assumes that the current best candidate solution is the target prey or is close to the optimum that will represent the best search agent. The other agents will hence try to update their positions in the neighborhood of this agent [41]. This behavior, defined by distance \(\overrightarrow{ D}\) and updated position \(\overrightarrow{ X}\), is represented by the following equations:

    $$\overrightarrow{D}= |\overrightarrow{C}\overrightarrow{X^{*}}(t)-\overrightarrow{X}(t)|$$
    (3)
    $$\overrightarrow{X}(t+1)= \overrightarrow{X^{*}}(t)- \overrightarrow{A}\overrightarrow{D}$$
    (4)

    where t indicates the current iteration; \(\overrightarrow{X^{*}}\) is the position vector of the best solution obtained so far, \(\overrightarrow{X^{*}}\) is the position vector; \(|\cdot |\) is the absolute value. It is worth mentioning here that \(\overrightarrow{X^{*}}\) should be updated each iteration if there is a better solution. \(\overrightarrow{A}\) and \(\overrightarrow{C}\) are coefficient vectors calculated as follows:

    $$\overrightarrow{A}= 2 \overrightarrow{a}\overrightarrow{r_{1}}-\overrightarrow{a}$$
    (5)
    $$\overrightarrow{C}= 2\overrightarrow{r_{2}}$$
    (6)

    where \(\overrightarrow{a}\) is linearly decreased from 2 to 0 over the course of iterations (in both exploration and exploitation phase) and \(\overrightarrow{r_{1}}\) and \(\overrightarrow{r_{2}}\) are random vectors in [0, 1].

  2. 2.

    Bubble-net attacking method This step represents the exploitation phase designed by two approaches:

    • Shrinking encircling mechanism: the behavior is achieved by decreasing the value of \(\overrightarrow{a}\) in (5) from 2 to 0. Consequently, the values of \(\overrightarrow{A}\) are fluctuated in the interval \([-a , a]\).

    • Spiral updating position: the movement of humpback is helix-shaped that can be describe by a spiral equation created between the position of whale and prey:

      $$\overrightarrow{X}(t+1)=\overrightarrow{D'}\exp ^{bl}\cos (2\pi l)+ \overrightarrow{X^{*}(t)}$$
      (7)

      with \(\overrightarrow{D'}=\overrightarrow{X^{*}}(t ) -\overrightarrow{X}(t)\) indicates the distance of the ith whales to the prey (best solution obtained so far), b is constant for defining the shape of the logarithmic spiral, l is a random number in \([-1, 1]\). Note that humpback whales swim around the prey within a shrinking circle and along a spiral-shaped path simultaneously with a probability of 50% according to the following mathematical model:

      $$\overrightarrow{X}(t+1)= \left\{ \begin{array}{ll} \overrightarrow{X^{*}(t)}- \overrightarrow{A} \overrightarrow{D} &{}\qquad {\text {if}} \quad p < 0.5 \\ \overrightarrow{D'}\exp ^{bl}\cos (2\pi l)+ \overrightarrow{X^{*}(t)} &{}\qquad {\text {if}} \quad p \ge 0.5 \end{array}\right.$$
      (8)

      where p is a random number in [0, 1].

  3. 3.

    Exploration phase The last step illustrates the chase of prey. This approach depends also on the variation of the vector \(\overrightarrow{A}\). The whales search randomly according to the position of each other. Indeed, the search agent position is updated according to randomly chosen agent. Note that for the random value of \((|\overrightarrow{A}| ) >1\), the search agent moves far away from reference whale. So, this allows to perform a global research and can be modeled as follows.

    $$\overrightarrow{D}= |\overrightarrow{C} \overrightarrow{X_{{\mathrm{rand}}}}(t)-\overrightarrow{X}(t)|$$
    (9)
    $$\overrightarrow{X}(t+1)= \overrightarrow{X_ {{\mathrm{rand}}}}(t)- \overrightarrow{A}\overrightarrow{D}$$
    (10)

    Exploration and exploitation phases are the common features of WOA and GWO which will be presented afterward.

3.2 Grey wolf optimizer (GWO)

GWO is a recent meta-heuristic optimizer inspired by grey wolves and proposed by [42]. It mimics the leadership hierarchy and the hunting mechanism of grey wolves in nature. As described in the literature, the GWO algorithm includes two mathematical models: encircling prey and hunting prey. The encircling behavior looks like above case and is modeled by:

$$\overrightarrow{D}= | \overrightarrow{C} \overrightarrow{X_{p}}(t)-\overrightarrow{X}(t)|$$
(11)
$$\overrightarrow{X}(t+1)= \overrightarrow{X_ {p}}(t)- \overrightarrow{A}\overrightarrow{D}$$
(12)

where \(\overrightarrow{X_ {p}}\) is the position vector of prey, \(\overrightarrow{X}(t)\) indicates the position of grey wolf. \(\overrightarrow{A}\) and \(\overrightarrow{C}\) have the same expression as (5) and (6), respectively. t indicates the current iteration.

In the hunting model, four types of grey wolves participate in chasing prey; alpha, beta, delta and omega denote the wolf group and are employed as solutions (fittest, best and candidate) for simulating the leadership hierarchy. The optimization algorithm is guided by \(\alpha , \beta\) and \(\gamma\), three best solutions obtained so far, and the other search agents follow them and update their positions according to the best search agent. The proposed formula are the following:

$$\begin{array}{l} \overrightarrow{D_{\alpha }}=|\overrightarrow{C_{1}} \overrightarrow{X_{\alpha }}(t)-\overrightarrow{X}(t)| \\ \overrightarrow{D_{\beta }}=|\overrightarrow{C_{2}} \overrightarrow{X_{\beta }}(t)-\overrightarrow{X}(t)| \\ \overrightarrow{D_{\gamma }}=| \overrightarrow{C_{3}} \overrightarrow{X_{\gamma }}(t)-\overrightarrow{X}(t)|\\ \\ \overrightarrow{X_{1}}(t+1)=\overrightarrow{X_ {\alpha }}(t)- \overrightarrow{A_{1}}\overrightarrow{D_{\alpha }}\\ \overrightarrow{X_{2}}(t+1)=\overrightarrow{X_ {\beta }}(t)- \overrightarrow{A_{2}} \overrightarrow{D_{\beta }}\\ \overrightarrow{X_{3}}(t+1)=\overrightarrow{X_ {\gamma }}(t)- \overrightarrow{A_{3}}\overrightarrow{D_{\gamma }}\\ \\ \overrightarrow{X}(t+1)=\frac{\overrightarrow{X_ {1}} + \overrightarrow{X_{2}}+\overrightarrow{X_{3}}}{3} \end{array}$$
(13)

3.3 Partial Swarm Optimization (PSO)

Particle swarm optimization is a stochastic algorithm insprired from natural biotic life of swarms. Based on a population of candidate solutions, its principle is to optimize a problem in search space by iteratively trying to improve a candidate solution with regard to a given objective function. Each particle searches for better positions by updating its velocity and position according to simple mathematical formulae [43]. This is expected to move the swarm toward the best solutions using the individual best position (pbest) and global best position (gbest) expressed by:

$$\begin{aligned} V_{i}&=\lambda (w\cdot V_{i}(t-1)+c_{1}{\mathrm{rand}}( )({\mathrm{pbest}}_{i}-X_{i}(t-1))+c_{2}{\mathrm{rand}}( )\cdot ({\mathrm{gbest}}_{i}-X_{i}(t-1)))\\ X_{i}&=X_{i}(t-1)+V_{i}(t) \end{aligned}$$
(14)

where \(c_{1}\) and \(c_{2}\) are two positive constants, called cognitive learning rate and social rate, respectively, w is the inertia factor and \(\lambda\) is constriction factor.

4 Dynamic model of 2-DOF robot manipulator

In this paper, a 2-DOF manipulator is regarded as the case study of coupled nonlinear systems and presented in Fig. 2. The manipulator dynamic equation is as follows:

$$\tau =M(q)\ddot{q} +H(q, \dot{q})\dot{q}+G(q)+F(\dot{q})$$
(15)

where \(q_{i}, \dot{q}_{i}\), and \(\ddot{q}_{i}\) denote the link position, velocity, and acceleration vectors, respectively, M(q) is the matrix inertia; \(H(q,\dot{q})\) is the Coriolis centripetal forces matrix, G(q) is the gravity vector, \(F(\dot{q})\) is the friction force vector and \(\tau\) is the vector of the torque control signal.

Fig. 2
figure 2

Model of 2-DOF robot manipulator

$$\begin{aligned} M(q)&= \left[\begin{array}{cc} m_{1}l_{1}^{2}+ m_{2}(l_{1}^{2}+2l_{1}l_{2}\cos (q_{2})+ l_{2}^{2}) \qquad&m_{2}l_{2}(l_{2}+l_{1}\cos (q_{2})) \\ m_{2}l_{2}(l_{2}+l_{1}\cos (q_{2}))&m_{2}l_{2}^{2} \end{array}\right] \quad F(\dot{q})= \left[\begin{array}{cc} 2\dot{q}_{1}+0.8sign(\dot{q}_{1})\\ 4\dot{q}_{2}+0.1sign(\dot{q}_{2}) \end{array}\right] \\ H(q,\dot{q})&= \left[\begin{array}{cc} - m_{2}l_{2}l_{1}\sin (q_{2}) 2\dot{q}_{2} \qquad&-m_{2} l_{2}l_{1}\sin (q_{2})\dot{q}_{2} \\ m_{2}l_{2}l_{1}\sin (q_{2}) \dot{q}_{1}&0 \end{array}\right]\\ G(q)&= \left[\begin{array}{cc} (m_{1}+m_{2})g.l_{1}\cos (q_{1})+ m_{2}.g.l_{2}.\cos (q_ {1}+q_{2})\\ m_{2}.g.l_{2}.\cos (q_{1}+q_{2}) \end{array}\right]\end{aligned}$$

where \(m_{i}, l_{i}\) and g are the link mass, the link length and the gravity, respectively. In application, \(m_{1}=10\) kg, \(m_{2}=5\) kg, \(l_{1}=1\) m and \(l_{2}=0.5\) m.

5 Simulation results

In this section, a PID controller is designed for the trajectory tracking control of robot manipulator using optimization techniques previously described. The idea is to determinate optimal parameters (\(K_{p}, K_{i}\) and \(K_{d}\)) ensuring best performance indexes (rise time, overshoot, settling time and peak) of system. The main is to apply the novel optimization algorithm (WOA) to tune the PID parameters and to demonstrate thus its efficiency compared to those obtained from the two most known algorithms (GWO and PSO). From randomly initialized parameters, the optimization algorithm minimizes the integral time absolute error criterion (ITAE), mentioned in Equation (16), for each iteration until obtaining optimal set of \(K_{p}, K_{i}\) and \(K_{d}\) parameters.

First, we tune the PID controller with three algorithms for different values of agents or population size is 10, 20, 30, 50, 100 and 200; we choose thereafter the best population size for each algorithm.

$$J={\text {ITAE}}=\int _{0}^{t_{f}} e(t) t \mathrm {d}t$$
(16)

\(e=q_{{\mathrm{ref}}}-q\) is the error between the reference position and the actual position of the robot. In this study, simulation results are performed using MATLAB/Simulink environment. Figure 3 illustrates the Simulink diagram of the control scheme strategy of two joints robot manipulator. The simulation consists of robot manipulator system, two controllers, trajectory that must followed, and optimization block used for selecting the controlling parameters which based on optimal tracking error constrain. It also has two outputs which are the first joint displacement (position 1) and the second joint displacement (position 2). The desired and actual positions for the first and second joint of robot manipulator controlled are depicted in Figs. 4, 5 and 6 with the initial conditions \(q_{1}(0)=0.5\) deg, \(q_{2}(0)=1\) deg, \(\dot{q}_{1}(0)=0\) and \(\dot{q} _{2}(0)=0\). In addition, all controllers were tested under the same conditions with time range of simulation is 10 s, and the reference signal magnitude is given as 10 degree. Tables 1, 2, and 3 illustrate the temporal characteristics responses of the two joints with different number of agents.

Fig. 3
figure 3

PID control mechanism

Table 1 Comparison results of the first and second joint response characteristics controlled by PSO-PID
Table 2 Comparison results of the first and second joint response characteristics controlled by GWO-PID
Table 3 Comparison results of the first and second joint response characteristics controlled by WOA-PID

From Table 1, we see that the lower objective function (7.94) is obtained with 200 agents for the PSO optimization that corresponds at the smallest rise time for both joints and the others characteristics are on average acceptable. We can observe also (Table 2), that the objective function for the GWO optimization is weak with a value of 6.464 for only 20 agents. Besides the overshoot value of the first joint, GWO offers enhancement over related PSO in terms of the settling times for the first joint (0.453/1.296) and the second joint (0.585/2.8678) and gives improvements in the cost function by decreasing the value from 3.834 to 3.157 for the first joint and from 4.106 to 3.307 for the second joint.

While in Table 3, the WOA algorithm has the smallest cost function 5.24, with only 10 agents. But due to the remarkable overshoot for the first joint, it seems that the results obtained with 30 agents are the best, with a small cost function 6.97, compared with the other values. In general, the WOA presents best performances for the controlled system. The different values of controller parameters deduced from the optimization process with objective function (16) are listed in Table 4. In order to prove the efficiency of WOA algorithm in term of computing time, we have tested the three algorithms for different and same number of agents; the numerical results show that the WOA algorithm spends a short time to compute the set of PID optimal parameters compared to GWO and PSO algorithms, i.e., it converges faster to the optimum. This constitutes another choice criterion of the WOA algorithm (Table 5) that shows the superiority of WOA algorithm over the other algorithms.

Table 4 PSO-PID, GWO-PID, and WOA-PID parameters
Table 5 Convergence time of three algorithms

From Figs. 4, 5 and 6, we can see that the first articulation has an oscillatory form for the three optimization techniques, and different population sizes, with a remarkable relative overshoot for the GWO-PID and WOA-PID between [2.18–15.11%] and [0.43–20.78%], respectively, where the maximum overshoot does not precede 6.4% for the PID control optimized by PSO algorithm. For the second articulation, the response is almost aperiodic with a small overshoot of 3.87%, 4.53% and 2.774% maximum for the PSO-PID, GWO-PID, and WOA-PID commands, respectively. We see also that the response of the first joint converges more rapidly toward the final value for the PID control optimized by GWO and WOA algorithms with an average response time between [0.38–0.48 s] and [0.27–0.51 s], respectively, and also with an interval of [0.58–4.15 s] and [0.57–3.44 s] for the second joint, and the WOA optimization gives good values in this sense. On the contrary, the response of the robot controlled by a PSO-PID takes more response time between [0.44–1.6 s] for the first articulation and an interval of [2.86–4.33 s] for the second articulation.

Fig. 4
figure 4

Actual position for the robot controlled by PSO-PID controller

Fig. 5
figure 5

Actual position for the robot controlled by GWO-PID controller

Fig. 6
figure 6

Actual position for the robot controlled by WOA-PID controller

6 Control of Robot manipulator

To better see the contribution of WOA for a fast and precise trajectory tracking of the robot manipulator. We have applied the PID control tuned by the three algorithms, used the best performance of the controllers obtained by 200 agents for the PSO, 20 agents for GWO and 30 agents for WOA, respectively. Figure 7 shows the superiority of the WOA-PID and GWO-PID controllers over the PSO-PID controller. The second joint have almost the same behavior with WOA-PID and GWO-PID, and we can see the enhancement in the performance of the first joint of the robot. The WOA algorithm offers the optimization of the settling time for the first joint compared with those obtained with GWO tuning (0.2728 s/0.453 s) and reduce the overshoot from (7.8466% to 1.3703%). It can be said that the system performance is satisfied in terms of settling time and the system responses are faster with the WOA-PID controller. Figure 8a shows the convergence speed of error resulting from WOA algorithm which confirm the competitive of this methodology. As can be seen from Fig. 8b, WOA algorithm converge to the optimum with a smaller number of iterations.

Fig. 7
figure 7

Actual position for joints controlled by three algorithm controllers

Fig. 8
figure 8

a Error signal for the first and second articulations from step consign, b converge curve of the WOA algorithm

7 Robustness test

To examine the robustness of the optimization algorithm, the obtained controller was tested for the first time by changing the input to the sinus signal and for the second time by introducing a torque disturbance for two articulations.

  1. (A)

    Tracking sinus input The optimization was done for a sinus input signal, the obtained gains of the PID controller allows pursuit of the trajectory as it is shown in Fig. 9a and the generated control torque in Fig. 9b.

  2. (B)

    Disturbance torque The objective is to test the robustness of the WOA-PID controller when the robot is subjected to disturbances. A white noise torque is added to the PID control signal with variance of (0.8) as depicted in Fig. 10a. It can be seen from Fig. 10b that the efficiency of the controlled robot by the tuned PID to track the desired trajectory despite the existence of external disturbances. Figure 11 reveals the cost function (ITAE) for both cases.

Fig. 9
figure 9

Position and control torque of the first and second articulations without disturbances

Fig. 10
figure 10

Position and control torque of both articulations with disturbance (variance = 0.8)

Fig. 11
figure 11

Cost function of the WOA algorithm

8 Conclusion

The aim of this paper is to introduce new application of the whale optimizer algorithm to adjust the PID parameters for the tracking control of the 2-DOF robot manipulator. The results obtained witness the effectiveness of the proposed WOA-PID in terms settling time, errors, and convergence time as well as its robustness to tune parameters PID controller for the robot tracking control with or without disturbances.

As future work, we propose to change the objective function and to test this controller for robots with more degrees of freedom. We plan also to compare the results of this study with others control techniques. From an optimization standpoint, we attempt to test the WOA algorithm in others applications such as determining optimal durations of traffic lights.