1 Introduction

Gear transmissions are usually integrated into machines to provide high torque within a limited space. Among gear transmissions, harmonic drives possess the advantages of high gear reduction ratio, compact size, and high torque-to-weight ratio with virtually no backlash. These salient features make harmonic drives ideal for precise motion mechanisms such as lightweight service robot manipulators [1], force-feedback haptic devices [2] and steer-by-wire systems [3]. For these human–machine interaction applications, high torque resolution is a necessity, and actuators including gear transmissions should play the role of ideal torque sources. However, the physical variable that is manipulated in practice is the armature current in a motor or, generally speaking, the motor torque to the gear. Owing to Coulomb frictions, structural damping and flexibility of a harmonic drive, the relation between its input torque and output torque possesses complex dynamics [4]. To move the harmonic drive actuator toward an ideal torque source, its dynamics should be shaped through feeding back the gear’s output torque transmitted to the load, and torque ripples induced by the harmonic drive should be compensated for.

Please refer to [5] for a description of the harmonic drive. In a harmonic drive system, transmission flexibility would cause output vibrations, and frictional forces would worsen its output accuracy. To control the output torque of a harmonic drive, many researchers [3, 69] used disturbance observers (DOB) to estimate torque disturbances. DOBs [1012] are useful in compensating for unknown system perturbations to regulate plant’s dynamics to the nominal dynamics. Since the nominal dynamics may not necessarily yield satisfactory performances, the DOB usually accompanies a feedback compensator to shape the nominal dynamics and meet performance requirements. Therefore, a feedback compensator as well as a DOB needs to be designed in building a DOB-based control system. In [5], the similarity between the DOB and the internal model control (IMC) [13] has been demonstrated. However, compared with the DOB-based control structure, the IMC scheme reduces the efforts required in the controller design as well as in the practical implementation. Therefore, as in [5], the IMC is applied to the feedback control of the harmonic drive system, instead of using the DOB-based control configuration.

Due to mechanical imperfections such as misalignments of the gear assembly and dimensional inaccuracies of the gear itself, the output torque of the gear contains ripples that vary for different drives, assemblies, speeds and loads. However, a special characteristic of harmonic drives is that the dominating component of torque ripple is repeated every half turn of the input shaft [14], that is, the torque ripple is periodic in nature and its fundamental component corresponds to twice the rotational frequency of the motor shaft. To compensate for the kinematic error of a harmonic drive, Nye et al. [15] used an open-loop method by approximating the kinematic error with a simple sinusoidal term and superimposing it on the desired trajectory. Gandhi and Ghorbel [16] proposed a PD-type controller to compensate for the kinematic error of a harmonic drive in a closed-loop fashion. To alleviate speed ripple caused by the harmonic drive, Hirabayashi et al. [17] proposed a method of adaptive speed control, in which the controller senses speed ripple through a high-resolution encoder and modifies the speed command to the driving motor. Godler et al. [18] applied repetitive learning control for reducing speed ripple in a harmonic drive system. Han et al. [19] passed the load-side acceleration signals through a peak filter in parallel with an existing controller in order to reject the disturbance at the resonant frequency. However, the introduction of the peak filter deteriorates the transient performance, and a time-varying gain to softly switch the peak filter on/off is employed as a remedy. While the previous studies [1519] aimed at reducing position or speed ripples, Lu and Lin [5] focused on minimizing torque ripples transmitted to the load. However, the robust stability issue was not addressed, and an adaptation gain for ripple compensation needed to be redesigned in accordance with the variation of the disturbance frequency.

Fig. 1
figure 1

Experimental system. a Photo of the harmonic drive actuator. b Schematic representation of the hardware configuration

Following and enhancing the work [5], this paper proposes a DOB-based repetitive learning control (RLC) scheme to compensate for torque ripples. The RLC is a technique in which the control signal is built iteratively from successive cycles, that is, the control in the present operation cycle is refined by feeding back the output error in the previous cycle. For every operation cycle, the reference command and external disturbance, that are periodic functions of time, remain to be constant at any specific instant of local time. Since constants are of the simplest form of unknowns, they can be well compensated for through the betterment process, and performance on repetitive tasks can be enhanced from one cycle to the other till the final goal is achieved. The previous RLC schemes [18, 2024] update the present control by referring to output errors in previous periods. However, the objective of a learning controller is to generate an effort to cancel out the input disturbance equivalent to the entire system perturbation. The scheme proposed in this paper learns directly from the DOB’s output that approximates the compensation error of the learning control, rather than extracting the disturbance information from tracking errors. This speeds up the learning process. Since the torque ripple of a harmonic drive contains mainly a component whose frequency is two times that of the motor speed, we implement the RLC in Fourier series expansion of the estimated compensation-error signal and update only one relevant frequency component. In contrast to [5], the robust stability conditions for the overall system are given. Moreover, no adaptation gain for ripple compensation needs to be redesigned as the disturbance frequency varies. Experiments were conducted on a harmonic drive actuator to demonstrate the feasibility of the proposed scheme.

2 Torque control with the feedback structure of IMC

2.1 Experimental system

The harmonic drive actuator in the experimental system is a hollow-shaft actuator with an integrated torque sensor, SD-25B from Sensodrive GmbH. The experimental system is shown in Fig. 1 with a photo of the harmonic drive actuator. Please refer to [5] for details on the harmonic drive actuator as well as the experimental system. With a sampling period of 0.1024 ms, the DSP that is the controller core obtains the torque and position information from the FPGA, calculates the control algorithm, and sends the control effort to a regulated current converter through a 12-bit D/A converter and some analog signal processing circuits. In the experimental system, the input to the plant is the motor-torque command to the regulated current converter, and the information on the plant’s output is obtained from the torque sensor that measures the output torque of the harmonic drive gear to the load. Let \(y(s)\) denote the Laplace transform of the harmonic drive’s output torque \(y(t)\), and \(u(s)\) the Laplace transform of the commanded motor torque \(u(t)\) referred to the load side. The estimated nominal transfer function based on the measured frequency response is [5]

$$\begin{aligned}&P_n (s)\!=\!\frac{y(s)}{u(s)}\nonumber \\&\quad =\frac{4.8371\!\times \! 10^{10}}{s^{4}\!+\!998.95s^{3}\!+\!1.2272\times 10^{6}s^{2}\!+\!7.2805\!\times \! 10^{7}s\!+\!4.8480\!\times \! 10^{10}}.\nonumber \\ \end{aligned}$$
(1)

2.2 Design of an IMC torque controller

Consider a harmonic drive actuator described by

$$\begin{aligned} y=P(s) \left({u+d} \right) \end{aligned}$$
(2)

in which \(P(s)\) denotes the actual transfer function of the plant, and \(d\) represents all disturbances referred to the input. With a reference \(r\), the IMC structure using a nominal plant model \(P_n (s)\) in parallel with the actual plant \(P(s)\) is shown in Fig. 2, in which \(Q_\mathrm{im} (s)\) is a filter usually chosen so that the so-called IMC controller \(Q_\mathrm{im} (s)P_n^{-1} (s)\) is proper and then its implementation does not involve direct differentiation of the measured output signal. Whenever there is an output difference between the real plant and its nominal model, there is a nonzero feedback to the IMC controller. The output of the IMC system can be derived as

$$\begin{aligned} y(s)&= \frac{Q_\mathrm{im} (s)P(s)P_n^{-1} (s)}{1+Q_\mathrm{im} (s)\left( {P(s)P_n^{-1} (s)-1} \right) }r(s)\nonumber \\&+\frac{\left( {1-Q_\mathrm{im} (s)} \right)P(s)}{1+Q_\mathrm{im} (s)\left( {P(s)P_n^{-1} (s)-1} \right)}d(s). \end{aligned}$$
(3)

When the nominal model is exact \(\left( {P_n =P} \right)\) and there is no disturbance \(\left( {d=0} \right),\) we have \(y(s)=Q_\mathrm{im} (s)r(s),\) meaning that the nominal closed-loop transfer function of the IMC system is directly assigned to \(Q_\mathrm{im} (s).\) The IMC design is hence straightforward, and closed-loop characteristics are related straight to controller parameters [13].

Fig. 2
figure 2

Structure of the IMC-based system

From (3), it is found that, when \(Q_\mathrm{im} (s)=1,\) we have \(y(s)=r(s);\) that is, the output signal \(y\) attains the reference command \(r\) instantaneously even in the presence of model mismatches and external disturbances. However, this perfect performance cannot be accomplished in practice since this usually requires control efforts larger than those the actuator can deliver, and the IMC controller \(Q_\mathrm{im} (s)P_n^{-1} (s)\) is hardly ever proper and cannot be implemented when \(Q_\mathrm{im} (s)=1\). For our fourth-order harmonic drive actuator, \(Q_\mathrm{im} (s)\) is chosen as

$$\begin{aligned} Q_\mathrm{im} (s)=\frac{\omega _c^4 }{(s+\omega _c )^{4}} \end{aligned}$$
(4)

in which \(\omega _c \) is a design parameter to specify the desired closed-loop poles. Now, the dynamics of the closed-loop system can be tuned fast (slow) by simply increasing (decreasing) \(\omega _c \). It has been experimentally shown in [5] that although the IMC leads to well-damped output responses, it is not efficient in compensating for torque ripples induced by the harmonic drive.

3 Compensation for torque ripples

3.1 DOB-based learning control

Harmonic drive gears typically contain kinematic inaccuracies due to manufacturing and assembly errors, which leads to output-torque ripples that are periodic with respect to the angular displacement of the input shaft. Since the ripple has a period equal to half a rotation of the input shaft [25], the RLC scheme should be effective in reducing the torque ripple through repetitive trials. The previous learning control schemes [18, 2024] update a learning control according to tracking errors, extracting disturbance information from output errors indirectly. The learning control’s objective, however, is to have a feedforward control cancel out the input disturbance that is equivalent to the whole system perturbation. Following this idea, this paper proposes a DOB-based learning control scheme, in which a DOB is applied to evaluating the compensation error of the feedforward control for the learning. Let \(L\) denote the duration of one cycle, that is, the period of the periodical disturbance, i.e. \(d(t+L)=d(t)\). Figure 3 shows the structure of the proposed scheme, in which the IMC plays the role of a real-time feedback controller while the learning control is delayed by \(L\) before being applied to the plant. According to the proposed structure, we have the control law during the \(i\)th cycle

$$\begin{aligned} u=u_\mathrm{fb} +u_\mathrm{ff}^i \end{aligned}$$
(5)

in which \(u_\mathrm{fb} \) denotes the feedback control provided by the IMC, and \(u_\mathrm{ff}^i \) denotes the feedforward learning control during the \(i\)th cycle. Here, variables without a superscript denote signals in the current time frame. The learning control is updated by the following learning rule

$$\begin{aligned} u_\mathrm{ff}^{i+1} =\alpha (s)u_\mathrm{ff}^i +\left[ {Q_\mathrm{do} (s) u_\mathrm{fb} -Q_\mathrm{do} (s)P_n^{-1} (s)y} \right] \end{aligned}$$
(6)

in which \(Q_\mathrm{do} (s)\) determines the dynamics of the DOB, and the filter \(\alpha (s)\) is used to attenuate high-frequency components and then increase system robustness to high-frequency unmodeled dynamics and noises. The idea of using low-pass filters to increase the system’s insensitivity to imperfections in high frequencies can also be found in previous studies [2124]. Define the tracking error \(e=r-y\). With the assumption that \(r^i+1=r^{ i}\) and \(d^i+1=d^{i}\), it is shown in Appendix A that the relationship between the tracking errors in two consecutive cycles is described by

$$\begin{aligned} e^{i+1}&\!=\!&\left( {\alpha \!-\!Q_\mathrm{do} H} \right)e^{i}\!+\!\left[ {P^{-1}\!+\!\left( {1\!-\!Q_\mathrm{im} } \right)^{-1}Q_\mathrm{im} P_n^{-1} } \right]^{-1} \nonumber \\&\quad \times \left\{ {\left[ {\left( {1\!-\!\alpha } \right)P^{-1}\!+\!Q_\mathrm{do} P_n^{-1} } \right]r\!-\!\left( {1\!-\!\alpha } \right)d} \right\} \end{aligned}$$
(7)

in which \(e^{i}\) and \(e^{i+1}\) denote the tracking errors at the \(i\)th and (\(i+1\))th cycles, respectively, and \(H(s)=\left[ Q_\mathrm{im} +\left( {1-Q_\mathrm{im} } \right) P_n P^{-1}\right]^{-1}\). The robust stability conditions for the proposed learning system are then obtained as follows:

Fig. 3
figure 3

Block diagram of the proposed DOB-based learning control system

  1. (i)

    All roots of the following equation have negative real parts.

    $$\begin{aligned} P^{-1}(s)+\left( {1-Q_\mathrm{im} (s)} \right)^{-1}Q_\mathrm{im} (s)P_n^{-1} (s)=0 \end{aligned}$$
    (8)
  2. (ii)

    For all values of \(s=j\omega \),

    $$\begin{aligned} \,\,\left| {\alpha (s)-Q_\mathrm{do} (s)H(s)} \right|<1. \end{aligned}$$
    (9)

Rearranging (8) after multiplying both sides by \(\left({1-Q_\mathrm{im} (s)}\right) P(s)\) yields \(1+Q_\mathrm{im} (s)\left( {P(s)P_n^{-1} (s)-1} \right)=0\). In view of (3), the stability condition (8) is thus equivalent to the stability condition for an IMC system, and the necessary condition for a stable learning system is that the IMC must stabilize the uncertain plant (2). On the other hand, the stability condition (9) is the requirement for a stable DOB-based learning process. When \(P_n (s)=P(s)\), the condition (9) is simplified to \(\left| {\alpha (s)-Q_\mathrm{do} (s)} \right|<1\) for all values of \(s=j\omega \). Unlike the previous learning control schemes, this stability condition is irrelevant to the closed-loop transfer function of the real-time feedback system, and the dynamics of the proposed learning process can be adjusted independently of the tuning of the feedback compensator. Moreover, since the DOB in the proposed scheme is not directly involved in the real-time feedback loop, its dynamics can be tuned fast by increasing the cutoff frequency of \(Q_\mathrm{do} (s)\) without exciting high-frequency unmodeled dynamics. Fast dynamics of the DOB reduce the time lag in estimating the compensation error, and thus accelerate the learning process.

Assume that the stability conditions (8) and (9) are fulfilled. When \(\alpha =1\) and \(Q_\mathrm{do} \ne 0,\) we have from (7) the steady-state tracking error of the proposed DOB-based learning control system

$$\begin{aligned} e^{\infty }(s)=\left( {1-Q_\mathrm{im} (s)} \right) r(s) \end{aligned}$$
(10)

irrespective of model uncertainties and external disturbances. For comparison, the tracking error of the IMC system can be derived from (3) as

$$\begin{aligned} e(s)\!=\!\frac{1\!-\!Q_\mathrm{im} (s)}{1\!+\!Q_\mathrm{im} (s)\left(\!{P(s)P_n^{-1} (s)\!-\!1} \!\right) }\left(\! {r(s)\!-\!P(s)d(s)}\! \right) \end{aligned}$$
(11)

which shows that model uncertainties and external disturbances have certain influences on the tracking precision. Actually, (10) corresponds to (11) with \(P_n =P\) and \(d=0\), which means that the proposed learning control achieves the ideal closed-loop dynamics of the IMC system through repetitive trials. Therefore, the introduction of a plug-in DOB-based learning control to the IMC is advantageous in the sense that it reduces the sensitivity of system performance to modeling errors and unknown disturbances.

3.2 Implementation using Fourier series expansions

Currently, most advanced control schemes are implemented with digital microprocessor systems. Here realizing the learning law (6) requires storage of the feedforward signal, \(u_\mathrm{ff} (\tau )\) for \(0\le \tau \le L\), which requires a lot of memory space if the temporal resolution between two consecutive storage points needs to be small. To avoid this problem, Fourier series expansions that can be effectively accomplished by the microprocessing technology are applied for approximating an ideal feedforward signal with few parameters.

Fig. 4
figure 4

Implementation of the proposed learning control scheme using truncated Fourier series

Let the Fourier series of the estimated disturbance signal during the \(i\)th cycle be expressed as

$$\begin{aligned} Q_\mathrm{do} \left[ {u_\mathrm{fb} -P_n^{-1} y} \right]=\sum _{m=-\infty }^\infty {\theta _m^i \phi _m } \end{aligned}$$
(12)

in which \(\theta _m^i \) is a Fourier coefficient, and \(\phi _m \) is a trigonometric function in Fourier series. Furthermore, let the feedforward compensation at the \(i\)th cycle be a trigonometric polynomial of degree \(M\), i.e.

$$\begin{aligned} u_\mathrm{ff}^i =\sum _{m=-M}^M {w_m^{i} \phi _m } \end{aligned}$$
(13)

in which \(M\) is a fixed integer, and \(w_m^{i} \) denotes the weight parameter associated with \(\phi _m \) at the \(i\)th cycle. The choice of the number \(M\) depends on how well the ideal feedforward compensation is to be approximated by the actual one (13). Increasing the value of \(M\) reduces the approximation error between the ideal feedforward signal and the real one, while it complicates its implementation and requires more computation efforts. The learning process is to adapt the weight parameters, \(w_m^{i} \), so that the feedforward control compensates for periodic disturbances that are related to \(\phi _m \) for \(-M\le m\le M\). The learning law is designed as

$$\begin{aligned} w_m^{i+1} =w_m^{i} +\theta _m^i \quad \text{ for}\quad -M\le m\le M. \end{aligned}$$
(14)

Figure 4 shows the structure of the proposed DOB-based learning control using Fourier series, in which CTFA and CTFS are the abbreviations of continuous-time Fourier analysis and continuous-time Fourier synthesis, respectively, and \(z^{-1}\) denotes the delay of one cycle in the discrete domain. The estimated disturbance signal from the DOB during one period is modeled by the CTFA using Fourier series expansions for selective frequencies. The resulting Fourier coefficients are then used to update parameters in the learning controller, and the feedforward control is obtained by restoring these parameters to a time-domain signal with the CTFS (13). It is shown in Appendix B that, for the proposed learning control system using the truncated Fourier series, the stability condition (8) remains the same, but the stability condition (9) is modified to

$$\begin{aligned} \left| {1-Q_\mathrm{do} (s)H(s)} \right|<1 \end{aligned}$$
(15)

for all values of \(s=j\omega _m ,\) in which \(\omega _m \) denotes the frequency corresponding to \(\phi _m \) for \(-M\le m\le M\). The benefits from using truncated Fourier series to compensate for some disturbance components of certain frequencies are: (1) the Fourier series approximation is effective since every trigonometric function in Fourier series corresponds to a specific frequency and is independent of each other in the frequency domain; (2) the resulting system is further insensitive to high-frequency imperfections since high-frequency signals including noises are automatically eliminated in the CTFA of selective frequencies; (3) it is convenient for realization since only several parameters are required to reconstruct the feedforward control signal by the CTFS, rather than storing time histories of relevant signals that would require a lot of memory space in digital implementations.

3.3 Experimental results on ripple compensation

The DOB-based learning control using truncated Fourier series is applied to compensating for torque ripples that are periodic with respect to the angular position of the motor shaft. Since the torque ripples induced by harmonic drives are periodic functions of position instead of time, the time in the learning control formulation is implemented with the angular position. In our experiments, the length of one cycle is considered to be one motor revolution, i.e. \(L=2\pi \) (rad). Since the torque ripples contain a main component whose frequency is twice the angular frequency of the input shaft, the learning control is designed to compensate for that major component, and the feedforward compensation at the \(i\)th cycle is

$$\begin{aligned} u_\mathrm{ff}^i =w_a^i \cos (2q_m )+w_b^i \sin (2q_m ) \end{aligned}$$
(16)

in which \(q_m \) is the angular position of the motor shaft, and \(w_a^{i} \) and \(w_b^{i} \) denote the weight parameters at the \(i\)th cycle. Using the DOB’s output signal during the \(i\)th cycle, the proposed scheme calculates its Fourier coefficients, \(a^{i}\) and \(b^{i}\) corresponding to \(\cos (2q_m )\) and \(\sin (2q_m )\), respectively. The weight parameters is then updated by

$$\begin{aligned} w_a^{i+1} =w_a^i +a^{i}, \quad w_b^{i+1} =w_b^i +b^{i}. \end{aligned}$$
(17)

To verify the effectiveness of the proposed scheme, consider the harmonic drive actuator under two kinds of operating conditions: quasi-constant angular-speed motion and variable angular-speed motion.

Fig. 5
figure 5

Tracking responses under quasi-constant speed motion

Quasi-constant angular-speed motion: The servomotor is forced to have an initial speed of approximately 11.4 (rps) at the beginning of a torque-control task, and the output torque of the harmonic drive actuator is then required to counteract the effects of the gravitational force exerted on the load, that is, the torque reference \(r=-35.4\sin ({q_m }/N)\) (Nm) in our setup. When the output torque perfectly follows the reference, the load will move at a constant velocity as if it was in the outer space and without external forces. Figure 5 shows the tracking responses of the proposed scheme, in which \(Q_\mathrm{do} (s)\) is designed as a fourth-order low-pass Butterworth filter with a cutoff frequency of 180 Hz. Note that the feedforward control by the learning control is null during the first cycle, and the torque response of the proposed scheme during the first motor revolution is hence nearly the same as that of the IMC without feedforward compensation. Figure 5 demonstrates that the torque ripples, whose frequency is twice the rotational frequency of the input shaft, are well compensated for by the proposed scheme after the first cycle under the quasi-constant speed motion.

Fig. 6
figure 6

Tracking responses under variable speed motion

Fig. 7
figure 7

Tracking errors and their spectra under variable speed motion

Variable angular-speed motion: A torque-controlled actuator does not necessarily operate at a constant speed. To evaluate the performance of the proposed scheme further, we apply the following reference

$$\begin{aligned} r\!=\!\left\{ {{\begin{array}{l@{\quad }l} \!-\!35.4\sin ({q_m }/N)\!-\!1.2\text{(Nm)}&\text{ for}\quad q_m \ge \!-\!\text{1}0\pi (\text{ rad}), \\ \!-\!35.4\sin ({q_m }/N)\!+\!1.2\text{(Nm)}&\text{ for} \quad q_m <-\text{1}0\pi (\text{ rad}) \\ \end{array} }} \right.\nonumber \\ \end{aligned}$$
(18)

which, when followed precisely, accelerates the load during the first five revolutions of the motor shaft, and decelerates the load afterwards. Figure 6 shows the dynamic response of the proposed scheme. From the output response of the IMC without feedforward compensation, it is seen that the magnitude of torque ripples varies with the angular speed of the input shaft. Moreover, it is obvious that the tracking performance of the pure IMC has been improved by the proposed scheme even when the amplitude of torque ripples is time-varying. According to the tracking responses shown in Figs. 6, 7 shows the spectra of tracking errors. It reveals that the torque ripple of twice the frequency of the motor shaft is alleviated while remaining other high-frequency components are almost the same.

4 Conclusions

This paper presents a DOB-based learning control scheme to compensate for torque ripples induced by harmonic drives. The proposed scheme learns straight from the compensation-error signal evaluated by the DOB, rather than extracting the disturbance information from output errors. Since the DOB is not directly involved in the real-time feedback loop, its bandwidth can be set to be relatively large, and the proposed learning control has an excellent convergence property. The experimental results show that the proposed scheme effectively alleviates the major component of torque ripples induced by harmonic drives.